2014-11-26 06:13 UTC
Video generation in computing has historically been a somewhat tricky business. Nowadays in modern high-speed systems it's not really an issue but 30 years ago things were different. A video display must generally be constantly refreshed and fed with an uninterrupted stream of data and synchronization pulses in order to display a stable picture. The data is normally read from a dedicated video memory, over and over again for each screen refresh. The CPU needs to access this memory in between the reads in order to update the picture on the display. This is a timing accident bound to happen. Especially if the system bus needs to run at a completely different clock than the video display. Perhaps even totally asynchronous to each other. The most common technique used in the past is to simply avoid the issue by running the CPU and the video display at the same clock. Not just the same rate but exactly the same clock. That way the CPU and video interface can take turns accessing the common video memory. This is a simple and effective method but has its limitations in speed and flexibility. I wanted to build a video interface for my MC3 6303 computer that could handle an asynchronous bus clock without getting overly complicated. This is an idea that has been lurking in the back of my head for several years but I've never really taken the time to realize it. Until now.
The video interface I'm presenting here is made to be simple and understandable while using all discreet logic. I've seen many "retro" video projects today that uses a modern microcontroller to generate video. That just doesn't feel right. Almost on the brink of cheating.
Specifications
- Standard B/W PAL composite video output
- Fully bitmapped 16kB memory
- Around 400x256 pixels usable resolution depending on the TV/monitor in use
- Built using commonly available IC's
Circuit description
This is a bitmapped interface meaning it has no character generator, sprites, hardware scrolling or any other fancy stuff. The bits in memory are displayed on screen. That's it. Character generator and hardware scrolling would be relatively simple to implement but I've left that out for now.
The main concept is that the whole interface is driven by a constantly running 18bit binary ripple counter built around two 4040 cascaded counters clocked at 8MHz. All timing is derived from this counter and the same counter is also used for iterating through the video RAM. PAL video lines are 64us long. With a pixel clock of 8MHz there is room for roughly 512 pixels on each line (some of which are hidden under the horizontal sync signal and some who is outside the safe video area). This means that the lower 9bits of the counter are counting pixels on each video line and the upper 9bits are counting video lines on each frame. PAL uses 625 interlaced lines. Interlacing is of no use here so I opted for progressive video with half the resolution which is roughly 312 lines. The video synchronization pulses are generated from the counter outputs using logic gates. A composite sync is generated by a simple XOR of both HSYNC and VSYNC.
All gates are 74HCT regardless of what the schematics says.
Above is the first part of the schematic for the video interface showing the counters and gates that generates the video synchronization signals. In order to maintain a 50Hz refresh rate the counter needs to be reset after 312 lines. The reset is performed using a diode network that triggers the reset signal at 512 pixels * 312 lines = 159744. Actually I chose the value 159743 as reset value (one pixel early). The precise reset timing can then be set by adjusting R6. If the reset is just a tiny bit off then the image will be skewed at the top and that's not pretty at all.
Also visible in the schematic is the shift register for shifting out the individual video pixels. Every eighth pixel the shift register is loaded with a new value, thus every line covers 64 Bytes of data. The second part of the schematic just have to make sure the correct data is available for the shift register at this time.
Above is the second part of the schematic. This part shows the bus interface and the video memory (Cypress CY7C199-20PC). The bus interface consists of three latches. These latches triggers on writes to MC3 I/O-page 4 (16kB) and holds the address and data values until the video interface is ready to receive it. My MC3 computer has regular memory in this area so this design is actually mirroring a CPU accessible memory area to a local memory area on the video interface making the video RAM, from a software point of view, behave as regular RAM that can be both read and written to.
Two timing signals are generated to accomplish this. One signal (generated by IC7A and C1) that alternates the RAM direction from either writing data from the latches or reading data to the shift register. This means that in a cycle of eight pixels the RAM is read once and written once. The second signal is the write strobe to the memory. I've chosen a relatively fast 20ns memory for this design, the type of memory that can normally be found in 386/486 PC's as cache memory.
The RAM address bus is directly connected to the latches and connected to the counter via 1k resistors. This avoids the need for additional muxes since the latches can then simply force the address bus to a specific value when writing to RAM, regardless of the current counter value.
The bus latches are read at a rate of 1MHz. As long as writes from the system bus are not too frequent then data will not be missed. My MC3 system bus runs at 1.2288MHz and writes are performed nowhere near every E cycle so there is plenty of time for the video interface to read the data. I think a bus rate of up to perhaps 2-4MHz should be possible using this technique since most CPU's cannot perform writes at every cycle anyway.
Construction
As with the other boards in my MC3 system the video interface is built on a prototyping board using soldered wire-wrap wire. I was a little bit worried about signal quality and noise using this method but I've had no problems so far. The video output is very clean.
Some space left over for eventual additions. The only trimming point is the variable resistor for the counter reset timing. This has not needed any re-adjustments so it appears to be stable.
Note the temporary video output connection and the simple two resistor mixer. I have not yet fully decided on the best way to drive the output. Also take note of the decoupling capacitors on all chips. They are not in the schematic but are important.
Results
Below are some sample photos showing what is possible with this interface.
Quick and dirty text terminal showing various commands from my MC3 monitor program. Resolution is 50 columns by 32 rows when using a 8x8 font.
This is a 400x256 photo of the house cat glaring through the window. Using dithering to imitate gray scale.
On 2022-06-04 our beloved cat sadly passed away just one month short of his 10th birthday.
We miss you immensely little angel <3
Known issues and limitations
This interface has been running pretty solid but there are however a few minor issues so far that I have noted.
1. The quick and dirty video output driver stage consisting of two '04 inverters and two 470ohm resistors do not really comply with the broadcast standard but most TVs/monitors have no problem with this. This should be improved upon, thereof the currently temporary video connection and driver. Need to sort this out. Now fixed. See below.
2. Since the video run completely asynchronous to the CPU, some banding may show for fast repeated fills. By implementing a double-buffered memory synchronized with the vertical sync or writing only when video is not drawing this may be avoided but tearing is only visible under certain extreme conditions so it's not really an issue. I have not seen snowing or any other types of artifacts.
3. Scrolling is slow when used as a text terminal. This is the most significant issue so far. Since this interface has no hardware for scrolling in any direction, scrolling has to be performed by copying large amounts of data in video RAM. This causes a slowdown when the interface is used as a terminal where normally the entire screen is scrolled upwards for every new line. Next step would be to implement scrolling in hardware (basically just a latch and an adder) to achieve vertical scrolling with just a single register write.
Update 2016-08-07 - Better output stage
I have now built a prettier video output driver stage. It can still be improved upon but it's a lot better than the two resistors that were not even close to the correct impedance. Schematic can be seen below.
This is basically a simple transistor amplifier and signal mixer with two inputs, LUMA and SYNC, taken from IC7 pin 12 and pin 10 respectively. None of the component values are very critical. R3 should be around 75 ohm but 100 ohm worked just fine for me. The brightness of the output video can be adjusted with R5. This also affects the working point of the amplifier and is one of the reasons that the component values are not really critical. D1 is a general silicon small signal diode. I tested with 1N4148 and BAT42. Both worked fine. The reason for using a diode is to make sure that the SYNC signal always pulls the output low regardless of the LUMA level. This ensures proper synchronization even if erroneous pixels are being clocked out.
A proper output stage was really the only issue left to fix on the video interface. Now this board feels very complete.
by Claudio 2014-12-10 11:29 UTC
Nice thing !
I am seriously interested in building this for my homebrew 6502-based computer ! I tried implementing s.th. more sophisticated with a V9938 - but after many weeks it's still somehow not working. Your design could be a way out for me !
Would it be possible for you to add a list of chips and parts please ?
by Daniel 2014-12-12 20:34 UTC
I'm glad you like it Claudio!
Sure thing. Never really made a bill of material for this project but here it is. Be careful, there may be errors.
74HCT IC's
1x '00
3x '04
2x '08
1x '32
2x '86
1x '165
3x '373
2x '4040
Memory
1x Cypress CY7C199-20PC (or basically any old "fast" cache SRAM)
Diodes
15x BAT42
Resistors
2x 470
1x 680
15x 1k
1x 47k
1x 5k variable
Capacitors
2x 22p
2x 470p
16x 100n (decoupling)
Crystal
1x 8MHz
Connectors and mounting not included. Board is 100x160 euroboard.
I was also looking at the 99xx series of chips at first but ended up with this simple design.
by Dave 2015-03-23 01:09 UTC
Great circuit Daniel.
I agree with you about using a modern microcontroller for generating video, that is cheating.
by Claudio 2016-09-27 21:43 UTC
Hello again !
Now that you built a better output stage I'm really trying to build this project. May I contact you via Mail if I have a question ?
by Daniel 2016-09-28 15:17 UTC
Hi Claudio!
Really fun that you will give my design a go. I have been using this interface now for almost two years without a single glitch. I really like it.
I would recommend that you build part one of the interface first on a breadboard just to get the hang of the idea. You can generate a simple test pattern by connecting some pins of the shift register to the counter output to verify that you have a working signal without the need for a computer or video memory.
I've sent you an e-mail. Just let me know if you need any help :)
by Carlos 2017-01-13 14:46 UTC
Any progress to add character generator, etc?
by Daniel 2017-01-17 14:28 UTC
Yes and no :) Adding a 8x8 character ROM to the existing design is quite trivial. It will require 2k ROM space for 8 bits character address and 3 bits line address. The lower three bits of the 9 bit line counter will be connected to the ROM instead of directly to the VRAM. This will reduce the VRAM requirements to 2k instead of 16k. This will result in a fixed display of around 50x32 characters depending on the amount of monitor overscan. However, I did not feel the need for it as I find the flexibility of the fully bitmapped frame buffer really nice! The only real issue is that full screen scrolling can get a bit slow. To improve that you can insert an 8 bit adder between the line counter (N16 to N9) and the VRAM (A13 to A6) to effectively scroll the screen vertically with a single register write. The scroller can be made using for example two 74HCT283 adders and another 74HCT373 latch to hold the scroll value. I can provide some detailed design ideas if you are interested.
by me 2018-02-27 18:52 UTC
Hi!
Could you please provide some details about how to implement vertical and horizontal scrolling to this system?
by rhelectronics 2021-01-16 14:17 UTC
Hi
This circuit is amazing, I've built the sync section and testing with just some random bits on the shift register got me some stable vertical lines. Worked first time.
I am interested in how you would implement the character ROM you talk about in another comment. I understand how you would store the data but not how you would tell the video card which characters to display from that ROM.
Thanks
by DrZ 2021-07-23 18:07 UTC
hi,
i like your design! i read a lot on apple computers and modified the internal graphics . your design is really streamlined.
i would like to understand the scrolling: may you help me how you do that? is it by adding 0..7 or 0..312 to the line counter ?
thanks
Armin
by RH Electronics 2021-10-06 21:31 UTC
I've got this working perfectly now with a character ROM.
Not sure if I need hardware scrolling, but the adders are a good idea and will work well if needed.
by dirk gently 2021-12-26 18:16 UTC
How did you make the 4040 count at 8MHz? Are you powering them with elevated supply voltage? I can't get mine to count at 6MHz with 5V power supply
by RH Electronics 2022-02-25 07:08 UTC
Building this never stops! I’ve put most of the discrete logic now into a programmable chip, and I’m now testing out a RAMDAC to give me RGB. Using 2 memory chips for the video ram and colour ram.
Thanks again for a great original design.
by Mosso 2023-05-29 14:14 UTC
I have been thinking of making my own computer, with a custom display like this, except with lower resolution and two modes:
bitmap-based 128x128 and tile-based 256x256
There would be two sets of memory (2k and 1k), accessible by both the display and the cpu.
2k memory would have two uses:
In bitmap mode, it would be used as a screen memory, directly displaying the stored bitmap on the screen.
In tile mode, the bitmap would get "cut" into 256 unique 8x8 pixel sized pieces that would be then placed on the screen using the separate 1k memory.
That separate 1k memory would store placements of the tiles on a 32x32 grid.
I sucked at Physics at school, so Idk if I manage to make it, but how easy would this display be to make?
If I manage to get it to work (somehow), I would also try improving it by adding colors. (using a second set of 1k memory, storing 8x1 sized color attributes from a palette of 16 colors, and able to set different colors for both the 'background' and 'foreground' bits on each attribute)