Analog Video Capture on Linux

For video capture I use an old AIMS Lab Video Highway Xtreme PCI video capture card. This card uses a bt848 chip. To capture video with 'square' pixels you need to run the sample clock at 14.75 MHz exactly. But the bt848 does not use this frequency. Instead, it runs on 4 times the PAL color carrier, or roughly 17.73447 MHz.

In the PAL system, one line of video data is 64μs. This means that the bt848 takes 17.73447 * 64 = 1135 samples per line. But for square pixels we need to sample at 64 * 14.75 = 944 per line. So the bt848 chip has a so-called HSCALE register that you can set so that it scales the 1135 samples down to whatever is needed. The calculation for HSCALE, according to the datasheet is:

HSCALE = [(1135 / Pdesired) - 1] * 4096

For square pixels, Pdesired is 944, as we have seen. Thus HSCALE comes to about 828.75, which is then rounded up to 829. (Curiously the datasheet suggests the rounded-down value of 828.)

The bt848 chip has another register that you need to set, called HACTIVE. This determines the amount of pixels per line (after scaling) that are actually sent to the host computer.

In PAL-land, the 'active area' of one line of video (the part that actually contains a picture) is 52μs. This corresponds to 52 * 14.75 = 767 square pixels. This is rounded up to a nice even number, and so HACTIVE becomes 768.

There is another required register, HDELAY, which is not really relevant to what I want to discuss, so we'll ignore that for now. But it allows one to fine-tune the position of the active area relative to the beginning of the video line.

In practice you, as a user of the device, should not have to know any of this. All that you care about is that you want to capture PAL video at 768x576 square pixels.

On Linux the low-level programming of the bt848 chip is done by a driver. The driver is exposed to the user via the Video4Linux API.

Here is where things get interesting.

Since (I guess) the Video4Linux (v4l) API is designed to accomodate a multitude of capture hardware, all this HSCALE business is abstracted away from the user. Instead we have to do something entirely different.

In v4l we need to set a video format. Normally this would be some sane default like 768x576 with a certain pixel format.

The next thing that is relevant is the so-called 'crop window'. This defines the active area. In v4l the width of the active area is specified in unscaled samples. For the bt848 we can compute that: it should be 17.73447 * 52 = 922.19 samples, which is rounded down to 922.

For some bizarre reason the bt848 driver sets the default crop window width not to 922, but to 924. According to the driver this is because '924 is divisable by 4 and 3' - what does that have to do with anything!?

Anyway, as a result, the HSCALE value that the v4l driver actually programs into the chip turns out to be not 829, not even 828, but 833! This is nowhere near what it should be. (Can you tell I'm in pedantic mode?)

To adjust HSCALE you need to adjust the cropping window. Unfortunately there is no way to get the the correct HSCALE value (apart from hacking the driver.) Here's a small table (c_width = width of crop window)

widthc_widthHSCALE
768922821capture (blake)diff
capture (pm5544)diff
768923827capture (blake)diff
capture (pm5544)diff
768924833capture (blake)diff
capture (pm5544)diff

The original 'blake' image used in the capture/diffs above is here. This is taken directly from a Blake's 7 season 4 DVD. The DVD was then played on a Pioneer DV-350 DVD player and captured via S-video. Assuming the DVD player outputs at the BT.601 standard 13.5 MHz, the source picture was scaled horizontally by 59/54 to convert BT.601 into square pixels, then overlayed with the inverse of the captures to generate the 'diff' pictures.

The 'pm5544' image was generated from this picture. I scaled it horizontally by 54/59 and then displayed it on a Raspberry Pi and captured its output via composite video. (The pi also outputs at BT.601 frequency, apparently.)

The best thing to do for now is to set c_width to 923, and/or do some postprocessing to scale the image properly. At least having HSCALE=827 is 'less wrong' than HSCALE=833.

If you want to be really pedantic, you could argue that square pixels are really 52μs/768 wide, not 52μs/767. That corresponds to a HSCALE of about 822. In this case c_width = 922 is the better approximation (and 924 is even more wrong.)

As an aside, the v4l api allows you to change the cropping window, but the change is backed out as soon as you close the video device! To implement do-what-I-mean behaviour you have to explicitly add some obscure module parameter. (reset_crop=0) This apparently is for 'backward compatibility'. I guess for something that was relevant in 1996.

Michiel Boland

Aug 2020