On 17/03/2026 17:26, Michael Niedermayer via ffmpeg-devel wrote:
Hi Lynne
On Tue, Mar 17, 2026 at 04:13:33PM +0100, Lynne via ffmpeg-devel wrote:
On 17/03/2026 14:54, Michael Niedermayer via ffmpeg-devel wrote:
Hi everyone
STF is funding FFv1 Bayer video support.
The FFv1 specification has no Bayer support, so obviously part of this task
has to be to design the bitstream and compression algorithm and or how to
map bayer onto existing non bayer FFv1.
If you know an algorithm that should be considered, then please reply
Similarly if you know research work that compares bayer compression technologies
please reply too
I wrote the proposal.
In my research, I compared several other existing algorithms, and I think
this is by far the most advanced compression for Bayer data someone's
written.
All other lossless algorithms prioritize speed over compression.
- Redcode RAW (R3D, what RED cameras use, non-public) is based on Haar
[...]
- JPEG2000 supports lossless Bayer, but similar to R3D, does no prediction
[...]
- RawZipper is the only specialized codec for lossless Bayer data out
From these descriptions these dont sound like they are intended to be
high performance
RawZipper doesn't seem that well-optimized. JPEG2000 is, though it was
specifically written for ASICs, rather than software implementations.
R3D was mostly about speed, however. It was developed about a decade ago
when RED still used FPGAs in their cameras, rather than ASICs, and they
didn't mind asking customers to buy a lot of their own high performance
SSDs.
[...]
This will be the first new codec specifically designed for Bayer data
storage.
There is a wide range of academic Bayer-CFA image compression algorithms
These or parts of them could be considered
We also need a testset of Bayer CFA images so we can keep track of the
performance
effect that changes have.
Do you have such a test set ? If yes can we put that on samples.ffmpeg.org ?
I have a camera that can shoot ProRes RAW, and I can put together a
testset of various content.
So I based this on the best lossy variant, which was Blackmagic RAW. It does
a partial debayering, but here, I keep the difference instead of throwing it
away. Then I use the regular FFv1 prediction scheme to encode it.
The prior art for the decomposition is "Reversible color transform for Bayer
color filter array images" by Iwahashi et. al.
I don't really see too many ways this can be improved. Decorrelating the 2
green channels is already pretty optimal. Possibly using 3 contexts instead
of 2 may help, since the post-RCT green difference could pollute the first
context, but this would only matter for level 4's inter-frame coding
(disabled by default), and would make intra-only suboptimal (reusing the
first context whenever possible is always a plus).
Theres a few choices that are possible
1. are we storing N*M samples for a NxM CFA image or more ?
The current proposal uses 4 * (width/2) * (2 (or 3, with context=1))
transformed samples of storage per slice.
In other words, 2 total half-width lines for each component. Half as
much per component as GBRAP16, for example.
If you attempt to store a green (predicted or not) plane taht is sqrt(2)
subsampled from all luma samples that is not 2 planes
Look at this in a monospaced font:
G00 B G32 B G23 B G14 B
R G01 R G33 R G24 R G10
G11 B G02 B G34 B G20 B
R G12 R G03 R G30 R G21
G22 B G13 B G04 B G31 B
All the Green samples are addressed in a orthogonal raster, yes its rotated by
45° and
yes it has a boundary in the middle, but it can be represented as 1 plane
This of course assumes that the green samples are not subject to filtering that
makes them different from each other (which the current proposal would do)
Also a totally different way to encode Bayer-CFA is to convert it to YCbCr
4:2:0 in such
a way that after decompression and converting to RGB it is lossless
This would store 50% more samples than critically sampled CFA but these extra
samples would be free parameters that can be freely choosen while maintaining
losslessness so they could be optimized to minimize the bits per pixel
Did you see a paper exploring / comparing this ?
I did, actually, 10.3390/s22218362. It collated the green quincunx
samples into a 1:2 aspect image, then rotated it to form a 2:1 image,
and fed each plane through a (lossy) JPEG encoder.
This has the downside of requiring much more memory, as decoders would
need to decode the entire green image to undo the rotation.
I think decorrelating each quad is a better mechanism and better fits in
with the design of FFv1, preserving the base prediction process, like
with RGB.
when it comes to Decorrelating, the proposed filters are asymetric,
how would symmetric ones perform ? and how do adaptive ones perform?
The current implementation does not prohibit adaptive decorrelation. The
RCT search can still be performed to find more optimal coefficients than
(1, 1).
While it can be extended with more expensive search, I think that
extending the decorrelation process should be done in a way in which
RGBA coding would also benefit, than tuning it specifically for Bayer.
Perhaps that's something we can investigate in the future, since custom
RCT coefficients are still marked as unstable.
since h.264 intra prediction is directional, this here is kind of the
same thing. Prediction along an edge should do better than across an edge
FFv1's prediction filter is actually pretty good for diagonal patterns.
Encoders are also able to tune the quantization tables into directions
individually too.
I do think we could improve the prediction process, though. Currently,
predict() doesn't take the top-right pixel into account. This would also
benefit not just Bayer, but all FFv1 coding in general.
_______________________________________________
ffmpeg-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]