On 18/03/2026 01:11, Michael Niedermayer via ffmpeg-devel wrote:
Hi Lynne
On Tue, Mar 17, 2026 at 07:12:16PM +0100, Lynne via ffmpeg-devel wrote:
On 17/03/2026 17:26, Michael Niedermayer via ffmpeg-devel wrote:
[...]
If you attempt to store a green (predicted or not) plane taht is sqrt(2)
subsampled from all luma samples that is not 2 planes
Look at this in a monospaced font:
G00 B G32 B G23 B G14 B
R G01 R G33 R G24 R G10
G11 B G02 B G34 B G20 B
R G12 R G03 R G30 R G21
G22 B G13 B G04 B G31 B
All the Green samples are addressed in a orthogonal raster, yes its rotated by
45° and
yes it has a boundary in the middle, but it can be represented as 1 plane
This of course assumes that the green samples are not subject to filtering that
makes them different from each other (which the current proposal would do)
Also a totally different way to encode Bayer-CFA is to convert it to YCbCr
4:2:0 in such
a way that after decompression and converting to RGB it is lossless
This would store 50% more samples than critically sampled CFA but these extra
samples would be free parameters that can be freely choosen while maintaining
losslessness so they could be optimized to minimize the bits per pixel
Did you see a paper exploring / comparing this ?
I did, actually, 10.3390/s22218362. It collated the green quincunx samples
into a 1:2 aspect image, then rotated it to form a 2:1 image, and fed each
plane through a (lossy) JPEG encoder.
This has the downside of requiring much more memory, as decoders would need
to decode the entire green image to undo the rotation.
I would have expected this to be done on a per slice base not a per image base
also if you implemented 10.3390/s22218362, can you provide a link so we can
test it
This would not change the amount of memory we would need to allocate, we
would simply be buffering the entire slice in memory before inverting it
in the decoder. While it would be faster than doing it for an entire
image at a time, the memory requirement makes this suboptimal.
[...]
when it comes to Decorrelating, the proposed filters are asymetric,
how would symmetric ones perform ? and how do adaptive ones perform?
The current implementation does not prohibit adaptive decorrelation. The RCT
search can still be performed to find more optimal coefficients than (1, 1).
While it can be extended with more expensive search, I think that extending
the decorrelation process should be done in a way in which RGBA coding would
also benefit, than tuning it specifically for Bayer. Perhaps that's
something we can investigate in the future, since custom RCT coefficients
are still marked as unstable.
G B G B G B G B
R G R G R G R G
G B G B G B G B
R G R G R G R G
G B G B G B G B
each blue and red sample has 4 adjacent green samples
That is each sample can be decorrelated with an average of 4, 3, 2 or 1
of these samples. Thats without considering more distant samples.
(this could be attempted in a local edge direction dependant way)
PR22528 uses the left and top green samples for blue and the bottom and right
green
samples for red.
Such a asymetric fixed choice "feels" strange
The PR is implemented for RGGB16 (R G\n G B), not GBRG16.
The pattern we get is:
R G R G R G R G
G B G B G B G B
R G R G R G R G
G B G B G B G B
R G R G R G R G
G B G B G B G B
The names we use for the samples should have also been a hint.
Also you effectively use a haar transform to split the green into 2 green planes
why that and not the 5/3 one from j2k. But either feels odd, putting a wavelet
style transform into ffv1 feels odd.
I really would like to see some numbers that show that this is the best choice
5/3 is a Wavelet transform that requires 6 samples in both directions,
and would impose a restriction on the slice width and height. Haar can
operate on just a single Bayer quad.
I could have extended the RCT by adding a parameter to bias the split,
however the 2 green samples in each quad are already highly correlated,
since they are adjacent.
Now that I think about it, this transformation moves the sample location
for the resulting Green median to the center of the quad, between the
Red and Blue samples. This means that encoding the Green with the same
range coder context we use for Red and Blue could help compression.
_______________________________________________
ffmpeg-devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]