On Mon, Sep 28, 2020 at 10:38 AM Michael Niedermayer <mich...@niedermayer.cc> wrote:
> On Sat, Sep 26, 2020 at 10:01:30PM -0700, Mark Reid wrote: > > On Mon, Sep 14, 2020 at 6:31 PM Mark Reid <mindm...@gmail.com> wrote: > > > > > > > > > > > On Mon, Sep 14, 2020 at 2:44 PM Michael Niedermayer > <mich...@niedermayer.cc> > > > wrote: > > > > > >> On Sun, Sep 13, 2020 at 04:04:42PM -0700, Mark Reid wrote: > > >> > On Sun, Sep 13, 2020 at 8:55 AM Michael Niedermayer > > >> <mich...@niedermayer.cc> > > >> > wrote: > > >> > > > >> > > On Sat, Sep 12, 2020 at 02:07:14AM -0700, mindm...@gmail.com > wrote: > > >> > > > From: Mark Reid <mindm...@gmail.com> > > >> > > > > > >> > > > --- > > >> > > > libswscale/input.c | 12 +++++------- > > >> > > > tests/ref/fate/filter-pixfmts-scale | 8 ++++---- > > >> > > > 2 files changed, 9 insertions(+), 11 deletions(-) > > >> > > > > >> > > Can you provide some tests that show that this is better ? > > >> > > Iam asking that because some of the numbers in some of the code > > >> > > (i dont remember which) where tuned to give more accurate overall > > >> results > > >> > > > > >> > > an example for tests would be converting from A->B->A then > compare to > > >> the > > >> > > orig > > >> > > > > >> > > thx > > >> > > > > >> > > > > >> > Hopefully i can explain this clearly! > > >> > > > >> > It's easier to see the error if you run a black image through the > old > > >> > conversion. > > >> > zero values don't get mapped to zero. (attached sample image) > > >> > > > >> > ffmpeg -i 4x4_zero.exr -pix_fmt rgb48le -f rawvideo > 4x4_zero.rawvideo > > >> > The image should be rgb 0, 0, 0 everywhere but instead it's 353, > 0, 407 > > >> > > > >> > > > >> > I think this is a error in fixed point rounding, the issue is > basically > > >> > boils down to > > >> > > > >> > 128 << 8 != 257 << 7 > > >> > and > > >> > 16 << 8 != 33 << 7 > > >> > > > >> > https://en.wikipedia.org/wiki/YUV#Studio_swing_for_BT.601 > > >> > the 8 bit rgb to yuv formula is > > >> > > > >> > Y = ry*r + gy*g + by*b + 16 > > >> > U = ru*r + gu*g + bu*b + 128 > > >> > V = rv*r + gv*g + bv*b + 128 > > >> > > > >> > I think the studio swing offsets at the end are calculated wrong in > the > > >> old > > >> > code. > > >> > (257 << (RGB2YUV_SHIFT + bpc - 9))) > > >> > 257 is correct for 8 bit rounding but not for 16-bit. > > >> > > > >> > the 257 i believe is from (128 << 1) + 1 > > >> > the +1 is for rounding > > >> > > > >> > for rounding 16-bit (128 << 9) + 1 = 0x10001 > > >> > > > >> > therefore I think the correct rounding any bit depth with the old > > >> formula > > >> > would be (untested) > > >> > (((128 << (bpc - 7)) + 1) << (RGB2YUV_SHIFT-1) ) > > >> > > > >> > I just simplified it to > > >> > (0x10001 << (RGB2YUV_SHIFT - 1)) > > >> > > > >> > The rgb48ToUV and rgb48ToY funcs in input.c use the formula I'm > using. > > >> > > >> You quite possibly are correct, can you test that this actually works > > >> out. The test sample only covers 1 color (black) > > >> a testsample covering a wide range of the color cube would be more > > >> convincing that this change is needed and sufficient to fix this. > > >> > > >> thx > > >> > > > > > > I wrote a small python script to compare the raw gbrpf32le images if > that > > > works? I attached it and also a more colorful test pattern. > > > > > > it runs these two commands and compares the 2 raw float images > > > ffmpeg -y -i test_pattern.exr -f rawvideo original.gbrpf32le > > > ffmpeg -y -i test_pattern.exr -vf > > > format=pix_fmts=rgb48le,format=pix_fmts=gbrpf32le -f rawvideo > > > converted.gbrpf32le > > > > > > python gbrpf32le_diff.py test_pattern.exr > > > > > > without patch: > > > avg error: 237.445495855 > > > min error: 0.0 > > > max error: 468.399102688 > > > > > > with patch: > > > avg error: 15.9312244829 > > > min error: 0.0 > > > max error: 69.467689991 > > > > > > > > > These are floating points scaled to 16-bit values. > > > Even with my patch, I'm kinda disturbed how much error there is. > > > > > > > ping > > I re-wrote the python script as a c swscale test, if that helps > > replicate my results. attached is patch for that. > > it generates an image of random float values and does the > > conversion/comparison . > > > > before patch: > > gbrpf32le -> yuva444p16le -> gbrpf32le > > avg diff: 0.003852 > > min diff: 0.000000 > > max diff: 0.006638 > > > > after patch: > > gbrpf32le -> yuva444p16le -> gbrpf32le > > avg diff: 0.000125 > > min diff: 0.000000 > > max diff: 0.000501 > > is it better for all middle formats ? > Iam asking as it seems this should be rather easy to test with your code > good idea, yes it looks better for other intermediate formats too, see my results below. I'll submit a new version of patch that does this and with the warnings fixed. gbrpf32le -> yuv444p16le -> gbrpf32le avg diff: 0.003852 avg diff: 0.000125 min diff: 0.000000 min diff: 0.000000 max diff: 0.006638 max diff: 0.000501 gbrpf32le -> yuv444p -> gbrpf32le avg diff: 0.004316 avg diff: 0.001804 min diff: 0.000000 min diff: 0.000000 max diff: 0.012704 max diff: 0.006399 gbrpf32le -> yuv444p9le -> gbrpf32le avg diff: 0.004053 avg diff: 0.000906 min diff: 0.000001 min diff: 0.000000 max diff: 0.009402 max diff: 0.003313 gbrpf32le -> yuv444p10le -> gbrpf32le avg diff: 0.003960 avg diff: 0.000467 min diff: 0.000000 min diff: 0.000000 max diff: 0.008123 max diff: 0.001912 gbrpf32le -> yuv444p12le -> gbrpf32le avg diff: 0.003878 avg diff: 0.000166 min diff: 0.000000 min diff: 0.000000 max diff: 0.007011 max diff: 0.000802 gbrpf32le -> yuv444p14le -> gbrpf32le avg diff: 0.003868 avg diff: 0.000127 min diff: 0.000000 min diff: 0.000000 max diff: 0.006729 max diff: 0.000524 gbrpf32le -> rgb24 -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> bgr24 -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> rgba -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> bgra -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> argb -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> abgr -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> 0rgb -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> 0bgr -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> rgb0 -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> bgr0 -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> rgb48le -> gbrpf32le avg diff: 0.003851 avg diff: 0.000249 min diff: 0.000000 min diff: 0.000000 max diff: 0.007076 max diff: 0.000990 gbrpf32le -> bgr48le -> gbrpf32le avg diff: 0.003851 avg diff: 0.000249 min diff: 0.000000 min diff: 0.000000 max diff: 0.007076 max diff: 0.000990 gbrpf32le -> rgba64le -> gbrpf32le avg diff: 0.003851 avg diff: 0.000249 min diff: 0.000000 min diff: 0.000000 max diff: 0.007076 max diff: 0.000990 gbrpf32le -> bgra64le -> gbrpf32le avg diff: 0.003851 avg diff: 0.000249 min diff: 0.000000 min diff: 0.000000 max diff: 0.007076 max diff: 0.000990 gbrpf32le -> gbrp -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> gbrap -> gbrpf32le avg diff: 0.004122 avg diff: 0.001011 min diff: 0.000000 min diff: 0.000000 max diff: 0.008975 max diff: 0.004229 gbrpf32le -> gbrp9le -> gbrpf32le avg diff: 0.007737 avg diff: 0.003917 min diff: 0.000000 min diff: 0.000000 max diff: 0.014009 max diff: 0.007870 gbrpf32le -> gbrp10le -> gbrpf32le avg diff: 0.007662 avg diff: 0.003841 min diff: 0.000000 min diff: 0.000000 max diff: 0.013605 max diff: 0.007456 gbrpf32le -> gbrap10le -> gbrpf32le avg diff: 0.007662 avg diff: 0.003841 min diff: 0.000000 min diff: 0.000000 max diff: 0.013605 max diff: 0.007456 gbrpf32le -> gbrp12le -> gbrpf32le avg diff: 0.007622 avg diff: 0.003796 min diff: 0.000000 min diff: 0.000000 max diff: 0.013335 max diff: 0.007140 gbrpf32le -> gbrap12le -> gbrpf32le avg diff: 0.007622 avg diff: 0.003796 min diff: 0.000000 min diff: 0.000000 max diff: 0.013335 max diff: 0.007140 gbrpf32le -> gbrp14le -> gbrpf32le avg diff: 0.007620 avg diff: 0.003792 min diff: 0.000000 min diff: 0.000000 max diff: 0.013232 max diff: 0.007034 gbrpf32le -> gbrp16le -> gbrpf32le avg diff: 0.007680 avg diff: 0.003853 min diff: 0.000000 min diff: 0.000000 max diff: 0.013275 max diff: 0.007098 gbrpf32le -> gbrap16le -> gbrpf32le avg diff: 0.007680 avg diff: 0.003853 min diff: 0.000000 min diff: 0.000000 max diff: 0.013275 max diff: 0.007098 > > But from what i see above, obviously this is an improvment and should be > applied > > thx > > [...] > > -- > Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB > > Awnsering whenever a program halts or runs forever is > On a turing machine, in general impossible (turings halting problem). > On any real computer, always possible as a real computer has a finite > number > of states N, and will either halt in less than N cycles or never halt. > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe". _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".