Re: [FFmpeg-devel] [PATCH v2] avfilter/vf_lut3d: add x86-optimized tetrahedral interpolation

2021-10-10 Thread Mark Reid
On Sun, Oct 10, 2021 at 10:29 PM Xiang, Haihao 
wrote:

> On Sat, 2021-10-09 at 15:24 -0700, Mark Reid wrote:
> > On Sat, Oct 9, 2021 at 4:11 AM Paul B Mahol  wrote:
> >
> > > will test and apply shortly, why 8bit is not covered?
> > >
> >
> > Thanks for taking the time to test. I didn't do 8bit yet because I was
> > trying to limit my testing matrix, and these happen to be my main use
> > cases. I hope to try and incrementally add other pixel formats and
> > interpolation methods.
>
>
> This patch broke the linker with the following configuration:
>
>   configuration: --prefix=/usr --libdir=/usr/lib/x86_64-linux-gnu --
> shlibdir=/usr/lib/x86_64-linux-gnu --enable-opencl --enable-libglslang
> --enable-
> vulkan --enable-libdrm --enable-shared --enable-pic --enable-gpl --disable-
> stripping --disable-optimizations --disable-static --disable-mmx
> --disable-ssse3
> --enable-debug=3 --enable-libmfx --samples=fate-suite/ --enable-opengl
>
> LD  ffprobe_g
> libavfilter/libavfilter.so: undefined reference to
> `ff_interp_tetrahedral_pf32_sse2'
> libavfilter/libavfilter.so: undefined reference to
> `ff_interp_tetrahedral_p16_sse2'
> collect2: error: ld returned 1 exit status
> Makefile:125: recipe for target 'ffprobe_g' failed
> make: *** [ffprobe_g] Error 1
>
> Thanks
> Haihao
>
>
Sorry about that, it appears to be the --disable-optimizations flag.
The SSE2 stub functions are still getting built but the asm is not, I got
it fixed on my end I'll send a patch



> ___
> ffmpeg-devel mailing list
> ffmpeg-devel@ffmpeg.org
> https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>
> To unsubscribe, visit link above, or email
> ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2] avfilter/vf_lut3d: add x86-optimized tetrahedral interpolation

2021-10-10 Thread Xiang, Haihao
On Sat, 2021-10-09 at 15:24 -0700, Mark Reid wrote:
> On Sat, Oct 9, 2021 at 4:11 AM Paul B Mahol  wrote:
> 
> > will test and apply shortly, why 8bit is not covered?
> > 
> 
> Thanks for taking the time to test. I didn't do 8bit yet because I was
> trying to limit my testing matrix, and these happen to be my main use
> cases. I hope to try and incrementally add other pixel formats and
> interpolation methods.


This patch broke the linker with the following configuration:

  configuration: --prefix=/usr --libdir=/usr/lib/x86_64-linux-gnu --
shlibdir=/usr/lib/x86_64-linux-gnu --enable-opencl --enable-libglslang --enable-
vulkan --enable-libdrm --enable-shared --enable-pic --enable-gpl --disable-
stripping --disable-optimizations --disable-static --disable-mmx 
--disable-ssse3 
--enable-debug=3 --enable-libmfx --samples=fate-suite/ --enable-opengl

LD  ffprobe_g
libavfilter/libavfilter.so: undefined reference to
`ff_interp_tetrahedral_pf32_sse2'
libavfilter/libavfilter.so: undefined reference to
`ff_interp_tetrahedral_p16_sse2'
collect2: error: ld returned 1 exit status
Makefile:125: recipe for target 'ffprobe_g' failed
make: *** [ffprobe_g] Error 1

Thanks
Haihao

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2] avfilter/vf_lut3d: add x86-optimized tetrahedral interpolation

2021-10-09 Thread Mark Reid
On Sat, Oct 9, 2021 at 4:11 AM Paul B Mahol  wrote:

> will test and apply shortly, why 8bit is not covered?
>

Thanks for taking the time to test. I didn't do 8bit yet because I was
trying to limit my testing matrix, and these happen to be my main use
cases. I hope to try and incrementally add other pixel formats and
interpolation methods.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


Re: [FFmpeg-devel] [PATCH v2] avfilter/vf_lut3d: add x86-optimized tetrahedral interpolation

2021-10-09 Thread Paul B Mahol
will test and apply shortly, why 8bit is not covered?
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".


[FFmpeg-devel] [PATCH v2] avfilter/vf_lut3d: add x86-optimized tetrahedral interpolation

2021-10-05 Thread mindmark
From: Mark Reid 

I spotted an interesting pattern that I didn't see before that leads to the 
implementation being faster.
The bit shifting table I was using before is no longer needed, and was able to 
remove quite a few lines. 
I also add use of FMA on the AVX2 version.

f32 1920x1080 1 thread with prelut
c impl
1434012700 UNITS in lut3d->interp,       1 runs,      0 skips
1434035335 UNITS in lut3d->interp,       2 runs,      0 skips
1423615347 UNITS in lut3d->interp,       4 runs,      0 skips
1426268863 UNITS in lut3d->interp,       8 runs,      0 skips

sse2
905484420 UNITS in lut3d->interp,       1 runs,      0 skips
905659010 UNITS in lut3d->interp,       2 runs,      0 skips
915167140 UNITS in lut3d->interp,       4 runs,      0 skips
915834222 UNITS in lut3d->interp,       8 runs,      0 skips

avx
574794860 UNITS in lut3d->interp,       1 runs,      0 skips
581035090 UNITS in lut3d->interp,       2 runs,      0 skips
584116720 UNITS in lut3d->interp,       4 runs,      0 skips
581460290 UNITS in lut3d->interp,       8 runs,      0 skips

avx2
301698880 UNITS in lut3d->interp,       1 runs,      0 skips
301982880 UNITS in lut3d->interp,       2 runs,      0 skips
306962430 UNITS in lut3d->interp,       4 runs,      0 skips
305472025 UNITS in lut3d->interp,       8 runs,      0 skips

gbrap16 1920x1080 1 thread with prelut
c impl
1480894840 UNITS in lut3d->interp,       1 runs,      0 skips
1502922990 UNITS in lut3d->interp,       2 runs,      0 skips
1496114307 UNITS in lut3d->interp,       4 runs,      0 skips
1492554551 UNITS in lut3d->interp,       8 runs,      0 skips

sse2
980777180 UNITS in lut3d->interp,       1 runs,      0 skips
986121520 UNITS in lut3d->interp,       2 runs,      0 skips
986489840 UNITS in lut3d->interp,       4 runs,      0 skips
998832248 UNITS in lut3d->interp,       8 runs,      0 skips

avx
622212360 UNITS in lut3d->interp,       1 runs,      0 skips
622981160 UNITS in lut3d->interp,       2 runs,      0 skips
645396315 UNITS in lut3d->interp,       4 runs,      0 skips
641057075 UNITS in lut3d->interp,       8 runs,      0 skips

avx2
321336400 UNITS in lut3d->interp,       1 runs,      0 skips
321268920 UNITS in lut3d->interp,       2 runs,      0 skips
323459895 UNITS in lut3d->interp,       4 runs,      0 skips
324949967 UNITS in lut3d->interp,       8 runs,      0 skips

---
 libavfilter/lut3d.h |  83 
 libavfilter/vf_lut3d.c  |  61 +--
 libavfilter/x86/Makefile|   2 +
 libavfilter/x86/vf_lut3d.asm| 662 
 libavfilter/x86/vf_lut3d_init.c |  88 +
 5 files changed, 840 insertions(+), 56 deletions(-)
 create mode 100644 libavfilter/lut3d.h
 create mode 100644 libavfilter/x86/vf_lut3d.asm
 create mode 100644 libavfilter/x86/vf_lut3d_init.c

diff --git a/libavfilter/lut3d.h b/libavfilter/lut3d.h
new file mode 100644
index 00..ded2a036a5
--- /dev/null
+++ b/libavfilter/lut3d.h
@@ -0,0 +1,83 @@
+/*
+ * Copyright (c) 2013 Clément Bœsch
+ * Copyright (c) 2018 Paul B Mahol
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+#ifndef AVFILTER_LUT3D_H
+#define AVFILTER_LUT3D_H
+
+#include "libavutil/pixdesc.h"
+#include "framesync.h"
+#include "avfilter.h"
+
+enum interp_mode {
+INTERPOLATE_NEAREST,
+INTERPOLATE_TRILINEAR,
+INTERPOLATE_TETRAHEDRAL,
+INTERPOLATE_PYRAMID,
+INTERPOLATE_PRISM,
+NB_INTERP_MODE
+};
+
+struct rgbvec {
+float r, g, b;
+};
+
+/* 3D LUT don't often go up to level 32, but it is common to have a Hald CLUT
+ * of 512x512 (64x64x64) */
+#define MAX_LEVEL 256
+#define PRELUT_SIZE 65536
+
+typedef struct Lut3DPreLut {
+int size;
+float min[3];
+float max[3];
+float scale[3];
+float* lut[3];
+} Lut3DPreLut;
+
+typedef struct LUT3DContext {
+const AVClass *class;
+struct rgbvec *lut;
+int lutsize;
+int lutsize2;
+struct rgbvec scale;
+int interpolation;  ///
+;*
+;* This file is part of FFmpeg.
+;*
+;* FFmpeg is free software; you can redistribute it and/or
+;* modify it under the terms of the GNU Lesser General Public
+;* License as published by the Free Software Foundation; either
+;* version 2.1 of the License, or (at your option) any later version.
+;*
+;* FFmpeg is distributed in t