Hi, Le 20 mai 2024 03:42:03 GMT+03:00, Stone Chen <chen.stonec...@gmail.com> a écrit : >Adds checkasm for DMVR SAD AVX2 implementation. > >Benchmarks ( AMD 7940HS ) >vvc_sad_8x8_c: 70.0 >vvc_sad_8x8_avx2: 10.0 >vvc_sad_16x16_c: 280.0 >vvc_sad_16x16_avx2: 20.0 >vvc_sad_32x32_c: 1020.0 >vvc_sad_32x32_avx2: 70.0 >vvc_sad_64x64_c: 3560.0 >vvc_sad_64x64_avx2: 270.0 >vvc_sad_128x128_c: 13760.0 >vvc_sad_128x128_avx2: 1070.0 >--- > tests/checkasm/vvc_mc.c | 38 ++++++++++++++++++++++++++++++++++++++ > 1 file changed, 38 insertions(+)
VVC benchmarks have increased checksam runtime by at least an order of magnitude. It's become so prohibitively slow that I could not even get to the end. This is not an acceptable situation and impedes non-VVC assembler work Please fix this before you add any new VVC tests. In the mean time: -1 / Nack all VVC checksam from my behalf. >diff --git a/tests/checkasm/vvc_mc.c b/tests/checkasm/vvc_mc.c >index 97f57cb401..e251400bfc 100644 >--- a/tests/checkasm/vvc_mc.c >+++ b/tests/checkasm/vvc_mc.c >@@ -322,8 +322,46 @@ static void check_avg(void) > report("avg"); > } > >+static void check_vvc_sad(void) >+{ >+ const int bit_depth = 10; >+ VVCDSPContext c; >+ LOCAL_ALIGNED_32(uint16_t, src0, [MAX_CTU_SIZE * MAX_CTU_SIZE * 4]); >+ LOCAL_ALIGNED_32(uint16_t, src1, [MAX_CTU_SIZE * MAX_CTU_SIZE * 4]); >+ declare_func(int, const int16_t *src0, const int16_t *src1, int dx, int >dy, int block_w, int block_h); >+ >+ ff_vvc_dsp_init(&c, bit_depth); >+ memset(src0, 0, MAX_CTU_SIZE * MAX_CTU_SIZE * 4); >+ memset(src1, 0, MAX_CTU_SIZE * MAX_CTU_SIZE * 4); >+ >+ randomize_pixels(src0, src1, MAX_CTU_SIZE * MAX_CTU_SIZE * 2); >+ for (int h = 8; h <= MAX_CTU_SIZE; h *= 2) { >+ for (int w = 8; w <= MAX_CTU_SIZE; w *= 2) { >+ for(int offy = 0; offy <= 4; offy++) { >+ for(int offx = 0; offx <= 4; offx++) { >+ if(check_func(c.inter.sad, "vvc_sad_%dx%d", w, h)) { >+ int result0; >+ int result1; >+ >+ result0 = call_ref(src0 + PIXEL_STRIDE * 2 + 2, src1 >+ PIXEL_STRIDE * 2 + 2, offx, offy, w, h); >+ result1 = call_new(src0 + PIXEL_STRIDE * 2 + 2, src1 >+ PIXEL_STRIDE * 2 + 2, offx, offy, w, h); >+ >+ if (result1 != result0) >+ fail(); >+ if(w == h && offx == 0 && offy == 0) >+ bench_new(src0 + PIXEL_STRIDE * 2 + 2, src1 + >PIXEL_STRIDE * 2 + 2, offx, offy, w, h); >+ } >+ } >+ } >+ } >+ } >+ >+ report("check_vvc_sad"); >+} >+ > void checkasm_check_vvc_mc(void) > { >+ check_vvc_sad(); > check_put_vvc_luma(); > check_put_vvc_luma_uni(); > check_put_vvc_chroma(); _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".