On Fri, Jul 18, 2025 at 3:17 PM Kacper Michajlow <kaspe...@gmail.com> wrote: > > On Fri, 18 Jul 2025 at 15:33, Kieran Kunhya via ffmpeg-devel > <ffmpeg-devel@ffmpeg.org> wrote: > > > > On Fri, Jul 18, 2025 at 2:22 PM Kacper Michajlow <kaspe...@gmail.com> wrote: > > > > > > On Fri, 18 Jul 2025 at 14:46, Kieran Kunhya via ffmpeg-devel > > > <ffmpeg-devel@ffmpeg.org> wrote: > > > > > > > > On Fri, Jul 18, 2025 at 1:41 PM Kacper Michajlow <kaspe...@gmail.com> > > > > wrote: > > > > > > > > > > On Fri, 18 Jul 2025 at 14:14, Kieran Kunhya via ffmpeg-devel > > > > > <ffmpeg-devel@ffmpeg.org> wrote: > > > > > > > > > > > > > blackdetect8_c: 820.8 ( > > > > > > > 1.00x) > > > > > > > blackdetect8_avx2: 219.2 ( > > > > > > > 3.74x) > > > > > > > blackdetect16_c: 372.8 ( > > > > > > > 1.00x) > > > > > > > blackdetect16_avx2: 201.4 ( > > > > > > > 1.85x) > > > > > > > > > > > > > > Again, sorry for being pedantic here, but it gives the wrong > > > > > > > impression especially if you look at this from outside. > > > > > > > > > > > > Also misleading as far as I understand because GCC doesn't have > > > > > > runtime detection like FFmpeg. > > > > > > > > > > Speak of... actually GCC does have runtime detection. All you have to > > > > > do is mark the function with `target_clones` with requested > > > > > architectures and it will dispatch automatically during runtime the > > > > > best function to use. > > > > > > > > > > See for more information: > > > > > https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-target_005fclones-function-attribute > > > > > > > > It's not as sophisticated as our runtime detection (e.g avx512 vs > > > > avx512icl which we support). > > > > Comparing C vs autovectorised code that works only on some platforms > > > > with forced compilation settings is also unfair. > > > > > > In my original message clang build was completely default, no forced > > > options. > > > > > > Handwritten avx512 also works on this specific platform. So comparing > > > this to autovectorized code (that works on exactly the same platform) > > > as a baseline makes sense. Furthermore autovectorized code can scale > > > onto more platforms than handwritten avx512. IMHO comparing things in > > > the same domain makes more sense. > > > > > > The point of my message was that we should have defined a baseline > > > target, if it is GCC without autovectorization, so be it. But it > > > should be specified and not implied in the commit description that the > > > compared result is autovectorized. > > > > > > To be honest, I agree with you. It's misleading and unfair, so we > > > shouldn't make any comparisons. This is not only limited to > > > autovectorization, scalar code generation also differs. It just > > > happens to give the biggest difference. > > > > > > Context matters, saying "C code performance " is vague. I'm not saying > > > one way is better than the other, but it doesn't cost anything to > > > specify it better to avoid miscommunication. > > > > It's not fair to compare autovectorised output that's AVX512 that will > > be called *on any system with AVX512 support including ones that > > downclock heavily* with AVX512(ICL) checked properly in FFmpeg to run > > on only non-downlocking systems. > > That's the customer/user decision how to compile FFmpeg for best > performance on their target platform. Also note, you brought up > avx512, while I agree on the issues with it. I'm commenting on the > AVX2 patch. I wanted to make general comment about the performance > metric we share, diving into avx512 issues is kinda a separate topic.
Huh, we should have the best performance for *all* users (all compilers, all platforms) by default. We have this now for SIMD functions, it's an open question about autovec for the rest. Kieran _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".