Hi, On Wed, Nov 30, 2016 at 7:10 AM, James Darnley <jdarn...@obe.tv> wrote:
> On 2016-11-29 21:09, Carl Eugen Hoyos wrote: > > 2016-11-29 17:14 GMT+01:00 James Darnley <jdarn...@obe.tv>: > >> On 2016-11-29 15:30, Carl Eugen Hoyos wrote: > >>> 2016-11-29 12:52 GMT+01:00 James Darnley <jdarn...@obe.tv>: > >>>> sse2: > >>>> complex: 4.13x faster (1514 vs. 367 cycles) > >>>> simple: 4.38x faster (1836 vs. 419 cycles) > >>>> > >>>> avx: > >>>> complex: 1.07x faster (260 vs. 244 cycles) > >>>> simple: 1.03x faster (284 vs. 274 cycles) > >>> > >>> What are you comparing? > > > >> The AVX comparison is it versus SSE2. > > > > This wasn't obvious to me. > > I've made it more verbose but I'm not sure whether it is any better. > Care to give your opinion Carl? > > > Nehalem: > > - sse2: > > - complex: 4.13x faster (1514 vs. 367 cycles) > > - simple: 4.38x faster (1836 vs. 419 cycles) > > > > Haswell: > > - sse2: > > - complex: 3.61x faster ( 936 vs. 260 cycles) > > - simple: 3.97x faster (1126 vs. 284 cycles) > > - avx (versus sse2): > > - complex: 1.07x faster (260 vs. 244 cycles) > > - simple: 1.03x faster (284 vs. 274 cycles) > > I included the sse2 results for the Haswell to show that the avx is > (slightly) better. Ah! Now it makes sense. I had no idea why your SSE2 results changed from 367 (SSE2 vs. C) to 260 cycles (AVX vs. SSE2). Ronald _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel