Re: [FFmpeg-devel] [PATCH] avcodec/vp9: ipred_dr_16x16_16 avx2 implementation

2017-06-12 Thread Ronald S. Bultje
Hi, On Sat, Jun 10, 2017 at 6:01 AM, Ilia Valiakhmetov wrote: > Signed-off-by: Ilia Valiakhmetov > --- > libavcodec/x86/vp9dsp_init_16bpp.c| 2 ++ > libavcodec/x86/vp9intrapred_16bpp.asm | 56 ++ > + > 2 files changed, 58 insertions(+) > > diff --git a/liba

[FFmpeg-devel] [PATCH] avcodec/vp9: ipred_dr_16x16_16 avx2 implementation

2017-06-10 Thread Ilia Valiakhmetov
Signed-off-by: Ilia Valiakhmetov --- libavcodec/x86/vp9dsp_init_16bpp.c| 2 ++ libavcodec/x86/vp9intrapred_16bpp.asm | 56 +++ 2 files changed, 58 insertions(+) diff --git a/libavcodec/x86/vp9dsp_init_16bpp.c b/libavcodec/x86/vp9dsp_init_16bpp.c index d1b8fc

Re: [FFmpeg-devel] [PATCH] avcodec/vp9: ipred_dr_16x16_16 avx2 implementation

2017-06-10 Thread gh0st
Yes, you are right, I'll send a patch with this fixed, thanks. On Sat, Jun 10, 2017 at 5:35 AM, Ivan Kalvachev wrote: > On 6/9/17, Ilia Valiakhmetov wrote: > > Signed-off-by: Ilia Valiakhmetov > > --- > > libavcodec/x86/vp9dsp_init_16bpp.c| 2 ++ > > libavcodec/x86/vp9intrapred_16bpp.asm

Re: [FFmpeg-devel] [PATCH] avcodec/vp9: ipred_dr_16x16_16 avx2 implementation

2017-06-09 Thread Ivan Kalvachev
On 6/9/17, Ilia Valiakhmetov wrote: > Signed-off-by: Ilia Valiakhmetov > --- > libavcodec/x86/vp9dsp_init_16bpp.c| 2 ++ > libavcodec/x86/vp9intrapred_16bpp.asm | 56 > +++ > 2 files changed, 58 insertions(+) > > diff --git a/libavcodec/x86/vp9dsp_init_16bpp.

[FFmpeg-devel] [PATCH] avcodec/vp9: ipred_dr_16x16_16 avx2 implementation

2017-06-09 Thread Ilia Valiakhmetov
Signed-off-by: Ilia Valiakhmetov --- libavcodec/x86/vp9dsp_init_16bpp.c| 2 ++ libavcodec/x86/vp9intrapred_16bpp.asm | 56 +++ 2 files changed, 58 insertions(+) diff --git a/libavcodec/x86/vp9dsp_init_16bpp.c b/libavcodec/x86/vp9dsp_init_16bpp.c index d1b8fc

Re: [FFmpeg-devel] [PATCH] avcodec/vp9: ipred_dr_16x16_16 avx2 implementation

2017-06-09 Thread gh0st
>I know unaligned loads are not as slow as they used to be, >but could m1 be produced by m2 and palignr? I am not sure, can you clarify your question? >From the comment I assume you don't use the extra two bytes >that you get from the load, as you mark them as "*" >generic undefined values No, t

Re: [FFmpeg-devel] [PATCH] avcodec/vp9: ipred_dr_16x16_16 avx2 implementation

2017-06-09 Thread Ivan Kalvachev
On 6/8/17, Ilia Valiakhmetov wrote: > vp9_diag_downright_16x16_12bpp_c: 149.0 > vp9_diag_downright_16x16_12bpp_sse2: 67.8 > vp9_diag_downright_16x16_12bpp_ssse3: 45.6 > vp9_diag_downright_16x16_12bpp_avx: 36.6 > vp9_diag_downright_16x16_12bpp_avx2: 25.5 > > ~30% faster than avx > > Signed-off-by:

[FFmpeg-devel] [PATCH] avcodec/vp9: ipred_dr_16x16_16 avx2 implementation

2017-06-08 Thread Ilia Valiakhmetov
vp9_diag_downright_16x16_12bpp_c: 149.0 vp9_diag_downright_16x16_12bpp_sse2: 67.8 vp9_diag_downright_16x16_12bpp_ssse3: 45.6 vp9_diag_downright_16x16_12bpp_avx: 36.6 vp9_diag_downright_16x16_12bpp_avx2: 25.5 ~30% faster than avx Signed-off-by: Ilia Valiakhmetov --- libavcodec/x86/vp9dsp_init_16