Re: [libav-devel] [PATCHv4] arm: vp9: Add NEON optimizations of VP9 MC functions

2016-11-03 Thread Janne Grunau
On 2016-11-02 23:14:28 +0200, Martin Storsjö wrote: > On Wed, 2 Nov 2016, Martin Storsjö wrote: > > I'll push the arm version tomorrow unless there's more comments on it. patch ok. push when you want. Janne ___ libav-devel mailing list

Re: [libav-devel] [PATCHv4] arm: vp9: Add NEON optimizations of VP9 MC functions

2016-11-02 Thread Martin Storsjö
On Wed, 2 Nov 2016, Martin Storsjö wrote: +@ Instantiate a horizontal filter function for the given size. +@ This can work on 4, 8 or 16 pixels in parallel; for larger +@ widths it will do 16 pixels at a time and loop horizontally. +@ The actual width is passed in r5, the height in r4 and +@

Re: [libav-devel] [PATCHv4] arm: vp9: Add NEON optimizations of VP9 MC functions

2016-11-02 Thread Martin Storsjö
On Wed, 2 Nov 2016, Janne Grunau wrote: On 2016-11-02 13:47:37 +0200, Martin Storsjö wrote: diff --git a/libavcodec/arm/vp9mc_neon.S b/libavcodec/arm/vp9mc_neon.S new file mode 100644 index 000..0651ec7 --- /dev/null +++ b/libavcodec/arm/vp9mc_neon.S @@ -0,0 +1,764 @@ + +@ All public

Re: [libav-devel] [PATCHv4] arm: vp9: Add NEON optimizations of VP9 MC functions

2016-11-02 Thread Janne Grunau
On 2016-11-02 13:47:37 +0200, Martin Storsjö wrote: ... > --- > libavcodec/arm/Makefile | 2 + > libavcodec/arm/vp9dsp_init_arm.c | 140 +++ > libavcodec/arm/vp9mc_neon.S | 764 > +++ > libavcodec/vp9.h | 4 +- >

[libav-devel] [PATCHv4] arm: vp9: Add NEON optimizations of VP9 MC functions

2016-11-02 Thread Martin Storsjö
This work is sponsored by, and copyright, Google. The filter coefficients are signed values, where the product of the multiplication with one individual filter coefficient doesn't overflow a 16 bit signed value (the largest filter coefficient is 127). But when the products are accumulated, the