Re: [libav-devel] [PATCH 02/15] lavr: x86: optimized 6-channel s16p to s16 conversion

Ronald S. Bultje Tue, 24 Jul 2012 20:51:58 -0700

Hi,

On Sat, Jul 14, 2012 at 9:29 PM, Justin Ruggles
<justin.rugg...@gmail.com> wrote:
> ---
>  libavresample/x86/audio_convert.asm    |   62 
> ++++++++++++++++++++++++++++++++
>  libavresample/x86/audio_convert_init.c |    9 +++++
>  2 files changed, 71 insertions(+), 0 deletions(-)
>
> diff --git a/libavresample/x86/audio_convert.asm 
> b/libavresample/x86/audio_convert.asm
> index 0ca562a..fdcea3a 100644
> --- a/libavresample/x86/audio_convert.asm
> +++ b/libavresample/x86/audio_convert.asm
> @@ -269,6 +269,68 @@ INIT_XMM avx
>  CONV_S16P_TO_S16_2CH
>  %endif
>
> +;------------------------------------------------------------------------------
> +; void ff_conv_s16p_to_s16_6ch(int16_t *dst, int16_t *const *src, int len,
> +;                              int channels);
> +;------------------------------------------------------------------------------
> +
> +%macro CONV_S16P_TO_S16_6CH 0
> +cglobal conv_s16p_to_s16_6ch, 2,8,6, dst, src, src1, src2, src3, src4, src5, 
> len
> +%if ARCH_X86_64
> +    mov     lend, r2d
> +%else
> +    %define lend dword r2m
> +%endif


Eehw, just do:

%if ARCH_X86_64
cglobal ..., 3, 8, 6, dst, src, len, src1, src2, ..
%else
.. what you do up there ..
%endif

> +    movq   [dstq   ], m1
> +    movq   [dstq+ 8], m0
> +    movq   [dstq+16], m2
> +    movhps [dstq+24], m1
> +    movhps [dstq+32], m0
> +    movhps [dstq+40], m2
> +    add      srcq, mmsize/2
> +    add      dstq, mmsize*3
> +    sub      lend, mmsize/4
> +    jg .loop
> +    REP_RET
> +%endmacro

Here, too, I think you can use imul lenq, 6, then add that to dstq,
neg it and index dstq as [dstq+lenq+0/8/16/..]. Then add lend,
mmsize/4 instead of sub, and jl instead of jg, and you can remove the
add dstq, mmsize*3 from the inner loop.

Does unrolling this by another factor of 2 (and thus being able to use
aligned loads/stores) make a performance difference?

Ronald
_______________________________________________
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 02/15] lavr: x86: optimized 6-channel s16p to s16 conversion

Reply via email to