Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-12-28 Thread Ronald S. Bultje
Hi, dead thread ping. On Mon, Sep 5, 2011 at 6:33 PM, Loren Merritt wrote: > On Sat, Sep 3, 2011, Vitor Sessak wrote: >> On Sat, Sep 3, 2011, Ronald S. Bultje wrote: >>> On Thu, Sep 1, 2011, Vitor Sessak wrote: >>> +%macro LOADA64 2 +   movlps   %1, [%2] +   movhps   %1, [%2 +

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-05 Thread Loren Merritt
On Sat, Sep 3, 2011, Ronald S. Bultje wrote: > On Sat, Sep 3, 2011, Ronald S. Bultje wrote: >> On Thu, Sep 1, 2011, Vitor Sessak wrote: >> >>> +%macro LOADA64 2 >>> + � movlps � %1, [%2] >>> + � movhps � %1, [%2 + 8] >>> +%endmacro >>> + >>> +%macro STOREA64 2 >>> + � movlps � [%1 � �], %2 >>> +

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-05 Thread Vitor Sessak
On Mon, Sep 5, 2011 at 10:52 PM, Luca Barbato wrote: > On 9/5/11 8:33 PM, Vitor Sessak wrote: >> >> On Sat, Sep 3, 2011 at 12:50 PM, Vitor Sessak  wrote: >>> >>> On Fri, Sep 2, 2011 at 3:04 PM, Loren Merritt >>>  wrote: On Fri, 2 Sep 2011, Vitor Sessak wrote: > ; input  %1={x1,x

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-05 Thread Luca Barbato
On 9/5/11 8:33 PM, Vitor Sessak wrote: On Sat, Sep 3, 2011 at 12:50 PM, Vitor Sessak wrote: On Fri, Sep 2, 2011 at 3:04 PM, Loren Merritt wrote: On Fri, 2 Sep 2011, Vitor Sessak wrote: ; input %1={x1,x2,x3,x4}, %2={y1,y2,y3,y4} ; output %3={x4,y1,y2,y3} %macro ROTLEFT 3 BUILDINVHIGHLO

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-05 Thread Vitor Sessak
On Sat, Sep 3, 2011 at 12:50 PM, Vitor Sessak wrote: > On Fri, Sep 2, 2011 at 3:04 PM, Loren Merritt wrote: >> On Fri, 2 Sep 2011, Vitor Sessak wrote: >> >>> ; input  %1={x1,x2,x3,x4}, %2={y1,y2,y3,y4} >>> ; output %3={x4,y1,y2,y3} >>> %macro ROTLEFT 3 >>>     BUILDINVHIGHLOW %1, %2, %3 >>>     s

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-03 Thread Vitor Sessak
On Fri, Sep 2, 2011 at 3:04 PM, Loren Merritt wrote: > On Fri, 2 Sep 2011, Vitor Sessak wrote: > >> ; input  %1={x1,x2,x3,x4}, %2={y1,y2,y3,y4} >> ; output %3={x4,y1,y2,y3} >> %macro ROTLEFT 3 >>     BUILDINVHIGHLOW %1, %2, %3 >>     shufps  %3, %3, %2, 0x99 >> %endmacro > > palignr New version w

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-03 Thread Vitor Sessak
On Sat, Sep 3, 2011 at 2:01 AM, Ronald S. Bultje wrote: > Hi, > > On Thu, Sep 1, 2011 at 9:54 PM, Vitor Sessak wrote: >> On Sun, Aug 21, 2011 at 4:53 PM, Vitor Sessak wrote: >>> $subj. A lot faster on my Atom, much less impressive difference for others >>> CPUs. >> >> ping (new version attached

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-03 Thread Måns Rullgård
Vitor Sessak writes: > On Sun, Aug 21, 2011 at 4:53 PM, Vitor Sessak wrote: >> $subj. A lot faster on my Atom, much less impressive difference for others >> CPUs. > > ping (new version attached). > > -Vitor > > From c49e56aeda477c499d523ea34ee65947bae6cc39 Mon Sep 17 00:00:00 2001 > From: Vitor

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-02 Thread Ronald S. Bultje
Hi, On Thu, Sep 1, 2011 at 9:54 PM, Vitor Sessak wrote: > On Sun, Aug 21, 2011 at 4:53 PM, Vitor Sessak wrote: >> $subj. A lot faster on my Atom, much less impressive difference for others >> CPUs. > > ping (new version attached). I really think I'm missing something very very obvious here, or

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-02 Thread Loren Merritt
On Fri, 2 Sep 2011, Vitor Sessak wrote: > ; input %1={x1,x2,x3,x4}, %2={y1,y2,y3,y4} > ; output %3={x4,y1,y2,y3} > %macro ROTLEFT 3 > BUILDINVHIGHLOW %1, %2, %3 > shufps %3, %3, %2, 0x99 > %endmacro palignr --Loren Merritt ___ libav-devel mai

Re: [libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-09-01 Thread Vitor Sessak
On Sun, Aug 21, 2011 at 4:53 PM, Vitor Sessak wrote: > $subj. A lot faster on my Atom, much less impressive difference for others > CPUs. ping (new version attached). -Vitor From c49e56aeda477c499d523ea34ee65947bae6cc39 Mon Sep 17 00:00:00 2001 From: Vitor Sessak Date: Mon, 22 Aug 2011 07:59:4

[libav-devel] [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36()

2011-08-21 Thread Vitor Sessak
$subj. A lot faster on my Atom, much less impressive difference for others CPUs. -Vitor From a57855f575b0ef85d95dbe318ebe988314c770dc Mon Sep 17 00:00:00 2001 From: Vitor Sessak Date: Sun, 21 Aug 2011 16:44:20 +0200 Subject: [PATCH 2/2] mpegaudiodec: add SSE-optimized imdct36() --- libavcodec/x