Re: [FFmpeg-devel] [PATCH 0/4] Exploit compile-time constant
Patch okay. Mickaël Le 4 août 2014 à 10:31, Christophe Gisquet christophe.gisq...@gmail.com a écrit : Hi, 2014-08-02 14:48 GMT+02:00 Michael Niedermayer michae...@gmx.at: seems to fail with libavcodec/x86/hevc_mc.asm:1258: error: (add:2) cannot reference symbol `MAX_PB_SIZE' in preprocessor I forgot the initial patch when generating the patchset, that you can find here. I expect no changes for the others, so I didn't bother resending them/starting another thread. -- Christophe 0001-x86-hevc_mc-assume-2nd-source-stride-is-64.patch___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 0/4] Exploit compile-time constant
On Fri, Aug 22, 2014 at 11:40:17AM +0200, Mickaël Raulet wrote: Patch okay. patch applied just to make sure i dont misunderstand, that okay was just for this patch or the whole patchset ? thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB In fact, the RIAA has been known to suggest that students drop out of college or go to community college in order to be able to afford settlements. -- The RIAA signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 0/4] Exploit compile-time constant
for the whole patchset. Mickaël Le 22 août 2014 à 13:25, Michael Niedermayer michae...@gmx.at a écrit : On Fri, Aug 22, 2014 at 11:40:17AM +0200, Mickaël Raulet wrote: Patch okay. patch applied just to make sure i dont misunderstand, that okay was just for this patch or the whole patchset ? thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB In fact, the RIAA has been known to suggest that students drop out of college or go to community college in order to be able to afford settlements. -- The RIAA ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 0/4] Exploit compile-time constant
On Fri, Aug 22, 2014 at 02:04:36PM +0200, Mickaël Raulet wrote: for the whole patchset. all applied thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Let us carefully observe those good qualities wherein our enemies excel us and endeavor to excel them, by avoiding what is faulty, and imitating what is excellent in them. -- Plutarch signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 0/4] Exploit compile-time constant
Hi, 2014-08-02 14:48 GMT+02:00 Michael Niedermayer michae...@gmx.at: seems to fail with libavcodec/x86/hevc_mc.asm:1258: error: (add:2) cannot reference symbol `MAX_PB_SIZE' in preprocessor I forgot the initial patch when generating the patchset, that you can find here. I expect no changes for the others, so I didn't bother resending them/starting another thread. -- Christophe From 8b13e4350c6662ca4bd2bcab443a1e62f7751b30 Mon Sep 17 00:00:00 2001 From: Christophe Gisquet christophe.gisq...@gmail.com Date: Mon, 28 Jul 2014 08:55:26 +0200 Subject: [PATCH 1/5] x86: hevc_mc: assume 2nd source stride is 64 --- libavcodec/x86/hevc_mc.asm | 36 +--- 1 file changed, 21 insertions(+), 15 deletions(-) diff --git a/libavcodec/x86/hevc_mc.asm b/libavcodec/x86/hevc_mc.asm index fc78062..51017cf 100644 --- a/libavcodec/x86/hevc_mc.asm +++ b/libavcodec/x86/hevc_mc.asm @@ -75,6 +75,8 @@ QPEL_TABLE 8, 8, b, sse4 QPEL_TABLE 10, 4, w, sse4 QPEL_TABLE 12, 4, w, sse4 +%define MAX_PB_SIZE 64 + %define hevc_qpel_filters_sse4_14 hevc_qpel_filters_sse4_10 %if ARCH_X86_64 @@ -377,7 +379,11 @@ QPEL_TABLE 12, 4, w, sse4 %endmacro %macro LOOP_END 4 +%ifnum %2 +add %1q, 2*%2 ; dst += dststride +%else lea %1q, [%1q+2*%2q]; dst += dststride +%endif add %3q, %4q; src += srcstride dec heightd ; cmp height jnz .loop ; height loop @@ -548,7 +554,7 @@ cglobal hevc_put_hevc_pel_pixels%1_%2, 5, 5, 3, dst, dststride, src, srcstride,h SIMPLE_LOAD %1, %2, srcq, m0 MC_PIXEL_COMPUTE %1, %2 PEL_10STORE%1 dstq, m0, m1 -LOOP_END dst, dststride, src, srcstride +LOOP_END dst, MAX_PB_SIZE, src, srcstride RET cglobal hevc_put_hevc_uni_pel_pixels%1_%2, 5, 5, 3, dst, dststride, src, srcstride,height @@ -573,7 +579,7 @@ cglobal hevc_put_hevc_bi_pel_pixels%1_%2, 7, 7, 6, dst, dststride, src, srcstrid PEL_%2STORE%1 dstq, m0, m1 add dstq, dststrideq ; dst += dststride add srcq, srcstrideq ; src += srcstride -leasrc2q, [src2q+2*src2strideq] ; src += srcstride +addsrc2q, 2*MAX_PB_SIZE ; src += srcstride dec heightd ; cmp height jnz .loop ; height loop RET @@ -597,7 +603,7 @@ cglobal hevc_put_hevc_epel_h%1_%2, 6, 7, 6, dst, dststride, src, srcstride, heig EPEL_LOAD %2, srcq-%%stride, %%stride, %1 EPEL_COMPUTE %2, %1, m4, m5 PEL_10STORE%1 dstq, m0, m1 -LOOP_END dst, dststride, src, srcstride +LOOP_END dst, MAX_PB_SIZE, src, srcstride RET cglobal hevc_put_hevc_uni_epel_h%1_%2, 6, 7, 7, dst, dststride, src, srcstride, height, mx, rfilter @@ -626,7 +632,7 @@ cglobal hevc_put_hevc_bi_epel_h%1_%2, 8, 9, 7, dst, dststride, src, srcstride, s PEL_%2STORE%1 dstq, m0, m1 add dstq, dststrideq ; dst += dststride add srcq, srcstrideq ; src += srcstride -leasrc2q, [src2q+2*src2strideq] ; src += srcstride +addsrc2q, 2*MAX_PB_SIZE ; src += srcstride dec heightd ; cmp height jnz .loop ; height loop RET @@ -646,7 +652,7 @@ cglobal hevc_put_hevc_epel_v%1_%2, 7, 8, 6, dst, dststride, src, srcstride, heig EPEL_LOAD %2, srcq, srcstride, %1 EPEL_COMPUTE %2, %1, m4, m5 PEL_10STORE%1 dstq, m0, m1 -LOOP_END dst, dststride, src, srcstride +LOOP_END dst, MAX_PB_SIZE, src, srcstride RET cglobal hevc_put_hevc_uni_epel_v%1_%2, 7, 8, 7, dst, dststride, src, srcstride, height, r3src, my, rfilter @@ -679,7 +685,7 @@ cglobal hevc_put_hevc_bi_epel_v%1_%2, 9, 10, 7, dst, dststride, src, srcstride, PEL_%2STORE%1 dstq, m0, m1 add dstq, dststrideq ; dst += dststride add srcq, srcstrideq ; src += srcstride -leasrc2q, [src2q+2*src2strideq] ; src += srcstride +addsrc2q, 2*MAX_PB_SIZE ; src += srcstride dec heightd ; cmp height jnz .loop ; height loop RET @@ -724,7 +730,7 @@ cglobal hevc_put_hevc_epel_hv%1_%2, 7, 9, 12 , dst, dststride, src, srcstride, h movdqam4, m5 movdqam5, m6 movdqam6, m7 -LOOP_END dst, dststride, src, srcstride +LOOP_END dst, MAX_PB_SIZE, src, srcstride RET cglobal hevc_put_hevc_uni_epel_hv%1_%2, 7, 9, 12 , dst, dststride, src, srcstride, height, mx, my, r3src, rfilter @@ -801,7 +807,7 @@ cglobal hevc_put_hevc_bi_epel_hv%1_%2,
Re: [FFmpeg-devel] [PATCH 0/4] Exploit compile-time constant
On Mon, Aug 04, 2014 at 10:31:52AM +0200, Christophe Gisquet wrote: Hi, 2014-08-02 14:48 GMT+02:00 Michael Niedermayer michae...@gmx.at: seems to fail with libavcodec/x86/hevc_mc.asm:1258: error: (add:2) cannot reference symbol `MAX_PB_SIZE' in preprocessor I forgot the initial patch when generating the patchset, that you can find here. I expect no changes for the others, so I didn't bother resending them/starting another thread. yes, builds works fine now with that patch [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB What does censorship reveal? It reveals fear. -- Julian Assange signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 0/4] Exploit compile-time constant
Hi, 2014-08-02 14:48 GMT+02:00 Michael Niedermayer michae...@gmx.at: is this for apply/push or just RFC/WIP ? in-between. I had expected Mickael Raulet to comment if he was seeing something not compatible with this. I think the bipred code is a bit more mature since Ronald comments (iirc), so premature optimization is probably a bit strong. Once Mickael is OK, then I'd agree with you about applying it. you say Premature optimization and overall not that useful. i would tend to suggest to apply it as it improves speed ... I was saying this mostly because it doesn't really register overall: MC is around 20% in ffhevc for starters. seems to fail with libavcodec/x86/hevc_mc.asm:1258: error: (add:2) cannot reference symbol `MAX_PB_SIZE' in preprocessor That's actually the biggest beef I have with this patchset: MAX_PB_SIZE is a C and asm define, and the 2 need to be synchronized manually. I don't see how it could go beyond 64 (max block size in hevc), so the issue is rhetorical. I'm busy atm so I don't expect a new patchset soon. Best regards, -- Christophe ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 0/4] Exploit compile-time constant
On Mon, Jul 28, 2014 at 05:17:24PM +, Christophe Gisquet wrote: MAX_PB_SIZE is used or assumed for various buffer strides. In some cases, it is used as constant parameter(s) to functions. Make use of that knowledge to: - not pass the parameter - avoid extra GPR usage - precompute addresses / offsets Premature optimization and overall not that useful. Before: 46092 decicycles in oma, 1028766 runs, 19810 skips 10174 decicycles in chroma, 2065859 runs, 31293 skips After: 45634 decicycles in luma, 1027414 runs, 21162 skips 9932 decicycles in chroma, 2063780 runs, 33372 skips is this for apply/push or just RFC/WIP ? you say Premature optimization and overall not that useful. i would tend to suggest to apply it as it improves speed ... Christophe Gisquet (4): hevc: move MAX_PB_SIZE declaration hevcdsp: remove compilation-time-fixed parameter hevcdsp: remove more instances of compile-time-fixed parameters x86: hevcdsp: use compilation-time-fixed constant seems to fail with libavcodec/x86/hevc_mc.asm:1258: error: (add:2) cannot reference symbol `MAX_PB_SIZE' in preprocessor [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Republics decline into democracies and democracies degenerate into despotisms. -- Aristotle signature.asc Description: Digital signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel