Re: [libav-devel] [PATCH] swscale: implement scale19to15 in sse2.
On Sat, 23 Jul 2011, Jason Garrett-Glaser wrote: > On Sat, Jul 23, 2011 at 8:11 PM, Ronald S. Bultje wrote: >> On Sat, Jul 23, 2011 at 7:32 PM, Jason Garrett-Glaser wrote: >>> On Sat, Jul 23, 2011 at 6:52 PM, Ronald S. Bultje >>> wrote: >>> +cglobal scale19To15Fw_sse2, 3, 4, 0 + � �xor � � � � � r3, r3 +.loop: + � �movdqu � � � �m0, [r1+r3*4+ 0] + � �movdqu � � � �m1, [r1+r3*4+16] + � �psrad � � � � m0, 4 + � �psrad � � � � m1, 4 + � �packssdw � � �m0, m1 + � �movdqu [r0+r3*2], m0 + � �add � � � � � r3, 4 + � �cmp � � � � �r3d, r2d + � �jl .loop + � �REP_RET >>> >>> This is going to be load-bottlenecked. �Why can't we guarantee the >>> inputs or outputs are aligned? >> >> Good question, I've asked that before and got no answer. It seems to >> have to do with special-case conditions where ( blah blah ) a crop can >> be done inline and that must destroy the general system. >> >> I suppose I could make two versions, one movdqa and one movdqu, >> depending on alignment of the src/dst, and test that per-iteration. >> Would that work? > > Wouldn't that be 4 versions for all 4 permutations of the possibilities? The load can always be aligned, if you don't mind fixing up the ends outside the loop. --Loren Merritt___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] swscale: implement SSE2 version for 9/10-bit horizontal scaling.
On Sat, 23 Jul 2011, Ronald S. Bultje wrote: > +%macro SCALE_16BITS_FUNC 1 > +INIT_XMM > +cglobal hscale%1_sse2, 7, 7, 5 > +%ifdef ARCH_X86_64 > +movsxdr2, r2d > +%endif > +movdqam2, [max_19bit] > +cmp r6d, 4 > +je .scale4 > +cmp r6d, 8 > +je .scale8 > +jmp .scaleX cmp r6d, 8 je .scale8 jg .scaleX > +.scale4: > +lea r1, [r1+r2*4] > +lea r4, [r4+r2*8] > +lea r5, [r5+r2*2] > +neg r2 > +.loop4: > +movsx r0, word [r5+r2*2+0] ; filterPos[0] > +movsx r6, word [r5+r2*2+4] ; filterPos[2] > +movq m0, [r3+r0*2] ; src[filterPos[0] + {0,1,2,3}] > +movhpsm0, [r3+r6*2] ; src[filterPos[2] + {0,1,2,3}] > +movsx r0, word [r5+r2*2+2] ; filterPos[1] > +movsx r6, word [r5+r2*2+6] ; filterPos[3] > +movq m3, [r3+r0*2] ; src[filterPos[1] + {0,1,2,3}] > +movhpsm3, [r3+r6*2] ; src[filterPos[3] + {0,1,2,3}] > +movq m1, [r4+r2*8+ 0] ; filter[{0,1,..,6,7}] > +movhpsm1, [r4+r2*8+16] ; filter[{0,1,..,6,7}] > +movq m4, [r4+r2*8+ 8] ; filter[{8,9,..,14,15}] > +movhpsm4, [r4+r2*8+24] ; filter[{8,9,..,14,15}] Can you make these consecutive? > +pmaddwd m0, m1 > +pmaddwd m3, m4 > +pshufdm1, m0, 00110001b > +pshufdm4, m3, 1000b > +paddd m0, m1 > +paddd m3, m4 > +pblendw m0, m3, 11001100b ; filter[{ 0, 1, 2, > 3}]*src[filterPos[0]+{0,1,2,3}], > +; filter[{ 4, 5, 6, > 7}]*src[filterPos[1]+{0,1,2,3}], > +; filter[{ 8, > 9,10,11}]*src[filterPos[2]+{0,1,2,3}], > +; > filter[{12,13,14,15}]*src[filterPos[3]+{0,1,2,3}] pblendw is sse4 > +psrad m0, %1 - 5 > +pminsdm0, m2 You only clamp one end of the range? > +movdqa [r1+r2*4], m0 > +add r2, 4 > +jl .loop4 > +REP_RET > + > +.scale8: > +shl r2, 1 ; this allows *16 (i.e. now *8) in > lea instructions > +lea r1, [r1+r2*2] > +lea r4, [r4+r2*8] > +lea r5, [r5+r2*1] > +neg r2 > +.loop8: > +movsx r0, word [r5+r2*1+0] ; filterPos[0] > +movsx r6, word [r5+r2*1+2] ; filterPos[1] > +movdqum0, [r3+r0*2] ; src[filterPos[0] + > {0,1,2,3,4,5,6,7}] > +movdqum3, [r3+r6*2] ; src[filterPos[1] + > {0,1,2,3,4,5,6,7}] > +movdqam1, [r4+r2*8 ] ; filter[{0,1,..,6,7}] > +movdqam4, [r4+r2*8+16] ; filter[{8,9,..,14,15}] > +pmaddwd m0, m1 > +pmaddwd m3, m4 memory arg > +pshufdm1, m0, 1110b > +pshufdm4, m3, 1110b > +paddd m0, m1 > +paddd m3, m4 > +punpckldq m0, m3 > +pshufdm1, m0, 1110b > +paddd m0, m1; filter[{0,1,..., 6, > 7}]*src[filterPos[0]+{0,1,...,6,7}], > +; > filter[{8,9,...,14,15}]*src[filterPos[1]+{0,1,...,6,7}], > +psrad m0, %1 - 5 > +pminsdm0, m2 > +movq [r1+r2*2], m0 > +add r2, 4 ; it really only does 2px, see "shl > r2,1" above > +jl .loop8 > +REP_RET > + > +.scaleX: > +%ifdef ARCH_X86_64 > +movsxdr6, r6d > +%endif > +lea r11, [r3+r6*2] ; &src[filterSize] r11 on x86_32? > +mov r6m, r11 Keep it in a register. > +lea r5, [r5+r2*2] > +lea r1, [r1+r2*4] > +neg r2 > +.loopX: > +movsx r0, word [r5+r2*2+0] ; filterPos[0] > +movsxr10, word [r5+r2*2+2] ; filterPos[1] > +pxor m3, m3 > +mov r11, r3 > +.innerloopX: > +movq m0, [r11+r0 *2] ; src[filterPos[0] + {0,1,2,3}] > +movhpsm0, [r11+r10*2] ; src[filterPos[1] + {0,1,2,3}] > +movq m1, [r4 ] ; filter[{0,1,2,3}] > +movhpsm1, [r4+r6*2] ; filter[filtersize+{0,1,2,3}] > +pmaddwd m0, m1 > +paddd m3, m0 > +add r4, 8 > +add r11, 8 > +cmp r11, r6m ; while (src += 4) < &src[filterSize] > +jl .innerloopX > +lea r4, [r4+r6*2] > +pshufdm0, m3, 00110001b > +paddd m0, m3 > +pshufdm0, m0, 1000b Would be lower latency if you did the 2 shuffles in parallel. > +psrad m0, %1 - 5 > +pminsdm0, m2 > +movq [r1+r2*4], m0 > +add r2, 2 > +jl .loopX > +RET > +%endmacro --Loren Merritt ___ libav-devel mailing list libav-devel@libav.org https://list
Re: [libav-devel] [PATCH] swscale: implement scale19to15 in sse2.
On Sat, Jul 23, 2011 at 8:11 PM, Ronald S. Bultje wrote: > Hi, > > On Sat, Jul 23, 2011 at 7:32 PM, Jason Garrett-Glaser wrote: >> On Sat, Jul 23, 2011 at 6:52 PM, Ronald S. Bultje wrote: >>> From 4.89k cycles to 1.23k cycles on a 1980-pixel-wide movie, i.e. >>> ~4x speedup. >>> --- >>> libswscale/x86/scale.asm | 20 >>> libswscale/x86/swscale_mmx.c | 2 ++ >>> 2 files changed, 22 insertions(+), 0 deletions(-) >>> >>> diff --git a/libswscale/x86/scale.asm b/libswscale/x86/scale.asm >>> index e200801..c3a9a75 100644 >>> --- a/libswscale/x86/scale.asm >>> +++ b/libswscale/x86/scale.asm >>> @@ -153,3 +153,23 @@ cglobal hscale%1_sse2, 7, 7, 5 >>> >>> SCALE_16BITS_FUNC 9 >>> SCALE_16BITS_FUNC 10 >>> + >>> +;- >>> +; cut off last 4 bits of a line of pixels, and convert from 32 to 16 bits >>> +; >>> +; void scale19To15Fw_sse2(int16_t *dst, int32_t *src, int len); >>> +;- >>> + >>> +cglobal scale19To15Fw_sse2, 3, 4, 0 >>> + xor r3, r3 >>> +.loop: >>> + movdqu m0, [r1+r3*4+ 0] >>> + movdqu m1, [r1+r3*4+16] >>> + psrad m0, 4 >>> + psrad m1, 4 >>> + packssdw m0, m1 >>> + movdqu [r0+r3*2], m0 >>> + add r3, 4 >>> + cmp r3d, r2d >>> + jl .loop >>> + REP_RET >> >> This is going to be load-bottlenecked. Why can't we guarantee the >> inputs or outputs are aligned? > > Good question, I've asked that before and got no answer. It seems to > have to do with special-case conditions where ( blah blah ) a crop can > be done inline and that must destroy the general system. > > I suppose I could make two versions, one movdqa and one movdqu, > depending on alignment of the src/dst, and test that per-iteration. > Would that work? Wouldn't that be 4 versions for all 4 permutations of the possibilities? Jason ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] swscale: implement scale19to15 in sse2.
Hi, On Sat, Jul 23, 2011 at 7:32 PM, Jason Garrett-Glaser wrote: > On Sat, Jul 23, 2011 at 6:52 PM, Ronald S. Bultje wrote: >> From 4.89k cycles to 1.23k cycles on a 1980-pixel-wide movie, i.e. >> ~4x speedup. >> --- >> libswscale/x86/scale.asm | 20 >> libswscale/x86/swscale_mmx.c | 2 ++ >> 2 files changed, 22 insertions(+), 0 deletions(-) >> >> diff --git a/libswscale/x86/scale.asm b/libswscale/x86/scale.asm >> index e200801..c3a9a75 100644 >> --- a/libswscale/x86/scale.asm >> +++ b/libswscale/x86/scale.asm >> @@ -153,3 +153,23 @@ cglobal hscale%1_sse2, 7, 7, 5 >> >> SCALE_16BITS_FUNC 9 >> SCALE_16BITS_FUNC 10 >> + >> +;- >> +; cut off last 4 bits of a line of pixels, and convert from 32 to 16 bits >> +; >> +; void scale19To15Fw_sse2(int16_t *dst, int32_t *src, int len); >> +;- >> + >> +cglobal scale19To15Fw_sse2, 3, 4, 0 >> + xor r3, r3 >> +.loop: >> + movdqu m0, [r1+r3*4+ 0] >> + movdqu m1, [r1+r3*4+16] >> + psrad m0, 4 >> + psrad m1, 4 >> + packssdw m0, m1 >> + movdqu [r0+r3*2], m0 >> + add r3, 4 >> + cmp r3d, r2d >> + jl .loop >> + REP_RET > > This is going to be load-bottlenecked. Why can't we guarantee the > inputs or outputs are aligned? Good question, I've asked that before and got no answer. It seems to have to do with special-case conditions where ( blah blah ) a crop can be done inline and that must destroy the general system. I suppose I could make two versions, one movdqa and one movdqu, depending on alignment of the src/dst, and test that per-iteration. Would that work? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] swscale: implement scale19to15 in sse2.
On Sat, Jul 23, 2011 at 6:52 PM, Ronald S. Bultje wrote: > From 4.89k cycles to 1.23k cycles on a 1980-pixel-wide movie, i.e. > ~4x speedup. > --- > libswscale/x86/scale.asm | 20 > libswscale/x86/swscale_mmx.c | 2 ++ > 2 files changed, 22 insertions(+), 0 deletions(-) > > diff --git a/libswscale/x86/scale.asm b/libswscale/x86/scale.asm > index e200801..c3a9a75 100644 > --- a/libswscale/x86/scale.asm > +++ b/libswscale/x86/scale.asm > @@ -153,3 +153,23 @@ cglobal hscale%1_sse2, 7, 7, 5 > > SCALE_16BITS_FUNC 9 > SCALE_16BITS_FUNC 10 > + > +;- > +; cut off last 4 bits of a line of pixels, and convert from 32 to 16 bits > +; > +; void scale19To15Fw_sse2(int16_t *dst, int32_t *src, int len); > +;- > + > +cglobal scale19To15Fw_sse2, 3, 4, 0 > + xor r3, r3 > +.loop: > + movdqu m0, [r1+r3*4+ 0] > + movdqu m1, [r1+r3*4+16] > + psrad m0, 4 > + psrad m1, 4 > + packssdw m0, m1 > + movdqu [r0+r3*2], m0 > + add r3, 4 > + cmp r3d, r2d > + jl .loop > + REP_RET This is going to be load-bottlenecked. Why can't we guarantee the inputs or outputs are aligned? Jason ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] swscale: implement scale19to15 in sse2.
>From 4.89k cycles to 1.23k cycles on a 1980-pixel-wide movie, i.e. ~4x speedup. --- libswscale/x86/scale.asm | 20 libswscale/x86/swscale_mmx.c |2 ++ 2 files changed, 22 insertions(+), 0 deletions(-) diff --git a/libswscale/x86/scale.asm b/libswscale/x86/scale.asm index e200801..c3a9a75 100644 --- a/libswscale/x86/scale.asm +++ b/libswscale/x86/scale.asm @@ -153,3 +153,23 @@ cglobal hscale%1_sse2, 7, 7, 5 SCALE_16BITS_FUNC 9 SCALE_16BITS_FUNC 10 + +;- +; cut off last 4 bits of a line of pixels, and convert from 32 to 16 bits +; +; void scale19To15Fw_sse2(int16_t *dst, int32_t *src, int len); +;- + +cglobal scale19To15Fw_sse2, 3, 4, 0 +xor r3, r3 +.loop: +movdqum0, [r1+r3*4+ 0] +movdqum1, [r1+r3*4+16] +psrad m0, 4 +psrad m1, 4 +packssdw m0, m1 +movdqu [r0+r3*2], m0 +add r3, 4 +cmp r3d, r2d +jl .loop +REP_RET diff --git a/libswscale/x86/swscale_mmx.c b/libswscale/x86/swscale_mmx.c index d9995cd..f3abeab 100644 --- a/libswscale/x86/swscale_mmx.c +++ b/libswscale/x86/swscale_mmx.c @@ -184,6 +184,7 @@ extern void ff_hscale9_sse2 (SwsContext *c, int16_t *_dst, int dstW, const uint8_t *_src, const int16_t *filter, const int16_t *filterPos, int filterSize); +extern void ff_scale19To15Fw_sse2(int16_t *dst, const int32_t *src, int len); void ff_sws_init_swScale_mmx(SwsContext *c) { @@ -206,6 +207,7 @@ void ff_sws_init_swScale_mmx(SwsContext *c) c->hScale = ff_hscale10_sse2; } } +c->scale19To15Fw = ff_scale19To15Fw_sse2; } #endif } -- 1.7.2.1 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] Move x86inc.asm to libavutil/.
On Sat, Jul 23, 2011 at 9:36 PM, Ronald S. Bultje wrote: > This allows using it in libswscale/ also. > --- > libavcodec/x86/ac3dsp.asm |2 +- > libavcodec/x86/dct32_sse.asm|2 +- > libavcodec/x86/deinterlace.asm |2 +- > libavcodec/x86/dsputil_yasm.asm |2 +- > libavcodec/x86/dsputilenc_yasm.asm |2 +- > libavcodec/x86/fft_mmx.asm |2 +- > libavcodec/x86/fmtconvert.asm |2 +- > libavcodec/x86/h264_chromamc.asm|2 +- > libavcodec/x86/h264_chromamc_10bit.asm |2 +- > libavcodec/x86/h264_deblock.asm |2 +- > libavcodec/x86/h264_deblock_10bit.asm |2 +- > libavcodec/x86/h264_idct.asm|2 +- > libavcodec/x86/h264_idct_10bit.asm |2 +- > libavcodec/x86/h264_intrapred.asm |2 +- > libavcodec/x86/h264_intrapred_10bit.asm |2 +- > libavcodec/x86/h264_qpel_10bit.asm |2 +- > libavcodec/x86/h264_weight.asm |2 +- > libavcodec/x86/h264_weight_10bit.asm|2 +- > libavcodec/x86/vc1dsp_yasm.asm |2 +- > libavcodec/x86/vp3dsp.asm |2 +- > libavcodec/x86/vp56dsp.asm |2 +- > libavcodec/x86/vp8dsp.asm |2 +- > libavcodec/x86/x86inc.asm | 905 > --- > libavutil/x86/x86inc.asm| 905 > +++ > 24 files changed, 927 insertions(+), 927 deletions(-) > delete mode 100644 libavcodec/x86/x86inc.asm > create mode 100644 libavutil/x86/x86inc.asm > Ok if you've tested to make sure nothing breaks. ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 2/2] swscale: implement SSE2 version for 9/10-bit horizontal scaling.
Speed: ~340k cycles -> ~59k cycles for a 1920-pixel wide movie, i.e. around 5.8x faster. --- libswscale/Makefile |1 + libswscale/x86/scale.asm | 155 ++ libswscale/x86/swscale_mmx.c | 22 ++ 3 files changed, 178 insertions(+), 0 deletions(-) create mode 100644 libswscale/x86/scale.asm diff --git a/libswscale/Makefile b/libswscale/Makefile index 57e867a..88eb6e2 100644 --- a/libswscale/Makefile +++ b/libswscale/Makefile @@ -16,6 +16,7 @@ OBJS-$(HAVE_ALTIVEC) += ppc/swscale_altivec.o\ OBJS-$(HAVE_MMX) += x86/rgb2rgb.o\ x86/swscale_mmx.o\ x86/yuv2rgb_mmx.o +OBJS-$(HAVE_YASM) += x86/scale.o OBJS-$(HAVE_VIS) += sparc/yuv2rgb_vis.o TESTPROGS = colorspace swscale diff --git a/libswscale/x86/scale.asm b/libswscale/x86/scale.asm new file mode 100644 index 000..e200801 --- /dev/null +++ b/libswscale/x86/scale.asm @@ -0,0 +1,155 @@ +;** +;* x86-optimized horizontal line scaling functions +;* Copyright (c) 2011 Ronald S. Bultje +;* +;* This file is part of Libav. +;* +;* Libav is free software; you can redistribute it and/or +;* modify it under the terms of the GNU Lesser General Public +;* License as published by the Free Software Foundation; either +;* version 2.1 of the License, or (at your option) any later version. +;* +;* Libav is distributed in the hope that it will be useful, +;* but WITHOUT ANY WARRANTY; without even the implied warranty of +;* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +;* Lesser General Public License for more details. +;* +;* You should have received a copy of the GNU Lesser General Public +;* License along with Libav; if not, write to the Free Software +;* Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA +;** + +%include "libavutil/x86/x86inc.asm" + +SECTION_RODATA + +max_19bit: times 4 dd (1 << 19) - 1 + +SECTION .text + +;- +; 16-bit horizontal line scaling +; +; static void hscale_c(SwsContext *c, int32_t *_dst, +; int dstW, const uint16_t *_src, +; const int16_t *filter, +; const int16_t *filterPos, int filterSize) +;- + +%macro SCALE_16BITS_FUNC 1 +INIT_XMM +cglobal hscale%1_sse2, 7, 7, 5 +%ifdef ARCH_X86_64 +movsxdr2, r2d +%endif +movdqam2, [max_19bit] +cmp r6d, 4 +je .scale4 +cmp r6d, 8 +je .scale8 +jmp .scaleX + +.scale4: +lea r1, [r1+r2*4] +lea r4, [r4+r2*8] +lea r5, [r5+r2*2] +neg r2 +.loop4: +movsx r0, word [r5+r2*2+0] ; filterPos[0] +movsx r6, word [r5+r2*2+4] ; filterPos[2] +movq m0, [r3+r0*2] ; src[filterPos[0] + {0,1,2,3}] +movhpsm0, [r3+r6*2] ; src[filterPos[2] + {0,1,2,3}] +movsx r0, word [r5+r2*2+2] ; filterPos[1] +movsx r6, word [r5+r2*2+6] ; filterPos[3] +movq m3, [r3+r0*2] ; src[filterPos[1] + {0,1,2,3}] +movhpsm3, [r3+r6*2] ; src[filterPos[3] + {0,1,2,3}] +movq m1, [r4+r2*8+ 0] ; filter[{0,1,..,6,7}] +movhpsm1, [r4+r2*8+16] ; filter[{0,1,..,6,7}] +movq m4, [r4+r2*8+ 8] ; filter[{8,9,..,14,15}] +movhpsm4, [r4+r2*8+24] ; filter[{8,9,..,14,15}] +pmaddwd m0, m1 +pmaddwd m3, m4 +pshufdm1, m0, 00110001b +pshufdm4, m3, 1000b +paddd m0, m1 +paddd m3, m4 +pblendw m0, m3, 11001100b ; filter[{ 0, 1, 2, 3}]*src[filterPos[0]+{0,1,2,3}], +; filter[{ 4, 5, 6, 7}]*src[filterPos[1]+{0,1,2,3}], +; filter[{ 8, 9,10,11}]*src[filterPos[2]+{0,1,2,3}], +; filter[{12,13,14,15}]*src[filterPos[3]+{0,1,2,3}] +psrad m0, %1 - 5 +pminsdm0, m2 +movdqa [r1+r2*4], m0 +add r2, 4 +jl .loop4 +REP_RET + +.scale8: +shl r2, 1 ; this allows *16 (i.e. now *8) in lea instructions +lea r1, [r1+r2*2] +lea r4, [r4+r2*8] +lea r5, [r5+r2*1] +neg r2 +.loop8: +movsx r0, word [r5+r2*1+0] ; filterPos[0] +movsx r6, word [r5+r2*1+2] ; filterPos[1] +movdqum0, [r3+r0*2] ; src[filterPos[0] + {0,1,2,3,4,5,6,7}] +movdqum3, [r3+r6*2] ; src[filterPos[1] + {0,1,2,3,4,5,6,7
Re: [libav-devel] [RFC] Make Bink IDCT take 32-bit coefficients as input
On Sat, Jul 23, 2011 at 07:11:09PM +0100, Måns Rullgård wrote: > Kostya writes: > > >> >> > diff --git a/libavcodec/binkdsp.c b/libavcodec/binkdsp.c > >> >> > new file mode 100644 > >> >> > index 000..456d71d > >> >> > --- /dev/null > >> >> > +++ b/libavcodec/binkdsp.c > >> >> > >> >> [...] > >> >> > >> >> > +static void bink_clear_block_c(int32_t *block) > >> >> > +{ > >> >> > +memset(block, 0, sizeof(*block) * 64); > >> >> > +} > >> >> > + > >> >> > +static void bink_put_block_c(const int32_t *block, uint8_t *restrict > >> >> > pixels) > >> >> > +{ > >> >> > +int i; > >> >> > + > >> >> > +for (i = 0; i < 8; i++) { > >> >> > +pixels[0] = block[0]; > >> >> > +pixels[1] = block[1]; > >> >> > +pixels[2] = block[2]; > >> >> > +pixels[3] = block[3]; > >> >> > +pixels[4] = block[4]; > >> >> > +pixels[5] = block[5]; > >> >> > +pixels[6] = block[6]; > >> >> > +pixels[7] = block[7]; > >> >> > + > >> >> > +pixels += 8; > >> >> > +block += 8; > >> >> > +} > >> >> > +} > >> >> > >> >> These almost exist in dsputil... > >> > > >> > But not for such drastic situation. > >> > >> The 32-bit clear_block() actually exists exactly. It would be a shame > >> not to use it. > > > > It's a pity you have to select high bits_per_sample to use it and then it > > screws other functions like put/get_pixels* > > Enabling 32-bit "DCTELEM" with 8-bit pixels is a 2-line patch to remove > an #if. I disabled that variant since nothing needed it at the time. > Keeping the 16-bit version for add/put_pixels is harder. Ideas for a > clean solution welcome. One possibility is to use two DSPContexts, or > initialise a scratch context and pull the needed pointers out of it. But all of this are hacks :( ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] Make Bink IDCT take 32-bit coefficients as input
Kostya writes: >> >> > diff --git a/libavcodec/binkdsp.c b/libavcodec/binkdsp.c >> >> > new file mode 100644 >> >> > index 000..456d71d >> >> > --- /dev/null >> >> > +++ b/libavcodec/binkdsp.c >> >> >> >> [...] >> >> >> >> > +static void bink_clear_block_c(int32_t *block) >> >> > +{ >> >> > +memset(block, 0, sizeof(*block) * 64); >> >> > +} >> >> > + >> >> > +static void bink_put_block_c(const int32_t *block, uint8_t *restrict >> >> > pixels) >> >> > +{ >> >> > +int i; >> >> > + >> >> > +for (i = 0; i < 8; i++) { >> >> > +pixels[0] = block[0]; >> >> > +pixels[1] = block[1]; >> >> > +pixels[2] = block[2]; >> >> > +pixels[3] = block[3]; >> >> > +pixels[4] = block[4]; >> >> > +pixels[5] = block[5]; >> >> > +pixels[6] = block[6]; >> >> > +pixels[7] = block[7]; >> >> > + >> >> > +pixels += 8; >> >> > +block += 8; >> >> > +} >> >> > +} >> >> >> >> These almost exist in dsputil... >> > >> > But not for such drastic situation. >> >> The 32-bit clear_block() actually exists exactly. It would be a shame >> not to use it. > > It's a pity you have to select high bits_per_sample to use it and then it > screws other functions like put/get_pixels* Enabling 32-bit "DCTELEM" with 8-bit pixels is a 2-line patch to remove an #if. I disabled that variant since nothing needed it at the time. Keeping the 16-bit version for add/put_pixels is harder. Ideas for a clean solution welcome. One possibility is to use two DSPContexts, or initialise a scratch context and pull the needed pointers out of it. -- Måns Rullgård m...@mansr.com ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] Make Bink IDCT take 32-bit coefficients as input
On Sat, Jul 23, 2011 at 06:32:09PM +0100, Måns Rullgård wrote: > Kostya writes: > > > On Sat, Jul 23, 2011 at 03:42:42PM +0100, Måns Rullgård wrote: > >> Kostya writes: > >> > >> > As you may know, Bink IDCT is a very special one, it's defined to be > >> > integer-only (hence lossless and producing the same result on different > >> > platforms). Unfortunately, it needs more than 16-bit input on blocks with > >> > sharp edges (and still producing 8-bit output). Since bloating > >> > DSPContext with > >> > another set of functions is bad, here's an attempt to introduce a > >> > different > >> > context for Bink-specific functions. I don't like my patch much (it still > >> > relies on some of DSPContext stuff), please suggest the better way to > >> > fix it. > > [...] > >> > @@ -669,10 +671,10 @@ static int read_dct_coeffs(GetBitContext *gb, > >> > DCTELEM block[64], const uint8_t * > >> > > >> > quant = quant_matrices[quant_idx]; > >> > > >> > -block[0] = (block[0] * quant[0]) >> 11; > >> > +block[0] = ((int32_t)(block[0] * quant[0])) >> 11; > >> > for (i = 0; i < coef_count; i++) { > >> > int idx = coef_idx[i]; > >> > -block[scan[idx]] = (block[scan[idx]] * quant[idx]) >> 11; > >> > +block[scan[idx]] = ((int32_t)(block[scan[idx]] * quant[idx])) > >> > >> 11; > >> > } > >> > >> What are those casts meant to do? They have no effect. > > > > They do: quant is uint32_t, so at least GCC chooses to multiply result into > > uint32_t, previously it was masked by the fact it stored only low 16 bits of > > result anyway. > > The low 32 bits of a 32x32 multiplication do not depend on the sign of > the inputs. The signedness does affect the shift, of course. Could the > quant tables be made signed instead? That feels more natural since you > do want a signed result. Indeed, changed. > >> > diff --git a/libavcodec/binkdsp.c b/libavcodec/binkdsp.c > >> > new file mode 100644 > >> > index 000..456d71d > >> > --- /dev/null > >> > +++ b/libavcodec/binkdsp.c > >> > >> [...] > >> > >> > +static void bink_clear_block_c(int32_t *block) > >> > +{ > >> > +memset(block, 0, sizeof(*block) * 64); > >> > +} > >> > + > >> > +static void bink_put_block_c(const int32_t *block, uint8_t *restrict > >> > pixels) > >> > +{ > >> > +int i; > >> > + > >> > +for (i = 0; i < 8; i++) { > >> > +pixels[0] = block[0]; > >> > +pixels[1] = block[1]; > >> > +pixels[2] = block[2]; > >> > +pixels[3] = block[3]; > >> > +pixels[4] = block[4]; > >> > +pixels[5] = block[5]; > >> > +pixels[6] = block[6]; > >> > +pixels[7] = block[7]; > >> > + > >> > +pixels += 8; > >> > +block += 8; > >> > +} > >> > +} > >> > >> These almost exist in dsputil... > > > > But not for such drastic situation. > > The 32-bit clear_block() actually exists exactly. It would be a shame > not to use it. It's a pity you have to select high bits_per_sample to use it and then it screws other functions like put/get_pixels* > >> > diff --git a/libavcodec/dsputil.c b/libavcodec/dsputil.c > >> > index 09e58f4..be846d3 100644 > >> > --- a/libavcodec/dsputil.c > >> > +++ b/libavcodec/dsputil.c > >> > @@ -2895,9 +2895,6 @@ av_cold void dsputil_init(DSPContext* c, > >> > AVCodecContext *avctx) > >> > c->idct_put= ff_ea_idct_put_c; > >> > c->idct_permutation_type= FF_NO_IDCT_PERM; > >> > }else if(CONFIG_BINK_DECODER && avctx->idct_algo==FF_IDCT_BINK) > >> > { > >> > -c->idct = ff_bink_idct_c; > >> > -c->idct_add = ff_bink_idct_add_c; > >> > -c->idct_put = ff_bink_idct_put_c; > >> > c->idct_permutation_type = FF_NO_IDCT_PERM; > >> > }else{ //accurate/default > >> > >> That part can be removed entirely if you change the ff_init_scantable() > >> call in bink.c:decode_init() to not use dsp.idct_permuation but pass > >> FF_NO_IDCT_PERM directly (since that is the only one used). > > > > It's actually the whole table you need to pass there, but changed anyway. > > You could also drop that scantable permutation entirely since there is > only one bink idct implementation, and it doesn't need it. In the > unlikely event someone writes a simd version, we can put it back then. Done. >From 013886acce4b00396fa69671ca099e88dcb523ce Mon Sep 17 00:00:00 2001 From: Kostya Shishkov Date: Sat, 23 Jul 2011 15:46:35 +0200 Subject: [PATCH] Make Bink DCT take 32-bit input Since IDC transforming 32-bit input to 8-bit output is unusual and unpractical for most codecs, move Bink IDCT into separate context. Get rid of an additional permutation table while at it since SIMD support for DCT is unlikely to be implemented in foreseeable future. Quantisation tables also have to change type into signed one for proper quantisation of DCT coefficients. --- libavcodec/Makefile |2 +- libavcodec/bink.c | 59 ++- libavcodec/bin
Re: [libav-devel] [RFC] Make Bink IDCT take 32-bit coefficients as input
Kostya writes: > On Sat, Jul 23, 2011 at 03:42:42PM +0100, Måns Rullgård wrote: >> Kostya writes: >> >> > As you may know, Bink IDCT is a very special one, it's defined to be >> > integer-only (hence lossless and producing the same result on different >> > platforms). Unfortunately, it needs more than 16-bit input on blocks with >> > sharp edges (and still producing 8-bit output). Since bloating DSPContext >> > with >> > another set of functions is bad, here's an attempt to introduce a different >> > context for Bink-specific functions. I don't like my patch much (it still >> > relies on some of DSPContext stuff), please suggest the better way to fix >> > it. > [...] >> > @@ -669,10 +671,10 @@ static int read_dct_coeffs(GetBitContext *gb, >> > DCTELEM block[64], const uint8_t * >> > >> > quant = quant_matrices[quant_idx]; >> > >> > -block[0] = (block[0] * quant[0]) >> 11; >> > +block[0] = ((int32_t)(block[0] * quant[0])) >> 11; >> > for (i = 0; i < coef_count; i++) { >> > int idx = coef_idx[i]; >> > -block[scan[idx]] = (block[scan[idx]] * quant[idx]) >> 11; >> > +block[scan[idx]] = ((int32_t)(block[scan[idx]] * quant[idx])) >> >> > 11; >> > } >> >> What are those casts meant to do? They have no effect. > > They do: quant is uint32_t, so at least GCC chooses to multiply result into > uint32_t, previously it was masked by the fact it stored only low 16 bits of > result anyway. The low 32 bits of a 32x32 multiplication do not depend on the sign of the inputs. The signedness does affect the shift, of course. Could the quant tables be made signed instead? That feels more natural since you do want a signed result. >> > diff --git a/libavcodec/binkdsp.c b/libavcodec/binkdsp.c >> > new file mode 100644 >> > index 000..456d71d >> > --- /dev/null >> > +++ b/libavcodec/binkdsp.c >> >> [...] >> >> > +static void bink_clear_block_c(int32_t *block) >> > +{ >> > +memset(block, 0, sizeof(*block) * 64); >> > +} >> > + >> > +static void bink_put_block_c(const int32_t *block, uint8_t *restrict >> > pixels) >> > +{ >> > +int i; >> > + >> > +for (i = 0; i < 8; i++) { >> > +pixels[0] = block[0]; >> > +pixels[1] = block[1]; >> > +pixels[2] = block[2]; >> > +pixels[3] = block[3]; >> > +pixels[4] = block[4]; >> > +pixels[5] = block[5]; >> > +pixels[6] = block[6]; >> > +pixels[7] = block[7]; >> > + >> > +pixels += 8; >> > +block += 8; >> > +} >> > +} >> >> These almost exist in dsputil... > > But not for such drastic situation. The 32-bit clear_block() actually exists exactly. It would be a shame not to use it. >> > diff --git a/libavcodec/dsputil.c b/libavcodec/dsputil.c >> > index 09e58f4..be846d3 100644 >> > --- a/libavcodec/dsputil.c >> > +++ b/libavcodec/dsputil.c >> > @@ -2895,9 +2895,6 @@ av_cold void dsputil_init(DSPContext* c, >> > AVCodecContext *avctx) >> > c->idct_put= ff_ea_idct_put_c; >> > c->idct_permutation_type= FF_NO_IDCT_PERM; >> > }else if(CONFIG_BINK_DECODER && avctx->idct_algo==FF_IDCT_BINK) { >> > -c->idct = ff_bink_idct_c; >> > -c->idct_add = ff_bink_idct_add_c; >> > -c->idct_put = ff_bink_idct_put_c; >> > c->idct_permutation_type = FF_NO_IDCT_PERM; >> > }else{ //accurate/default >> >> That part can be removed entirely if you change the ff_init_scantable() >> call in bink.c:decode_init() to not use dsp.idct_permuation but pass >> FF_NO_IDCT_PERM directly (since that is the only one used). > > It's actually the whole table you need to pass there, but changed anyway. You could also drop that scantable permutation entirely since there is only one bink idct implementation, and it doesn't need it. In the unlikely event someone writes a simd version, we can put it back then. -- Måns Rullgård m...@mansr.com ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] Make Bink IDCT take 32-bit coefficients as input
On Sat, Jul 23, 2011 at 03:42:42PM +0100, Måns Rullgård wrote: > Kostya writes: > > > As you may know, Bink IDCT is a very special one, it's defined to be > > integer-only (hence lossless and producing the same result on different > > platforms). Unfortunately, it needs more than 16-bit input on blocks with > > sharp edges (and still producing 8-bit output). Since bloating DSPContext > > with > > another set of functions is bad, here's an attempt to introduce a different > > context for Bink-specific functions. I don't like my patch much (it still > > relies on some of DSPContext stuff), please suggest the better way to fix > > it. [...] > > @@ -669,10 +671,10 @@ static int read_dct_coeffs(GetBitContext *gb, DCTELEM > > block[64], const uint8_t * > > > > quant = quant_matrices[quant_idx]; > > > > -block[0] = (block[0] * quant[0]) >> 11; > > +block[0] = ((int32_t)(block[0] * quant[0])) >> 11; > > for (i = 0; i < coef_count; i++) { > > int idx = coef_idx[i]; > > -block[scan[idx]] = (block[scan[idx]] * quant[idx]) >> 11; > > +block[scan[idx]] = ((int32_t)(block[scan[idx]] * quant[idx])) >> > > 11; > > } > > What are those casts meant to do? They have no effect. They do: quant is uint32_t, so at least GCC chooses to multiply result into uint32_t, previously it was masked by the fact it stored only low 16 bits of result anyway. > > diff --git a/libavcodec/binkdsp.c b/libavcodec/binkdsp.c > > new file mode 100644 > > index 000..456d71d > > --- /dev/null > > +++ b/libavcodec/binkdsp.c > > [...] > > > +static void bink_clear_block_c(int32_t *block) > > +{ > > +memset(block, 0, sizeof(*block) * 64); > > +} > > + > > +static void bink_put_block_c(const int32_t *block, uint8_t *restrict > > pixels) > > +{ > > +int i; > > + > > +for (i = 0; i < 8; i++) { > > +pixels[0] = block[0]; > > +pixels[1] = block[1]; > > +pixels[2] = block[2]; > > +pixels[3] = block[3]; > > +pixels[4] = block[4]; > > +pixels[5] = block[5]; > > +pixels[6] = block[6]; > > +pixels[7] = block[7]; > > + > > +pixels += 8; > > +block += 8; > > +} > > +} > > These almost exist in dsputil... But not for such drastic situation. > > diff --git a/libavcodec/dsputil.c b/libavcodec/dsputil.c > > index 09e58f4..be846d3 100644 > > --- a/libavcodec/dsputil.c > > +++ b/libavcodec/dsputil.c > > @@ -2895,9 +2895,6 @@ av_cold void dsputil_init(DSPContext* c, > > AVCodecContext *avctx) > > c->idct_put= ff_ea_idct_put_c; > > c->idct_permutation_type= FF_NO_IDCT_PERM; > > }else if(CONFIG_BINK_DECODER && avctx->idct_algo==FF_IDCT_BINK) { > > -c->idct = ff_bink_idct_c; > > -c->idct_add = ff_bink_idct_add_c; > > -c->idct_put = ff_bink_idct_put_c; > > c->idct_permutation_type = FF_NO_IDCT_PERM; > > }else{ //accurate/default > > That part can be removed entirely if you change the ff_init_scantable() > call in bink.c:decode_init() to not use dsp.idct_permuation but pass > FF_NO_IDCT_PERM directly (since that is the only one used). It's actually the whole table you need to pass there, but changed anyway. >From 5b103ed5f6e60aae2b743b708a8c57dbcd1a8d45 Mon Sep 17 00:00:00 2001 From: Kostya Shishkov Date: Sat, 23 Jul 2011 15:46:35 +0200 Subject: [PATCH] Make Bink DCT take 32-bit input --- libavcodec/Makefile |2 +- libavcodec/bink.c | 55 ++ libavcodec/binkdsp.c | 151 + libavcodec/binkdsp.h | 43 ++ libavcodec/binkidct.c | 112 libavcodec/dsputil.c |5 -- 6 files changed, 225 insertions(+), 143 deletions(-) create mode 100644 libavcodec/binkdsp.c create mode 100644 libavcodec/binkdsp.h delete mode 100644 libavcodec/binkidct.c diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 99ecbbf..36e07a9 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -89,7 +89,7 @@ OBJS-$(CONFIG_AURA2_DECODER) += aura.o OBJS-$(CONFIG_AVS_DECODER) += avs.o OBJS-$(CONFIG_BETHSOFTVID_DECODER) += bethsoftvideo.o OBJS-$(CONFIG_BFI_DECODER) += bfi.o -OBJS-$(CONFIG_BINK_DECODER)+= bink.o binkidct.o +OBJS-$(CONFIG_BINK_DECODER)+= bink.o binkdsp.o OBJS-$(CONFIG_BINKAUDIO_DCT_DECODER) += binkaudio.o wma.o OBJS-$(CONFIG_BINKAUDIO_RDFT_DECODER) += binkaudio.o wma.o OBJS-$(CONFIG_BMP_DECODER) += bmp.o msrledec.o diff --git a/libavcodec/bink.c b/libavcodec/bink.c index e085aa5..73fbe37 100644 --- a/libavcodec/bink.c +++ b/libavcodec/bink.c @@ -24,6 +24,7 @@ #include "avcodec.h" #include "dsputil.h" #include "binkdata.h" +#include "binkdsp.h" #include "mathops.h" #define ALT_BITSTREAM_READER_LE @@ -109,6 +110,7 @@ typedef struct Bundle { typedef struct BinkContext { AVCodecContext *avctx; DS
Re: [libav-devel] [PATCH] jpegdec: actually search for and parse RSTn
On Sat, Jul 23, 2011 at 05:14:29PM +0200, Reinhard Tartler wrote: > On Sat, Jul 23, 2011 at 14:44:36 (CEST), Kostya wrote: > > > On Fri, Jul 22, 2011 at 07:53:44PM +0200, Reinhard Tartler wrote: > >> From: Michael Niedermayer > >> > >> Fixes http://ffmpeg.org/trac/ffmpeg/ticket/267 > > > > Please replace it with a description of a problem (from that ticket, for > > example). Our project is not high-quality and thus still demands sensible > > commit messages. > > Okay, what about this message then: > > jpegdec: actually search for and parse RSTn > > Fixes decoding of MJPEG files produced by some UVC Logitec web cameras, > such as "Notebook Pro" and "HD C910". > > References: > http://trac.videolan.org/vlc/ticket/4215 > http://ffmpeg.org/trac/ffmpeg/ticket/267 > > Signed-off-by: Michael Niedermayer > (cherry picked from commit 7b8ed831eb8432d202dad16dedc1758b018bb1fa) Yes, looks much better. ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] jpegdec: actually search for and parse RSTn
On Sat, Jul 23, 2011 at 14:44:36 (CEST), Kostya wrote: > On Fri, Jul 22, 2011 at 07:53:44PM +0200, Reinhard Tartler wrote: >> From: Michael Niedermayer >> >> Fixes http://ffmpeg.org/trac/ffmpeg/ticket/267 > > Please replace it with a description of a problem (from that ticket, for > example). Our project is not high-quality and thus still demands sensible > commit messages. Okay, what about this message then: jpegdec: actually search for and parse RSTn Fixes decoding of MJPEG files produced by some UVC Logitec web cameras, such as "Notebook Pro" and "HD C910". References: http://trac.videolan.org/vlc/ticket/4215 http://ffmpeg.org/trac/ffmpeg/ticket/267 Signed-off-by: Michael Niedermayer (cherry picked from commit 7b8ed831eb8432d202dad16dedc1758b018bb1fa) >> --- >> libavcodec/mjpegdec.c |7 +-- >> 1 files changed, 5 insertions(+), 2 deletions(-) >> >> diff --git a/libavcodec/mjpegdec.c b/libavcodec/mjpegdec.c >> index b28fdf8..a1588df 100644 >> --- a/libavcodec/mjpegdec.c >> +++ b/libavcodec/mjpegdec.c >> @@ -879,9 +879,12 @@ static int mjpeg_decode_scan(MJpegDecodeContext *s, int >> nb_components, int Ah, i >> } >> } >> >> -if (s->restart_interval && !--s->restart_count) { >> +if (s->restart_interval && show_bits(&s->gb, 8) == 0xFF){/* >> skip RSTn */ > a space wouldn't hurt here ^ Will add it. >> +--s->restart_count; >> align_get_bits(&s->gb); >> -skip_bits(&s->gb, 16); /* skip RSTn */ >> +while(show_bits(&s->gb, 8) == 0xFF) >> +skip_bits(&s->gb, 8); >> +skip_bits(&s->gb, 8); >> for (i=0; i> s->last_dc[i] = 1024; >> } >> -- > > After looking at the spec the patch seems passable, Appendix E of the JPEG > spec says "Note that optional X’FF’ fill bytes which may precede any marker > shall be discarded before determining which marker is present." > ___ > libav-devel mailing list > libav-devel@libav.org > https://lists.libav.org/mailman/listinfo/libav-devel -- Gruesse/greetings, Reinhard Tartler, KeyID 945348A4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] Make Bink IDCT take 32-bit coefficients as input
Kostya writes: > As you may know, Bink IDCT is a very special one, it's defined to be > integer-only (hence lossless and producing the same result on different > platforms). Unfortunately, it needs more than 16-bit input on blocks with > sharp edges (and still producing 8-bit output). Since bloating DSPContext with > another set of functions is bad, here's an attempt to introduce a different > context for Bink-specific functions. I don't like my patch much (it still > relies on some of DSPContext stuff), please suggest the better way to fix it. > > diff --git a/libavcodec/bink.c b/libavcodec/bink.c > index e085aa5..25ee63a 100644 > --- a/libavcodec/bink.c > +++ b/libavcodec/bink.c > @@ -24,6 +24,7 @@ > #include "avcodec.h" > #include "dsputil.h" > #include "binkdata.h" > +#include "binkdsp.h" > #include "mathops.h" > > #define ALT_BITSTREAM_READER_LE > @@ -109,6 +110,7 @@ typedef struct Bundle { > typedef struct BinkContext { > AVCodecContext *avctx; > DSPContext dsp; > +BinkDSPContext bdsp; > AVFramepic, last; > intversion; ///< internal Bink file version > inthas_alpha; > @@ -580,7 +582,7 @@ static inline int binkb_get_value(BinkContext *c, int > bundle_num) > * @param quant_matrices quantization matrices > * @return 0 for success, negative value in other cases > */ > -static int read_dct_coeffs(GetBitContext *gb, DCTELEM block[64], const > uint8_t *scan, > +static int read_dct_coeffs(GetBitContext *gb, int32_t block[64], const > uint8_t *scan, > const uint32_t quant_matrices[16][64], int q) > { > int coef_list[128]; > @@ -669,10 +671,10 @@ static int read_dct_coeffs(GetBitContext *gb, DCTELEM > block[64], const uint8_t * > > quant = quant_matrices[quant_idx]; > > -block[0] = (block[0] * quant[0]) >> 11; > +block[0] = ((int32_t)(block[0] * quant[0])) >> 11; > for (i = 0; i < coef_count; i++) { > int idx = coef_idx[i]; > -block[scan[idx]] = (block[scan[idx]] * quant[idx]) >> 11; > +block[scan[idx]] = ((int32_t)(block[scan[idx]] * quant[idx])) >> 11; > } What are those casts meant to do? They have no effect. > diff --git a/libavcodec/binkdsp.c b/libavcodec/binkdsp.c > new file mode 100644 > index 000..456d71d > --- /dev/null > +++ b/libavcodec/binkdsp.c [...] > +static void bink_clear_block_c(int32_t *block) > +{ > +memset(block, 0, sizeof(*block) * 64); > +} > + > +static void bink_put_block_c(const int32_t *block, uint8_t *restrict pixels) > +{ > +int i; > + > +for (i = 0; i < 8; i++) { > +pixels[0] = block[0]; > +pixels[1] = block[1]; > +pixels[2] = block[2]; > +pixels[3] = block[3]; > +pixels[4] = block[4]; > +pixels[5] = block[5]; > +pixels[6] = block[6]; > +pixels[7] = block[7]; > + > +pixels += 8; > +block += 8; > +} > +} These almost exist in dsputil... > diff --git a/libavcodec/dsputil.c b/libavcodec/dsputil.c > index 09e58f4..be846d3 100644 > --- a/libavcodec/dsputil.c > +++ b/libavcodec/dsputil.c > @@ -2895,9 +2895,6 @@ av_cold void dsputil_init(DSPContext* c, AVCodecContext > *avctx) > c->idct_put= ff_ea_idct_put_c; > c->idct_permutation_type= FF_NO_IDCT_PERM; > }else if(CONFIG_BINK_DECODER && avctx->idct_algo==FF_IDCT_BINK) { > -c->idct = ff_bink_idct_c; > -c->idct_add = ff_bink_idct_add_c; > -c->idct_put = ff_bink_idct_put_c; > c->idct_permutation_type = FF_NO_IDCT_PERM; > }else{ //accurate/default That part can be removed entirely if you change the ff_init_scantable() call in bink.c:decode_init() to not use dsp.idct_permuation but pass FF_NO_IDCT_PERM directly (since that is the only one used). -- Måns Rullgård m...@mansr.com ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [RFC] Make Bink IDCT take 32-bit coefficients as input
As you may know, Bink IDCT is a very special one, it's defined to be integer-only (hence lossless and producing the same result on different platforms). Unfortunately, it needs more than 16-bit input on blocks with sharp edges (and still producing 8-bit output). Since bloating DSPContext with another set of functions is bad, here's an attempt to introduce a different context for Bink-specific functions. I don't like my patch much (it still relies on some of DSPContext stuff), please suggest the better way to fix it. >From f341e08d28637aef990f1d5a86e395f955660304 Mon Sep 17 00:00:00 2001 From: Kostya Shishkov Date: Sat, 23 Jul 2011 15:46:35 +0200 Subject: [PATCH] Make Bink DCT take 32-bit input --- libavcodec/Makefile |2 +- libavcodec/bink.c | 53 ++ libavcodec/binkdsp.c | 146 + libavcodec/binkdsp.h | 42 ++ libavcodec/binkidct.c | 112 - libavcodec/dsputil.c |3 - 6 files changed, 218 insertions(+), 140 deletions(-) create mode 100644 libavcodec/binkdsp.c create mode 100644 libavcodec/binkdsp.h delete mode 100644 libavcodec/binkidct.c diff --git a/libavcodec/Makefile b/libavcodec/Makefile index 99ecbbf..36e07a9 100644 --- a/libavcodec/Makefile +++ b/libavcodec/Makefile @@ -89,7 +89,7 @@ OBJS-$(CONFIG_AURA2_DECODER) += aura.o OBJS-$(CONFIG_AVS_DECODER) += avs.o OBJS-$(CONFIG_BETHSOFTVID_DECODER) += bethsoftvideo.o OBJS-$(CONFIG_BFI_DECODER) += bfi.o -OBJS-$(CONFIG_BINK_DECODER)+= bink.o binkidct.o +OBJS-$(CONFIG_BINK_DECODER)+= bink.o binkdsp.o OBJS-$(CONFIG_BINKAUDIO_DCT_DECODER) += binkaudio.o wma.o OBJS-$(CONFIG_BINKAUDIO_RDFT_DECODER) += binkaudio.o wma.o OBJS-$(CONFIG_BMP_DECODER) += bmp.o msrledec.o diff --git a/libavcodec/bink.c b/libavcodec/bink.c index e085aa5..25ee63a 100644 --- a/libavcodec/bink.c +++ b/libavcodec/bink.c @@ -24,6 +24,7 @@ #include "avcodec.h" #include "dsputil.h" #include "binkdata.h" +#include "binkdsp.h" #include "mathops.h" #define ALT_BITSTREAM_READER_LE @@ -109,6 +110,7 @@ typedef struct Bundle { typedef struct BinkContext { AVCodecContext *avctx; DSPContext dsp; +BinkDSPContext bdsp; AVFramepic, last; intversion; ///< internal Bink file version inthas_alpha; @@ -580,7 +582,7 @@ static inline int binkb_get_value(BinkContext *c, int bundle_num) * @param quant_matrices quantization matrices * @return 0 for success, negative value in other cases */ -static int read_dct_coeffs(GetBitContext *gb, DCTELEM block[64], const uint8_t *scan, +static int read_dct_coeffs(GetBitContext *gb, int32_t block[64], const uint8_t *scan, const uint32_t quant_matrices[16][64], int q) { int coef_list[128]; @@ -669,10 +671,10 @@ static int read_dct_coeffs(GetBitContext *gb, DCTELEM block[64], const uint8_t * quant = quant_matrices[quant_idx]; -block[0] = (block[0] * quant[0]) >> 11; +block[0] = ((int32_t)(block[0] * quant[0])) >> 11; for (i = 0; i < coef_count; i++) { int idx = coef_idx[i]; -block[scan[idx]] = (block[scan[idx]] * quant[idx]) >> 11; +block[scan[idx]] = ((int32_t)(block[scan[idx]] * quant[idx])) >> 11; } return 0; @@ -791,6 +793,7 @@ static int binkb_decode_plane(BinkContext *c, GetBitContext *gb, int plane_idx, const uint8_t *scan; int xoff, yoff; LOCAL_ALIGNED_16(DCTELEM, block, [64]); +LOCAL_ALIGNED_16(int32_t, dctblock, [64]); int coordmap[64]; int ybias = is_key ? -15 : 0; int qp; @@ -845,11 +848,11 @@ static int binkb_decode_plane(BinkContext *c, GetBitContext *gb, int plane_idx, dst[coordmap[*scan++]] = binkb_get_value(c, BINKB_SRC_COLORS); break; case 2: -c->dsp.clear_block(block); -block[0] = binkb_get_value(c, BINKB_SRC_INTRA_DC); +c->bdsp.clear_block(dctblock); +dctblock[0] = binkb_get_value(c, BINKB_SRC_INTRA_DC); qp = binkb_get_value(c, BINKB_SRC_INTRA_Q); -read_dct_coeffs(gb, block, c->scantable.permutated, binkb_intra_quant, qp); -c->dsp.idct_put(dst, stride, block); +read_dct_coeffs(gb, dctblock, c->scantable.permutated, binkb_intra_quant, qp); +c->bdsp.idct_put(dst, stride, dctblock); break; case 3: xoff = binkb_get_value(c, BINKB_SRC_X_OFF); @@ -878,11 +881,11 @@ static int binkb_decode_plane(BinkContext *c, GetBitContext *gb, int plane_idx, } else { put_pixels8x8_overlapped(dst, ref, stride); } -c->dsp.clear_block(block); -block[0] = binkb_get_value(c, BINKB_SRC_INTER_DC); +c->bdsp.clear_b
Re: [libav-devel] 10-bit H.264 Decoding
2011/7/23 Daniel Kang > I've been working on optimizing 10-bit H.264 decoding > on x86/x64 for the past months. Before I started, 10-bit > H.264 HD content was unplayable in real-time, but now, > with a recent CPU, it is. > Overall improvements are about 53%. With threading, > about 61% (with respect to pre-asm with the same > number of threads). Thanks a lot for your work - and well done!! :-) Best regards, Mathias. ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] parseutils: add av_parse_number() function and children, and use them in the ff* tools
On date Wednesday 2011-06-15 00:30:32 +0200, Stefano Sabatini encoded: > On date Wednesday 2011-06-15 00:04:21 +0200, Stefano Sabatini encoded: > > On date Sunday 2011-06-05 00:40:57 +0200, Stefano Sabatini encoded: > > > On date Saturday 2011-05-28 19:28:17 +0200, Stefano Sabatini encoded: > > [...] > > > > With number parsing it is more complicated, the failure may depend > > > > on an invalid range, on a typo, or on the parsed number not being an > > > > integer, discovering the reason of the failure requires duplicating > > > > the logic in the calling code, if you don't report the failure > > > > reason you have a poor feedback (and I don't want to duplicate the > > > > error reporting every time I use that function, check ffmpeg for an > > > > example). > > > > > > > > > Developers can use av_dlog() messages with DEBUG defined. So there is > > > > > something to say for it. Whatever you guys prefer is fine with me, I'd > > > > > have a slight preference for the part where you have a simpler (so 4- > > > > > instead of 6-argument API), but don't care much. > > > > > > > > I believe the two parameters are a modest price for the added > > > > flexibility/functionality. > > > > > > Ping and updated patch (Note for the committer: update libavutil minor > > > number and APIchanges number and date). > > > > Ping and updated patch. > > With a minor fix. > From 414530679acbd9845c940dcfdf5dcbf32472a25a Mon Sep 17 00:00:00 2001 > From: Stefano Sabatini > Date: Sat, 28 May 2011 01:20:27 +0200 > Subject: [PATCH] parseutils: add av_parse_number() function and children > > Also use them in the ff* tools in place of parse_number_or_die(). Updated with some factorization. Also av_parse_number now returns AVERROR(ERANGE) in case of out-of-range value. >From 50a8feae9ce59dbed308c86d4a71ff4f7b9e4f23 Mon Sep 17 00:00:00 2001 From: Stefano Sabatini Date: Sat, 28 May 2011 01:20:27 +0200 Subject: [PATCH] parseutils: add av_parse_number() function and children Also use them in the ff* tools in place of parse_number_or_die(). --- cmdutils.c | 45 +++--- cmdutils.h | 15 --- doc/APIchanges |9 +++ ffmpeg.c | 30 +-- ffplay.c | 15 --- libavutil/parseutils.c | 62 libavutil/parseutils.h | 29 ++ 7 files changed, 132 insertions(+), 73 deletions(-) diff --git a/cmdutils.c b/cmdutils.c index 19c5d72..23baa82 100644 --- a/cmdutils.c +++ b/cmdutils.c @@ -87,25 +87,6 @@ void log_callback_help(void* ptr, int level, const char* fmt, va_list vl) vfprintf(stdout, fmt, vl); } -double parse_number_or_die(const char *context, const char *numstr, int type, double min, double max) -{ -char *tail; -const char *error; -double d = av_strtod(numstr, &tail); -if (*tail) -error= "Expected number for %s but found: %s\n"; -else if (d < min || d > max) -error= "The value for %s was %s which is not within %f - %f\n"; -else if(type == OPT_INT64 && (int64_t)d != d) -error= "Expected int64 for %s but found %s\n"; -else if (type == OPT_INT && (int)d != d) -error= "Expected int for %s but found %s\n"; -else -return d; -fprintf(stderr, error, context, numstr, min, max); -exit(1); -} - int64_t parse_time_or_die(const char *context, const char *timestr, int is_duration) { int64_t us; @@ -214,7 +195,7 @@ void parse_options(int argc, char **argv, const OptionDef *options, void (* parse_arg_function)(const char*)) { const char *opt, *arg; -int optindex, handleoptions=1; +int ret = 0, optindex, handleoptions=1; const OptionDef *po; /* perform system-dependent conversions for arguments list */ @@ -262,16 +243,17 @@ unknown_opt: } else if (po->flags & OPT_BOOL) { *po->u.int_arg = bool_val; } else if (po->flags & OPT_INT) { -*po->u.int_arg = parse_number_or_die(opt, arg, OPT_INT64, INT_MIN, INT_MAX); +ret = av_parse_int(po->u.int_arg, arg, INT_MIN, INT_MAX, 0, NULL); } else if (po->flags & OPT_INT64) { -*po->u.int64_arg = parse_number_or_die(opt, arg, OPT_INT64, INT64_MIN, INT64_MAX); +ret = av_parse_int64(po->u.int64_arg, arg, INT64_MIN, INT64_MAX, 0, NULL); } else if (po->flags & OPT_FLOAT) { -*po->u.float_arg = parse_number_or_die(opt, arg, OPT_FLOAT, -INFINITY, INFINITY); -} else if (po->u.func_arg) { -if (po->u.func_arg(opt, arg) < 0) { -fprintf(stderr, "%s: failed to set value '%s' for option '%s'\n", argv[0], arg, opt); -exit(1); -} +ret = av_parse_float(po->u.float_arg, arg, -INFINITY, INFINITY, 0, NULL); +} else if (po->u.func_arg && po->u.fu
Re: [libav-devel] [PATCH] jpegdec: actually search for and parse RSTn
On Fri, Jul 22, 2011 at 07:53:44PM +0200, Reinhard Tartler wrote: > From: Michael Niedermayer > > Fixes http://ffmpeg.org/trac/ffmpeg/ticket/267 Please replace it with a description of a problem (from that ticket, for example). Our project is not high-quality and thus still demands sensible commit messages. > Signed-off-by: Michael Niedermayer > > (cherry picked from commit 7b8ed831eb8432d202dad16dedc1758b018bb1fa) > --- > libavcodec/mjpegdec.c |7 +-- > 1 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/libavcodec/mjpegdec.c b/libavcodec/mjpegdec.c > index b28fdf8..a1588df 100644 > --- a/libavcodec/mjpegdec.c > +++ b/libavcodec/mjpegdec.c > @@ -879,9 +879,12 @@ static int mjpeg_decode_scan(MJpegDecodeContext *s, int > nb_components, int Ah, i > } > } > > -if (s->restart_interval && !--s->restart_count) { > +if (s->restart_interval && show_bits(&s->gb, 8) == 0xFF){/* skip > RSTn */ a space wouldn't hurt here ^ > +--s->restart_count; > align_get_bits(&s->gb); > -skip_bits(&s->gb, 16); /* skip RSTn */ > +while(show_bits(&s->gb, 8) == 0xFF) > +skip_bits(&s->gb, 8); > +skip_bits(&s->gb, 8); > for (i=0; i s->last_dc[i] = 1024; > } > -- After looking at the spec the patch seems passable, Appendix E of the JPEG spec says "Note that optional X’FF’ fill bytes which may precede any marker shall be discarded before determining which marker is present." ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] 10-bit H.264 Decoding
I've been working on optimizing 10-bit H.264 decoding on x86/x64 for the past months. Before I started, 10-bit H.264 HD content was unplayable in real-time, but now, with a recent CPU, it is. Overall improvements are about 53%. With threading, about 61% (with respect to pre-asm with the same number of threads). On a recent laptop computer, normal 1080p content can be decoded in real time (with threading). With a high end computer, 1080p50 can be decoded in real time (with just two threads). The actual timing data and patch I used are attached. CPU: Intel(R) Xeon(R) CPU E31270 @ 3.40GHz Percent improvements are taken with respect to the before with the same number of threads. Real was used in all cases for standarization. ~BEFORE~ time ./ffmpeg -vsync 0 -i ../park_joy_paritions_all.264 -v 2 -an -f rawvideo -y /dev/null real0m20.426s user0m18.440s sys 0m1.950s real0m20.393s user0m18.420s sys 0m1.940s real0m20.387s user0m18.410s sys 0m1.940s 20.402 time ./ffmpeg -thread_type 1 -threads 2 -vsync 0 -i ../park_joy_paritions_all.264 -v 2 -an -f rawvideo -y /dev/null real0m12.813s user0m19.600s sys 0m2.290s real0m12.387s user0m19.030s sys 0m2.130s real0m12.386s user0m18.880s sys 0m2.230s 12.529 time ./ffmpeg -thread_type 1 -threads 4 -vsync 0 -i ../park_joy_paritions_all.264 -v 2 -an -f rawvideo -y /dev/null real0m7.050s user0m21.780s sys 0m2.640s real0m7.430s user0m21.290s sys 0m2.750s real0m7.490s user0m21.390s sys 0m2.600s 7.323 time ./ffmpeg -thread_type 1 -threads 6 -vsync 0 -i ../park_joy_paritions_all.264 -v 2 -an -f rawvideo -y /dev/null real0m6.058s user0m25.980s sys 0m2.840s real0m5.922s user0m26.450s sys 0m2.640s real0m6.081s user0m26.550s sys 0m2.780s 6.020 ~AFTER~ time ./ffmpeg -vsync 0 -i ../park_joy_paritions_all.264 -v 2 -an -f rawvideo -y /dev/null real0m13.311s user0m11.450s sys 0m1.840s real0m13.307s user0m11.520s sys 0m1.760s real0m13.333s user0m11.280s sys 0m2.030s 13.317 -> 53.203% speedup time ./ffmpeg -thread_type 1 -threads 2 -vsync 0 -i ../park_joy_paritions_all.264 -v 2 -an -f rawvideo -y /dev/null real0m8.199s user0m11.640s sys 0m1.930s real0m8.084s user0m11.420s sys 0m1.940s real0m8.297s user0m11.440s sys 0m2.320s 8.193 -> 52.913% speedup time ./ffmpeg -thread_type 1 -threads 4 -vsync 0 -i ../park_joy_paritions_all.264 -v 2 -an -f rawvideo -y /dev/null real0m4.579s user0m13.420s sys 0m2.190s real0m4.527s user0m13.480s sys 0m2.110s real0m4.552s user0m13.370s sys 0m2.250s 4.553 -> 60.858% speedup time ./ffmpeg -thread_type 1 -threads 6 -vsync 0 -i ../park_joy_paritions_all.264 -v 2 -an -f rawvideo -y /dev/null real0m4.420s user0m15.120s sys 0m2.370s real0m4.449s user0m14.640s sys 0m2.500s real0m4.754s user0m14.500s sys 0m2.660s 4.541 -> 32.577% speedup patch.diff Description: Binary data ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] cmdutils: add codec_opts parameter to setup_find_stream_info_opts()
On date Sunday 2011-07-17 16:19:28 +0200, Stefano Sabatini encoded: > Avoid brittle and obfuscating reference to a global. > --- > cmdutils.c |2 +- > cmdutils.h |2 +- > ffmpeg.c |2 +- > ffplay.c |2 +- > 4 files changed, 4 insertions(+), 4 deletions(-) Updated. >From ce8b65229980cfb09230f5fde6a0d2427bf6427d Mon Sep 17 00:00:00 2001 From: Stefano Sabatini Date: Sun, 17 Jul 2011 01:32:27 +0200 Subject: [PATCH] cmdutils: add codec_opts parameter to setup_find_stream_info_opts() Avoid brittle and obfuscating reference to a global. --- cmdutils.c |3 ++- cmdutils.h |4 +++- ffmpeg.c |2 +- ffplay.c |2 +- 4 files changed, 7 insertions(+), 4 deletions(-) diff --git a/cmdutils.c b/cmdutils.c index 9278371..6fbfa25 100644 --- a/cmdutils.c +++ b/cmdutils.c @@ -806,7 +806,8 @@ AVDictionary *filter_codec_opts(AVDictionary *opts, enum CodecID codec_id, int e return ret; } -AVDictionary **setup_find_stream_info_opts(AVFormatContext *s) +AVDictionary **setup_find_stream_info_opts(AVFormatContext *s, + const AVDictionary *codec_opts) { int i; AVDictionary **opts; diff --git a/cmdutils.h b/cmdutils.h index 90d02e6..993186f 100644 --- a/cmdutils.h +++ b/cmdutils.h @@ -170,10 +170,12 @@ AVDictionary *filter_codec_opts(AVDictionary *opts, enum CodecID codec_id, int e * be applied to the corresponding stream codec context. * * @param s pointer to a format context + * @param codec_opts a dictionary containing options * @return pointer to the created array of dictionaries, NULL if it * cannot be created */ -AVDictionary **setup_find_stream_info_opts(AVFormatContext *s); +AVDictionary **setup_find_stream_info_opts(AVFormatContext *s, + const AVDictionary *codec_opts); /** * Print an error message to stderr, indicating filename and a human diff --git a/ffmpeg.c b/ffmpeg.c index 776cfab..c76aeac 100644 --- a/ffmpeg.c +++ b/ffmpeg.c @@ -3252,7 +3252,7 @@ static int opt_input_file(const char *opt, const char *filename) } /* Set AVCodecContext options for avformat_find_stream_info */ -opts = setup_find_stream_info_opts(ic); +opts = setup_find_stream_info_opts(ic, codec_opts); orig_nb_streams = ic->nb_streams; /* If not enough info to get the stream parameters, we decode the diff --git a/ffplay.c b/ffplay.c index 77c9d4b..bef75fe 100644 --- a/ffplay.c +++ b/ffplay.c @@ -2343,7 +2343,7 @@ static int decode_thread(void *arg) if(genpts) ic->flags |= AVFMT_FLAG_GENPTS; -opts = setup_find_stream_info_opts(ic); +opts = setup_find_stream_info_opts(ic, codec_opts); orig_nb_streams = ic->nb_streams; err = avformat_find_stream_info(ic, opts); -- 1.7.2.5 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] cmdutils: clarify documentation for setup_find_stream_info_opts()
On date Monday 2011-07-18 12:47:25 +0200, Diego Biurrun encoded: > On Sun, Jul 17, 2011 at 06:47:54PM +0200, Stefano Sabatini wrote: > > > > --- a/cmdutils.h > > +++ b/cmdutils.h > > @@ -159,8 +159,16 @@ void parse_options(int argc, char **argv, const > > OptionDef *options, > > > > +/** > > + * Setup AVCodecContext options for avformat_find_stream_info(). > > + * > > + * Create an array of dictionaries, one dictionary for each stream > > + * contained in s. > > + * Each dictionary will contain the options from codec_opts which can > > + * be applied to the corresponding stream codec context. > > + * > > + * @return pointer to the created array of dictionaries, NULL if it > > + * cannot be created > > */ > > AVDictionary **setup_find_stream_info_opts(AVFormatContext *s); > > This is missing parameter documentation, which will create warnings. Updated. >From bac4009d43ace460c7672002d5dab9b5df4af026 Mon Sep 17 00:00:00 2001 From: Stefano Sabatini Date: Sun, 17 Jul 2011 01:30:43 +0200 Subject: [PATCH] cmdutils: clarify documentation for setup_find_stream_info_opts() --- cmdutils.h | 13 +++-- 1 files changed, 11 insertions(+), 2 deletions(-) diff --git a/cmdutils.h b/cmdutils.h index e09398e..90d02e6 100644 --- a/cmdutils.h +++ b/cmdutils.h @@ -161,8 +161,17 @@ void parse_options(int argc, char **argv, const OptionDef *options, */ AVDictionary *filter_codec_opts(AVDictionary *opts, enum CodecID codec_id, int encoder); -/* - * Setup AVCodecContext options for avformat_find_stream_info. +/** + * Setup AVCodecContext options for avformat_find_stream_info(). + * + * Create an array of dictionaries, one dictionary for each stream + * contained in s. + * Each dictionary will contain the options from codec_opts which can + * be applied to the corresponding stream codec context. + * + * @param s pointer to a format context + * @return pointer to the created array of dictionaries, NULL if it + * cannot be created */ AVDictionary **setup_find_stream_info_opts(AVFormatContext *s); -- 1.7.2.5 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] cmdutils: clarify documentation for filter_codec_opts()
On date Monday 2011-07-18 12:42:42 +0200, Diego Biurrun encoded: > On Sun, Jul 17, 2011 at 06:46:34PM +0200, Stefano Sabatini wrote: > > > > --- a/cmdutils.h > > +++ b/cmdutils.h > > @@ -150,6 +150,12 @@ void parse_options(int argc, char **argv, const > > OptionDef *options, > > > > /** > > * Filter out options for given codec. > > the/a given codec > > > + * > > + * Create a new options dictionary containing only the options from > > + * opts which apply to the codec with ID codec_id. > > + * > > + * @param encoder if non-zero the codec is an encoder, otherwise is a > > decoder > > I think flags like this are ugly, I'd rather create two separate functions > - but maybe that's just me... I tend to disagree, but unrelated. > > + * @return a pointer to the created dictionary > > */ > > AVDictionary *filter_codec_opts(AVDictionary *opts, enum CodecID codec_id, > > int encoder); > > I'm generally not happy about leaving parameters undocumented - it > generates Doxygen warnings. We already drown in Doxygen warnings, > exacerbating the problem is not good. I see the point into always documenting params (this allows e.g. intellisense systems to display some info when you hang over with a pointer device), on the other hand they tend to be redundant. A possibility would be to patch doxygen to disable this specific warning. Updated. >From f86e77b586e9b930457881b99398be43a1e1fd1b Mon Sep 17 00:00:00 2001 From: Stefano Sabatini Date: Sun, 17 Jul 2011 01:20:50 +0200 Subject: [PATCH] cmdutils: clarify documentation for filter_codec_opts() --- cmdutils.h | 10 +- 1 files changed, 9 insertions(+), 1 deletions(-) diff --git a/cmdutils.h b/cmdutils.h index a47f157..e09398e 100644 --- a/cmdutils.h +++ b/cmdutils.h @@ -149,7 +149,15 @@ void parse_options(int argc, char **argv, const OptionDef *options, void (* parse_arg_function)(const char*)); /** - * Filter out options for given codec. + * Filter out options for the given codec. + * + * Create a new options dictionary containing only the options from + * opts which apply to the codec with ID codec_id. + * + * @param opts a dictionary containing options + * @param codec_id the ID which identifies the codec + * @param encoder if non-zero the codec is an encoder, otherwise is a decoder + * @return a pointer to the created dictionary */ AVDictionary *filter_codec_opts(AVDictionary *opts, enum CodecID codec_id, int encoder); -- 1.7.2.5 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel