Re: [libav-devel] [PATCH 1/1] h264: fix memleak on error during SPS parsing
Hi, On Fri, Dec 14, 2012 at 12:09 PM, Janne Grunau janne-li...@jannau.net wrote: Introduced in d7d6efe42b0d. --- libavcodec/h264_ps.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: handle complementary field pairs in await_references()
Hi, On Fri, Dec 14, 2012 at 7:02 AM, Janne Grunau janne-li...@jannau.net wrote: On 2012-12-12 16:16:21 -0800, Ronald S. Bultje wrote: On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau janne-li...@jannau.net wrote: On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote: Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid field 2. --- libavcodec/h264.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index 546b046..d5a54e2 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -433,7 +433,8 @@ static void await_references(H264Context *h) ff_thread_await_progress(ref_pic-f, FFMIN((row 1), pic_height - 1), 0); -} else if (FIELD_PICTURE !ref_field_picture) { // field referencing one field of a frame +} else if (FIELD_PICTURE + (!ref_field_picture || ref_field 1)) { // field referencing one field of a frame or complementary field pair I don't understand this one. If we're referencing two fields, shouldn't ref_field_picture automatically be true? it isn't and the code that marks both fields of a complementory field pair as available doesn't touch field_picture. Right, but isn't this wrong then? I mean, the reference is field-based, so we're going to reference a field. The part where both fields are available doesn't negate the fact that we're only referencing one. Yes the patch is wrong, I wasn't paying enough attention. A complementary ref field pair can't handled in the same way as a frame. We reported progress on the individual fields. I don't know how to determine which field it references so the patch waits on both. eld It seems to me that a change like this could potentially cause use of reference data before the second field of a reference is fully decoded. Yes, see above. Janne ---8--- Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid field 2. --- libavcodec/h264.c | 5 + 1 file changed, 5 insertions(+) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index f01ff67..be2a98b 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -438,6 +438,11 @@ static void await_references(H264Context *h) FFMIN(row * 2 + ref_field, pic_height - 1), 0); +} else if (FIELD_PICTURE ref_field_picture ref_field == 2) { // field referencing one field of a complementary field pair +ff_thread_await_progress(ref_pic-f, + FFMIN(row, pic_height - 1), 1); +ff_thread_await_progress(ref_pic-f, + FFMIN(row, pic_height - 1), 0); } else if (FIELD_PICTURE) { I still don't get it, how does a field reference need to wait for two fields? The idea of a field reference is that we only need the data of 1 field. Can I get the sample? I'll try to see what's going on this weekend perhaps... All this looks weird. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] x86inc: fully concatenate tokens to fix macro expansion for nasm
Hi, On Thu, Dec 13, 2012 at 1:48 PM, Janne Grunau janne-li...@jannau.net wrote: Fixes build errors with nasm introduced in 6f40e9f070f7 for stack memory alignment. Noticed by BugMaster. --- libavutil/x86/x86inc.asm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 01/15] h264: set parameters from SPS whenever it changes
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: Fixes a crash in the fuzzed sample sample_varPAR.avi_s26638 with alternating bit depths. --- libavcodec/h264.c| 107 +-- libavcodec/h264.h| 2 + libavcodec/h264_ps.c | 3 ++ 3 files changed, 65 insertions(+), 47 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 02/15] lavc: do not overwrite frame parameters of delayed frames
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: Decoders supporting frame parameter changes can return delayed frames with the old parameters. --- libavcodec/utils.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/libavcodec/utils.c b/libavcodec/utils.c index 1185a35..97fa570 100644 --- a/libavcodec/utils.c +++ b/libavcodec/utils.c @@ -1282,11 +1282,16 @@ int attribute_align_arg avcodec_decode_video2(AVCodecContext *avctx, AVFrame *pi else { ret = avctx-codec-decode(avctx, picture, got_picture_ptr, avpkt); -picture-pkt_dts = avpkt-dts; -picture-sample_aspect_ratio = avctx-sample_aspect_ratio; -picture-width = avctx-width; -picture-height = avctx-height; -picture-format = avctx-pix_fmt; +picture-pkt_dts = avpkt-dts; + +/* delayed pictures might have different parameters than + * the context returning them */ +if (!(avctx-codec-capabilities CODEC_CAP_DELAY)) { +picture-sample_aspect_ratio = avctx-sample_aspect_ratio; +picture-width = avctx-width; +picture-height = avctx-height; +picture-format = avctx-pix_fmt; +} } I can live with this. We can clean this further later. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 03/15] mpegvideo: treat delayed pictures as used
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: This requires to move the avcodec_default_free_buffers() call to ff_MPV_common_end() since otherwise delayed pictures would get freed during a size change. --- libavcodec/h264.h | 6 -- libavcodec/mpegvideo.c | 8 libavcodec/mpegvideo.h | 6 ++ 3 files changed, 10 insertions(+), 10 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 04/15] mpegvideo: align width to 32 for scratch buffer sizes
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: cmdutis.c's alloc_buffer() uses aligned to 32 width plus 2 edges of 32 pixels as linesize. emu_edge_buffer has to work with the same stride. This makes only a difference for 8 bit per pixel bit depths since we always allocate for 16 bit per pixel. Fixes fuzzed sample nasa-8s2.ts_s244342. --- libavcodec/mpegvideo.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) What if we implement a custom get_buffer() which returns bigger strides? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 05/15] h264: initialize frame-mt context copies properly
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: --- libavcodec/h264.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index e8a22f8..546b046 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -1125,6 +1125,8 @@ static int decode_init_thread_copy(AVCodecContext *avctx) memset(h-sps_buffers, 0, sizeof(h-sps_buffers)); memset(h-pps_buffers, 0, sizeof(h-pps_buffers)); +h-s.context_initialized = 0; + return 0; } OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid field 2. --- libavcodec/h264.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index 546b046..d5a54e2 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -433,7 +433,8 @@ static void await_references(H264Context *h) ff_thread_await_progress(ref_pic-f, FFMIN((row 1), pic_height - 1), 0); -} else if (FIELD_PICTURE !ref_field_picture) { // field referencing one field of a frame +} else if (FIELD_PICTURE + (!ref_field_picture || ref_field 1)) { // field referencing one field of a frame or complementary field pair I don't understand this one. If we're referencing two fields, shouldn't ref_field_picture automatically be true? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 04/15] mpegvideo: align width to 32 for scratch buffer sizes
Hi, On Wed, Dec 12, 2012 at 3:12 PM, Janne Grunau janne-li...@jannau.net wrote: On 2012-12-12 14:30:43 -0800, Ronald S. Bultje wrote: Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: cmdutis.c's alloc_buffer() uses aligned to 32 width plus 2 edges of 32 pixels as linesize. emu_edge_buffer has to work with the same stride. This makes only a difference for 8 bit per pixel bit depths since we always allocate for 16 bit per pixel. Fixes fuzzed sample nasa-8s2.ts_s244342. --- libavcodec/mpegvideo.c | 13 - 1 file changed, 8 insertions(+), 5 deletions(-) What if we implement a custom get_buffer() which returns bigger strides? it fails horrible for high pixel bit depth like it does with this sample. I could add an assumed_linesize into the context and realloc if the linesize returned by get_buffer is larger or maybe preallocate a buffer here if it' feasible. Can't you allocate these buffers after you receive the first buffer from get_buffer() and actually know the linesize? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 11/15] h264: prevent starting new frames after ff_thread_finish_setup()
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: --- libavcodec/h264.c | 13 - libavcodec/h264.h | 1 + 2 files changed, 13 insertions(+), 1 deletion(-) In what situation does this happen? We should really make this an assert() at some point, and track the cases where it does happen back to ensure we fix it at whatever level. (The current hack to decode two fields in a packet is a hack and should die; we should parse such files better and split at the field boundary.) Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()
Hi, On Wed, Dec 12, 2012 at 4:29 PM, Janne Grunau janne-li...@jannau.net wrote: On 2012-12-12 16:16:21 -0800, Ronald S. Bultje wrote: Hi, On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau janne-li...@jannau.net wrote: On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote: Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote: Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid field 2. --- libavcodec/h264.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index 546b046..d5a54e2 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -433,7 +433,8 @@ static void await_references(H264Context *h) ff_thread_await_progress(ref_pic-f, FFMIN((row 1), pic_height - 1), 0); -} else if (FIELD_PICTURE !ref_field_picture) { // field referencing one field of a frame +} else if (FIELD_PICTURE + (!ref_field_picture || ref_field 1)) { // field referencing one field of a frame or complementary field pair I don't understand this one. If we're referencing two fields, shouldn't ref_field_picture automatically be true? it isn't and the code that marks both fields of a complementory field pair as available doesn't touch field_picture. Right, but isn't this wrong then? I mean, the reference is field-based, so we're going to reference a field. The part where both fields are available doesn't negate the fact that we're only referencing one. It seems to me that a change like this could potentially cause use of reference data before the second field of a reference is fully decoded. I need to check What does the hang look like? ref_field = ref_pic-f.reference; and it waits on ref_field in the next else that breaks when reference is PICT_FRAME that's BOTTOM_FIELD | TOP_FIELD. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: Ronald S. Bultje rsbul...@gmail.com Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++ libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 --- libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 185 5 files changed, 191 insertions(+), 101 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index c124c4d..1f1dbc6 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -398,14 +398,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5,5,8,2*%2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+12*HAVE_ALIGNED_STACK ; transpose 6x16 - tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2,4,8,0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 - tmp space @@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index ac231cb..73d4990 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c-h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c-h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c-h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c-h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3; c
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Sat, Dec 8, 2012 at 8:41 AM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: From: Ronald S. Bultje rsbul...@gmail.com Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++ libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 --- libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 185 5 files changed, 191 insertions(+), 101 deletions(-) How is this different from the patch you sent yesterday? It adds a missing %endrep for win64. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: Ronald S. Bultje rsbul...@gmail.com Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++ libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 --- libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 185 5 files changed, 191 insertions(+), 101 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index c124c4d..1f1dbc6 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -398,14 +398,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5,5,8,2*%2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+12*HAVE_ALIGNED_STACK ; transpose 6x16 - tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2,4,8,0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 - tmp space @@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index ac231cb..73d4990 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c-h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c-h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c-h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c-h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3; c
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Sat, Dec 8, 2012 at 4:12 PM, Ronald S. Bultje rsbul...@gmail.com wrote: From: Ronald S. Bultje rsbul...@gmail.com Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++ libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 --- libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 185 5 files changed, 191 insertions(+), 101 deletions(-) One more fix for invalid stack free'ing if a YMM function on win64 used 6 registers, but no stack. Any more reviews, or can this be applied? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: Ronald S. Bultje rsbul...@gmail.com Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++ libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 --- libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 186 5 files changed, 191 insertions(+), 102 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index c124c4d..1f1dbc6 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -398,14 +398,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5,5,8,2*%2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+12*HAVE_ALIGNED_STACK ; transpose 6x16 - tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2,4,8,0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 - tmp space @@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index ac231cb..73d4990 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c-h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c-h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c-h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c-h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3; c
[libav-devel] [PATCH] Check for invalid VLC code in zeros_left before writing coefficients.
From: Ronald S. Bultje rsbul...@gmail.com This prevents an invalid write into coeffs[scantable[-1]] if zeros_left itself was an invalid VLC code (and thus -1). --- libavcodec/h264_cavlc.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c index 8702122..fa2bd99 100644 --- a/libavcodec/h264_cavlc.c +++ b/libavcodec/h264_cavlc.c @@ -610,17 +610,17 @@ static int decode_residual(H264Context *h, GetBitContext *gb, DCTELEM *block, in } \ } +if(zeros_left0){ +av_log(h-s.avctx, AV_LOG_ERROR, negative number of zero coeffs at %d %d\n, s-mb_x, s-mb_y); +return -1; +} + if (h-pixel_shift) { STORE_BLOCK(int32_t) } else { STORE_BLOCK(int16_t) } -if(zeros_left0){ -av_log(h-s.avctx, AV_LOG_ERROR, negative number of zero coeffs at %d %d\n, s-mb_x, s-mb_y); -return -1; -} - return 0; } -- 1.8.0 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Fri, Dec 7, 2012 at 1:01 PM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: +%if mmsize = 16 HAVE_ALIGNED_STACK How much overhead would it be to drop HAVE_ALIGNED_STACK entirely? Well, for now, we still have a ton of functions that don't use the cglobal-method of allocating stack. I only ported h264/vp8 loopfilter, nothing else. But anyway, more generally, it's 4-5 instructions per function. For typical functions with an inner loop, that's negligible, but for a select small set of functions, it may be significant. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] golomb: use unsigned arithmetics in svq3_get_ue_golomb()
Hi, On Fri, Nov 30, 2012 at 10:56 AM, Janne Grunau janne-li...@jannau.net wrote: This prevents undefined behaviour of signed left shift if the coded value is larger than 2^31. Large values are most likely invalid and caused errors or by feeding random. Validate every use of svq3_get_ue_golomb() and changed the place there the return value was compared with negative numbers. dirac.c was clean, fixed rv30 and svq3. --- libavcodec/golomb.h | 5 +++-- libavcodec/rv30.c | 6 +++--- libavcodec/svq3.c | 17 - 3 files changed, 14 insertions(+), 14 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Fri, Dec 7, 2012 at 2:01 PM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: On Fri, Dec 7, 2012 at 1:01 PM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: +%if mmsize = 16 HAVE_ALIGNED_STACK How much overhead would it be to drop HAVE_ALIGNED_STACK entirely? Well, for now, we still have a ton of functions that don't use the cglobal-method of allocating stack. I only ported h264/vp8 loopfilter, nothing else. But anyway, more generally, it's 4-5 instructions per function. For typical functions with an inner loop, that's negligible, but for a select small set of functions, it may be significant. The remaining functions are ff_h264_idct8_add(4)_10_{sse2,avx}, ff_hadamard8_diff(16)_{sse2,ssse3}, and something in swscale. lavr also. Besides, does anyone still use 32-bit where performance is that critical? This is used for YMM (e.g. avx float) stack alignment (to 32-byte) also, so it will affect 64-bit also. My personal point of view is that the code to take advantage of an actual feature of the compiler/system (alignment) is there. I don't see why we'd remove it. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Fri, Dec 7, 2012 at 2:08 PM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: Hi, On Fri, Dec 7, 2012 at 2:01 PM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: On Fri, Dec 7, 2012 at 1:01 PM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: +%if mmsize = 16 HAVE_ALIGNED_STACK How much overhead would it be to drop HAVE_ALIGNED_STACK entirely? Well, for now, we still have a ton of functions that don't use the cglobal-method of allocating stack. I only ported h264/vp8 loopfilter, nothing else. But anyway, more generally, it's 4-5 instructions per function. For typical functions with an inner loop, that's negligible, but for a select small set of functions, it may be significant. The remaining functions are ff_h264_idct8_add(4)_10_{sse2,avx}, ff_hadamard8_diff(16)_{sse2,ssse3}, and something in swscale. lavr also. There are no references to HAVE_ALIGNED_STACK there. It crashes on x86-32 icc10.x and msvc. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] x86: float_dsp: fix loading of the len parameter on x86-32
Hi, On Wed, Dec 5, 2012 at 9:53 AM, Justin Ruggles justin.rugg...@gmail.com wrote: --- libavutil/x86/float_dsp.asm |3 +++ 1 files changed, 3 insertions(+), 0 deletions(-) diff --git a/libavutil/x86/float_dsp.asm b/libavutil/x86/float_dsp.asm index 4a1742f..dc75532 100644 --- a/libavutil/x86/float_dsp.asm +++ b/libavutil/x86/float_dsp.asm @@ -127,6 +127,9 @@ cglobal vector_dmul_scalar, 3,3,3, dst, src, len cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len %endif %if ARCH_X86_32 +; PROLOGUE loads len from the wrong stack address because mul is an 8-byte +; parameter and PROLOGUE assumes all parameters are 4-byte +mov lenq, [esp+0x18] VBROADCASTSD m0, mulm That loads len twice. Why not: %if ARCH_X86_32 cglobal vector_dmul_scalar, 3,4,3, dst, src, mul, len, lenaddr mov lenq, lenaddrm %else cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len %endif which is more functionally correct? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32
Hi, On Wed, Dec 5, 2012 at 9:53 AM, Justin Ruggles justin.rugg...@gmail.com wrote: --- libavutil/x86/float_dsp.asm |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/libavutil/x86/float_dsp.asm b/libavutil/x86/float_dsp.asm index d8fd93a..4a1742f 100644 --- a/libavutil/x86/float_dsp.asm +++ b/libavutil/x86/float_dsp.asm @@ -127,7 +127,7 @@ cglobal vector_dmul_scalar, 3,3,3, dst, src, len cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len %endif %if ARCH_X86_32 -VBROADCASTSD xmm0, mulm +VBROADCASTSD m0, mulm OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: slice-mt: check master context for valid current_picture_ptr
Hi, On Wed, Dec 5, 2012 at 11:10 AM, Janne Grunau janne-li...@jannau.net wrote: Fixes errors in slice based multithreading introduced in 0b300daad2f5. CC: libav-sta...@libav.org --- libavcodec/h264.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index b9c46ea..317ee95 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -2782,7 +2782,7 @@ static int decode_slice_header(H264Context *h, H264Context *h0) s-picture_structure = last_pic_structure; s-dropable = last_pic_dropable; return AVERROR_INVALIDDATA; -} else if (!s-current_picture_ptr) { +} else if (!s0-current_picture_ptr) { av_log(s-avctx, AV_LOG_ERROR, unset current_picture_ptr on %d. slice\n, h0-current_slice + 1); OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: slice-mt: get last_pic_dropable from master context
Hi, On Wed, Dec 5, 2012 at 11:00 AM, Janne Grunau janne-li...@jannau.net wrote: Fixes fate-h264-conformance-cvnlfi2_sony_h and smllwebdl.mkv from https://github.com/OpenELEC/OpenELEC.tv/issues/1557 . --- libavcodec/h264.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index cd8aeec..b9c46ea 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -2759,7 +2759,7 @@ static int decode_slice_header(H264Context *h, H264Context *h0) h-mb_mbaff= 0; h-mb_aff_frame= 0; last_pic_structure = s0-picture_structure; -last_pic_dropable = s-dropable; +last_pic_dropable = s0-dropable; OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 03/10] SBR DSP x86: implement SSE qmf_post_shuffle
Hi, On Fri, Nov 30, 2012 at 1:14 PM, Christophe Gisquet christophe.gisq...@gmail.com wrote: Hello, 2012/11/30 Loren Merritt lor...@u.washington.edu: If you increment an index into W and z rather than the pointers themselves, then you can eliminate an add and a cmp. I add already tested that, and redid it: cglobal sbr_qmf_post_shuffle, 2,4,3,W,z mov r3q, 32*4 lea r2q, [zq + (64-4)*4] addzq, r3q leaWq, [Wq + 2*r3q] neg r3q .loop: mova m0, [r2q] mova m1, [zq + r3q] xorps m0, [ps_neg] shufps m0, m0, 0x1B mova m2, m0 unpcklps m0, m1 unpckhps m2, m1 mova [Wq + 2*r3q + 0], m0 mova [Wq + 2*r3q + 16], m2 sub r2q, 16 add r3q, 16 jl .loop REP_RET It's 2 cycles slower on Penrynn/Win64 (154 vs 152). Try adding an ALIGN 16 just above .loop:, maybe that fixes it? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 3/4] h264: check ref_count validity for num_ref_idx_active_override_flag
On Mon, Nov 26, 2012 at 7:54 AM, Janne Grunau janne-li...@jannau.net wrote: On 2012-11-26 07:01:20 -0800, Ronald S. Bultje wrote: Hi, On Mon, Nov 26, 2012 at 4:06 AM, Janne Grunau janne-li...@jannau.net wrote: Fixes segfault in the fuzzed sample bipbop234.ts_s226407. CC: libav-sta...@libav.org --- libavcodec/h264.c | 7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index eca4636..730e34a 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -2896,8 +2896,13 @@ static int decode_slice_header(H264Context *h, H264Context *h0) if (num_ref_idx_active_override_flag) { h-ref_count[0] = get_ue_golomb(s-gb) + 1; -if (h-slice_type_nos == AV_PICTURE_TYPE_B) +if (h-ref_count[0] 1) +return AVERROR_INVALIDDATA; +if (h-slice_type_nos == AV_PICTURE_TYPE_B) { h-ref_count[1] = get_ue_golomb(s-gb) + 1; +if (h-ref_count[1] 1) +return AVERROR_INVALIDDATA; +} } if (h-slice_type_nos == AV_PICTURE_TYPE_B) -- 1.7.12.4 get_ue_golomb() guarantees to return something non-negative, so is this an integer overflow? We should then prevent the integer overflow. It returns -1 if the next 32 bits are all 0. That is arguebly not a valid exp-golomb code but occurs with the sample mentioned in the commit message (available in incoming/2012-11-15_fuzz_h264_janne as my other fuzzed samples). Oh I see, patch OK then. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] wavenc: write fact chunk sample count at the correct file position
Hi, On Mon, Nov 26, 2012 at 4:11 PM, Justin Ruggles justin.rugg...@gmail.com wrote: From: Michael Niedermayer michae...@gmx.at Fixes curruption of metadata in the INFO chunk. Signed-off-by: Michael Niedermayer michae...@gmx.at Signed-off-by: Justin Ruggles justin.rugg...@gmail.com --- libavformat/wavenc.c |9 + 1 files changed, 5 insertions(+), 4 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] dsputil: modify scalarproduct_int16 to handle mod8 numbers of loops.
Hi, On Sun, Nov 25, 2012 at 1:18 AM, Diego Biurrun di...@biurrun.de wrote: On Sat, Nov 24, 2012 at 03:43:04PM +0100, Christophe Gisquet wrote: PS: no idea why firefox/gmail treats those patches as octet-stream, and how to fix that. Ronald wrote about this a while back, but there seems to be no easy solution: http://blogs.gnome.org/rbultje/2009/03/04/firefox-gmail-and-content-type-of-attachments/ The solution is basically to use git send-email, not gmail. git send-email can use your gmail account as smtp, if wanted. I can help you set that up if this doesn't help enough: http://morefedora.blogspot.com/2009/02/configuring-git-send-email-to-use-gmail.html (This is kind of a pain on Fedora, since you'll first have to install a bunch of perl extensions. I'm gonna hope you're not using a Mac at this point. See http://kbase.wincent.com/old/knowledge-base/Installing_Net::SMTP::SSL_for_sending_patches_with_Git_over_secure_SMTP.html for gory details.) Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] dsputil: modify scalarproduct_int16 to handle mod8 numbers of loops.
Hi, On Sun, Nov 25, 2012 at 8:43 AM, Ronald S. Bultje rsbul...@gmail.com wrote: (This is kind of a pain on Fedora Whoops, I meant Mac here. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: enable low delay only if no delayed frames were seen
Hi, On Sun, Nov 25, 2012 at 2:40 PM, Janne Grunau janne-li...@jannau.net wrote: Dropping frames is undesirable but that is the only way by which the decoder could return to low delay mode. Instead emit a warning and continue with delayed frames. Fixes a crash in fuzzed sample nasa-8s2.ts_s20033 caused by a larger than expected has_b_frames value. Low delay keeps getting re-enabled from a presumely broken SPS. --- libavcodec/h264.c | 10 -- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index f45c572..658d865 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -3867,8 +3867,14 @@ again: if (s-flags CODEC_FLAG_LOW_DELAY || (h-sps.bitstream_restriction_flag - !h-sps.num_reorder_frames)) -s-low_delay = 1; + !h-sps.num_reorder_frames)) { +if (s-avctx-has_b_frames 1 || h-delayed_pic[0]) +av_log(avctx, AV_LOG_WARNING, Delayed frames seen + reenabling low delay requires a codec + flush.\n); +else +s-low_delay = 1; +} That looks good, thanks. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: reset has_b_frames after enabling low_delay from SPS
Hi, On Wed, Nov 21, 2012 at 5:37 AM, Janne Grunau janne-li...@jannau.netwrote: On 2012-11-21 13:14:34 +, Loren Merritt wrote: On Wed, 21 Nov 2012, Janne Grunau wrote: On 2012-11-16 18:14:29 -0800, Ronald S. Bultje wrote: So I'm going to have to wonder what happens to the delayed frames already cached in h-delayed_pics[]? Are they still output? yes, delayed pictures are returned as long as h-delayed_pics[0] is not NULL. And when you return something from delayed_pics[] instead of the new frame that just got decoded, the new frame gets delayed, so that you don't ever actually switch to low delay mode? Since we can't return multiple frames at once we can't switch to low delay mode once we have delayed pictures. If low delay after switching from a stream with delayed pictures to a low_delay stream is important and dropping frames is an option requiring a decoder flush from sounds reasonable to me. That's one option. We could also simply say that low_delay can't be re-enabled once disabled, so even though the SPS says we're low-delay, just ignore it. I don't really have a preference either way, whichever is easier implementation-wise or whichever you feel is a better representation of what the end user of a non-fuzzed bitstream likely wanted to happen. (In practice, this probably never happens anyway?) Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: reset has_b_frames after enabling low_delay from SPS
Hi, On Wed, Nov 21, 2012 at 12:29 PM, Janne Grunau janne-li...@jannau.netwrote: On 2012-11-21 09:13:27 -0800, Ronald S. Bultje wrote: On Wed, Nov 21, 2012 at 5:37 AM, Janne Grunau janne-li...@jannau.net wrote: On 2012-11-21 13:14:34 +, Loren Merritt wrote: On Wed, 21 Nov 2012, Janne Grunau wrote: On 2012-11-16 18:14:29 -0800, Ronald S. Bultje wrote: So I'm going to have to wonder what happens to the delayed frames already cached in h-delayed_pics[]? Are they still output? yes, delayed pictures are returned as long as h-delayed_pics[0] is not NULL. And when you return something from delayed_pics[] instead of the new frame that just got decoded, the new frame gets delayed, so that you don't ever actually switch to low delay mode? Since we can't return multiple frames at once we can't switch to low delay mode once we have delayed pictures. If low delay after switching from a stream with delayed pictures to a low_delay stream is important and dropping frames is an option requiring a decoder flush from sounds reasonable to me. That's one option. We could also simply say that low_delay can't be re-enabled once disabled, so even though the SPS says we're low-delay, just ignore it. I don't really have a preference either way, whichever is easier implementation-wise or whichever you feel is a better representation of what the end user of a non-fuzzed bitstream likely wanted to happen. (In practice, this probably never happens anyway?) The current patch seems to work fine and fate-h264 still passes so this approach is easier for me. I also think that not dropping frames is less surprising option. Expecting tests of sender and receiver in situations where low delay is important is not unreasonable. I can add a warning when low_delay gets enabled with delayed frames. Well, what I meant is that we probably shouldn't reset s-avctx-has_b_frames and s-low_delay, since that may give the (incorrect) assumption to the application that there is no delay... A warning would be good, yes. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] h264: reset has_b_frames after enabling low_delay from SPS
Hi, On Fri, Nov 16, 2012 at 8:43 AM, Janne Grunau janne-li...@jannau.netwrote: Fixes a crash in fuzzed file nasa-8s2.ts_s20033 caused by a too large has_b_frames value. low_delay keeps getting re-enabled from the the presumely broken SPS. --- libavcodec/h264.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/libavcodec/h264.c b/libavcodec/h264.c index c30c478..7935fe6 100644 --- a/libavcodec/h264.c +++ b/libavcodec/h264.c @@ -2346,8 +2346,10 @@ static int h264_set_parameter_from_sps(H264Context *h) if (s-flags CODEC_FLAG_LOW_DELAY || (h-sps.bitstream_restriction_flag - !h-sps.num_reorder_frames)) + !h-sps.num_reorder_frames)) { s-low_delay = 1; +s-avctx-has_b_frames = 0; +} So I'm going to have to wonder what happens to the delayed frames already cached in h-delayed_pics[]? Are they still output? What happens if we concatenate a delayed h264 file by a file that has this kind of sps header? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86: lavr: fix stack allocation for 7 and 8 channel downmixing on x86-32
Hi, On Mon, Nov 12, 2012 at 12:41 PM, Luca Barbato lu_z...@gentoo.org wrote: On 11/12/2012 07:35 PM, Justin Ruggles wrote: Fixes crashes on Win32 and stack overruns on x86-32 in general. Who can test it? http://bugzilla.libav.org/show_bug.cgi?id=338 Reproducible on x86-32 Linux/Mac with asan. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86: lavr: fix stack allocation for 7 and 8 channel downmixing on x86-32
Hi, On Thu, Nov 8, 2012 at 2:17 PM, Justin Ruggles justin.rugg...@gmail.comwrote: On 10/29/2012 04:39 PM, Justin Ruggles wrote: From: Ronald S. Bultje rsbul...@gmail.com Fixes crashes on Win32 and stack overruns on x86-32 in general. --- libavresample/x86/audio_mix.asm |7 +++ 1 files changed, 7 insertions(+), 0 deletions(-) diff --git a/libavresample/x86/audio_mix.asm b/libavresample/x86/audio_mix.asm index 0c4a9bd..cdc787d 100644 --- a/libavresample/x86/audio_mix.asm +++ b/libavresample/x86/audio_mix.asm @@ -280,6 +280,13 @@ cglobal mix_%1_to_%2_%3_flt, 3,in_channels+2,needed_mmregs+matrix_elements_mm, s sub rsp, matrix_elements_stack * mmsize %else %assign pad matrix_elements_stack * mmsize + (mmsize - gprsize) - (stack_offset (mmsize - gprsize)) +; on x86-32 for 7 and 8 channels we need more stack space for src pointers +%if ARCH_X86_32 in_channels = 7 +%assign pad pad + 0x10 +%define src5m [rsp+0xc0] +%define src6m [rsp+0xc4] +%define src7m [rsp+0xc8] +%endif SUB rsp, pad %endif %endif ping Well it's my patch and you reviewed it, isn't that enough? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86: h264_qpel: sign-extend stride argument in put_pixels16_sse2()
Hi, On Mon, Nov 5, 2012 at 1:22 PM, Diego Biurrun di...@biurrun.de wrote: --- This fixes the crash in fate-vp5 with the QPEL code converted to YASM. I'm sending this separately to not spam everybody with 100+kB patches. The patch is meant to be squashed into the QPEL yasmification. ; void put_pixels16_sse2(uint8_t *block, const uint8_t *pixels, int line_size, int h) cglobal put_pixels16, 4,5,4 +movsxdifnidnr2, r2d lea r4, [r2*3] .loop: No, I said each occurence of cglobal needed this, not just one of them. Also, please fix indenting of this instruction to be aligned with the line below. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86: h264: Convert 8-bit QPEL inline assembly to YASM
Hi, On Mon, Nov 5, 2012 at 10:48 AM, Diego Biurrun di...@biurrun.de wrote: +%macro QPEL8_H_LOWPASS_OP 1 +cglobal %1_h264_qpel8_h_lowpass, 4,5 ; dst, src, dstStride, srcStride +mov r4d, 8 Each and every single one of these needs a stride sign extension like you did in dsputil.asm. Do you have numbers to compare performance of qpel (both with START/STOP_TIMER as well as with total wallclock time, e.g. using the time command) before/after this patch? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] Use ptrdiff_t instead of int for intra pred stride function parameter.
Hi, On Mon, Oct 29, 2012 at 7:53 AM, Diego Biurrun di...@biurrun.de wrote: On Sun, Oct 28, 2012 at 08:44:54PM -0700, Ronald S. Bultje wrote: From: Ronald S. Bultje rsbul...@gmail.com --- libavcodec/arm/h264pred_init_arm.c | 36 +++--- libavcodec/h264pred.c| 79 libavcodec/h264pred.h| 18 +-- libavcodec/h264pred_template.c | 233 +-- libavcodec/x86/h264_intrapred_init.c | 15 ++- 5 files changed, 264 insertions(+), 117 deletions(-) Why? So you don't have to sign-extend integers in simd functions (or suffer undefined behaviour if you forget). I've done this before for several other simd functions, use git log. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] Remove usage of INIT_AVX in h264_intrapred_10bit.asm.
From: Ronald S. Bultje rsbul...@gmail.com Replace INIT_AVX by INIT_XMM avx. Port the whole file to use cpuflag based function declarations. Remove (now unused) cputype argument in function declaration macros. Change function prototypes to have mmx2 instead of mmxext as suffix, since that's required by cpuflags. --- libavcodec/x86/h264_intrapred_10bit.asm | 306 libavcodec/x86/h264_intrapred_init.c| 40 ++--- 2 files changed, 177 insertions(+), 169 deletions(-) diff --git a/libavcodec/x86/h264_intrapred_10bit.asm b/libavcodec/x86/h264_intrapred_10bit.asm index 529134e..c3f6dc4 100644 --- a/libavcodec/x86/h264_intrapred_10bit.asm +++ b/libavcodec/x86/h264_intrapred_10bit.asm @@ -53,8 +53,8 @@ SECTION .text ;- ; void pred4x4_down_right(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_DR 1 -cglobal pred4x4_down_right_10_%1, 3,3 +%macro PRED4x4_DR 0 +cglobal pred4x4_down_right_10, 3, 3 sub r0, r2 lea r1, [r0+r2*2] movhpsm1, [r1-8] @@ -79,21 +79,22 @@ cglobal pred4x4_down_right_10_%1, 3,3 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_DR sse2 +PRED4x4_DR +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_DR ssse3 +PRED4x4_DR %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_DR avx +INIT_XMM avx +PRED4x4_DR %endif ;- ; void pred4x4_vertical_right(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_VR 1 -cglobal pred4x4_vertical_right_10_%1, 3,3,6 +%macro PRED4x4_VR 0 +cglobal pred4x4_vertical_right_10, 3, 3, 6 sub r0, r2 lea r1, [r0+r2*2] movqm5, [r0]; t3t2t1t0 @@ -119,21 +120,22 @@ cglobal pred4x4_vertical_right_10_%1, 3,3,6 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_VR sse2 +PRED4x4_VR +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_VR ssse3 +PRED4x4_VR %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_VR avx +INIT_XMM avx +PRED4x4_VR %endif ;- ; void pred4x4_horizontal_down(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_HD 1 -cglobal pred4x4_horizontal_down_10_%1, 3,3 +%macro PRED4x4_HD 0 +cglobal pred4x4_horizontal_down_10, 3, 3 subr0, r2 lear1, [r0+r2*2] movq m0, [r0-8] ; lt .. @@ -162,14 +164,15 @@ cglobal pred4x4_horizontal_down_10_%1, 3,3 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_HD sse2 +PRED4x4_HD +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_HD ssse3 +PRED4x4_HD %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_HD avx +INIT_XMM avx +PRED4x4_HD %endif ;- @@ -192,8 +195,8 @@ PRED4x4_HD avx HADDD %1, %2 %endmacro -INIT_MMX -cglobal pred4x4_dc_10_mmxext, 3,3 +INIT_MMX mmx2 +cglobal pred4x4_dc_10, 3, 3 subr0, r2 lear1, [r0+r2*2] movq m2, [r0+r2*1-8] @@ -216,8 +219,8 @@ cglobal pred4x4_dc_10_mmxext, 3,3 ;- ; void pred4x4_down_left(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_DL 1 -cglobal pred4x4_down_left_10_%1, 3,3 +%macro PRED4x4_DL 0 +cglobal pred4x4_down_left_10, 3, 3 subr0, r2 movq m0, [r0] movhps m0, [r1] @@ -236,18 +239,18 @@ cglobal pred4x4_down_left_10_%1, 3,3 RET %endmacro -INIT_XMM -PRED4x4_DL sse2 +INIT_XMM sse2 +PRED4x4_DL %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_DL avx +INIT_XMM avx +PRED4x4_DL %endif ;- ; void pred4x4_vertical_left(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_VL 1 -cglobal pred4x4_vertical_left_10_%1, 3,3 +%macro PRED4x4_VL 0 +cglobal pred4x4_vertical_left_10, 3, 3 subr0, r2 movu m1, [r0] movhps m1, [r1] @@ -265,18 +268,18 @@ cglobal pred4x4_vertical_left_10_%1, 3,3 RET %endmacro -INIT_XMM -PRED4x4_VL sse2 +INIT_XMM sse2 +PRED4x4_VL %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_VL avx +INIT_XMM avx +PRED4x4_VL %endif ;- ; void pred4x4_horizontal_up(pixel *src, const pixel *topright, int stride
Re: [libav-devel] [PATCH 1/2] Use PRED4x4/8x8/8x8L/16x16 macros to declare x86 intrapred prototypes.
Hi, On Mon, Oct 29, 2012 at 10:25 AM, Luca Barbato lu_z...@gentoo.org wrote: On 10/29/2012 04:44 AM, Ronald S. Bultje wrote: From: Ronald S. Bultje rsbul...@gmail.com If you don't mind x86: use PRED4x4/8x8/8x8L/16x16 macros to declare intrapred prototypes might fit better. Will change locally. Any other comments on the code or can I commit with that modified? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: Add cpuflags_mmxext alias for cpuflags_mmx2
Hi, On Mon, Oct 29, 2012 at 5:15 PM, Diego Biurrun di...@biurrun.de wrote: This allows using mmxext as name in Libav while staying compatible with changes to the YASM macro infrastructure imported from x264. --- libavutil/x86/x86inc.asm |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm index 1fe9f55..2655154 100644 --- a/libavutil/x86/x86inc.asm +++ b/libavutil/x86/x86inc.asm @@ -35,6 +35,7 @@ ; to x264-de...@videolan.org . %define program_name ff +%define cpuflags_mmxext cpuflags_mmx2 Any such a change to x86inc.asm is OK if and only if x264 accepts this change upstream. Loren? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: Add cpuflags_mmxext alias for cpuflags_mmx2
Hi, On Mon, Oct 29, 2012 at 5:56 PM, Diego Biurrun di...@biurrun.de wrote: On Mon, Oct 29, 2012 at 05:25:08PM -0700, Ronald S. Bultje wrote: On Mon, Oct 29, 2012 at 5:15 PM, Diego Biurrun di...@biurrun.de wrote: This allows using mmxext as name in Libav while staying compatible with changes to the YASM macro infrastructure imported from x264. --- libavutil/x86/x86inc.asm |1 + 1 files changed, 1 insertions(+), 0 deletions(-) --- a/libavutil/x86/x86inc.asm +++ b/libavutil/x86/x86inc.asm @@ -35,6 +35,7 @@ ; to x264-de...@videolan.org . %define program_name ff +%define cpuflags_mmxext cpuflags_mmx2 Any such a change to x86inc.asm is OK if and only if x264 accepts this change upstream. Loren? The x86inc.asm files from x264 and Libav are already different, x264 uses x264 instead of ff as program_name. That's the reason why I chose to place the alias in this very spot: during upstream merges there will be no conflicts that are not a) trivial to fix and b) would have occurred anyway due to the existing difference. This patch is also what a room full of Libav developers accepted as a compromise at VDD. Please don't oppose the consensus of your peers. You keep trying to push this through. No. I do the syncing, and nobody wants to take that job from me, I assure you. Every time the semantics of cglobal change by however little, someone who knows that stuff needs to go in and fix/port it. I prefer to not have to take any additional tasks on me, such as maintaining forky differences. Thus, please, no. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] swscale: support gray to 9bit and 10bit formats
Hi, On Mon, Oct 29, 2012 at 7:14 PM, Luca Barbato lu_z...@gentoo.org wrote: With the input of Kostya and Ronald. --- libswscale/swscale.c | 38 -- libswscale/swscale_unscaled.c | 32 ++-- 2 files changed, 66 insertions(+), 4 deletions(-) So it works now, under valgrind? LGTM then. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] pixfmt: support more yuva formats
Hi, On Mon, Oct 29, 2012 at 7:14 PM, Luca Barbato lu_z...@gentoo.org wrote: Signed-off-by: Luca Barbato lu_z...@gentoo.org --- libavcodec/raw.c | 19 libavformat/nut.c | 28 + libavutil/pixdesc.c | 234 ++ libavutil/pixfmt.h| 28 + libswscale/utils.c| 18 tests/ref/lavfi/pixdesc | 18 tests/ref/lavfi/pixfmts_copy | 18 tests/ref/lavfi/pixfmts_null | 18 tests/ref/lavfi/pixfmts_scale | 18 tests/ref/lavfi/pixfmts_vflip | 18 10 files changed, 417 insertions(+) OK, assuming it works without new warnings under valgrind. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/4] x86: include x86inc.asm in x86util.asm
Hi, On Mon, Oct 29, 2012 at 7:28 PM, Diego Biurrun di...@biurrun.de wrote: -%include x86inc.asm %include x86util.asm Didn't you come up with the rule that each file should contain all headers it directly takes symbols from? How do these changes fit with that scheme? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/4] x86util: Add cpuflags_mmxext alias for cpuflags_mmx2
Hi, On Mon, Oct 29, 2012 at 7:28 PM, Diego Biurrun di...@biurrun.de wrote: mmxext is a more sensible name and more common in outside projects. --- libavutil/x86/x86util.asm |1 + 1 files changed, 1 insertions(+), 0 deletions(-) No. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/2] Remove usage of INIT_AVX in h264_intrapred_10bit.asm.
From: Ronald S. Bultje rsbul...@gmail.com --- libavcodec/x86/h264_intrapred_10bit.asm | 306 libavcodec/x86/h264_intrapred_init.c| 40 ++--- 2 files changed, 177 insertions(+), 169 deletions(-) diff --git a/libavcodec/x86/h264_intrapred_10bit.asm b/libavcodec/x86/h264_intrapred_10bit.asm index 529134e..c3f6dc4 100644 --- a/libavcodec/x86/h264_intrapred_10bit.asm +++ b/libavcodec/x86/h264_intrapred_10bit.asm @@ -53,8 +53,8 @@ SECTION .text ;- ; void pred4x4_down_right(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_DR 1 -cglobal pred4x4_down_right_10_%1, 3,3 +%macro PRED4x4_DR 0 +cglobal pred4x4_down_right_10, 3, 3 sub r0, r2 lea r1, [r0+r2*2] movhpsm1, [r1-8] @@ -79,21 +79,22 @@ cglobal pred4x4_down_right_10_%1, 3,3 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_DR sse2 +PRED4x4_DR +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_DR ssse3 +PRED4x4_DR %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_DR avx +INIT_XMM avx +PRED4x4_DR %endif ;- ; void pred4x4_vertical_right(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_VR 1 -cglobal pred4x4_vertical_right_10_%1, 3,3,6 +%macro PRED4x4_VR 0 +cglobal pred4x4_vertical_right_10, 3, 3, 6 sub r0, r2 lea r1, [r0+r2*2] movqm5, [r0]; t3t2t1t0 @@ -119,21 +120,22 @@ cglobal pred4x4_vertical_right_10_%1, 3,3,6 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_VR sse2 +PRED4x4_VR +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_VR ssse3 +PRED4x4_VR %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_VR avx +INIT_XMM avx +PRED4x4_VR %endif ;- ; void pred4x4_horizontal_down(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_HD 1 -cglobal pred4x4_horizontal_down_10_%1, 3,3 +%macro PRED4x4_HD 0 +cglobal pred4x4_horizontal_down_10, 3, 3 subr0, r2 lear1, [r0+r2*2] movq m0, [r0-8] ; lt .. @@ -162,14 +164,15 @@ cglobal pred4x4_horizontal_down_10_%1, 3,3 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_HD sse2 +PRED4x4_HD +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_HD ssse3 +PRED4x4_HD %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_HD avx +INIT_XMM avx +PRED4x4_HD %endif ;- @@ -192,8 +195,8 @@ PRED4x4_HD avx HADDD %1, %2 %endmacro -INIT_MMX -cglobal pred4x4_dc_10_mmxext, 3,3 +INIT_MMX mmx2 +cglobal pred4x4_dc_10, 3, 3 subr0, r2 lear1, [r0+r2*2] movq m2, [r0+r2*1-8] @@ -216,8 +219,8 @@ cglobal pred4x4_dc_10_mmxext, 3,3 ;- ; void pred4x4_down_left(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_DL 1 -cglobal pred4x4_down_left_10_%1, 3,3 +%macro PRED4x4_DL 0 +cglobal pred4x4_down_left_10, 3, 3 subr0, r2 movq m0, [r0] movhps m0, [r1] @@ -236,18 +239,18 @@ cglobal pred4x4_down_left_10_%1, 3,3 RET %endmacro -INIT_XMM -PRED4x4_DL sse2 +INIT_XMM sse2 +PRED4x4_DL %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_DL avx +INIT_XMM avx +PRED4x4_DL %endif ;- ; void pred4x4_vertical_left(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_VL 1 -cglobal pred4x4_vertical_left_10_%1, 3,3 +%macro PRED4x4_VL 0 +cglobal pred4x4_vertical_left_10, 3, 3 subr0, r2 movu m1, [r0] movhps m1, [r1] @@ -265,18 +268,18 @@ cglobal pred4x4_vertical_left_10_%1, 3,3 RET %endmacro -INIT_XMM -PRED4x4_VL sse2 +INIT_XMM sse2 +PRED4x4_VL %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_VL avx +INIT_XMM avx +PRED4x4_VL %endif ;- ; void pred4x4_horizontal_up(pixel *src, const pixel *topright, int stride) ;- -INIT_MMX -cglobal pred4x4_horizontal_up_10_mmxext, 3,3 +INIT_MMX mmx2 +cglobal pred4x4_horizontal_up_10, 3, 3 sub r0, r2 lea r1, [r0+r2*2] movq m0, [r0+r2*1-8] @@ -309,8 +312,8 @@ cglobal pred4x4_horizontal_up_10_mmxext, 3,3
[libav-devel] [PATCH 2/2] Remove INIT_AVX from x86inc.asm.
From: Ronald S. Bultje rsbul...@gmail.com --- libavutil/x86/x86inc.asm | 8 1 file changed, 8 deletions(-) diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm index d734c6e..1fe9f55 100644 --- a/libavutil/x86/x86inc.asm +++ b/libavutil/x86/x86inc.asm @@ -678,14 +678,6 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits INIT_CPUFLAGS %1 %endmacro -; FIXME: INIT_AVX can be replaced by INIT_XMM avx -%macro INIT_AVX 0 -INIT_XMM -%assign avx_enabled 1 -%define PALIGNR PALIGNR_SSSE3 -%define RESET_MM_PERMUTATION INIT_AVX -%endmacro - %macro INIT_YMM 0-1+ %assign avx_enabled 1 %define RESET_MM_PERMUTATION INIT_YMM %1 -- 1.7.11.3 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/2] Use PRED4x4/8x8/8x8L/16x16 macros to declare x86 intrapred prototypes.
From: Ronald S. Bultje rsbul...@gmail.com --- libavcodec/x86/h264_intrapred.asm| 80 +- libavcodec/x86/h264_intrapred_init.c | 296 ++- 2 files changed, 190 insertions(+), 186 deletions(-) diff --git a/libavcodec/x86/h264_intrapred.asm b/libavcodec/x86/h264_intrapred.asm index dc418f7..7c6aa12 100644 --- a/libavcodec/x86/h264_intrapred.asm +++ b/libavcodec/x86/h264_intrapred.asm @@ -53,7 +53,7 @@ cextern pw_32 ; void pred16x16_vertical(uint8_t *src, int stride) ;- -cglobal pred16x16_vertical_mmx, 2,3 +cglobal pred16x16_vertical_8_mmx, 2,3 sub r0, r1 mov r2, 8 movq mm0, [r0+0] @@ -68,7 +68,7 @@ cglobal pred16x16_vertical_mmx, 2,3 jg .loop REP_RET -cglobal pred16x16_vertical_sse, 2,3 +cglobal pred16x16_vertical_8_sse, 2,3 sub r0, r1 mov r2, 4 movaps xmm0, [r0] @@ -88,7 +88,7 @@ cglobal pred16x16_vertical_sse, 2,3 ;- %macro PRED16x16_H 0 -cglobal pred16x16_horizontal, 2,3 +cglobal pred16x16_horizontal_8, 2,3 mov r2, 8 %if cpuflag(ssse3) mova m2, [pb_3] @@ -130,7 +130,7 @@ INIT_XMM ;- %macro PRED16x16_DC 0 -cglobal pred16x16_dc, 2,7 +cglobal pred16x16_dc_8, 2,7 mov r4, r0 sub r0, r1 pxor mm0, mm0 @@ -193,7 +193,7 @@ INIT_XMM ;- %macro PRED16x16_TM_MMX 0 -cglobal pred16x16_tm_vp8, 2,5 +cglobal pred16x16_tm_vp8_8, 2,5 subr0, r1 pxor mm7, mm7 movq mm0, [r0+0] @@ -234,7 +234,7 @@ INIT_MMX mmx2 PRED16x16_TM_MMX INIT_MMX -cglobal pred16x16_tm_vp8_sse2, 2,6,6 +cglobal pred16x16_tm_vp8_8_sse2, 2,6,6 sub r0, r1 pxor xmm2, xmm2 movdqa xmm0, [r0] @@ -274,7 +274,7 @@ cglobal pred16x16_tm_vp8_sse2, 2,6,6 ;- %macro H264_PRED16x16_PLANE 1 -cglobal pred16x16_plane_%1, 2,9,7 +cglobal pred16x16_plane_%1_8, 2,9,7 mov r2, r1 ; +stride neg r1 ; -stride @@ -556,7 +556,7 @@ INIT_XMM ;- %macro H264_PRED8x8_PLANE 0 -cglobal pred8x8_plane, 2,9,7 +cglobal pred8x8_plane_8, 2,9,7 mov r2, r1 ; +stride neg r1 ; -stride @@ -730,7 +730,7 @@ INIT_XMM ; void pred8x8_vertical(uint8_t *src, int stride) ;- -cglobal pred8x8_vertical_mmx, 2,2 +cglobal pred8x8_vertical_8_mmx, 2,2 subr0, r1 movq mm0, [r0] %rep 3 @@ -747,7 +747,7 @@ cglobal pred8x8_vertical_mmx, 2,2 ;- %macro PRED8x8_H 0 -cglobal pred8x8_horizontal, 2,3 +cglobal pred8x8_horizontal_8, 2,3 mov r2, 4 %if cpuflag(ssse3) mova m2, [pb_3] @@ -774,7 +774,7 @@ INIT_MMX ;- ; void pred8x8_top_dc_mmxext(uint8_t *src, int stride) ;- -cglobal pred8x8_top_dc_mmxext, 2,5 +cglobal pred8x8_top_dc_8_mmxext, 2,5 sub r0, r1 movq mm0, [r0] pxor mm1, mm1 @@ -809,7 +809,7 @@ cglobal pred8x8_top_dc_mmxext, 2,5 ;- INIT_MMX -cglobal pred8x8_dc_mmxext, 2,5 +cglobal pred8x8_dc_8_mmxext, 2,5 sub r0, r1 pxor m7, m7 movd m0, [r0+0] @@ -869,7 +869,7 @@ cglobal pred8x8_dc_mmxext, 2,5 ; void pred8x8_dc_rv40(uint8_t *src, int stride) ;- -cglobal pred8x8_dc_rv40_mmxext, 2,7 +cglobal pred8x8_dc_rv40_8_mmxext, 2,7 mov r4, r0 sub r0, r1 pxor mm0, mm0 @@ -906,7 +906,7 @@ cglobal pred8x8_dc_rv40_mmxext, 2,7 ;- %macro PRED8x8_TM_MMX 0 -cglobal pred8x8_tm_vp8, 2,6 +cglobal pred8x8_tm_vp8_8, 2,6 subr0, r1 pxor mm7, mm7 movq mm0, [r0] @@ -946,7 +946,7 @@ INIT_MMX mmx2 PRED8x8_TM_MMX INIT_MMX -cglobal pred8x8_tm_vp8_sse2, 2,6,4 +cglobal pred8x8_tm_vp8_8_sse2, 2,6,4 sub r0, r1 pxor xmm1, xmm1 movq xmm0, [r0] @@ -974,7 +974,7 @@ cglobal pred8x8_tm_vp8_sse2, 2,6,4 jg .loop REP_RET -cglobal pred8x8_tm_vp8_ssse3, 2,3,6 +cglobal pred8x8_tm_vp8_8_ssse3, 2,3,6 sub r0, r1 movdqa xmm4, [tm_shuf] pxor xmm1, xmm1 @@ -1016,7 +1016,7 @@ cglobal
[libav-devel] [PATCH 2/2] Use ptrdiff_t instead of int for intra pred stride function parameter.
From: Ronald S. Bultje rsbul...@gmail.com --- libavcodec/arm/h264pred_init_arm.c | 36 +++--- libavcodec/h264pred.c| 79 libavcodec/h264pred.h| 18 +-- libavcodec/h264pred_template.c | 233 +-- libavcodec/x86/h264_intrapred_init.c | 15 ++- 5 files changed, 264 insertions(+), 117 deletions(-) diff --git a/libavcodec/arm/h264pred_init_arm.c b/libavcodec/arm/h264pred_init_arm.c index 371b1a6..39c0121 100644 --- a/libavcodec/arm/h264pred_init_arm.c +++ b/libavcodec/arm/h264pred_init_arm.c @@ -23,25 +23,25 @@ #include libavutil/arm/cpu.h #include libavcodec/h264pred.h -void ff_pred16x16_vert_neon(uint8_t *src, int stride); -void ff_pred16x16_hor_neon(uint8_t *src, int stride); -void ff_pred16x16_plane_neon(uint8_t *src, int stride); -void ff_pred16x16_dc_neon(uint8_t *src, int stride); -void ff_pred16x16_128_dc_neon(uint8_t *src, int stride); -void ff_pred16x16_left_dc_neon(uint8_t *src, int stride); -void ff_pred16x16_top_dc_neon(uint8_t *src, int stride); +void ff_pred16x16_vert_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_hor_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_plane_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_128_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_left_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_top_dc_neon(uint8_t *src, ptrdiff_t stride); -void ff_pred8x8_vert_neon(uint8_t *src, int stride); -void ff_pred8x8_hor_neon(uint8_t *src, int stride); -void ff_pred8x8_plane_neon(uint8_t *src, int stride); -void ff_pred8x8_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_128_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_left_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_top_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_l0t_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_0lt_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_l00_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_0l0_dc_neon(uint8_t *src, int stride); +void ff_pred8x8_vert_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_hor_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_plane_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_128_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_left_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_top_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_l0t_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_0lt_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_l00_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_0l0_dc_neon(uint8_t *src, ptrdiff_t stride); static void ff_h264_pred_init_neon(H264PredContext *h, int codec_id, const int bit_depth, const int chroma_format_idc) { diff --git a/libavcodec/h264pred.c b/libavcodec/h264pred.c index fb44046..94cf9d0 100644 --- a/libavcodec/h264pred.c +++ b/libavcodec/h264pred.c @@ -39,7 +39,9 @@ #include h264pred_template.c #undef BIT_DEPTH -static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright, int stride){ +static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright, + ptrdiff_t stride) +{ const unsigned lt = src[-1-1*stride]; LOAD_TOP_EDGE LOAD_TOP_RIGHT_EDGE @@ -54,7 +56,9 @@ static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright, int st AV_WN32A(src+3*stride, v); } -static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright, int stride){ +static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright, + ptrdiff_t stride) +{ const unsigned lt = src[-1-1*stride]; LOAD_LEFT_EDGE @@ -64,7 +68,9 @@ static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright, int AV_WN32A(src+3*stride, ((l2 + 2*l3 + l3 + 2) 2)*0x01010101); } -static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright, int stride){ +static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright, + ptrdiff_t stride) +{ LOAD_TOP_EDGE LOAD_LEFT_EDGE @@ -86,7 +92,9 @@ static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright, int src[3+3*stride]=(l3 + t3)1; } -static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright, int stride){ +static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright, + ptrdiff_t stride) +{ LOAD_TOP_EDGE LOAD_TOP_RIGHT_EDGE LOAD_LEFT_EDGE @@ -110,7 +118,10 @@ static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright, int src[3+3*stride]=(t6 + t7 + 1 + l6 + l7 + 1)2; } -static void pred4x4_down_left_rv40_nodown_c(uint8_t *src, const uint8_t *topright, int stride){ +static void
Re: [libav-devel] [PATCH 2/2] swscale: support gray to 9bit and 10bit formats
Hi, On Fri, Oct 26, 2012 at 8:35 AM, Luca Barbato lu_z...@gentoo.org wrote: Signed-off-by: Luca Barbato lu_z...@gentoo.org --- Partially fixes the issue, there is still something wrong somewhere, possibly a buffer overrun or yet some memory not properly set. How do we reproduce? What kind of msgs do you get in valgrind? Do 32/64bit outputs match? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 11/11] vp8: fix memset() crossing array boundary
Hi, On Fri, Oct 26, 2012 at 5:50 PM, Mans Rullgard m...@mansr.com wrote: Indexing across array boundaries is not allowed by C99. Signed-off-by: Mans Rullgard m...@mansr.com --- libavcodec/vp8.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavcodec/vp8.c b/libavcodec/vp8.c index 2766c9e..d9902c4 100644 --- a/libavcodec/vp8.c +++ b/libavcodec/vp8.c @@ -1964,7 +1964,8 @@ static int vp8_decode_frame(AVCodecContext *avctx, void *data, int *data_size, // top edge of 127 for intra prediction if (!(avctx-flags CODEC_FLAG_EMU_EDGE)) { s-top_border[0][15] = s-top_border[0][23] = 127; -memset(s-top_border[1]-1, 127, s-mb_width*sizeof(*s-top_border)+1); +s-top_border[0][31] = 127; +memset(s-top_border[1], 127, s-mb_width*sizeof(*s-top_border)); } OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] swscale: support gray to 9bit and 10bit formats
Hi, On Sun, Oct 14, 2012 at 5:22 PM, Luca Barbato lu_z...@gentoo.org wrote: +if (isPlanar(dstFormat) isALPHA(dstFormat) !alpPixBuf) { +int length = dstW; +int height = dstY - lastDstY; +if (is9_OR_10BPS(dstFormat)) { +const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat); +fill_plane9or10(dst[3], dstStride[3], length, height, lastDstY, +255, desc-comp[3].depth_minus1 + 1, +isBE(dstFormat)); +} else if (is16BPS(c-dstFormat)) + length *= 2; + +fillPlane(dst[3], dstStride[3], length, height, lastDstY, 255); Doesn't this call fillPlane, even if we're 9 or 10 bits? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Wed, Oct 24, 2012 at 10:07 AM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: From: Ronald S. Bultje rsbul...@gmail.com Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++- libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 167 +--- 5 files changed, 181 insertions(+), 93 deletions(-) What happened to this? Is there something wrong with the patch? I am addressing reviews from the x264 people and am somewhat slow at testing new revisions because of other work... Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] Problem with H.264 and NALU chunks
Hi, On Fri, Oct 19, 2012 at 6:55 PM, Felipe Contreras felipe.contre...@gmail.com wrote: Hi, While testing an mpegts stream I found what appears to be a bug in the H.264 decoder that only happens on certain processors. The stream is here[1], but I've split the specific chunks I get from the demuxer. They are not complete access-units, but I've set the flag CODEC_FLAG2_CHUNKS. On one laptop it works perfectly fine, on another I get a bunch of corruption, errors, and valgrind complains that wrong memory accesses. The first one is a Core i5, the other a Core i7. This happens both on v0.8.3 and the master branch. If I use GStreamer's H.264 parser to aggregate the NAL units into access-units, it works fine. Any ideas? Aren't you supposed to use the internal h264 parser to do the chunking before feeding it into the decoder if you choose to not use the built-in demuxers? (The built-in ones will automatically call the parser for you if you call av_read_frame() instead of av_read_packet().) Or in other words, is this in any way reproducible with avconv/avplay? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org wrote: --- Here an initial patch to support many yuva, apparently either I botched adding them (since I did lots of cut and paste) or they manage the expose some flaws in swscale. valgrind manages to spot something and might be nice if other people try this and see if the values for the fate-lavfi change after each round. If this is what I think it is, search for YUVA420P in swscale and replace it with a generic alpha check. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org wrote: --- Here an initial patch to support many yuva, apparently either I botched adding them (since I did lots of cut and paste) or they manage the expose some flaws in swscale. valgrind manages to spot something and might be nice if other people try this and see if the values for the fate-lavfi change after each round. If this is what I think it is, search for YUVA420P in swscale and replace it with a generic alpha check. Oh I see now. swscale.c:661 if (isPlanar(dstFormat) isALPHA(dstFormat) !alpPixBuf) fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255); This only works for 8bit. For 9bit, you need a word fill with a value of (1 numbits) - 1. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org wrote: --- Here an initial patch to support many yuva, apparently either I botched adding them (since I did lots of cut and paste) or they manage the expose some flaws in swscale. valgrind manages to spot something and might be nice if other people try this and see if the values for the fate-lavfi change after each round. If this is what I think it is, search for YUVA420P in swscale and replace it with a generic alpha check. Oh I see now. swscale.c:661 if (isPlanar(dstFormat) isALPHA(dstFormat) !alpPixBuf) fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255); This only works for 8bit. For 9bit, you need a word fill with a value of (1 numbits) - 1. Which actually leads to a question: does any of this exist in the real world? I've never, ever, ever seen an image with non-8bpc alpha planes. I hate to invent stuff that has no utility but does carry implementation burden. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sun, Oct 14, 2012 at 1:39 PM, Luca Barbato lu_z...@gentoo.org wrote: On 10/14/2012 09:54 PM, Ronald S. Bultje wrote: Hi, On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org wrote: --- Here an initial patch to support many yuva, apparently either I botched adding them (since I did lots of cut and paste) or they manage the expose some flaws in swscale. valgrind manages to spot something and might be nice if other people try this and see if the values for the fate-lavfi change after each round. If this is what I think it is, search for YUVA420P in swscale and replace it with a generic alpha check. Oh I see now. swscale.c:661 if (isPlanar(dstFormat) isALPHA(dstFormat) !alpPixBuf) fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255); This only works for 8bit. For 9bit, you need a word fill with a value of (1 numbits) - 1. Which actually leads to a question: does any of this exist in the real world? I've never, ever, ever seen an image with non-8bpc alpha planes. I hate to invent stuff that has no utility but does carry implementation burden. You asked me to add those ^^; Uhm, ok, maybe I meant something else? I mean, there really isn't much use for 9-15bpp alpha channels as far as I know. 8bpp is obvious and I can sort of see 16bpp being useful (although I don't think I've ever seen it used). BTW while I was investigating I noticed all the previous 9 and 10 formats look strange, if you have time we could have a look. Can you elaborate? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sun, Oct 14, 2012 at 2:12 PM, Luca Barbato lu_z...@gentoo.org wrote: On 10/14/2012 10:43 PM, Ronald S. Bultje wrote: Hi, On Sun, Oct 14, 2012 at 1:39 PM, Luca Barbato lu_z...@gentoo.org wrote: On 10/14/2012 09:54 PM, Ronald S. Bultje wrote: Hi, On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org wrote: --- Here an initial patch to support many yuva, apparently either I botched adding them (since I did lots of cut and paste) or they manage the expose some flaws in swscale. valgrind manages to spot something and might be nice if other people try this and see if the values for the fate-lavfi change after each round. If this is what I think it is, search for YUVA420P in swscale and replace it with a generic alpha check. Oh I see now. swscale.c:661 if (isPlanar(dstFormat) isALPHA(dstFormat) !alpPixBuf) fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255); This only works for 8bit. For 9bit, you need a word fill with a value of (1 numbits) - 1. Which actually leads to a question: does any of this exist in the real world? I've never, ever, ever seen an image with non-8bpc alpha planes. I hate to invent stuff that has no utility but does carry implementation burden. You asked me to add those ^^; Uhm, ok, maybe I meant something else? I mean, there really isn't much use for 9-15bpp alpha channels as far as I know. 8bpp is obvious and I can sort of see 16bpp being useful (although I don't think I've ever seen it used). BTW while I was investigating I noticed all the previous 9 and 10 formats look strange, if you have time we could have a look. Can you elaborate? try this command ./avconv -v quiet -nostats -y -cpuflags all -i ./tests/vsynth1/%02d.pgm -vcodec rawvideo -pix_fmt yuv420p10 -f nut - | ./avplay - You will see it blue tinted while it should be gray. (you might need to apply my yet to be reviewed patch to fix nut with rawvideo). Well yes, can we first sort out this stuff without nut? I mean, I really don't have much reason to trust nut at this moment. When you test a typical conversion (convert that thing to nut without colorspace conversion, and then ./avplay -vf format=yuv420p10le), it works, so the colorspace conversion itself is correct. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] rmdec: Move code shared with Matroska demuxer to RealMedia common code
Hi, On Sun, Oct 14, 2012 at 3:02 PM, Diego Biurrun di...@biurrun.de wrote: This removes a dependency of the Matroska demuxer on the RealMedia demuxer while only minimally bloating a RealMedia muxer only build. --- libavformat/Makefile |4 ++-- libavformat/rm.c | 41 + libavformat/rmdec.c | 41 - 3 files changed, 43 insertions(+), 43 deletions(-) I'm a little confused, rm.c contains code that is to be shared between rmdec and rmenc. Shouldn't you add a new file (e.g. sipr.c) that contains sipr-specific code to be shared between different demuxers (e.g. rmdec and matroskadec)? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 09/15] swscale: avoid pointless use of compound literals
On Sun, Oct 14, 2012 at 8:11 PM, Mans Rullgard m...@mansr.com wrote: Some compilers (e.g. old gcc) have trouble with these. Signed-off-by: Mans Rullgard m...@mansr.com --- libswscale/swscale_unscaled.c | 17 + 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c index 70eff72..5efc647 100644 --- a/libswscale/swscale_unscaled.c +++ b/libswscale/swscale_unscaled.c @@ -396,6 +396,11 @@ static int planarRgbToRgbWrapper(SwsContext *c, const uint8_t *src[], uint8_t *dst[], int dstStride[]) { int alpha_first = 0; +const uint8_t *src102[] = { src[1], src[0], src[2] }; +const uint8_t *src201[] = { src[2], src[0], src[1] }; +int stride102[] = { srcStride[1], srcStride[0], srcStride[2] }; +int stride201[] = { srcStride[2], srcStride[0], srcStride[1] }; + if (c-srcFormat != AV_PIX_FMT_GBRP) { av_log(c, AV_LOG_ERROR, unsupported planar RGB conversion %s - %s\n, av_get_pix_fmt_name(c-srcFormat), @@ -405,15 +410,13 @@ static int planarRgbToRgbWrapper(SwsContext *c, const uint8_t *src[], switch (c-dstFormat) { case AV_PIX_FMT_BGR24: -gbr24ptopacked24((const uint8_t *[]) { src[1], src[0], src[2] }, - (int []) { srcStride[1], srcStride[0], srcStride[2] }, +gbr24ptopacked24(src102, stride102, dst[0] + srcSliceY * dstStride[0], dstStride[0], srcSliceH, c-srcW); break; I find it slightly hilarious that there is no way in hell this would have gotten in had the compiler's name been gcc. Why don't you use c99-to-c89 for this so-called compiler? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] H.264: Convert 8-bit qpel inlined assembly to yasm
Hi, On Sat, Oct 13, 2012 at 9:05 AM, Diego Biurrun di...@biurrun.de wrote: On Sat, Oct 13, 2012 at 10:04:50AM -0500, Daniel Kang wrote: +%macro op_avgh 3 +movh %3, %2 +pavgb %1, %3 +movh %2, %1 +%endmacro + +%macro op_avg 3 +pavgb %1, %2 +mova %2, %1 +%endmacro + +%macro op_puth 3 +movh %2, %1 +%endmacro + +%macro op_put 3 +mova %2, %1 +%endmacro All of these macros only take 2 parameters. op_avgh takes 3. +.loop: +movh m1, [r1-1] +movh m2, [r1+0] +movh m3, [r1+1] +movh m0, [r1+2] Spaces around operators would help readability IMO; more below... No spaces around operators in assembly dereferences please. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] Fix for FFMPEG vp8dec loopfilter delta values reset at keyframes
Hi, On Fri, Oct 12, 2012 at 7:12 AM, Luca Barbato lu_z...@gentoo.org wrote: On 10/12/2012 12:42 PM, Sami Pietilä wrote: Hi all, while testing a VP8 encoder I found a mismatch between FFMPEG and libvpx VP8 decoders. The reason for this mismatch is that FFMPEG doesn't reset loopfilter delta values at keyframes. Patch that fixes the issue is below. I've verified that the output of ffmpeg after this patch matches libvpx. If you need I can send you a stream that shows the issue. Would be nice to have a small vector to add it to fate, indeed. The patch itself looks like that this specific detail hadn't been considered, while implementing it. If you can point me where in the specification it is stated I'd like to point it in the commit message. It's only natural for all codec state to be reset in a keyframe. It is supposed to be a re-entry point where all values are reset to defaults. The patch itself is LGTM, I was about to push it myself just now. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] Fix for FFMPEG vp8dec loopfilter delta values reset at keyframes
Hi, On Fri, Oct 12, 2012 at 7:18 AM, Luca Barbato lu_z...@gentoo.org wrote: On 10/12/2012 04:15 PM, Ronald S. Bultje wrote: If you can point me where in the specification it is stated I'd like to point it in the commit message. It's only natural for all codec state to be reset in a keyframe. It is supposed to be a re-entry point where all values are reset to defaults. The patch itself is LGTM, I was about to push it myself just now. Please do =) Pushed. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Support for MSVC.
Hi, On Tue, Oct 9, 2012 at 10:33 AM, aviad rozenhek avia...@gmail.com wrote: On Thu, Jun 14, 2012 at 1:59 AM, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, see attached. It's kinda big, some parts can/should be done differently, but it's a starting point so let's start talking about how to get this in the right way. Some general comments: - only tested on 32bit so far - no it won't compile on non-MSVC with my patch (I had to break the Makefile in a couple of places to get linking to work) - the include of stdlib.h everywhere is because of [1]. We may need to add av_restrict instead. - the include of mathematics.h, avconfig.h and avstring.h everywhere is (I think) legit, to account for snprintf(), M_PI or inline. - the added assembly in fmtconvert.asm helps to get rid of a VLA in fmtconvert_mmx.c. - test.c converts c99 to c89 which MSVC can compile. Don't look at it unless you want to see true pain, it's intended to be shipped outside, like gas-preprocessor.pl. (unit.c and unit2.c are unittests for it.) - the renames of libavutil/x86/cpu.c, libavcodec/x86/mlpdsp.c and libswscale/x86/rgb2rgb.c are to prevent duplicate filenames in the same library, which makes the MSVC linker crap out in debug mode (works fine in release mode). - it mostly passes fate, except zlib-dependent tests (because I didn't try including zlib yet) and wtv (which uses gmtime(), which is broken on Windows since it only supports years up to 3000, and our test file uses 10900.) All other tests pass. Ronald [1] http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/bf3c0191-c15d-4e73-a111-3796608900c7 can you provide prebuilt binaries for c99-to-c89.exe for windows? it will make using it that much easier https://gerrit.chromium.org/gerrit/#/c/34728/ You'll need c99wrap.exe also, but that one is trivial to build yourself, since it has no dependencies. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/5] vc1dec: Invoke edge emulation regardless of MV precision for 1-MV chroma
Hi, On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar mashiat.sar...@gmail.com wrote: This is required due to the way VC-1 handles chroma pull-back which may end up causing negative chroma MV for zero luma MV. Edge emulation needs to be invoked in such cases. This problem only affects chroma Y motion vector. --- libavcodec/vc1dec.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c index 42eb4a5..2683c86 100644 --- a/libavcodec/vc1dec.c +++ b/libavcodec/vc1dec.c @@ -430,7 +430,7 @@ static void vc1_mc_1mv(VC1Context *v, int dir) if (v-rangeredfrm || (v-mv_mode == MV_PMODE_INTENSITY_COMP) || s-h_edge_pos 22 || v_edge_pos 22 || (unsigned)(src_x - s-mspel) s-h_edge_pos - (mx3) - 16 - s-mspel * 3 -|| (unsigned)(src_y - s-mspel) v_edge_pos- (my3) - 16 - s-mspel * 3) { +|| (unsigned)(src_y - 1) v_edge_pos- (my3) - 16 - 3) { I'm slightly curious why this is only the case for the vertical dimension, but not the horizontal. Is that intentional, or an oversight? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 5/5] Double motion vector range for HPEL interlaced picture in proper place
Hi, On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar mashiat.sar...@gmail.com wrote: The existing code is not in the right place and it should cover both interlaced frame and field pictures. --- libavcodec/vc1.c|5 + libavcodec/vc1dec.c |4 2 files changed, 5 insertions(+), 4 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/5] vc1dec: Set opposite to the correct value for 1REF field pictures
Hi, On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar mashiat.sar...@gmail.com wrote: --- libavcodec/vc1dec.c | 14 ++ 1 file changed, 10 insertions(+), 4 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/5] vc1dec: Set chroma reference field from REFFIELD for 1REF field pictures
Hi, On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar mashiat.sar...@gmail.com wrote: Interlaced field pictures can have one or two reference pictures, signaled by NUMREF syntax element. For single reference pictures, reference picture is determined by REFFIELD syntax element. --- libavcodec/vc1dec.c |1 + 1 file changed, 1 insertion(+) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/5] vc1dec: Invoke edge emulation regardless of MV precision for 1-MV chroma
Hi, On Tue, Oct 9, 2012 at 4:56 PM, Mashiat Sarker Shakkhar mashiat.sar...@gmail.com wrote: On 10/9/2012 7:37 PM, Ronald S. Bultje wrote: Hi, On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar mashiat.sar...@gmail.com wrote: This is required due to the way VC-1 handles chroma pull-back which may end up causing negative chroma MV for zero luma MV. Edge emulation needs to be invoked in such cases. This problem only affects chroma Y motion vector. --- libavcodec/vc1dec.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c index 42eb4a5..2683c86 100644 --- a/libavcodec/vc1dec.c +++ b/libavcodec/vc1dec.c @@ -430,7 +430,7 @@ static void vc1_mc_1mv(VC1Context *v, int dir) if (v-rangeredfrm || (v-mv_mode == MV_PMODE_INTENSITY_COMP) || s-h_edge_pos 22 || v_edge_pos 22 || (unsigned)(src_x - s-mspel) s-h_edge_pos - (mx3) - 16 - s-mspel * 3 -|| (unsigned)(src_y - s-mspel) v_edge_pos- (my3) - 16 - s-mspel * 3) { +|| (unsigned)(src_y - 1) v_edge_pos- (my3) - 16 - 3) { I'm slightly curious why this is only the case for the vertical dimension, but not the horizontal. Is that intentional, or an oversight? This thread contains my earlier discussion on a rejected patch which tried to fix the same bug: http://patches.libav.org/patch/26240/ OK so it's not because of pullback per se, but because of field adjustment, then it makes sense. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/9] Replace PIX_FMT_* - AV_PIX_FMT_*, PixelFormat - AVPixelFormat
Hi, On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote: [..] Yay! OK. (Didn't review at all, a sed replace can't be that hard, plus this is awesome.) Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.
Hi, On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote: --- avprobe.c |6 +++--- cmdutils.c| 16 +++- tools/graph2dot.c |4 ++-- 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/avprobe.c b/avprobe.c index 16a5d29..3a3ae0f 100644 --- a/avprobe.c +++ b/avprobe.c @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int stream_idx) const char *profile; char val_str[128]; AVRational display_aspect_ratio; +const AVPixFmtDescriptor *desc; probe_object_header(stream); @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int stream_idx) rational_string(val_str, sizeof(val_str), :, display_aspect_ratio)); } -probe_str(pix_fmt, - dec_ctx-pix_fmt != AV_PIX_FMT_NONE ? - av_pix_fmt_descriptors[dec_ctx-pix_fmt].name : unknown); +desc = av_pix_fmt_desc_get(dec_ctx-pix_fmt); I think this is an awful idea, it should be OK to access the array directly. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: Ronald S. Bultje rsbul...@gmail.com Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++- libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 167 +--- 5 files changed, 181 insertions(+), 93 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 940a8f7..43aaf6d 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -399,14 +399,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -444,22 +442,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+16 ; transpose 6x16 - tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -501,7 +496,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -632,7 +626,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -647,10 +641,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -699,9 +690,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -738,12 +726,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 - tmp space @@ -774,7 +760,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index 3f6ded4..8596f26 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c-h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c-h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c-h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c-h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3
Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.
Hi, On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov an...@khirnov.net wrote: On Sat, 6 Oct 2012 12:35:44 -0700, Ronald S. Bultje rsbul...@gmail.com wrote: On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote: --- avprobe.c |6 +++--- cmdutils.c| 16 +++- tools/graph2dot.c |4 ++-- 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/avprobe.c b/avprobe.c index 16a5d29..3a3ae0f 100644 --- a/avprobe.c +++ b/avprobe.c @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int stream_idx) const char *profile; char val_str[128]; AVRational display_aspect_ratio; +const AVPixFmtDescriptor *desc; probe_object_header(stream); @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int stream_idx) rational_string(val_str, sizeof(val_str), :, display_aspect_ratio)); } -probe_str(pix_fmt, - dec_ctx-pix_fmt != AV_PIX_FMT_NONE ? - av_pix_fmt_descriptors[dec_ctx-pix_fmt].name : unknown); +desc = av_pix_fmt_desc_get(dec_ctx-pix_fmt); I think this is an awful idea, it should be OK to access the array directly. I don't see any advantage to accessing the array directly. It often leads to smaller code. Instead of AVPixFmtDesc *d = getter(pix_fmt); if (!d) return -1; printf(%d\n, d-something);, you can just do printf(%d\n, array[pix_fmt].something);. Do we really need to babysit our users? It is the only case in all Libav where we export an array like this. We never supported MSVC. So we shouldn't? It fixes the size of AVPixFmtDescriptor for no good reason. Is there a need to extend it? It prevents the caller from accessing all the descriptors when he's using a newer library than the one he compiled against. if (compile_ver runtime_ver) error(); on app init. Likely, you want to do that anyway. And I hear there's some trouble with the evil microsoft thing and data symbols. You have that anyway for any symbol shared between lavu/f/fi/c/r/*. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.
Hi, On Sat, Oct 6, 2012 at 9:19 PM, Anton Khirnov an...@khirnov.net wrote: On Sat, 6 Oct 2012 17:01:57 -0700, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov an...@khirnov.net wrote: On Sat, 6 Oct 2012 12:35:44 -0700, Ronald S. Bultje rsbul...@gmail.com wrote: On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote: --- avprobe.c |6 +++--- cmdutils.c| 16 +++- tools/graph2dot.c |4 ++-- 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/avprobe.c b/avprobe.c index 16a5d29..3a3ae0f 100644 --- a/avprobe.c +++ b/avprobe.c @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int stream_idx) const char *profile; char val_str[128]; AVRational display_aspect_ratio; +const AVPixFmtDescriptor *desc; probe_object_header(stream); @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int stream_idx) rational_string(val_str, sizeof(val_str), :, display_aspect_ratio)); } -probe_str(pix_fmt, - dec_ctx-pix_fmt != AV_PIX_FMT_NONE ? - av_pix_fmt_descriptors[dec_ctx-pix_fmt].name : unknown); +desc = av_pix_fmt_desc_get(dec_ctx-pix_fmt); I think this is an awful idea, it should be OK to access the array directly. I don't see any advantage to accessing the array directly. It often leads to smaller code. Instead of AVPixFmtDesc *d = getter(pix_fmt); if (!d) return -1; printf(%d\n, d-something);, you can just do printf(%d\n, array[pix_fmt].something);. Do we really need to babysit our users? Checking for d to be non-NULL is equivalent to checking that pix_fmt is valid. So no, it does not necessarily makes the code longer. It is the only case in all Libav where we export an array like this. We never supported MSVC. So we shouldn't? YES! =p Seriously though, there are only disadvantages to exporting the array directly. It fixes the size of AVPixFmtDescriptor for no good reason. Is there a need to extend it? There might be. You don't want to find out that you need to extend it two months after somebody else just bumped lavu. It prevents the caller from accessing all the descriptors when he's using a newer library than the one he compiled against. if (compile_ver runtime_ver) error(); on app init. Likely, you want to do that anyway. And I hear there's some trouble with the evil microsoft thing and data symbols. You have that anyway for any symbol shared between lavu/f/fi/c/r/*. It should be our long-term goal to remove those. You can't. Stuff like ff_sqrt[], ff_log2[] etc. can't be removed. Never. Sorry. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.
Hi, On Sat, Oct 6, 2012 at 9:47 PM, Anton Khirnov an...@khirnov.net wrote: On Sat, 6 Oct 2012 21:44:03 -0700, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 6, 2012 at 9:19 PM, Anton Khirnov an...@khirnov.net wrote: On Sat, 6 Oct 2012 17:01:57 -0700, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov an...@khirnov.net wrote: On Sat, 6 Oct 2012 12:35:44 -0700, Ronald S. Bultje rsbul...@gmail.com wrote: On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote: --- avprobe.c |6 +++--- cmdutils.c| 16 +++- tools/graph2dot.c |4 ++-- 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/avprobe.c b/avprobe.c index 16a5d29..3a3ae0f 100644 --- a/avprobe.c +++ b/avprobe.c @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int stream_idx) const char *profile; char val_str[128]; AVRational display_aspect_ratio; +const AVPixFmtDescriptor *desc; probe_object_header(stream); @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int stream_idx) rational_string(val_str, sizeof(val_str), :, display_aspect_ratio)); } -probe_str(pix_fmt, - dec_ctx-pix_fmt != AV_PIX_FMT_NONE ? - av_pix_fmt_descriptors[dec_ctx-pix_fmt].name : unknown); +desc = av_pix_fmt_desc_get(dec_ctx-pix_fmt); I think this is an awful idea, it should be OK to access the array directly. I don't see any advantage to accessing the array directly. It often leads to smaller code. Instead of AVPixFmtDesc *d = getter(pix_fmt); if (!d) return -1; printf(%d\n, d-something);, you can just do printf(%d\n, array[pix_fmt].something);. Do we really need to babysit our users? Checking for d to be non-NULL is equivalent to checking that pix_fmt is valid. So no, it does not necessarily makes the code longer. It is the only case in all Libav where we export an array like this. We never supported MSVC. So we shouldn't? YES! =p Seriously though, there are only disadvantages to exporting the array directly. It fixes the size of AVPixFmtDescriptor for no good reason. Is there a need to extend it? There might be. You don't want to find out that you need to extend it two months after somebody else just bumped lavu. It prevents the caller from accessing all the descriptors when he's using a newer library than the one he compiled against. if (compile_ver runtime_ver) error(); on app init. Likely, you want to do that anyway. And I hear there's some trouble with the evil microsoft thing and data symbols. You have that anyway for any symbol shared between lavu/f/fi/c/r/*. It should be our long-term goal to remove those. You can't. Stuff like ff_sqrt[], ff_log2[] etc. can't be removed. Never. Sorry. The same way we can never have msvc support? Right. Shall we move on to more practical patches? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/9] pixfmt: add AV_ prefixes to PIX_FMT_*
Hi, On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote: --- doc/APIchanges |8 ++ libavutil/Makefile |1 + libavutil/old_pix_fmts.h | 128 libavutil/pixfmt.h | 298 ++ libavutil/version.h |7 +- 5 files changed, 311 insertions(+), 131 deletions(-) create mode 100644 libavutil/old_pix_fmts.h \o/ OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Drop broken and unused CABAC test program.
Hi, On Fri, Oct 5, 2012 at 3:50 AM, Diego Biurrun di...@biurrun.de wrote: --- The test program is of doubtful utility, fails to compile on x86_32 and was never part of our test suite. libavcodec/Makefile |3 +- libavcodec/cabac.c | 147 --- 2 files changed, 1 insertions(+), 149 deletions(-) This is probably the 20th time you're submitting this patch. Can you please not do that? Are you just hoping someone isn't paying attention and accidently says yes? Why don't you fix the bug that prevents it from working. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.
Hi, On Fri, Oct 5, 2012 at 12:25 AM, Diego Biurrun di...@biurrun.de wrote: On Fri, Oct 05, 2012 at 02:48:20AM +0200, Luca Barbato wrote: On 10/05/2012 01:29 AM, Ronald S. Bultje wrote: On Thu, Oct 4, 2012 at 4:27 PM, Ronald S. Bultje rsbul...@gmail.com wrote: Oh right you mean for performance? fate-h264-conformance-cama2_vtc_b fate-h264-conformance-cama3_vtc_b should be ok. I'd push your patch + my style nits soon, I'll build tsan2 soon (yai gentoo live ebuilds) since I'm curious about what's wrong with slice threading I guess most of the issues could be uncovered clearly there. Here's an issue to fix wrt sliced threading: http://fate.libav.org/x86_64-linux-slice-threading/20121005005502 One of the mailinglists recently had an email about our slice threading impl relying on wakeups-only-by-signal, i.e. spurious wakeups could lead to races, can you look at that and see if it fixes it? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: Ronald S. Bultje rsbul...@gmail.com Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++- libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 - libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 160 +--- 5 files changed, 175 insertions(+), 92 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 6abdba2..6137678 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -401,14 +401,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -446,22 +444,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+16 ; transpose 6x16 - tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -503,7 +498,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -636,7 +630,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -651,10 +645,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -703,9 +694,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -742,12 +730,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 - tmp space @@ -778,7 +764,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index bb48867..f84e3e8 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -276,18 +276,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c-h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c-h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c-h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c-h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: Ronald S. Bultje rbul...@chromium.org Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++- libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 +- libavcodec/x86/vp8dsp_init.c| 8 --- libavutil/x86/x86inc.asm| 156 +--- 5 files changed, 173 insertions(+), 90 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 6abdba2..6137678 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -401,14 +401,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -446,22 +444,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+16 ; transpose 6x16 - tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -503,7 +498,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -636,7 +630,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -651,10 +645,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -703,9 +694,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -742,12 +730,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 - tmp space @@ -778,7 +764,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index bb48867..f84e3e8 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -276,18 +276,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c-h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c-h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c-h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c-h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: Ronald S. Bultje rbul...@chromium.org Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++- libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 - libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 158 +--- 5 files changed, 173 insertions(+), 92 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 6abdba2..6137678 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -401,14 +401,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -446,22 +444,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+16 ; transpose 6x16 - tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -503,7 +498,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -636,7 +630,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -651,10 +645,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -703,9 +694,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -742,12 +730,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 - tmp space @@ -778,7 +764,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index bb48867..f84e3e8 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -276,18 +276,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c-h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c-h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c-h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c-h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Fri, Oct 5, 2012 at 1:46 PM, Diego Biurrun di...@biurrun.de wrote: On Fri, Oct 05, 2012 at 01:38:44PM -0700, Ronald S. Bultje wrote: From: Ronald S. Bultje rsbul...@gmail.com Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++- libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 - libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 160 +--- 5 files changed, 175 insertions(+), 92 deletions(-) This has tabs in many places - please fix your editor. Fixed. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.
Hi, On Thu, Oct 4, 2012 at 12:59 PM, Luca Barbato lu_z...@gentoo.org wrote: On 10/04/2012 01:46 AM, Luca Barbato wrote: From: Ronald S. Bultje rsbul...@gmail.com The variable is copied to subsequent threads at the same time, so this may cause wrong ref_count[] values to be copied to subsequent threads. This bug was found using TSAN. Signed-off-by: Luca Barbato lu_z...@gentoo.org --- Few cosmetic nits. The patch looks fine, do you have a vector to try it? just fate-h264 will trigger it sufficiently. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.
On Thu, Oct 4, 2012 at 4:27 PM, Ronald S. Bultje rsbul...@gmail.com wrote: Hi, On Thu, Oct 4, 2012 at 12:59 PM, Luca Barbato lu_z...@gentoo.org wrote: On 10/04/2012 01:46 AM, Luca Barbato wrote: From: Ronald S. Bultje rsbul...@gmail.com The variable is copied to subsequent threads at the same time, so this may cause wrong ref_count[] values to be copied to subsequent threads. This bug was found using TSAN. Signed-off-by: Luca Barbato lu_z...@gentoo.org --- Few cosmetic nits. The patch looks fine, do you have a vector to try it? just fate-h264 will trigger it sufficiently. Oh right you mean for performance? fate-h264-conformance-cama2_vtc_b fate-h264-conformance-cama3_vtc_b should be ok. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Shared library support for MSVC build.
Hi, On Wed, Oct 3, 2012 at 4:02 AM, Diego Biurrun di...@biurrun.de wrote: On Tue, Oct 02, 2012 at 05:48:26PM -0700, Ronald S. Bultje wrote: 2) a lot of ff_ symbols, and dsputil_init, are accessed from outside the library in which they exist. These are bugs and should be fixed (i.e. renamed). You seem to be working on the wrong tree. This is Chrome's tree. --- a/configure +++ b/configure @@ -2616,7 +2616,7 @@ probe_cc(){ _flags_filter=msvc_flags -_ld_lib='lib%.a' +_ld_lib='%.lib' _ld_path='-libpath:' Do static libs still work as expected? Will test. @@ -3078,11 +3078,13 @@ case $target_os in shlibdir_default=$bindir_default SLIBPREF= SLIBSUF=.dll + LIBPREF= + LIBSUF=.lib Tabs; fix your editor. Yup, fixed. How does this affect normal gcc MinGW builds? Haven't tested yet. It probably breaks it right now, I'll unbreak that later. --- a/libavcodec/golomb.h +++ b/libavcodec/golomb.h @@ -33,13 +33,14 @@ -extern const uint8_t ff_golomb_vlc_len[512]; -extern const uint8_t ff_ue_golomb_vlc_code[512]; -extern const int8_t ff_se_golomb_vlc_code[512]; -extern const uint8_t ff_ue_golomb_len[256]; +extern AVCODEC_SYMBOL const uint8_t ff_golomb_vlc_len[512]; +extern AVCODEC_SYMBOL const uint8_t ff_ue_golomb_vlc_code[512]; +extern AVCODEC_SYMBOL const int8_t ff_se_golomb_vlc_code[512]; +extern AVCODEC_SYMBOL const uint8_t ff_ue_golomb_len[256]; I somehow doubt marking ff_-prefixed symbols is necessary. --- a/libavcodec/raw.h +++ b/libavcodec/raw.h @@ -28,13 +28,14 @@ -extern const PixelFormatTag ff_raw_pix_fmt_tags[]; +extern AVCODEC_SYMBOL const PixelFormatTag ff_raw_pix_fmt_tags[]; same So, yes, but, these are used between libraries. E.g. ff_raw_pix_fmt_tags is used in libavformat: dhcp-172-31-81-183:ffmpeg rbultje$ grep ff_raw_pix_fmt_tags libavformat/*.c libavformat/utils.c: if(ff_find_pix_fmt(ff_raw_pix_fmt_tags, tag) == st-codec-pix_fmt) and the golomb symbols are used by the golomb fate test. Maybe we can link in a bunch of other .c files from libavcodec instead of linking to avcodec.dll instead (I agree that it's ugly). Note that gcc/mingw shared lib build (see symbol list in libavcodec.v) also exports those symbols. +#ifndef AVCODEC_SYMBOLS_H +#define AVCODEC_SYMBOLS_H + +#include libavutil/avconfig.h + +#if AV_HAVE_SHARED_LIBS defined(_MSC_VER) !defined(COMPILING_avcodec) What's AV_HAVE_SHARED_LIBS? I'm not aware of it anywhere. What do you need avconfig.h for? These AVCODEC_SYMBOL things will be used in public headers, thus the definition needs t exist in a public header. Since I need to use __declspec(dllimport) only for .dll (shared) builds, not for .lib (static) builds, I need to know in a public header whether I'm a shared lib or not - hence AV_HAVE_SHARED_LIBS (I added that to configure, based on CONFIG_SHARED). +#ifndef AVUTIL_SYMBOLS_H +#define AVUTIL_SYMBOLS_H + +#include libavutil/avconfig.h No need for the libavutil/ prefix, it's the same directory. It's generated, i.e. it lives in the build dir, not in the source dir, so if you don't have the libavutil prefix, it won't build, since the header isn't in the same dir (sourcedir vs. builddir) as the file that included it. See other files including specifically avconfig.h, they all use the libavutil/ prefix. --- a/library.mak +++ b/library.mak @@ -94,7 +95,7 @@ endef -$(EXAMPLES) $(TESTPROGS) $(TOOLS): $(THIS_LIB) $(DEP_LIBS) +#$(EXAMPLES) $(TESTPROGS) $(TOOLS): $(THIS_LIB) $(DEP_LIBS) $(TESTPROGS): $(SUBDIR)$(LIBNAME) stray change No, that's fully intentional, although it should be removed. So, what this does, is that it adds DEP_LIBS (plus unrelated stuff) as a dependency to TESTPROGS, i.e. *.dll becomes a dep for e.g. dct-test.exe. If you look at the link rule for TESTPROGS, it uses $$ to link in all listed dependencies. Now note that you shouldn't link against .dll files on Windows, you generate .lib files that contain the linker information to link against .dll files (but .lib without .dll is a static build, sorry if this is confusing). The .lib files are already part of FFEXTRALIBS. So what you end up getting is a linker that tries to link against .dll files directly (link -libpath:libavcodec -out:dct-test.exe dct-test.o [other object files] libavcodec/libavcodec.dll [other dlls] libavcodec.lib [other libs] [external lib dependencies]), which doesn't make any sense, and obviously fails. This change removes the dependency on the .dll as a listed dep, and thus instead it just links like link -libpath:libavcodec -out:dct-test.exe dct-test.o [other object files] libavcodec.lib [other libs] [external lib dependencies], which works as expected. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Shared library support for MSVC build.
Hi, On Wed, Oct 3, 2012 at 7:44 AM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: These AVCODEC_SYMBOL things will be used in public headers, thus the definition needs t exist in a public header. Since I need to use __declspec(dllimport) only for .dll (shared) builds, not for .lib (static) builds, I need to know in a public header whether I'm a shared lib or not - hence AV_HAVE_SHARED_LIBS (I added that to configure, based on CONFIG_SHARED). That reasoning is flawed. What if you have both static and shared? What is CONFIG_SHARED then? Schroedinger's cat? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Shared library support for MSVC build.
Hi, On Wed, Oct 3, 2012 at 8:27 AM, Diego Biurrun di...@biurrun.de wrote: On Wed, Oct 03, 2012 at 07:37:29AM -0700, Ronald S. Bultje wrote: --- a/libavcodec/golomb.h +++ b/libavcodec/golomb.h @@ -33,13 +33,14 @@ -extern const uint8_t ff_golomb_vlc_len[512]; -extern const uint8_t ff_ue_golomb_vlc_code[512]; -extern const int8_t ff_se_golomb_vlc_code[512]; -extern const uint8_t ff_ue_golomb_len[256]; +extern AVCODEC_SYMBOL const uint8_t ff_golomb_vlc_len[512]; +extern AVCODEC_SYMBOL const uint8_t ff_ue_golomb_vlc_code[512]; +extern AVCODEC_SYMBOL const int8_t ff_se_golomb_vlc_code[512]; +extern AVCODEC_SYMBOL const uint8_t ff_ue_golomb_len[256]; I somehow doubt marking ff_-prefixed symbols is necessary. [..] and the golomb symbols are used by the golomb fate test. Maybe we can link in a bunch of other .c files from libavcodec instead of linking to avcodec.dll instead (I agree that it's ugly). Note that gcc/mingw shared lib build (see symbol list in libavcodec.v) also exports those symbols. No: biurrun@passion:/opt/biurrun/libav $ git grep ff_.e_golomb_ libavcodec/golomb.c:const uint8_t ff_ue_golomb_vlc_code[512]={ libavcodec/golomb.c:const int8_t ff_se_golomb_vlc_code[512]={ libavcodec/golomb.c:const uint8_t ff_ue_golomb_len[256]={ libavcodec/golomb.h:extern const uint8_t ff_ue_golomb_vlc_code[512]; libavcodec/golomb.h:extern const int8_t ff_se_golomb_vlc_code[512]; libavcodec/golomb.h:extern const uint8_t ff_ue_golomb_len[256]; libavcodec/golomb.h:return ff_ue_golomb_vlc_code[buf]; libavcodec/golomb.h:return ff_ue_golomb_vlc_code[buf]; libavcodec/golomb.h:return ff_se_golomb_vlc_code[buf]; libavcodec/golomb.h:put_bits(pb, ff_ue_golomb_len[i], i+1); biurrun@passion:/opt/biurrun/libav $ git grep ff_golomb_vlc libavcodec/golomb.c:const uint8_t ff_golomb_vlc_len[512]={ libavcodec/golomb.h:extern const uint8_t ff_golomb_vlc_len[512]; libavcodec/golomb.h:LAST_SKIP_BITS(re, gb, ff_golomb_vlc_len[buf]); libavcodec/golomb.h:LAST_SKIP_BITS(re, gb, ff_golomb_vlc_len[buf]); libavcodec/golomb.h:LAST_SKIP_BITS(re, gb, ff_golomb_vlc_len[buf]); dhcp-172-31-81-183:libav rbultje$ grep golomb libavcodec/golomb-test.c #include golomb.h set_ue_golomb(pb, i); j = get_ue_golomb(gb); [..] set_ue_golomb(pb, EXTEND(i)); j = get_ue_golomb_long(gb); [..] set_se_golomb(pb, i - COUNT / 2); in golomb.h: static inline int get_ue_golomb(GetBitContext *gb){ unsigned int buf; int log; OPEN_READER(re, gb); UPDATE_CACHE(re, gb); buf=GET_CACHE(re, gb); if(buf = (127)){ buf = 32 - 9; LAST_SKIP_BITS(re, gb, ff_golomb_vlc_len[buf]); CLOSE_READER(re, gb); return ff_ue_golomb_vlc_code[buf]; }else{ [..] Maybe Libav fixed the raw pix fmt stuff, but the point is that we unfortunately do use internal test symbols in various places. +#ifndef AVCODEC_SYMBOLS_H +#define AVCODEC_SYMBOLS_H + +#include libavutil/avconfig.h + +#if AV_HAVE_SHARED_LIBS defined(_MSC_VER) !defined(COMPILING_avcodec) What's AV_HAVE_SHARED_LIBS? I'm not aware of it anywhere. What do you need avconfig.h for? These AVCODEC_SYMBOL things will be used in public headers, thus the definition needs t exist in a public header. Since I need to use __declspec(dllimport) only for .dll (shared) builds, not for .lib (static) builds, I need to know in a public header whether I'm a shared lib or not - hence AV_HAVE_SHARED_LIBS (I added that to configure, based on CONFIG_SHARED). The configure change you speak about is missing then; nothing adds that define to avconfig.h ... You're right, so basically I added shared_libs to HAVE_LIST_PUB and set its value to $shared just before printing it into avconfig.h, so it always has the same value as CONFIG_SHARED. Sorry for the confusion. So, what this does, is that it adds DEP_LIBS (plus unrelated stuff) as a dependency to TESTPROGS, i.e. *.dll becomes a dep for e.g. dct-test.exe. If you look at the link rule for TESTPROGS, it uses $$ to link in all listed dependencies. No, that rule uses $$^, which has different semantics. Now note that you shouldn't link against .dll files on Windows, you generate .lib files that contain the linker information to link against .dll files (but .lib without .dll is a static build, sorry if this is confusing). The .lib files are already part of FFEXTRALIBS. So what you end up getting is a linker that tries to link against .dll files directly (link -libpath:libavcodec -out:dct-test.exe dct-test.o [other object files] libavcodec/libavcodec.dll [other dlls] libavcodec.lib [other libs] [external lib dependencies]), which doesn't make any sense, and obviously fails. This change removes the dependency on the .dll as a listed dep, and thus instead it just links like link -libpath:libavcodec -out:dct-test.exe dct-test.o [other object files] libavcodec.lib [other libs
Re: [libav-devel] [PATCH] Shared library support for MSVC build.
Hi, On Wed, Oct 3, 2012 at 9:07 AM, Diego Biurrun di...@biurrun.de wrote: On Wed, Oct 03, 2012 at 08:49:30AM -0700, Ronald S. Bultje wrote: On Wed, Oct 3, 2012 at 8:43 AM, Diego Biurrun di...@biurrun.de wrote: On Wed, Oct 03, 2012 at 08:31:43AM -0700, Ronald S. Bultje wrote: On Wed, Oct 3, 2012 at 7:44 AM, Måns Rullgård m...@mansr.com wrote: Ronald S. Bultje rsbul...@gmail.com writes: These AVCODEC_SYMBOL things will be used in public headers, thus the definition needs t exist in a public header. Since I need to use __declspec(dllimport) only for .dll (shared) builds, not for .lib (static) builds, I need to know in a public header whether I'm a shared lib or not - hence AV_HAVE_SHARED_LIBS (I added that to configure, based on CONFIG_SHARED). That reasoning is flawed. What if you have both static and shared? What is CONFIG_SHARED then? Schroedinger's cat? An indication that shared libraries shall be built. It is, however, independent of CONFIG_STATIC; hence you cannot use it as an indication that only shared libraries are built. It's not clear to me if you need an indication whether shared libs are available or whether *only* shared libs are available. Like Derek said, you can't do both on MSVC. The object files have different semantics (see __declspec(dllimport)) and are not binary compatible (most notably, a __declspec(dllimport) int variable; isn't actually an integer; it's a pointer to the integer in the dll where it actually resides). Does configure fail if you enable both at the same time? I don't think so; that should be addressed. I can certainly add that. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] Shared library support for MSVC build.
From: Ronald S. Bultje rsbul...@gmail.com This patch adds support for .dll shared library generation using MSVC's native build tools. To properly load non-function symbols from DLL files, we prefix them with __declspec(dllimport) on MSVC for shared library builds when accessed from outside the .dll. For optimal performance, we should do this for functions also, but I'm too lazy to do that right now and it works without. Caveats: 1) you need a new version of c99-to-c89, which will generate the .def files which contains the symbols to be exported. The interface for this isn't stable yet, I'd like to be able to specify symbols on the commandline with wildcard support, e.g. c99wrap.exe -symbols 'avpriv_*' link.exe -dll ..., but for now the hardcoded list does its job. Martin suggested I use the existing .v files for this. 2) a lot of ff_ symbols, and dsputil_init, are accessed from outside the library in which they exist. These are bugs and should be fixed (i.e. renamed). --- configure | 14 ++ libavcodec/ac3tab.h |3 ++- libavcodec/dca.h|3 ++- libavcodec/dnxhddata.h |3 ++- libavcodec/golomb.h |9 + libavcodec/mjpeg.h | 16 libavcodec/mpeg4audio.h |3 ++- libavcodec/mpegaudiodata.h |6 -- libavcodec/raw.h|3 ++- libavcodec/snowenc.c|4 ++-- libavcodec/symbols.h| 32 libavformat/gxf.c |3 ++- libavutil/aes.h |4 +++- libavutil/common.h |5 +++-- libavutil/intmath.h |3 ++- libavutil/md5.h |4 +++- libavutil/pca.c | 18 ++ libavutil/pixdesc.h |3 ++- libavutil/sha.h |4 +++- libavutil/symbols.h | 32 libavutil/tree.h|4 +++- library.mak |3 ++- libswresample/swresample-test.c |2 +- tools/ffeval.c |5 + 24 files changed, 142 insertions(+), 44 deletions(-) create mode 100644 libavcodec/symbols.h create mode 100644 libavutil/symbols.h diff --git a/configure b/configure index b7a8b77..3748ffb 100755 --- a/configure +++ b/configure @@ -1247,6 +1247,7 @@ HAVE_LIST_PUB=' bigendian fast_unaligned incompatible_fork_abi +shared_libs ' HAVE_LIST= @@ -2616,7 +2617,7 @@ probe_cc(){ _cc_o='-Fo $@' _cc_e='-P -Fi $@' _flags_filter=msvc_flags -_ld_lib='lib%.a' +_ld_lib='%.lib' _ld_path='-libpath:' _flags='-nologo' _cflags='-D_USE_MATH_DEFINES -Dinline=__inline -FIstdlib.h -Dstrtoll=_strtoi64' @@ -3078,11 +3079,13 @@ case $target_os in shlibdir_default=$bindir_default SLIBPREF= SLIBSUF=.dll +LIBPREF= +LIBSUF=.lib SLIBNAME_WITH_VERSION='$(SLIBPREF)$(FULLNAME)-$(LIBVERSION)$(SLIBSUF)' SLIBNAME_WITH_MAJOR='$(SLIBPREF)$(FULLNAME)-$(LIBMAJOR)$(SLIBSUF)' dlltool=${cross_prefix}dlltool if check_cmd lib.exe -list; then -SLIB_EXTRA_CMD='-lib.exe /machine:$(LIBTARGET) /def:$$(@:$(SLIBSUF)=.def) /out:$(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)' +SLIB_EXTRA_CMD='-lib.exe -machine:$(LIBTARGET) -def:$$(@:$(SLIBSUF)=.def) -out:$(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)' if enabled x86_64; then LIBTARGET=x64 fi @@ -3092,8 +3095,8 @@ case $target_os in SLIB_INSTALL_NAME='$(SLIBNAME_WITH_MAJOR)' SLIB_INSTALL_LINKS= SLIB_INSTALL_EXTRA_SHLIB='$(SLIBNAME:$(SLIBSUF)=.lib)' -SLIB_INSTALL_EXTRA_LIB='lib$(SLIBNAME:$(SLIBSUF)=.dll.a) $(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.def)' -SHFLAGS='-shared -Wl,--output-def,$$(@:$(SLIBSUF)=.def) -Wl,--out-implib,$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a) -Wl,--enable-runtime-pseudo-reloc -Wl,--enable-auto-image-base' +SLIB_INSTALL_EXTRA_LIB='lib$(SLIBNAME:$(SLIBSUF)=.dll) $(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.def)' +SHFLAGS='-dll -def:$$(@:$(SLIBSUF)=.def) -implib:$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a)' objformat=win32 ranlib=: enable dos_paths @@ -3912,6 +3915,8 @@ elif enabled pathscale; then add_cflags -fstrict-overflow -OPT:wrap_around_unsafe_opt=OFF elif enabled msvc; then enabled x86_32 disable aligned_stack +enabled_all shared static +die Cannot build shared and static at the same time with MSVC fi enabled_any $THREADS_LIST enable threads @@ -4306,6 +4311,7 @@ test $postproc_version != current cat $TMPH EOF #define LIBPOSTPROC_VERSION_MICRO $LIBPOSTPROC_VERSION_MICRO EOF +shared_libs=$shared print_config AV_HAVE_ $TMPH $HAVE_LIST_PUB echo #endif /* AVUTIL_AVCONFIG_H */ $TMPH diff --git a/libavcodec/ac3tab.h b/libavcodec/ac3tab.h
[libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.
From: Ronald S. Bultje rsbul...@gmail.com The variable is copied to subsequent threads at the same time, so this may cause wrong ref_count[] values to be copied to subsequent threads. --- libavcodec/h264_cabac.c | 41 - libavcodec/h264_cavlc.c | 33 + 2 files changed, 29 insertions(+), 45 deletions(-) diff --git a/libavcodec/h264_cabac.c b/libavcodec/h264_cabac.c index 7e8947b..da3a695 100644 --- a/libavcodec/h264_cabac.c +++ b/libavcodec/h264_cabac.c @@ -2003,11 +2003,6 @@ decode_intra_mb: return 0; } -if(MB_MBAFF){ -h-ref_count[0] = 1; -h-ref_count[1] = 1; -} - fill_decode_caches(h, mb_type); if( IS_INTRA( mb_type ) ) { @@ -2076,10 +2071,11 @@ decode_intra_mb: for( i = 0; i 4; i++ ) { if(IS_DIRECT(h-sub_mb_type[i])) continue; if(IS_DIR(h-sub_mb_type[i], 0, list)){ -if( h-ref_count[list] 1 ){ +int rc = h-ref_count[list] MB_MBAFF; +if (rc 1) { ref[list][i] = decode_cabac_mb_ref( h, list, 4*i ); -if(ref[list][i] = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref[list][i], h-ref_count[list]); +if (ref[list][i] = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref[list][i], rc); return -1; } }else @@ -2161,11 +2157,11 @@ decode_intra_mb: if(IS_16X16(mb_type)){ for(list=0; listh-list_count; list++){ if(IS_DIR(mb_type, 0, list)){ -int ref; -if(h-ref_count[list] 1){ +int ref, rc = h-ref_count[list] MB_MBAFF; +if (rc 1){ ref= decode_cabac_mb_ref(h, list, 0); -if(ref = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, h-ref_count[list]); +if (ref = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, rc); return -1; } }else @@ -2189,11 +2185,11 @@ decode_intra_mb: for(list=0; listh-list_count; list++){ for(i=0; i2; i++){ if(IS_DIR(mb_type, i, list)){ -int ref; -if(h-ref_count[list] 1){ +int ref, rc = h-ref_count[list] MB_MBAFF; +if (rc 1) { ref= decode_cabac_mb_ref( h, list, 8*i ); -if(ref = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, h-ref_count[list]); +if (ref = rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, rc); return -1; } }else @@ -2224,11 +2220,11 @@ decode_intra_mb: for(list=0; listh-list_count; list++){ for(i=0; i2; i++){ if(IS_DIR(mb_type, i, list)){ //FIXME optimize -int ref; -if(h-ref_count[list] 1){ +int ref, rc = h-ref_count[list] MB_MBAFF; +if (rc 1) { ref= decode_cabac_mb_ref( h, list, 4*i ); -if(ref = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, h-ref_count[list]); +if (ref = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, rc); return -1; } }else @@ -2401,10 +2397,5 @@ decode_intra_mb: s-current_picture.f.qscale_table[mb_xy] = s-qscale; write_back_non_zero_count(h); -if(MB_MBAFF){ -h-ref_count[0] = 1; -h-ref_count[1] = 1; -} - return 0; } diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c index 06e1ba7..cf3ce8f 100644 --- a/libavcodec/h264_cavlc.c +++ b/libavcodec/h264_cavlc.c @@ -784,11 +784,6 @@ decode_intra_mb: return 0; } -if(MB_MBAFF){ -h-ref_count[0] = 1; -h-ref_count[1] = 1; -} - fill_decode_neighbors(h, mb_type); fill_decode_caches(h, mb_type); @@ -868,7
[libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.
From: Ronald S. Bultje rsbul...@gmail.com The variable is copied to subsequent threads at the same time, so this may cause wrong ref_count[] values to be copied to subsequent threads. --- libavcodec/h264_cabac.c | 41 - libavcodec/h264_cavlc.c | 33 + 2 files changed, 29 insertions(+), 45 deletions(-) diff --git a/libavcodec/h264_cabac.c b/libavcodec/h264_cabac.c index 7e8947b..97e8128 100644 --- a/libavcodec/h264_cabac.c +++ b/libavcodec/h264_cabac.c @@ -2003,11 +2003,6 @@ decode_intra_mb: return 0; } -if(MB_MBAFF){ -h-ref_count[0] = 1; -h-ref_count[1] = 1; -} - fill_decode_caches(h, mb_type); if( IS_INTRA( mb_type ) ) { @@ -2076,10 +2071,11 @@ decode_intra_mb: for( i = 0; i 4; i++ ) { if(IS_DIRECT(h-sub_mb_type[i])) continue; if(IS_DIR(h-sub_mb_type[i], 0, list)){ -if( h-ref_count[list] 1 ){ +int rc = h-ref_count[list] MB_MBAFF; +if (rc 1) { ref[list][i] = decode_cabac_mb_ref( h, list, 4*i ); -if(ref[list][i] = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref[list][i], h-ref_count[list]); +if (ref[list][i] = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref[list][i], rc); return -1; } }else @@ -2161,11 +2157,11 @@ decode_intra_mb: if(IS_16X16(mb_type)){ for(list=0; listh-list_count; list++){ if(IS_DIR(mb_type, 0, list)){ -int ref; -if(h-ref_count[list] 1){ +int ref, rc = h-ref_count[list] MB_MBAFF; +if (rc 1){ ref= decode_cabac_mb_ref(h, list, 0); -if(ref = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, h-ref_count[list]); +if (ref = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, rc); return -1; } }else @@ -2189,11 +2185,11 @@ decode_intra_mb: for(list=0; listh-list_count; list++){ for(i=0; i2; i++){ if(IS_DIR(mb_type, i, list)){ -int ref; -if(h-ref_count[list] 1){ +int ref, rc = h-ref_count[list] MB_MBAFF; +if (rc 1) { ref= decode_cabac_mb_ref( h, list, 8*i ); -if(ref = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, h-ref_count[list]); +if (ref = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, rc); return -1; } }else @@ -2224,11 +2220,11 @@ decode_intra_mb: for(list=0; listh-list_count; list++){ for(i=0; i2; i++){ if(IS_DIR(mb_type, i, list)){ //FIXME optimize -int ref; -if(h-ref_count[list] 1){ +int ref, rc = h-ref_count[list] MB_MBAFF; +if (rc 1) { ref= decode_cabac_mb_ref( h, list, 4*i ); -if(ref = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, h-ref_count[list]); +if (ref = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, rc); return -1; } }else @@ -2401,10 +2397,5 @@ decode_intra_mb: s-current_picture.f.qscale_table[mb_xy] = s-qscale; write_back_non_zero_count(h); -if(MB_MBAFF){ -h-ref_count[0] = 1; -h-ref_count[1] = 1; -} - return 0; } diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c index 06e1ba7..cf3ce8f 100644 --- a/libavcodec/h264_cavlc.c +++ b/libavcodec/h264_cavlc.c @@ -784,11 +784,6 @@ decode_intra_mb: return 0; } -if(MB_MBAFF){ -h-ref_count[0] = 1; -h-ref_count[1] = 1; -} - fill_decode_neighbors(h, mb_type); fill_decode_caches(h, mb_type
[libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.
From: Ronald S. Bultje rsbul...@gmail.com The variable is copied to subsequent threads at the same time, so this may cause wrong ref_count[] values to be copied to subsequent threads. This bug was found using TSAN. --- libavcodec/h264_cabac.c | 41 - libavcodec/h264_cavlc.c | 33 + 2 files changed, 29 insertions(+), 45 deletions(-) diff --git a/libavcodec/h264_cabac.c b/libavcodec/h264_cabac.c index 7e8947b..97e8128 100644 --- a/libavcodec/h264_cabac.c +++ b/libavcodec/h264_cabac.c @@ -2003,11 +2003,6 @@ decode_intra_mb: return 0; } -if(MB_MBAFF){ -h-ref_count[0] = 1; -h-ref_count[1] = 1; -} - fill_decode_caches(h, mb_type); if( IS_INTRA( mb_type ) ) { @@ -2076,10 +2071,11 @@ decode_intra_mb: for( i = 0; i 4; i++ ) { if(IS_DIRECT(h-sub_mb_type[i])) continue; if(IS_DIR(h-sub_mb_type[i], 0, list)){ -if( h-ref_count[list] 1 ){ +int rc = h-ref_count[list] MB_MBAFF; +if (rc 1) { ref[list][i] = decode_cabac_mb_ref( h, list, 4*i ); -if(ref[list][i] = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref[list][i], h-ref_count[list]); +if (ref[list][i] = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref[list][i], rc); return -1; } }else @@ -2161,11 +2157,11 @@ decode_intra_mb: if(IS_16X16(mb_type)){ for(list=0; listh-list_count; list++){ if(IS_DIR(mb_type, 0, list)){ -int ref; -if(h-ref_count[list] 1){ +int ref, rc = h-ref_count[list] MB_MBAFF; +if (rc 1){ ref= decode_cabac_mb_ref(h, list, 0); -if(ref = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, h-ref_count[list]); +if (ref = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, rc); return -1; } }else @@ -2189,11 +2185,11 @@ decode_intra_mb: for(list=0; listh-list_count; list++){ for(i=0; i2; i++){ if(IS_DIR(mb_type, i, list)){ -int ref; -if(h-ref_count[list] 1){ +int ref, rc = h-ref_count[list] MB_MBAFF; +if (rc 1) { ref= decode_cabac_mb_ref( h, list, 8*i ); -if(ref = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, h-ref_count[list]); +if (ref = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, rc); return -1; } }else @@ -2224,11 +2220,11 @@ decode_intra_mb: for(list=0; listh-list_count; list++){ for(i=0; i2; i++){ if(IS_DIR(mb_type, i, list)){ //FIXME optimize -int ref; -if(h-ref_count[list] 1){ +int ref, rc = h-ref_count[list] MB_MBAFF; +if (rc 1) { ref= decode_cabac_mb_ref( h, list, 4*i ); -if(ref = (unsigned)h-ref_count[list]){ -av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, h-ref_count[list]); +if (ref = (unsigned) rc) { +av_log(s-avctx, AV_LOG_ERROR, Reference %d = %d\n, ref, rc); return -1; } }else @@ -2401,10 +2397,5 @@ decode_intra_mb: s-current_picture.f.qscale_table[mb_xy] = s-qscale; write_back_non_zero_count(h); -if(MB_MBAFF){ -h-ref_count[0] = 1; -h-ref_count[1] = 1; -} - return 0; } diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c index 06e1ba7..cf3ce8f 100644 --- a/libavcodec/h264_cavlc.c +++ b/libavcodec/h264_cavlc.c @@ -784,11 +784,6 @@ decode_intra_mb: return 0; } -if(MB_MBAFF){ -h-ref_count[0] = 1; -h-ref_count[1] = 1; -} - fill_decode_neighbors(h, mb_type
Re: [libav-devel] [PATCH 5/6] x86: af_volume: add SSE2-optimized s16 volume scaling
Hi, On Tue, Oct 2, 2012 at 10:54 AM, Justin Ruggles justin.rugg...@gmail.com wrote: On 09/29/2012 01:17 AM, Justin Ruggles wrote: --- libavfilter/af_volume.c |3 ++ libavfilter/x86/Makefile |3 ++ libavfilter/x86/af_volume.asm| 56 ++ libavfilter/x86/af_volume_init.c | 39 ++ 4 files changed, 101 insertions(+), 0 deletions(-) create mode 100644 libavfilter/x86/af_volume.asm create mode 100644 libavfilter/x86/af_volume_init.c ping. Sorry, lgtm. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] Shared library support for MSVC build.
From: Ronald S. Bultje rsbul...@gmail.com This patch adds support for .dll shared library generation using MSVC's native build tools. To properly load non-function symbols from DLL files, we prefix them with __declspec(dllimport) on MSVC for shared library builds when accessed from outside the .dll. For optimal performance, we should do this for functions also, but I'm too lazy to do that right now and it works without. Caveats: 1) you need a new version of c99-to-c89, which will generate the .def files which contains the symbols to be exported. The interface for this isn't stable yet, I'd like to be able to specify symbols on the commandline with wildcard support, e.g. c99wrap.exe -symbols 'avpriv_*' link.exe -dll ..., but for now the hardcoded list does its job. Martin suggested I use the existing .v files for this. 2) a lot of ff_ symbols, and dsputil_init, are accessed from outside the library in which they exist. These are bugs and should be fixed (i.e. renamed). --- configure | 10 ++ libavcodec/ac3tab.h|3 ++- libavcodec/dca.h |3 ++- libavcodec/dnxhddata.h |3 ++- libavcodec/golomb.h|9 + libavcodec/mjpeg.h | 16 libavcodec/mpeg4audio.h|3 ++- libavcodec/mpegaudiodata.h |6 -- libavcodec/raw.h |3 ++- libavcodec/symbols.h | 32 libavformat/gxf.c |3 ++- libavutil/aes.h|4 +++- libavutil/common.h |5 +++-- libavutil/intmath.h|3 ++- libavutil/md5.h|4 +++- libavutil/pixdesc.h|3 ++- libavutil/sha.h|4 +++- libavutil/symbols.h| 32 libavutil/tree.h |4 +++- library.mak|3 ++- 20 files changed, 120 insertions(+), 33 deletions(-) create mode 100644 libavcodec/symbols.h create mode 100644 libavutil/symbols.h diff --git a/configure b/configure index b7a8b77..2828956 100755 --- a/configure +++ b/configure @@ -2616,7 +2616,7 @@ probe_cc(){ _cc_o='-Fo $@' _cc_e='-P -Fi $@' _flags_filter=msvc_flags -_ld_lib='lib%.a' +_ld_lib='%.lib' _ld_path='-libpath:' _flags='-nologo' _cflags='-D_USE_MATH_DEFINES -Dinline=__inline -FIstdlib.h -Dstrtoll=_strtoi64' @@ -3078,11 +3078,13 @@ case $target_os in shlibdir_default=$bindir_default SLIBPREF= SLIBSUF=.dll + LIBPREF= + LIBSUF=.lib SLIBNAME_WITH_VERSION='$(SLIBPREF)$(FULLNAME)-$(LIBVERSION)$(SLIBSUF)' SLIBNAME_WITH_MAJOR='$(SLIBPREF)$(FULLNAME)-$(LIBMAJOR)$(SLIBSUF)' dlltool=${cross_prefix}dlltool if check_cmd lib.exe -list; then -SLIB_EXTRA_CMD='-lib.exe /machine:$(LIBTARGET) /def:$$(@:$(SLIBSUF)=.def) /out:$(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)' +SLIB_EXTRA_CMD='-lib.exe -machine:$(LIBTARGET) -def:$$(@:$(SLIBSUF)=.def) -out:$(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)' if enabled x86_64; then LIBTARGET=x64 fi @@ -3092,8 +3094,8 @@ case $target_os in SLIB_INSTALL_NAME='$(SLIBNAME_WITH_MAJOR)' SLIB_INSTALL_LINKS= SLIB_INSTALL_EXTRA_SHLIB='$(SLIBNAME:$(SLIBSUF)=.lib)' -SLIB_INSTALL_EXTRA_LIB='lib$(SLIBNAME:$(SLIBSUF)=.dll.a) $(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.def)' -SHFLAGS='-shared -Wl,--output-def,$$(@:$(SLIBSUF)=.def) -Wl,--out-implib,$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a) -Wl,--enable-runtime-pseudo-reloc -Wl,--enable-auto-image-base' +SLIB_INSTALL_EXTRA_LIB='lib$(SLIBNAME:$(SLIBSUF)=.dll) $(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.def)' +SHFLAGS='-dll -def:$$(@:$(SLIBSUF)=.def) -implib:$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a)' objformat=win32 ranlib=: enable dos_paths diff --git a/libavcodec/ac3tab.h b/libavcodec/ac3tab.h index d6e0eed..0063c8d 100644 --- a/libavcodec/ac3tab.h +++ b/libavcodec/ac3tab.h @@ -24,6 +24,7 @@ #include libavutil/common.h #include ac3.h +#include symbols.h #if CONFIG_HARDCODED_TABLES # define HCONST const @@ -33,7 +34,7 @@ extern const uint16_t ff_ac3_frame_size_tab[38][3]; extern const uint8_t ff_ac3_channels_tab[8]; -extern const uint16_t avpriv_ac3_channel_layout_tab[8]; +extern AVCODEC_SYMBOL const uint16_t avpriv_ac3_channel_layout_tab[8]; extern const uint8_t ff_ac3_enc_channel_map[8][2][6]; extern const uint8_t ff_ac3_dec_channel_map[8][2][6]; extern const uint16_t ff_ac3_sample_rate_tab[3]; diff --git a/libavcodec/dca.h b/libavcodec/dca.h index 4d2a829..e754056 100644 --- a/libavcodec/dca.h +++ b/libavcodec/dca.h @@ -26,6 +26,7 @@ #define AVCODEC_DCA_H #include stdint.h +#include symbols.h /** DCA syncwords, also used for bitstream type detection */ #define DCA_MARKER_RAW_BE 0x7FFE8001 @@ -36,6 +37,6 @@ /** DCA-HD specific block starts
Re: [libav-devel] Explicit symbol visibility Was: [PATCH] Shared library support for MSVC build.
Hi, On Tue, Oct 2, 2012 at 7:07 PM, Luca Barbato lu_z...@gentoo.org wrote: On 10/03/2012 03:47 AM, Diego Elio Pettenò wrote: On 02/10/2012 18:40, Luca Barbato wrote: The patch was rejected since it was _really_ invasive and after lots of back and forth we ended up with pattern matching and code w/out any explicit marking. Which is actually done at link time instead than at compile time. Now going with compile time declaration brings us at least a couple of things: a) LTO-enabled compilers should be able then to know what is internal and what is exported; b) tools like my missingstatic can rely on the hidden flag to know what is supposed to be used or static. On the other hand it seems Ronald's patch is for the other side, i.e. the importing of the symbols, not the exporting.. As I said, Ronald approach covers the minimal amount of changes needed to achieve a specific need (shared libraries in windows), I'd rather to cover everything at once and make it useful for everybody. Here my take on the subject, the sed expression to convert the codebase is left for when I'm less sleepy and if there is enough traction to go that way. A symbol exported from avcodec may be imported in avformat, I don't think your patch handles that case. I.e. rather than having av_api, you'd need avcodec_api, avformat_api, etc., to indicate in which dll a particular symbol resides. As for the more generic question, yes I'd like to be able to mark all symbols (functions as well as tables) as exported or imported, but foresee quite some resistance so I didn't go that way right now. If others like that, we can certainly go down that route. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] avutil: add yuva444p10 pixel format
Hi, On Mon, Oct 1, 2012 at 5:03 AM, Luca Barbato lu_z...@gentoo.org wrote: From: Paul B Mahol one...@gmail.com Signed-off-by: Luca Barbato lu_z...@gentoo.org --- libavutil/pixdesc.c | 26 ++ libavutil/pixfmt.h| 3 ++- libswscale/utils.c| 2 ++ tests/ref/lavfi/pixdesc | 2 ++ tests/ref/lavfi/pixfmts_copy | 2 ++ tests/ref/lavfi/pixfmts_null | 2 ++ tests/ref/lavfi/pixfmts_scale | 2 ++ tests/ref/lavfi/pixfmts_vflip | 2 ++ 8 files changed, 40 insertions(+), 1 deletion(-) diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c index d0d5de0..04801a7 100644 --- a/libavutil/pixdesc.c +++ b/libavutil/pixdesc.c @@ -553,6 +553,32 @@ const AVPixFmtDescriptor av_pix_fmt_descriptors[PIX_FMT_NB] = { }, .flags = PIX_FMT_PLANAR, }, +[PIX_FMT_YUVA444P10LE] = { +.name = yuva444p10le, +.nb_components = 4, +.log2_chroma_w = 0, +.log2_chroma_h = 0, +.comp = { +{ 0, 1, 1, 0, 9 },/* Y */ +{ 1, 1, 1, 0, 9 },/* U */ +{ 2, 1, 1, 0, 9 },/* V */ +{ 3, 1, 1, 0, 9 },/* A */ +}, +.flags = PIX_FMT_PLANAR, +}, +[PIX_FMT_YUVA444P10BE] = { +.name = yuva444p10be, +.nb_components = 4, +.log2_chroma_w = 0, +.log2_chroma_h = 0, +.comp = { +{ 0, 1, 1, 0, 9 },/* Y */ +{ 1, 1, 1, 0, 9 },/* U */ +{ 2, 1, 1, 0, 9 },/* V */ +{ 3, 1, 1, 0, 9 },/* A */ +}, +.flags = PIX_FMT_BE | PIX_FMT_PLANAR, +}, [PIX_FMT_VDPAU_H264] = { .name = vdpau_h264, .log2_chroma_w = 1, diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h index f3c504a..d3f10ee 100644 --- a/libavutil/pixfmt.h +++ b/libavutil/pixfmt.h @@ -159,7 +159,8 @@ enum PixelFormat { PIX_FMT_GBRP16LE, /// planar GBR 4:4:4 48bpp, little endian PIX_FMT_YUVA422P, /// planar YUV 4:2:2 24bpp, (1 Cr Cb sample per 2x1 Y A samples) PIX_FMT_YUVA444P, /// planar YUV 4:4:4 32bpp, (1 Cr Cb sample per 1x1 Y A samples) - +PIX_FMT_YUVA444P10BE, /// planar YUV 4:4:4 40bpp, (1 Cr Cb sample per 1x1 Y A samples) +PIX_FMT_YUVA444P10LE, /// planar YUV 4:4:4 40bpp, (1 Cr Cb sample per 1x1 Y A samples) PIX_FMT_NB,/// number of pixel formats, DO NOT USE THIS if you want to link with shared libav* because the number of formats might differ between versions }; diff --git a/libswscale/utils.c b/libswscale/utils.c index 2f78e87..7d9c120 100644 --- a/libswscale/utils.c +++ b/libswscale/utils.c @@ -110,6 +110,8 @@ static const FormatEntry format_entries[PIX_FMT_NB] = { [PIX_FMT_YUVA420P]= { 1, 1 }, [PIX_FMT_YUVA422P]= { 1, 1 }, [PIX_FMT_YUVA444P]= { 1, 1 }, +[PIX_FMT_YUVA444P10BE]= { 1, 1 }, +[PIX_FMT_YUVA444P10LE]= { 1, 1 }, [PIX_FMT_RGB48BE] = { 1, 1 }, [PIX_FMT_RGB48LE] = { 1, 1 }, [PIX_FMT_RGB565BE]= { 1, 1 }, Why not add support for yuva{420,422}p10{be,le} also? And while at it, Why not add yuva{420,422,444}p9{be.le} at the same time? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel