Re: [libav-devel] [PATCH 1/1] h264: fix memleak on error during SPS parsing

2012-12-14 Thread Ronald S. Bultje
Hi,

On Fri, Dec 14, 2012 at 12:09 PM, Janne Grunau janne-li...@jannau.net wrote:
 Introduced in d7d6efe42b0d.
 ---
  libavcodec/h264_ps.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: handle complementary field pairs in await_references()

2012-12-14 Thread Ronald S. Bultje
Hi,

On Fri, Dec 14, 2012 at 7:02 AM, Janne Grunau janne-li...@jannau.net wrote:
 On 2012-12-12 16:16:21 -0800, Ronald S. Bultje wrote:

 On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau janne-li...@jannau.net wrote:
  On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote:
  Hi,
 
  On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net 
  wrote:
   Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid
   field 2.
   ---
libavcodec/h264.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
  
   diff --git a/libavcodec/h264.c b/libavcodec/h264.c
   index 546b046..d5a54e2 100644
   --- a/libavcodec/h264.c
   +++ b/libavcodec/h264.c
   @@ -433,7 +433,8 @@ static void await_references(H264Context *h)
ff_thread_await_progress(ref_pic-f,
 FFMIN((row  1), 
   pic_height - 1),
 0);
   -} else if (FIELD_PICTURE  !ref_field_picture) { // 
   field referencing one field of a frame
   +} else if (FIELD_PICTURE 
   +   (!ref_field_picture || ref_field  1)) { // 
   field referencing one field of a frame or complementary field pair
 
  I don't understand this one. If we're referencing two fields,
  shouldn't ref_field_picture automatically be true?
 
  it isn't and the code that marks both fields of a complementory field
  pair as available doesn't touch field_picture.

 Right, but isn't this wrong then? I mean, the reference is
 field-based, so we're going to reference a field. The part where both
 fields are available doesn't negate the fact that we're only
 referencing one.

 Yes the patch is wrong, I wasn't paying enough attention.
 A complementary ref field pair can't handled in the same way as a frame.
 We reported progress on the individual fields. I don't know how to
 determine which field it references so the patch waits on both. eld

 It seems to me that a change like this could potentially cause use of
 reference data before the second field of a reference is fully
 decoded.

 Yes, see above.

 Janne

 ---8---
 Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid
 field 2.
 ---
  libavcodec/h264.c | 5 +
  1 file changed, 5 insertions(+)

 diff --git a/libavcodec/h264.c b/libavcodec/h264.c
 index f01ff67..be2a98b 100644
 --- a/libavcodec/h264.c
 +++ b/libavcodec/h264.c
 @@ -438,6 +438,11 @@ static void await_references(H264Context *h)
   FFMIN(row * 2 + ref_field,
 pic_height - 1),
   0);
 +} else if (FIELD_PICTURE  ref_field_picture  ref_field 
 == 2) { // field referencing one field of a complementary field pair
 +ff_thread_await_progress(ref_pic-f,
 + FFMIN(row, pic_height - 1), 1);
 +ff_thread_await_progress(ref_pic-f,
 + FFMIN(row, pic_height - 1), 0);
  } else if (FIELD_PICTURE) {

I still don't get it, how does a field reference need to wait for two
fields? The idea of a field reference is that we only need the data of
1 field.

Can I get the sample? I'll try to see what's going on this weekend
perhaps... All this looks weird.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] x86inc: fully concatenate tokens to fix macro expansion for nasm

2012-12-13 Thread Ronald S. Bultje
Hi,

On Thu, Dec 13, 2012 at 1:48 PM, Janne Grunau janne-li...@jannau.net wrote:
 Fixes build errors with nasm introduced in 6f40e9f070f7 for stack
 memory alignment. Noticed by BugMaster.
 ---
  libavutil/x86/x86inc.asm | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 01/15] h264: set parameters from SPS whenever it changes

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote:
 Fixes a crash in the fuzzed sample sample_varPAR.avi_s26638 with
 alternating bit depths.
 ---
  libavcodec/h264.c| 107 
 +--
  libavcodec/h264.h|   2 +
  libavcodec/h264_ps.c |   3 ++
  3 files changed, 65 insertions(+), 47 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 02/15] lavc: do not overwrite frame parameters of delayed frames

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote:
 Decoders supporting frame parameter changes can return delayed frames
 with the old parameters.
 ---
  libavcodec/utils.c | 15 ++-
  1 file changed, 10 insertions(+), 5 deletions(-)

 diff --git a/libavcodec/utils.c b/libavcodec/utils.c
 index 1185a35..97fa570 100644
 --- a/libavcodec/utils.c
 +++ b/libavcodec/utils.c
 @@ -1282,11 +1282,16 @@ int attribute_align_arg 
 avcodec_decode_video2(AVCodecContext *avctx, AVFrame *pi
  else {
  ret = avctx-codec-decode(avctx, picture, got_picture_ptr,
 avpkt);
 -picture-pkt_dts = avpkt-dts;
 -picture-sample_aspect_ratio = avctx-sample_aspect_ratio;
 -picture-width   = avctx-width;
 -picture-height  = avctx-height;
 -picture-format  = avctx-pix_fmt;
 +picture-pkt_dts = avpkt-dts;
 +
 +/* delayed pictures might have different parameters than
 + * the context returning them */
 +if (!(avctx-codec-capabilities  CODEC_CAP_DELAY)) {
 +picture-sample_aspect_ratio = avctx-sample_aspect_ratio;
 +picture-width   = avctx-width;
 +picture-height  = avctx-height;
 +picture-format  = avctx-pix_fmt;
 +}
  }

I can live with this. We can clean this further later.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 03/15] mpegvideo: treat delayed pictures as used

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote:
 This requires to move the avcodec_default_free_buffers() call to
 ff_MPV_common_end() since otherwise delayed pictures would get freed
 during a size change.
 ---
  libavcodec/h264.h  | 6 --
  libavcodec/mpegvideo.c | 8 
  libavcodec/mpegvideo.h | 6 ++
  3 files changed, 10 insertions(+), 10 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 04/15] mpegvideo: align width to 32 for scratch buffer sizes

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote:
 cmdutis.c's alloc_buffer() uses aligned to 32 width plus 2 edges of 32
 pixels as linesize. emu_edge_buffer has to work with the same stride.
 This makes only a difference for  8 bit per pixel bit depths since we
 always allocate for 16 bit per pixel.

 Fixes fuzzed sample nasa-8s2.ts_s244342.
 ---
  libavcodec/mpegvideo.c | 13 -
  1 file changed, 8 insertions(+), 5 deletions(-)

What if we implement a custom get_buffer() which returns bigger strides?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 05/15] h264: initialize frame-mt context copies properly

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote:
 ---
  libavcodec/h264.c | 2 ++
  1 file changed, 2 insertions(+)

 diff --git a/libavcodec/h264.c b/libavcodec/h264.c
 index e8a22f8..546b046 100644
 --- a/libavcodec/h264.c
 +++ b/libavcodec/h264.c
 @@ -1125,6 +1125,8 @@ static int decode_init_thread_copy(AVCodecContext 
 *avctx)
  memset(h-sps_buffers, 0, sizeof(h-sps_buffers));
  memset(h-pps_buffers, 0, sizeof(h-pps_buffers));

 +h-s.context_initialized = 0;
 +
  return 0;
  }

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote:
 Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid
 field 2.
 ---
  libavcodec/h264.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/libavcodec/h264.c b/libavcodec/h264.c
 index 546b046..d5a54e2 100644
 --- a/libavcodec/h264.c
 +++ b/libavcodec/h264.c
 @@ -433,7 +433,8 @@ static void await_references(H264Context *h)
  ff_thread_await_progress(ref_pic-f,
   FFMIN((row  1), pic_height - 
 1),
   0);
 -} else if (FIELD_PICTURE  !ref_field_picture) { // field 
 referencing one field of a frame
 +} else if (FIELD_PICTURE 
 +   (!ref_field_picture || ref_field  1)) { // field 
 referencing one field of a frame or complementary field pair

I don't understand this one. If we're referencing two fields,
shouldn't ref_field_picture automatically be true?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 04/15] mpegvideo: align width to 32 for scratch buffer sizes

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 3:12 PM, Janne Grunau janne-li...@jannau.net wrote:
 On 2012-12-12 14:30:43 -0800, Ronald S. Bultje wrote:
 Hi,

 On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net 
 wrote:
  cmdutis.c's alloc_buffer() uses aligned to 32 width plus 2 edges of 32
  pixels as linesize. emu_edge_buffer has to work with the same stride.
  This makes only a difference for  8 bit per pixel bit depths since we
  always allocate for 16 bit per pixel.
 
  Fixes fuzzed sample nasa-8s2.ts_s244342.
  ---
   libavcodec/mpegvideo.c | 13 -
   1 file changed, 8 insertions(+), 5 deletions(-)

 What if we implement a custom get_buffer() which returns bigger strides?

 it fails horrible for high pixel bit depth like it does with this
 sample. I could add an assumed_linesize into the context and realloc if
 the linesize returned by get_buffer is larger or maybe preallocate a
 buffer here if it' feasible.

Can't you allocate these buffers after you receive the first buffer
from get_buffer() and actually know the linesize?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 11/15] h264: prevent starting new frames after ff_thread_finish_setup()

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net wrote:
 ---
  libavcodec/h264.c | 13 -
  libavcodec/h264.h |  1 +
  2 files changed, 13 insertions(+), 1 deletion(-)

In what situation does this happen? We should really make this an
assert() at some point, and track the cases where it does happen back
to ensure we fix it at whatever level. (The current hack to decode two
fields in a packet is a hack and should die; we should parse such
files better and split at the field boundary.)

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 4:29 PM, Janne Grunau janne-li...@jannau.net wrote:
 On 2012-12-12 16:16:21 -0800, Ronald S. Bultje wrote:
 Hi,

 On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau janne-li...@jannau.net wrote:
  On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote:
  Hi,
 
  On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau janne-li...@jannau.net 
  wrote:
   Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid
   field 2.
   ---
libavcodec/h264.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
  
   diff --git a/libavcodec/h264.c b/libavcodec/h264.c
   index 546b046..d5a54e2 100644
   --- a/libavcodec/h264.c
   +++ b/libavcodec/h264.c
   @@ -433,7 +433,8 @@ static void await_references(H264Context *h)
ff_thread_await_progress(ref_pic-f,
 FFMIN((row  1), 
   pic_height - 1),
 0);
   -} else if (FIELD_PICTURE  !ref_field_picture) { // 
   field referencing one field of a frame
   +} else if (FIELD_PICTURE 
   +   (!ref_field_picture || ref_field  1)) { // 
   field referencing one field of a frame or complementary field pair
 
  I don't understand this one. If we're referencing two fields,
  shouldn't ref_field_picture automatically be true?
 
  it isn't and the code that marks both fields of a complementory field
  pair as available doesn't touch field_picture.

 Right, but isn't this wrong then? I mean, the reference is
 field-based, so we're going to reference a field. The part where both
 fields are available doesn't negate the fact that we're only
 referencing one.

 It seems to me that a change like this could potentially cause use of
 reference data before the second field of a reference is fully
 decoded.

 I need to check

 What does the hang look like?

 ref_field = ref_pic-f.reference;

 and it waits on ref_field in the next else that breaks when reference is
 PICT_FRAME

that's BOTTOM_FIELD | TOP_FIELD.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-08 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 ---
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 185 
 5 files changed, 191 insertions(+), 101 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index c124c4d..1f1dbc6 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -398,14 +398,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5,5,8,2*%2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+12*HAVE_ALIGNED_STACK
 
 ; transpose 6x16 - tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2,4,8,0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 - tmp space
@@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index ac231cb..73d4990 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c-h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c-h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c-h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c-h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3;
 c

Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-08 Thread Ronald S. Bultje
Hi,

On Sat, Dec 8, 2012 at 8:41 AM, Måns Rullgård m...@mansr.com wrote:
 Ronald S. Bultje rsbul...@gmail.com writes:

 From: Ronald S. Bultje rsbul...@gmail.com

 Use this in VP8/H264-8bit loopfilter functions so they can be used if
 there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
 ---
  libavcodec/x86/h264_deblock.asm |  27 ++
  libavcodec/x86/h264dsp_init.c   |   4 +-
  libavcodec/x86/vp8dsp.asm   |  68 ---
  libavcodec/x86/vp8dsp_init.c|   8 --
  libavutil/x86/x86inc.asm| 185 
 
  5 files changed, 191 insertions(+), 101 deletions(-)

 How is this different from the patch you sent yesterday?

It adds a missing %endrep for win64.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-08 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 ---
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 185 
 5 files changed, 191 insertions(+), 101 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index c124c4d..1f1dbc6 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -398,14 +398,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5,5,8,2*%2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+12*HAVE_ALIGNED_STACK
 
 ; transpose 6x16 - tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2,4,8,0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 - tmp space
@@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index ac231cb..73d4990 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c-h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c-h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c-h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c-h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3;
 c

Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-08 Thread Ronald S. Bultje
Hi,

On Sat, Dec 8, 2012 at 4:12 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
 From: Ronald S. Bultje rsbul...@gmail.com

 Use this in VP8/H264-8bit loopfilter functions so they can be used if
 there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
 ---
  libavcodec/x86/h264_deblock.asm |  27 ++
  libavcodec/x86/h264dsp_init.c   |   4 +-
  libavcodec/x86/vp8dsp.asm   |  68 ---
  libavcodec/x86/vp8dsp_init.c|   8 --
  libavutil/x86/x86inc.asm| 185 
 
  5 files changed, 191 insertions(+), 101 deletions(-)

One more fix for invalid stack free'ing if a YMM function on win64
used 6 registers, but no stack. Any more reviews, or can this be
applied?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-07 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 ---
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 186 
 5 files changed, 191 insertions(+), 102 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index c124c4d..1f1dbc6 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -398,14 +398,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5,5,8,2*%2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+12*HAVE_ALIGNED_STACK
 
 ; transpose 6x16 - tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2,4,8,0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 - tmp space
@@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index ac231cb..73d4990 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c-h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c-h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c-h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c-h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3;
 c

[libav-devel] [PATCH] Check for invalid VLC code in zeros_left before writing coefficients.

2012-12-07 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

This prevents an invalid write into coeffs[scantable[-1]] if zeros_left
itself was an invalid VLC code (and thus -1).
---
 libavcodec/h264_cavlc.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c
index 8702122..fa2bd99 100644
--- a/libavcodec/h264_cavlc.c
+++ b/libavcodec/h264_cavlc.c
@@ -610,17 +610,17 @@ static int decode_residual(H264Context *h, GetBitContext 
*gb, DCTELEM *block, in
 } \
 }
 
+if(zeros_left0){
+av_log(h-s.avctx, AV_LOG_ERROR, negative number of zero coeffs at %d 
%d\n, s-mb_x, s-mb_y);
+return -1;
+}
+
 if (h-pixel_shift) {
 STORE_BLOCK(int32_t)
 } else {
 STORE_BLOCK(int16_t)
 }
 
-if(zeros_left0){
-av_log(h-s.avctx, AV_LOG_ERROR, negative number of zero coeffs at %d 
%d\n, s-mb_x, s-mb_y);
-return -1;
-}
-
 return 0;
 }
 
-- 
1.8.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-07 Thread Ronald S. Bultje
Hi,

On Fri, Dec 7, 2012 at 1:01 PM, Måns Rullgård m...@mansr.com wrote:
 Ronald S. Bultje rsbul...@gmail.com writes:

 +%if mmsize = 16  HAVE_ALIGNED_STACK

 How much overhead would it be to drop HAVE_ALIGNED_STACK entirely?

Well, for now, we still have a ton of functions that don't use the
cglobal-method of allocating stack. I only ported h264/vp8 loopfilter,
nothing else.

But anyway, more generally, it's 4-5 instructions per function. For
typical functions with an inner loop, that's negligible, but for a
select small set of functions, it may be significant.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] golomb: use unsigned arithmetics in svq3_get_ue_golomb()

2012-12-07 Thread Ronald S. Bultje
Hi,

On Fri, Nov 30, 2012 at 10:56 AM, Janne Grunau janne-li...@jannau.net wrote:
 This prevents undefined behaviour of signed left shift if the coded
 value is larger than 2^31. Large values are most likely invalid and
 caused errors or by feeding random.

 Validate every use of svq3_get_ue_golomb() and changed the place there
 the return value was compared with negative numbers. dirac.c was clean,
 fixed rv30 and svq3.
 ---
  libavcodec/golomb.h |  5 +++--
  libavcodec/rv30.c   |  6 +++---
  libavcodec/svq3.c   | 17 -
  3 files changed, 14 insertions(+), 14 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-07 Thread Ronald S. Bultje
Hi,

On Fri, Dec 7, 2012 at 2:01 PM, Måns Rullgård m...@mansr.com wrote:
 Ronald S. Bultje rsbul...@gmail.com writes:
 On Fri, Dec 7, 2012 at 1:01 PM, Måns Rullgård m...@mansr.com wrote:
 Ronald S. Bultje rsbul...@gmail.com writes:

 +%if mmsize = 16  HAVE_ALIGNED_STACK

 How much overhead would it be to drop HAVE_ALIGNED_STACK entirely?

 Well, for now, we still have a ton of functions that don't use the
 cglobal-method of allocating stack. I only ported h264/vp8 loopfilter,
 nothing else.

 But anyway, more generally, it's 4-5 instructions per function. For
 typical functions with an inner loop, that's negligible, but for a
 select small set of functions, it may be significant.

 The remaining functions are ff_h264_idct8_add(4)_10_{sse2,avx},
 ff_hadamard8_diff(16)_{sse2,ssse3}, and something in swscale.

lavr also.

 Besides, does anyone still use 32-bit where performance is that
 critical?

This is used for YMM (e.g. avx float) stack alignment (to 32-byte)
also, so it will affect 64-bit also. My personal point of view is that
the code to take advantage of an actual feature of the compiler/system
(alignment) is there. I don't see why we'd remove it.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-07 Thread Ronald S. Bultje
Hi,

On Fri, Dec 7, 2012 at 2:08 PM, Måns Rullgård m...@mansr.com wrote:
 Ronald S. Bultje rsbul...@gmail.com writes:

 Hi,

 On Fri, Dec 7, 2012 at 2:01 PM, Måns Rullgård m...@mansr.com wrote:
 Ronald S. Bultje rsbul...@gmail.com writes:
 On Fri, Dec 7, 2012 at 1:01 PM, Måns Rullgård m...@mansr.com wrote:
 Ronald S. Bultje rsbul...@gmail.com writes:

 +%if mmsize = 16  HAVE_ALIGNED_STACK

 How much overhead would it be to drop HAVE_ALIGNED_STACK entirely?

 Well, for now, we still have a ton of functions that don't use the
 cglobal-method of allocating stack. I only ported h264/vp8 loopfilter,
 nothing else.

 But anyway, more generally, it's 4-5 instructions per function. For
 typical functions with an inner loop, that's negligible, but for a
 select small set of functions, it may be significant.

 The remaining functions are ff_h264_idct8_add(4)_10_{sse2,avx},
 ff_hadamard8_diff(16)_{sse2,ssse3}, and something in swscale.

 lavr also.

 There are no references to HAVE_ALIGNED_STACK there.

It crashes on x86-32 icc10.x and msvc.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] x86: float_dsp: fix loading of the len parameter on x86-32

2012-12-05 Thread Ronald S. Bultje
Hi,

On Wed, Dec 5, 2012 at 9:53 AM, Justin Ruggles justin.rugg...@gmail.com wrote:
 ---
  libavutil/x86/float_dsp.asm |3 +++
  1 files changed, 3 insertions(+), 0 deletions(-)

 diff --git a/libavutil/x86/float_dsp.asm b/libavutil/x86/float_dsp.asm
 index 4a1742f..dc75532 100644
 --- a/libavutil/x86/float_dsp.asm
 +++ b/libavutil/x86/float_dsp.asm
 @@ -127,6 +127,9 @@ cglobal vector_dmul_scalar, 3,3,3, dst, src, len
  cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len
  %endif
  %if ARCH_X86_32
 +; PROLOGUE loads len from the wrong stack address because mul is an 
 8-byte
 +; parameter and PROLOGUE assumes all parameters are 4-byte
 +mov  lenq, [esp+0x18]
  VBROADCASTSD   m0, mulm

That loads len twice. Why not:

%if ARCH_X86_32
cglobal vector_dmul_scalar, 3,4,3, dst, src, mul, len, lenaddr
mov lenq, lenaddrm
%else
cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len
%endif

which is more functionally correct?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32

2012-12-05 Thread Ronald S. Bultje
Hi,

On Wed, Dec 5, 2012 at 9:53 AM, Justin Ruggles justin.rugg...@gmail.com wrote:
 ---
  libavutil/x86/float_dsp.asm |2 +-
  1 files changed, 1 insertions(+), 1 deletions(-)

 diff --git a/libavutil/x86/float_dsp.asm b/libavutil/x86/float_dsp.asm
 index d8fd93a..4a1742f 100644
 --- a/libavutil/x86/float_dsp.asm
 +++ b/libavutil/x86/float_dsp.asm
 @@ -127,7 +127,7 @@ cglobal vector_dmul_scalar, 3,3,3, dst, src, len
  cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len
  %endif
  %if ARCH_X86_32
 -VBROADCASTSD xmm0, mulm
 +VBROADCASTSD   m0, mulm

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: slice-mt: check master context for valid current_picture_ptr

2012-12-05 Thread Ronald S. Bultje
Hi,

On Wed, Dec 5, 2012 at 11:10 AM, Janne Grunau janne-li...@jannau.net wrote:
 Fixes errors in slice based multithreading introduced in 0b300daad2f5.

 CC: libav-sta...@libav.org
 ---
  libavcodec/h264.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/libavcodec/h264.c b/libavcodec/h264.c
 index b9c46ea..317ee95 100644
 --- a/libavcodec/h264.c
 +++ b/libavcodec/h264.c
 @@ -2782,7 +2782,7 @@ static int decode_slice_header(H264Context *h, 
 H264Context *h0)
  s-picture_structure = last_pic_structure;
  s-dropable  = last_pic_dropable;
  return AVERROR_INVALIDDATA;
 -} else if (!s-current_picture_ptr) {
 +} else if (!s0-current_picture_ptr) {
  av_log(s-avctx, AV_LOG_ERROR,
 unset current_picture_ptr on %d. slice\n,
 h0-current_slice + 1);

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: slice-mt: get last_pic_dropable from master context

2012-12-05 Thread Ronald S. Bultje
Hi,

On Wed, Dec 5, 2012 at 11:00 AM, Janne Grunau janne-li...@jannau.net wrote:
 Fixes fate-h264-conformance-cvnlfi2_sony_h and smllwebdl.mkv from
 https://github.com/OpenELEC/OpenELEC.tv/issues/1557 .
 ---
  libavcodec/h264.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/libavcodec/h264.c b/libavcodec/h264.c
 index cd8aeec..b9c46ea 100644
 --- a/libavcodec/h264.c
 +++ b/libavcodec/h264.c
 @@ -2759,7 +2759,7 @@ static int decode_slice_header(H264Context *h, 
 H264Context *h0)
  h-mb_mbaff= 0;
  h-mb_aff_frame= 0;
  last_pic_structure = s0-picture_structure;
 -last_pic_dropable  = s-dropable;
 +last_pic_dropable  = s0-dropable;

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 03/10] SBR DSP x86: implement SSE qmf_post_shuffle

2012-11-30 Thread Ronald S. Bultje
Hi,

On Fri, Nov 30, 2012 at 1:14 PM, Christophe Gisquet
christophe.gisq...@gmail.com wrote:
 Hello,

 2012/11/30 Loren Merritt lor...@u.washington.edu:
 If you increment an index into W and z rather than the pointers
 themselves, then you can eliminate an add and a cmp.

 I add already tested that, and redid it:
 cglobal sbr_qmf_post_shuffle, 2,4,3,W,z
   mov   r3q, 32*4
   lea   r2q, [zq + (64-4)*4]
   addzq, r3q
   leaWq, [Wq + 2*r3q]
   neg   r3q
 .loop:
   mova   m0, [r2q]
   mova   m1, [zq  + r3q]
   xorps  m0, [ps_neg]
   shufps m0, m0, 0x1B
   mova   m2, m0
   unpcklps   m0, m1
   unpckhps   m2, m1
   mova  [Wq + 2*r3q +  0], m0
   mova  [Wq + 2*r3q + 16], m2
   sub   r2q, 16
   add   r3q, 16
   jl  .loop
   REP_RET

 It's 2 cycles slower on Penrynn/Win64 (154 vs 152).

Try adding an ALIGN 16 just above .loop:, maybe that fixes it?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 3/4] h264: check ref_count validity for num_ref_idx_active_override_flag

2012-11-26 Thread Ronald S. Bultje
On Mon, Nov 26, 2012 at 7:54 AM, Janne Grunau janne-li...@jannau.net wrote:
 On 2012-11-26 07:01:20 -0800, Ronald S. Bultje wrote:
 Hi,

 On Mon, Nov 26, 2012 at 4:06 AM, Janne Grunau janne-li...@jannau.net wrote:
  Fixes segfault in the fuzzed sample bipbop234.ts_s226407.
 
  CC: libav-sta...@libav.org
  ---
   libavcodec/h264.c | 7 ++-
   1 file changed, 6 insertions(+), 1 deletion(-)
 
  diff --git a/libavcodec/h264.c b/libavcodec/h264.c
  index eca4636..730e34a 100644
  --- a/libavcodec/h264.c
  +++ b/libavcodec/h264.c
  @@ -2896,8 +2896,13 @@ static int decode_slice_header(H264Context *h, 
  H264Context *h0)
 
   if (num_ref_idx_active_override_flag) {
   h-ref_count[0] = get_ue_golomb(s-gb) + 1;
  -if (h-slice_type_nos == AV_PICTURE_TYPE_B)
  +if (h-ref_count[0]  1)
  +return AVERROR_INVALIDDATA;
  +if (h-slice_type_nos == AV_PICTURE_TYPE_B) {
   h-ref_count[1] = get_ue_golomb(s-gb) + 1;
  +if (h-ref_count[1]  1)
  +return AVERROR_INVALIDDATA;
  +}
   }
 
   if (h-slice_type_nos == AV_PICTURE_TYPE_B)
  --
  1.7.12.4

 get_ue_golomb() guarantees to return something non-negative, so is
 this an integer overflow? We should then prevent the integer overflow.

 It returns -1 if the next 32 bits are all 0. That is arguebly not a
 valid exp-golomb code but occurs with the sample mentioned in the commit
 message (available in incoming/2012-11-15_fuzz_h264_janne as my other
 fuzzed samples).

Oh I see, patch OK then.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] wavenc: write fact chunk sample count at the correct file position

2012-11-26 Thread Ronald S. Bultje
Hi,

On Mon, Nov 26, 2012 at 4:11 PM, Justin Ruggles
justin.rugg...@gmail.com wrote:
 From: Michael Niedermayer michae...@gmx.at

 Fixes curruption of metadata in the INFO chunk.

 Signed-off-by: Michael Niedermayer michae...@gmx.at
 Signed-off-by: Justin Ruggles justin.rugg...@gmail.com
 ---
  libavformat/wavenc.c |9 +
  1 files changed, 5 insertions(+), 4 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: modify scalarproduct_int16 to handle mod8 numbers of loops.

2012-11-25 Thread Ronald S. Bultje
Hi,

On Sun, Nov 25, 2012 at 1:18 AM, Diego Biurrun di...@biurrun.de wrote:
 On Sat, Nov 24, 2012 at 03:43:04PM +0100, Christophe Gisquet wrote:

 PS: no idea why firefox/gmail treats those patches as octet-stream,
 and how to fix that.

 Ronald wrote about this a while back, but there seems to be no easy
 solution:

 http://blogs.gnome.org/rbultje/2009/03/04/firefox-gmail-and-content-type-of-attachments/

The solution is basically to use git send-email, not gmail. git
send-email can use your gmail account as smtp, if wanted. I can help
you set that up if this doesn't help enough:

http://morefedora.blogspot.com/2009/02/configuring-git-send-email-to-use-gmail.html

(This is kind of a pain on Fedora, since you'll first have to install
a bunch of perl extensions. I'm gonna hope you're not using a Mac at
this point. See
http://kbase.wincent.com/old/knowledge-base/Installing_Net::SMTP::SSL_for_sending_patches_with_Git_over_secure_SMTP.html
for gory details.)

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: modify scalarproduct_int16 to handle mod8 numbers of loops.

2012-11-25 Thread Ronald S. Bultje
Hi,

On Sun, Nov 25, 2012 at 8:43 AM, Ronald S. Bultje rsbul...@gmail.com wrote:
 (This is kind of a pain on Fedora

Whoops, I meant Mac here.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: enable low delay only if no delayed frames were seen

2012-11-25 Thread Ronald S. Bultje
Hi,

On Sun, Nov 25, 2012 at 2:40 PM, Janne Grunau janne-li...@jannau.net wrote:
 Dropping frames is undesirable but that is the only way by which the
 decoder could return to low delay mode. Instead emit a warning and
 continue with delayed frames.
 Fixes a crash in fuzzed sample nasa-8s2.ts_s20033 caused by a larger
 than expected has_b_frames value. Low delay keeps getting re-enabled
 from a presumely broken SPS.
 ---
  libavcodec/h264.c | 10 --
  1 file changed, 8 insertions(+), 2 deletions(-)

 diff --git a/libavcodec/h264.c b/libavcodec/h264.c
 index f45c572..658d865 100644
 --- a/libavcodec/h264.c
 +++ b/libavcodec/h264.c
 @@ -3867,8 +3867,14 @@ again:

  if (s-flags  CODEC_FLAG_LOW_DELAY ||
  (h-sps.bitstream_restriction_flag 
 - !h-sps.num_reorder_frames))
 -s-low_delay = 1;
 + !h-sps.num_reorder_frames)) {
 +if (s-avctx-has_b_frames  1 || h-delayed_pic[0])
 +av_log(avctx, AV_LOG_WARNING, Delayed frames seen 
 +   reenabling low delay requires a codec 
 +   flush.\n);
 +else
 +s-low_delay = 1;
 +}

That looks good, thanks.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: reset has_b_frames after enabling low_delay from SPS

2012-11-21 Thread Ronald S. Bultje
Hi,

On Wed, Nov 21, 2012 at 5:37 AM, Janne Grunau janne-li...@jannau.netwrote:

 On 2012-11-21 13:14:34 +, Loren Merritt wrote:
  On Wed, 21 Nov 2012, Janne Grunau wrote:
   On 2012-11-16 18:14:29 -0800, Ronald S. Bultje wrote:
   
So I'm going to have to wonder what happens to the delayed frames
 already
cached in h-delayed_pics[]? Are they still output?
  
   yes, delayed pictures are returned as long as h-delayed_pics[0] is not
   NULL.
 
  And when you return something from delayed_pics[] instead of the new
  frame that just got decoded, the new frame gets delayed, so that you
  don't ever actually switch to low delay mode?

 Since we can't return multiple frames at once we can't switch to low
 delay mode once we have delayed pictures. If low delay after switching
 from a stream with delayed pictures to a low_delay stream is important
 and dropping frames is an option requiring a decoder flush from sounds
 reasonable to me.


That's one option. We could also simply say that low_delay can't be
re-enabled once disabled, so even though the SPS says we're low-delay, just
ignore it. I don't really have a preference either way, whichever is easier
implementation-wise or whichever you feel is a better representation of
what the end user of a non-fuzzed bitstream likely wanted to happen. (In
practice, this probably never happens anyway?)

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: reset has_b_frames after enabling low_delay from SPS

2012-11-21 Thread Ronald S. Bultje
Hi,

On Wed, Nov 21, 2012 at 12:29 PM, Janne Grunau janne-li...@jannau.netwrote:

 On 2012-11-21 09:13:27 -0800, Ronald S. Bultje wrote:
  On Wed, Nov 21, 2012 at 5:37 AM, Janne Grunau janne-li...@jannau.net
 wrote:
   On 2012-11-21 13:14:34 +, Loren Merritt wrote:
On Wed, 21 Nov 2012, Janne Grunau wrote:
 On 2012-11-16 18:14:29 -0800, Ronald S. Bultje wrote:
 
  So I'm going to have to wonder what happens to the delayed frames
  already
  cached in h-delayed_pics[]? Are they still output?

 yes, delayed pictures are returned as long as h-delayed_pics[0]
 is not
 NULL.
   
And when you return something from delayed_pics[] instead of the new
frame that just got decoded, the new frame gets delayed, so that you
don't ever actually switch to low delay mode?
  
   Since we can't return multiple frames at once we can't switch to low
   delay mode once we have delayed pictures. If low delay after switching
   from a stream with delayed pictures to a low_delay stream is important
   and dropping frames is an option requiring a decoder flush from sounds
   reasonable to me.
  
 
  That's one option. We could also simply say that low_delay can't be
  re-enabled once disabled, so even though the SPS says we're low-delay,
 just
  ignore it. I don't really have a preference either way, whichever is
 easier
  implementation-wise or whichever you feel is a better representation of
  what the end user of a non-fuzzed bitstream likely wanted to happen. (In
  practice, this probably never happens anyway?)

 The current patch seems to work fine and fate-h264 still passes so this
 approach is easier for me. I also think that not dropping frames is
 less surprising option. Expecting tests of sender and receiver in
 situations where low delay is important is not unreasonable. I can add a
 warning when low_delay gets enabled with delayed frames.


Well, what I meant is that we probably shouldn't reset
s-avctx-has_b_frames and s-low_delay, since that may give the
(incorrect) assumption to the application that there is no delay... A
warning would be good, yes.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] h264: reset has_b_frames after enabling low_delay from SPS

2012-11-16 Thread Ronald S. Bultje
Hi,

On Fri, Nov 16, 2012 at 8:43 AM, Janne Grunau janne-li...@jannau.netwrote:

 Fixes a crash in fuzzed file nasa-8s2.ts_s20033 caused by a too large
 has_b_frames value. low_delay keeps getting re-enabled from the the
 presumely broken SPS.
 ---
  libavcodec/h264.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

 diff --git a/libavcodec/h264.c b/libavcodec/h264.c
 index c30c478..7935fe6 100644
 --- a/libavcodec/h264.c
 +++ b/libavcodec/h264.c
 @@ -2346,8 +2346,10 @@ static int h264_set_parameter_from_sps(H264Context
 *h)

  if (s-flags  CODEC_FLAG_LOW_DELAY ||
  (h-sps.bitstream_restriction_flag 
 - !h-sps.num_reorder_frames))
 + !h-sps.num_reorder_frames)) {
  s-low_delay = 1;
 +s-avctx-has_b_frames = 0;
 +}


So I'm going to have to wonder what happens to the delayed frames already
cached in h-delayed_pics[]? Are they still output? What happens if we
concatenate a delayed h264 file by a file that has this kind of sps header?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: lavr: fix stack allocation for 7 and 8 channel downmixing on x86-32

2012-11-12 Thread Ronald S. Bultje
Hi,

On Mon, Nov 12, 2012 at 12:41 PM, Luca Barbato lu_z...@gentoo.org wrote:

 On 11/12/2012 07:35 PM, Justin Ruggles wrote:
  Fixes crashes on Win32 and stack overruns on x86-32 in general.

 Who can test it?


http://bugzilla.libav.org/show_bug.cgi?id=338

Reproducible on x86-32 Linux/Mac with asan.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: lavr: fix stack allocation for 7 and 8 channel downmixing on x86-32

2012-11-08 Thread Ronald S. Bultje
Hi,

On Thu, Nov 8, 2012 at 2:17 PM, Justin Ruggles justin.rugg...@gmail.comwrote:

 On 10/29/2012 04:39 PM, Justin Ruggles wrote:
  From: Ronald S. Bultje rsbul...@gmail.com
 
  Fixes crashes on Win32 and stack overruns on x86-32 in general.
  ---
   libavresample/x86/audio_mix.asm |7 +++
   1 files changed, 7 insertions(+), 0 deletions(-)
 
  diff --git a/libavresample/x86/audio_mix.asm
 b/libavresample/x86/audio_mix.asm
  index 0c4a9bd..cdc787d 100644
  --- a/libavresample/x86/audio_mix.asm
  +++ b/libavresample/x86/audio_mix.asm
  @@ -280,6 +280,13 @@ cglobal mix_%1_to_%2_%3_flt,
 3,in_channels+2,needed_mmregs+matrix_elements_mm, s
   sub   rsp, matrix_elements_stack * mmsize
   %else
   %assign pad matrix_elements_stack * mmsize + (mmsize - gprsize) -
 (stack_offset  (mmsize - gprsize))
  +; on x86-32 for 7 and 8 channels we need more stack space for src
 pointers
  +%if ARCH_X86_32  in_channels = 7
  +%assign pad pad + 0x10
  +%define src5m [rsp+0xc0]
  +%define src6m [rsp+0xc4]
  +%define src7m [rsp+0xc8]
  +%endif
   SUB   rsp, pad
   %endif
   %endif

 ping


 Well it's my patch and you reviewed it, isn't that enough?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: h264_qpel: sign-extend stride argument in put_pixels16_sse2()

2012-11-05 Thread Ronald S. Bultje
Hi,

On Mon, Nov 5, 2012 at 1:22 PM, Diego Biurrun di...@biurrun.de wrote:
 ---

 This fixes the crash in fate-vp5 with the QPEL code converted to YASM.
 I'm sending this separately to not spam everybody with 100+kB patches.
 The patch is meant to be squashed into the QPEL yasmification.
  ; void put_pixels16_sse2(uint8_t *block, const uint8_t *pixels, int 
 line_size, int h)
  cglobal put_pixels16, 4,5,4
 +movsxdifnidnr2, r2d
  lea r4, [r2*3]
  .loop:

No, I said each occurence of cglobal needed this, not just one of
them. Also, please fix indenting of this instruction to be aligned
with the line below.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: h264: Convert 8-bit QPEL inline assembly to YASM

2012-11-05 Thread Ronald S. Bultje
Hi,

On Mon, Nov 5, 2012 at 10:48 AM, Diego Biurrun di...@biurrun.de wrote:
 +%macro QPEL8_H_LOWPASS_OP 1
 +cglobal %1_h264_qpel8_h_lowpass, 4,5 ; dst, src, dstStride, srcStride
 +mov   r4d, 8

Each and every single one of these needs a stride sign extension like
you did in dsputil.asm.

Do you have numbers to compare performance of qpel (both with
START/STOP_TIMER as well as with total wallclock time, e.g. using the
time command) before/after this patch?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] Use ptrdiff_t instead of int for intra pred stride function parameter.

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:53 AM, Diego Biurrun di...@biurrun.de wrote:
 On Sun, Oct 28, 2012 at 08:44:54PM -0700, Ronald S. Bultje wrote:
 From: Ronald S. Bultje rsbul...@gmail.com

 ---
  libavcodec/arm/h264pred_init_arm.c   |  36 +++---
  libavcodec/h264pred.c|  79 
  libavcodec/h264pred.h|  18 +--
  libavcodec/h264pred_template.c   | 233 
 +--
  libavcodec/x86/h264_intrapred_init.c |  15 ++-
  5 files changed, 264 insertions(+), 117 deletions(-)

 Why?

So you don't have to sign-extend integers in simd functions (or suffer
undefined behaviour if you forget). I've done this before for several
other simd functions, use git log.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] Remove usage of INIT_AVX in h264_intrapred_10bit.asm.

2012-10-29 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

Replace INIT_AVX by INIT_XMM avx. Port the whole file to use cpuflag
based function declarations. Remove (now unused) cputype argument in
function declaration macros. Change function prototypes to have mmx2
instead of mmxext as suffix, since that's required by cpuflags.
---
 libavcodec/x86/h264_intrapred_10bit.asm | 306 
 libavcodec/x86/h264_intrapred_init.c|  40 ++---
 2 files changed, 177 insertions(+), 169 deletions(-)

diff --git a/libavcodec/x86/h264_intrapred_10bit.asm 
b/libavcodec/x86/h264_intrapred_10bit.asm
index 529134e..c3f6dc4 100644
--- a/libavcodec/x86/h264_intrapred_10bit.asm
+++ b/libavcodec/x86/h264_intrapred_10bit.asm
@@ -53,8 +53,8 @@ SECTION .text
 ;-
 ; void pred4x4_down_right(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_DR 1
-cglobal pred4x4_down_right_10_%1, 3,3
+%macro PRED4x4_DR 0
+cglobal pred4x4_down_right_10, 3, 3
 sub   r0, r2
 lea   r1, [r0+r2*2]
 movhpsm1, [r1-8]
@@ -79,21 +79,22 @@ cglobal pred4x4_down_right_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_DR sse2
+PRED4x4_DR
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_DR ssse3
+PRED4x4_DR
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_DR avx
+INIT_XMM avx
+PRED4x4_DR
 %endif
 
 ;-
 ; void pred4x4_vertical_right(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_VR 1
-cglobal pred4x4_vertical_right_10_%1, 3,3,6
+%macro PRED4x4_VR 0
+cglobal pred4x4_vertical_right_10, 3, 3, 6
 sub r0, r2
 lea r1, [r0+r2*2]
 movqm5, [r0]; t3t2t1t0
@@ -119,21 +120,22 @@ cglobal pred4x4_vertical_right_10_%1, 3,3,6
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_VR sse2
+PRED4x4_VR
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_VR ssse3
+PRED4x4_VR
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_VR avx
+INIT_XMM avx
+PRED4x4_VR
 %endif
 
 ;-
 ; void pred4x4_horizontal_down(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_HD 1
-cglobal pred4x4_horizontal_down_10_%1, 3,3
+%macro PRED4x4_HD 0
+cglobal pred4x4_horizontal_down_10, 3, 3
 subr0, r2
 lear1, [r0+r2*2]
 movq   m0, [r0-8]  ; lt ..
@@ -162,14 +164,15 @@ cglobal pred4x4_horizontal_down_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_HD sse2
+PRED4x4_HD
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_HD ssse3
+PRED4x4_HD
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_HD avx
+INIT_XMM avx
+PRED4x4_HD
 %endif
 
 ;-
@@ -192,8 +195,8 @@ PRED4x4_HD avx
 HADDD   %1, %2
 %endmacro
 
-INIT_MMX
-cglobal pred4x4_dc_10_mmxext, 3,3
+INIT_MMX mmx2
+cglobal pred4x4_dc_10, 3, 3
 subr0, r2
 lear1, [r0+r2*2]
 movq   m2, [r0+r2*1-8]
@@ -216,8 +219,8 @@ cglobal pred4x4_dc_10_mmxext, 3,3
 ;-
 ; void pred4x4_down_left(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_DL 1
-cglobal pred4x4_down_left_10_%1, 3,3
+%macro PRED4x4_DL 0
+cglobal pred4x4_down_left_10, 3, 3
 subr0, r2
 movq   m0, [r0]
 movhps m0, [r1]
@@ -236,18 +239,18 @@ cglobal pred4x4_down_left_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
-PRED4x4_DL sse2
+INIT_XMM sse2
+PRED4x4_DL
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_DL avx
+INIT_XMM avx
+PRED4x4_DL
 %endif
 
 ;-
 ; void pred4x4_vertical_left(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_VL 1
-cglobal pred4x4_vertical_left_10_%1, 3,3
+%macro PRED4x4_VL 0
+cglobal pred4x4_vertical_left_10, 3, 3
 subr0, r2
 movu   m1, [r0]
 movhps m1, [r1]
@@ -265,18 +268,18 @@ cglobal pred4x4_vertical_left_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
-PRED4x4_VL sse2
+INIT_XMM sse2
+PRED4x4_VL
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_VL avx
+INIT_XMM avx
+PRED4x4_VL
 %endif
 
 ;-
 ; void pred4x4_horizontal_up(pixel *src, const pixel *topright, int stride

Re: [libav-devel] [PATCH 1/2] Use PRED4x4/8x8/8x8L/16x16 macros to declare x86 intrapred prototypes.

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 10:25 AM, Luca Barbato lu_z...@gentoo.org wrote:
 On 10/29/2012 04:44 AM, Ronald S. Bultje wrote:
 From: Ronald S. Bultje rsbul...@gmail.com


 If you don't mind

 x86: use PRED4x4/8x8/8x8L/16x16 macros to declare intrapred prototypes

 might fit better.

Will change locally. Any other comments on the code or can I commit
with that modified?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: Add cpuflags_mmxext alias for cpuflags_mmx2

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 5:15 PM, Diego Biurrun di...@biurrun.de wrote:
 This allows using mmxext as name in Libav while staying compatible
 with changes to the YASM macro infrastructure imported from x264.
 ---
  libavutil/x86/x86inc.asm |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

 diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
 index 1fe9f55..2655154 100644
 --- a/libavutil/x86/x86inc.asm
 +++ b/libavutil/x86/x86inc.asm
 @@ -35,6 +35,7 @@
  ; to x264-de...@videolan.org .

  %define program_name ff
 +%define cpuflags_mmxext cpuflags_mmx2

Any such a change to x86inc.asm is OK if and only if x264 accepts this
change upstream. Loren?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: Add cpuflags_mmxext alias for cpuflags_mmx2

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 5:56 PM, Diego Biurrun di...@biurrun.de wrote:
 On Mon, Oct 29, 2012 at 05:25:08PM -0700, Ronald S. Bultje wrote:
 On Mon, Oct 29, 2012 at 5:15 PM, Diego Biurrun di...@biurrun.de wrote:
  This allows using mmxext as name in Libav while staying compatible
  with changes to the YASM macro infrastructure imported from x264.
  ---
   libavutil/x86/x86inc.asm |1 +
   1 files changed, 1 insertions(+), 0 deletions(-)
 
  --- a/libavutil/x86/x86inc.asm
  +++ b/libavutil/x86/x86inc.asm
  @@ -35,6 +35,7 @@
   ; to x264-de...@videolan.org .
 
   %define program_name ff
  +%define cpuflags_mmxext cpuflags_mmx2

 Any such a change to x86inc.asm is OK if and only if x264 accepts this
 change upstream. Loren?

 The x86inc.asm files from x264 and Libav are already different, x264 uses
 x264 instead of ff as program_name.  That's the reason why I chose to
 place the alias in this very spot: during upstream merges there will be
 no conflicts that are not

 a) trivial to fix and
 b) would have occurred anyway due to the existing difference.

 This patch is also what a room full of Libav developers accepted as a
 compromise at VDD.  Please don't oppose the consensus of your peers.

You keep trying to push this through. No.

I do the syncing, and nobody wants to take that job from me, I assure
you. Every time the semantics of cglobal change by however little,
someone who knows that stuff needs to go in and fix/port it. I prefer
to not have to take any additional tasks on me, such as maintaining
forky differences.

Thus, please, no.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] swscale: support gray to 9bit and 10bit formats

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:14 PM, Luca Barbato lu_z...@gentoo.org wrote:
 With the input of Kostya and Ronald.
 ---
  libswscale/swscale.c  | 38 --
  libswscale/swscale_unscaled.c | 32 ++--
  2 files changed, 66 insertions(+), 4 deletions(-)

So it works now, under valgrind? LGTM then.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] pixfmt: support more yuva formats

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:14 PM, Luca Barbato lu_z...@gentoo.org wrote:
 Signed-off-by: Luca Barbato lu_z...@gentoo.org
 ---
  libavcodec/raw.c  |  19 
  libavformat/nut.c |  28 +
  libavutil/pixdesc.c   | 234 
 ++
  libavutil/pixfmt.h|  28 +
  libswscale/utils.c|  18 
  tests/ref/lavfi/pixdesc   |  18 
  tests/ref/lavfi/pixfmts_copy  |  18 
  tests/ref/lavfi/pixfmts_null  |  18 
  tests/ref/lavfi/pixfmts_scale |  18 
  tests/ref/lavfi/pixfmts_vflip |  18 
  10 files changed, 417 insertions(+)

OK, assuming it works without new warnings under valgrind.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/4] x86: include x86inc.asm in x86util.asm

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:28 PM, Diego Biurrun di...@biurrun.de wrote:
 -%include x86inc.asm
  %include x86util.asm

Didn't you come up with the rule that each file should contain all
headers it directly takes symbols from? How do these changes fit with
that scheme?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/4] x86util: Add cpuflags_mmxext alias for cpuflags_mmx2

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:28 PM, Diego Biurrun di...@biurrun.de wrote:
 mmxext is a more sensible name and more common in outside projects.
 ---
  libavutil/x86/x86util.asm |1 +
  1 files changed, 1 insertions(+), 0 deletions(-)

No.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 1/2] Remove usage of INIT_AVX in h264_intrapred_10bit.asm.

2012-10-28 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

---
 libavcodec/x86/h264_intrapred_10bit.asm | 306 
 libavcodec/x86/h264_intrapred_init.c|  40 ++---
 2 files changed, 177 insertions(+), 169 deletions(-)

diff --git a/libavcodec/x86/h264_intrapred_10bit.asm 
b/libavcodec/x86/h264_intrapred_10bit.asm
index 529134e..c3f6dc4 100644
--- a/libavcodec/x86/h264_intrapred_10bit.asm
+++ b/libavcodec/x86/h264_intrapred_10bit.asm
@@ -53,8 +53,8 @@ SECTION .text
 ;-
 ; void pred4x4_down_right(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_DR 1
-cglobal pred4x4_down_right_10_%1, 3,3
+%macro PRED4x4_DR 0
+cglobal pred4x4_down_right_10, 3, 3
 sub   r0, r2
 lea   r1, [r0+r2*2]
 movhpsm1, [r1-8]
@@ -79,21 +79,22 @@ cglobal pred4x4_down_right_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_DR sse2
+PRED4x4_DR
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_DR ssse3
+PRED4x4_DR
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_DR avx
+INIT_XMM avx
+PRED4x4_DR
 %endif
 
 ;-
 ; void pred4x4_vertical_right(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_VR 1
-cglobal pred4x4_vertical_right_10_%1, 3,3,6
+%macro PRED4x4_VR 0
+cglobal pred4x4_vertical_right_10, 3, 3, 6
 sub r0, r2
 lea r1, [r0+r2*2]
 movqm5, [r0]; t3t2t1t0
@@ -119,21 +120,22 @@ cglobal pred4x4_vertical_right_10_%1, 3,3,6
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_VR sse2
+PRED4x4_VR
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_VR ssse3
+PRED4x4_VR
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_VR avx
+INIT_XMM avx
+PRED4x4_VR
 %endif
 
 ;-
 ; void pred4x4_horizontal_down(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_HD 1
-cglobal pred4x4_horizontal_down_10_%1, 3,3
+%macro PRED4x4_HD 0
+cglobal pred4x4_horizontal_down_10, 3, 3
 subr0, r2
 lear1, [r0+r2*2]
 movq   m0, [r0-8]  ; lt ..
@@ -162,14 +164,15 @@ cglobal pred4x4_horizontal_down_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_HD sse2
+PRED4x4_HD
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_HD ssse3
+PRED4x4_HD
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_HD avx
+INIT_XMM avx
+PRED4x4_HD
 %endif
 
 ;-
@@ -192,8 +195,8 @@ PRED4x4_HD avx
 HADDD   %1, %2
 %endmacro
 
-INIT_MMX
-cglobal pred4x4_dc_10_mmxext, 3,3
+INIT_MMX mmx2
+cglobal pred4x4_dc_10, 3, 3
 subr0, r2
 lear1, [r0+r2*2]
 movq   m2, [r0+r2*1-8]
@@ -216,8 +219,8 @@ cglobal pred4x4_dc_10_mmxext, 3,3
 ;-
 ; void pred4x4_down_left(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_DL 1
-cglobal pred4x4_down_left_10_%1, 3,3
+%macro PRED4x4_DL 0
+cglobal pred4x4_down_left_10, 3, 3
 subr0, r2
 movq   m0, [r0]
 movhps m0, [r1]
@@ -236,18 +239,18 @@ cglobal pred4x4_down_left_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
-PRED4x4_DL sse2
+INIT_XMM sse2
+PRED4x4_DL
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_DL avx
+INIT_XMM avx
+PRED4x4_DL
 %endif
 
 ;-
 ; void pred4x4_vertical_left(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_VL 1
-cglobal pred4x4_vertical_left_10_%1, 3,3
+%macro PRED4x4_VL 0
+cglobal pred4x4_vertical_left_10, 3, 3
 subr0, r2
 movu   m1, [r0]
 movhps m1, [r1]
@@ -265,18 +268,18 @@ cglobal pred4x4_vertical_left_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
-PRED4x4_VL sse2
+INIT_XMM sse2
+PRED4x4_VL
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_VL avx
+INIT_XMM avx
+PRED4x4_VL
 %endif
 
 ;-
 ; void pred4x4_horizontal_up(pixel *src, const pixel *topright, int stride)
 ;-
-INIT_MMX
-cglobal pred4x4_horizontal_up_10_mmxext, 3,3
+INIT_MMX mmx2
+cglobal pred4x4_horizontal_up_10, 3, 3
 sub   r0, r2
 lea   r1, [r0+r2*2]
 movq  m0, [r0+r2*1-8]
@@ -309,8 +312,8 @@ cglobal pred4x4_horizontal_up_10_mmxext, 3,3

[libav-devel] [PATCH 2/2] Remove INIT_AVX from x86inc.asm.

2012-10-28 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

---
 libavutil/x86/x86inc.asm | 8 
 1 file changed, 8 deletions(-)

diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index d734c6e..1fe9f55 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -678,14 +678,6 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits
 INIT_CPUFLAGS %1
 %endmacro
 
-; FIXME: INIT_AVX can be replaced by INIT_XMM avx
-%macro INIT_AVX 0
-INIT_XMM
-%assign avx_enabled 1
-%define PALIGNR PALIGNR_SSSE3
-%define RESET_MM_PERMUTATION INIT_AVX
-%endmacro
-
 %macro INIT_YMM 0-1+
 %assign avx_enabled 1
 %define RESET_MM_PERMUTATION INIT_YMM %1
-- 
1.7.11.3

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 1/2] Use PRED4x4/8x8/8x8L/16x16 macros to declare x86 intrapred prototypes.

2012-10-28 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

---
 libavcodec/x86/h264_intrapred.asm|  80 +-
 libavcodec/x86/h264_intrapred_init.c | 296 ++-
 2 files changed, 190 insertions(+), 186 deletions(-)

diff --git a/libavcodec/x86/h264_intrapred.asm 
b/libavcodec/x86/h264_intrapred.asm
index dc418f7..7c6aa12 100644
--- a/libavcodec/x86/h264_intrapred.asm
+++ b/libavcodec/x86/h264_intrapred.asm
@@ -53,7 +53,7 @@ cextern pw_32
 ; void pred16x16_vertical(uint8_t *src, int stride)
 ;-
 
-cglobal pred16x16_vertical_mmx, 2,3
+cglobal pred16x16_vertical_8_mmx, 2,3
 sub   r0, r1
 mov   r2, 8
 movq mm0, [r0+0]
@@ -68,7 +68,7 @@ cglobal pred16x16_vertical_mmx, 2,3
 jg .loop
 REP_RET
 
-cglobal pred16x16_vertical_sse, 2,3
+cglobal pred16x16_vertical_8_sse, 2,3
 sub   r0, r1
 mov   r2, 4
 movaps xmm0, [r0]
@@ -88,7 +88,7 @@ cglobal pred16x16_vertical_sse, 2,3
 ;-
 
 %macro PRED16x16_H 0
-cglobal pred16x16_horizontal, 2,3
+cglobal pred16x16_horizontal_8, 2,3
 mov   r2, 8
 %if cpuflag(ssse3)
 mova  m2, [pb_3]
@@ -130,7 +130,7 @@ INIT_XMM
 ;-
 
 %macro PRED16x16_DC 0
-cglobal pred16x16_dc, 2,7
+cglobal pred16x16_dc_8, 2,7
 mov   r4, r0
 sub   r0, r1
 pxor  mm0, mm0
@@ -193,7 +193,7 @@ INIT_XMM
 ;-
 
 %macro PRED16x16_TM_MMX 0
-cglobal pred16x16_tm_vp8, 2,5
+cglobal pred16x16_tm_vp8_8, 2,5
 subr0, r1
 pxor  mm7, mm7
 movq  mm0, [r0+0]
@@ -234,7 +234,7 @@ INIT_MMX mmx2
 PRED16x16_TM_MMX
 INIT_MMX
 
-cglobal pred16x16_tm_vp8_sse2, 2,6,6
+cglobal pred16x16_tm_vp8_8_sse2, 2,6,6
 sub  r0, r1
 pxor   xmm2, xmm2
 movdqa xmm0, [r0]
@@ -274,7 +274,7 @@ cglobal pred16x16_tm_vp8_sse2, 2,6,6
 ;-
 
 %macro H264_PRED16x16_PLANE 1
-cglobal pred16x16_plane_%1, 2,9,7
+cglobal pred16x16_plane_%1_8, 2,9,7
 mov  r2, r1   ; +stride
 neg  r1   ; -stride
 
@@ -556,7 +556,7 @@ INIT_XMM
 ;-
 
 %macro H264_PRED8x8_PLANE 0
-cglobal pred8x8_plane, 2,9,7
+cglobal pred8x8_plane_8, 2,9,7
 mov  r2, r1   ; +stride
 neg  r1   ; -stride
 
@@ -730,7 +730,7 @@ INIT_XMM
 ; void pred8x8_vertical(uint8_t *src, int stride)
 ;-
 
-cglobal pred8x8_vertical_mmx, 2,2
+cglobal pred8x8_vertical_8_mmx, 2,2
 subr0, r1
 movq  mm0, [r0]
 %rep 3
@@ -747,7 +747,7 @@ cglobal pred8x8_vertical_mmx, 2,2
 ;-
 
 %macro PRED8x8_H 0
-cglobal pred8x8_horizontal, 2,3
+cglobal pred8x8_horizontal_8, 2,3
 mov   r2, 4
 %if cpuflag(ssse3)
 mova  m2, [pb_3]
@@ -774,7 +774,7 @@ INIT_MMX
 ;-
 ; void pred8x8_top_dc_mmxext(uint8_t *src, int stride)
 ;-
-cglobal pred8x8_top_dc_mmxext, 2,5
+cglobal pred8x8_top_dc_8_mmxext, 2,5
 sub r0, r1
 movq   mm0, [r0]
 pxor   mm1, mm1
@@ -809,7 +809,7 @@ cglobal pred8x8_top_dc_mmxext, 2,5
 ;-
 
 INIT_MMX
-cglobal pred8x8_dc_mmxext, 2,5
+cglobal pred8x8_dc_8_mmxext, 2,5
 sub   r0, r1
 pxor  m7, m7
 movd  m0, [r0+0]
@@ -869,7 +869,7 @@ cglobal pred8x8_dc_mmxext, 2,5
 ; void pred8x8_dc_rv40(uint8_t *src, int stride)
 ;-
 
-cglobal pred8x8_dc_rv40_mmxext, 2,7
+cglobal pred8x8_dc_rv40_8_mmxext, 2,7
 mov   r4, r0
 sub   r0, r1
 pxor  mm0, mm0
@@ -906,7 +906,7 @@ cglobal pred8x8_dc_rv40_mmxext, 2,7
 ;-
 
 %macro PRED8x8_TM_MMX 0
-cglobal pred8x8_tm_vp8, 2,6
+cglobal pred8x8_tm_vp8_8, 2,6
 subr0, r1
 pxor  mm7, mm7
 movq  mm0, [r0]
@@ -946,7 +946,7 @@ INIT_MMX mmx2
 PRED8x8_TM_MMX
 INIT_MMX
 
-cglobal pred8x8_tm_vp8_sse2, 2,6,4
+cglobal pred8x8_tm_vp8_8_sse2, 2,6,4
 sub  r0, r1
 pxor   xmm1, xmm1
 movq   xmm0, [r0]
@@ -974,7 +974,7 @@ cglobal pred8x8_tm_vp8_sse2, 2,6,4
 jg .loop
 REP_RET
 
-cglobal pred8x8_tm_vp8_ssse3, 2,3,6
+cglobal pred8x8_tm_vp8_8_ssse3, 2,3,6
 sub  r0, r1
 movdqa xmm4, [tm_shuf]
 pxor   xmm1, xmm1
@@ -1016,7 +1016,7 @@ cglobal

[libav-devel] [PATCH 2/2] Use ptrdiff_t instead of int for intra pred stride function parameter.

2012-10-28 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

---
 libavcodec/arm/h264pred_init_arm.c   |  36 +++---
 libavcodec/h264pred.c|  79 
 libavcodec/h264pred.h|  18 +--
 libavcodec/h264pred_template.c   | 233 +--
 libavcodec/x86/h264_intrapred_init.c |  15 ++-
 5 files changed, 264 insertions(+), 117 deletions(-)

diff --git a/libavcodec/arm/h264pred_init_arm.c 
b/libavcodec/arm/h264pred_init_arm.c
index 371b1a6..39c0121 100644
--- a/libavcodec/arm/h264pred_init_arm.c
+++ b/libavcodec/arm/h264pred_init_arm.c
@@ -23,25 +23,25 @@
 #include libavutil/arm/cpu.h
 #include libavcodec/h264pred.h
 
-void ff_pred16x16_vert_neon(uint8_t *src, int stride);
-void ff_pred16x16_hor_neon(uint8_t *src, int stride);
-void ff_pred16x16_plane_neon(uint8_t *src, int stride);
-void ff_pred16x16_dc_neon(uint8_t *src, int stride);
-void ff_pred16x16_128_dc_neon(uint8_t *src, int stride);
-void ff_pred16x16_left_dc_neon(uint8_t *src, int stride);
-void ff_pred16x16_top_dc_neon(uint8_t *src, int stride);
+void ff_pred16x16_vert_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_hor_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_plane_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_128_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_left_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_top_dc_neon(uint8_t *src, ptrdiff_t stride);
 
-void ff_pred8x8_vert_neon(uint8_t *src, int stride);
-void ff_pred8x8_hor_neon(uint8_t *src, int stride);
-void ff_pred8x8_plane_neon(uint8_t *src, int stride);
-void ff_pred8x8_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_128_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_left_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_top_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_l0t_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_0lt_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_l00_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_0l0_dc_neon(uint8_t *src, int stride);
+void ff_pred8x8_vert_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_hor_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_plane_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_128_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_left_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_top_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_l0t_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_0lt_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_l00_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_0l0_dc_neon(uint8_t *src, ptrdiff_t stride);
 
 static void ff_h264_pred_init_neon(H264PredContext *h, int codec_id, const int 
bit_depth, const int chroma_format_idc)
 {
diff --git a/libavcodec/h264pred.c b/libavcodec/h264pred.c
index fb44046..94cf9d0 100644
--- a/libavcodec/h264pred.c
+++ b/libavcodec/h264pred.c
@@ -39,7 +39,9 @@
 #include h264pred_template.c
 #undef BIT_DEPTH
 
-static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright, int 
stride){
+static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright,
+   ptrdiff_t stride)
+{
 const unsigned lt = src[-1-1*stride];
 LOAD_TOP_EDGE
 LOAD_TOP_RIGHT_EDGE
@@ -54,7 +56,9 @@ static void pred4x4_vertical_vp8_c(uint8_t *src, const 
uint8_t *topright, int st
 AV_WN32A(src+3*stride, v);
 }
 
-static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright, 
int stride){
+static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright,
+ ptrdiff_t stride)
+{
 const unsigned lt = src[-1-1*stride];
 LOAD_LEFT_EDGE
 
@@ -64,7 +68,9 @@ static void pred4x4_horizontal_vp8_c(uint8_t *src, const 
uint8_t *topright, int
 AV_WN32A(src+3*stride, ((l2 + 2*l3 + l3 + 2)  2)*0x01010101);
 }
 
-static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright, 
int stride){
+static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright,
+ ptrdiff_t stride)
+{
 LOAD_TOP_EDGE
 LOAD_LEFT_EDGE
 
@@ -86,7 +92,9 @@ static void pred4x4_down_left_svq3_c(uint8_t *src, const 
uint8_t *topright, int
 src[3+3*stride]=(l3 + t3)1;
 }
 
-static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright, 
int stride){
+static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright,
+ ptrdiff_t stride)
+{
 LOAD_TOP_EDGE
 LOAD_TOP_RIGHT_EDGE
 LOAD_LEFT_EDGE
@@ -110,7 +118,10 @@ static void pred4x4_down_left_rv40_c(uint8_t *src, const 
uint8_t *topright, int
 src[3+3*stride]=(t6 + t7 + 1 + l6 + l7 + 1)2;
 }
 
-static void pred4x4_down_left_rv40_nodown_c(uint8_t *src, const uint8_t 
*topright, int stride){
+static void

Re: [libav-devel] [PATCH 2/2] swscale: support gray to 9bit and 10bit formats

2012-10-26 Thread Ronald S. Bultje
Hi,

On Fri, Oct 26, 2012 at 8:35 AM, Luca Barbato lu_z...@gentoo.org wrote:
 Signed-off-by: Luca Barbato lu_z...@gentoo.org
 ---

 Partially fixes the issue, there is still something wrong somewhere, possibly
 a buffer overrun or yet some memory not properly set.

How do we reproduce? What kind of msgs do you get in valgrind? Do
32/64bit outputs match?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 11/11] vp8: fix memset() crossing array boundary

2012-10-26 Thread Ronald S. Bultje
Hi,

On Fri, Oct 26, 2012 at 5:50 PM, Mans Rullgard m...@mansr.com wrote:
 Indexing across array boundaries is not allowed by C99.

 Signed-off-by: Mans Rullgard m...@mansr.com
 ---
  libavcodec/vp8.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

 diff --git a/libavcodec/vp8.c b/libavcodec/vp8.c
 index 2766c9e..d9902c4 100644
 --- a/libavcodec/vp8.c
 +++ b/libavcodec/vp8.c
 @@ -1964,7 +1964,8 @@ static int vp8_decode_frame(AVCodecContext *avctx, void 
 *data, int *data_size,
  // top edge of 127 for intra prediction
  if (!(avctx-flags  CODEC_FLAG_EMU_EDGE)) {
  s-top_border[0][15] = s-top_border[0][23] = 127;
 -memset(s-top_border[1]-1, 127, 
 s-mb_width*sizeof(*s-top_border)+1);
 +s-top_border[0][31] = 127;
 +memset(s-top_border[1], 127, s-mb_width*sizeof(*s-top_border));
  }

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] swscale: support gray to 9bit and 10bit formats

2012-10-25 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 5:22 PM, Luca Barbato lu_z...@gentoo.org wrote:
 +if (isPlanar(dstFormat)  isALPHA(dstFormat)  !alpPixBuf) {
 +int length = dstW;
 +int height = dstY - lastDstY;
 +if (is9_OR_10BPS(dstFormat)) {
 +const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat);
 +fill_plane9or10(dst[3], dstStride[3], length, height, lastDstY,
 +255, desc-comp[3].depth_minus1 + 1,
 +isBE(dstFormat));
 +} else if (is16BPS(c-dstFormat))
 +   length *= 2;
 +
 +fillPlane(dst[3], dstStride[3], length, height, lastDstY, 255);

Doesn't this call fillPlane, even if we're 9 or 10 bits?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-10-24 Thread Ronald S. Bultje
Hi,

On Wed, Oct 24, 2012 at 10:07 AM, Måns Rullgård m...@mansr.com wrote:
 Ronald S. Bultje rsbul...@gmail.com writes:

 From: Ronald S. Bultje rsbul...@gmail.com

 Use this in VP8/H264-8bit loopfilter functions so they can be used if
 there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
 ---
  libavcodec/x86/h264_deblock.asm |  27 ++-
  libavcodec/x86/h264dsp_init.c   |   4 +-
  libavcodec/x86/vp8dsp.asm   |  68 
  libavcodec/x86/vp8dsp_init.c|   8 --
  libavutil/x86/x86inc.asm| 167 
 +---
  5 files changed, 181 insertions(+), 93 deletions(-)

 What happened to this?  Is there something wrong with the patch?

I am addressing reviews from the x264 people and am somewhat slow at
testing new revisions because of other work...

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] Problem with H.264 and NALU chunks

2012-10-19 Thread Ronald S. Bultje
Hi,

On Fri, Oct 19, 2012 at 6:55 PM, Felipe Contreras
felipe.contre...@gmail.com wrote:
 Hi,

 While testing an mpegts stream I found what appears to be a bug in the
 H.264 decoder that only happens on certain processors.

 The stream is here[1], but I've split the specific chunks I get from
 the demuxer. They are not complete access-units, but I've set the flag
 CODEC_FLAG2_CHUNKS. On one laptop it works perfectly fine, on another
 I get a bunch of corruption, errors, and valgrind complains that wrong
 memory accesses. The first one is a Core i5, the other a Core i7.

 This happens both on v0.8.3 and the master branch.

 If I use GStreamer's H.264 parser to aggregate the NAL units into
 access-units, it works fine.

 Any ideas?

Aren't you supposed to use the internal h264 parser to do the chunking
before feeding it into the decoder if you choose to not use the
built-in demuxers? (The built-in ones will automatically call the
parser for you if you call av_read_frame() instead of
av_read_packet().)

Or in other words, is this in any way reproducible with avconv/avplay?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org wrote:
 ---

 Here an initial patch to support many yuva, apparently either I botched
 adding them (since I did lots of cut and paste) or they manage the expose some
 flaws in swscale. valgrind manages to spot something and might be nice if
 other people try this and see if the values for the fate-lavfi change after
 each round.

If this is what I think it is, search for YUVA420P in swscale and
replace it with a generic alpha check.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
 Hi,

 On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org wrote:
 ---

 Here an initial patch to support many yuva, apparently either I botched
 adding them (since I did lots of cut and paste) or they manage the expose 
 some
 flaws in swscale. valgrind manages to spot something and might be nice if
 other people try this and see if the values for the fate-lavfi change after
 each round.

 If this is what I think it is, search for YUVA420P in swscale and
 replace it with a generic alpha check.

Oh I see now.

swscale.c:661
if (isPlanar(dstFormat)  isALPHA(dstFormat)  !alpPixBuf)
fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255);
This only works for 8bit. For 9bit, you need a word fill with a value
of (1  numbits) - 1.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
 Hi,

 On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
 Hi,

 On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org wrote:
 ---

 Here an initial patch to support many yuva, apparently either I botched
 adding them (since I did lots of cut and paste) or they manage the expose 
 some
 flaws in swscale. valgrind manages to spot something and might be nice if
 other people try this and see if the values for the fate-lavfi change after
 each round.

 If this is what I think it is, search for YUVA420P in swscale and
 replace it with a generic alpha check.

 Oh I see now.

 swscale.c:661
 if (isPlanar(dstFormat)  isALPHA(dstFormat)  !alpPixBuf)
 fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255);
 This only works for 8bit. For 9bit, you need a word fill with a value
 of (1  numbits) - 1.

Which actually leads to a question: does any of this exist in the real
world? I've never, ever, ever seen an image with non-8bpc alpha
planes. I hate to invent stuff that has no utility but does carry
implementation burden.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 1:39 PM, Luca Barbato lu_z...@gentoo.org wrote:
 On 10/14/2012 09:54 PM, Ronald S. Bultje wrote:
 Hi,

 On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje rsbul...@gmail.com 
 wrote:
 Hi,

 On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje rsbul...@gmail.com 
 wrote:
 Hi,

 On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org wrote:
 ---

 Here an initial patch to support many yuva, apparently either I botched
 adding them (since I did lots of cut and paste) or they manage the expose 
 some
 flaws in swscale. valgrind manages to spot something and might be nice if
 other people try this and see if the values for the fate-lavfi change 
 after
 each round.

 If this is what I think it is, search for YUVA420P in swscale and
 replace it with a generic alpha check.

 Oh I see now.

 swscale.c:661
 if (isPlanar(dstFormat)  isALPHA(dstFormat)  !alpPixBuf)
 fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 
 255);
 This only works for 8bit. For 9bit, you need a word fill with a value
 of (1  numbits) - 1.

 Which actually leads to a question: does any of this exist in the real
 world? I've never, ever, ever seen an image with non-8bpc alpha
 planes. I hate to invent stuff that has no utility but does carry
 implementation burden.

 You asked me to add those ^^;

Uhm, ok, maybe I meant something else? I mean, there really isn't much
use for 9-15bpp alpha channels as far as I know. 8bpp is obvious and I
can sort of see 16bpp being useful (although I don't think I've ever
seen it used).

 BTW while I was investigating I noticed
 all the previous 9 and 10 formats look strange, if you have time we
 could have a look.

Can you elaborate?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 2:12 PM, Luca Barbato lu_z...@gentoo.org wrote:
 On 10/14/2012 10:43 PM, Ronald S. Bultje wrote:
 Hi,

 On Sun, Oct 14, 2012 at 1:39 PM, Luca Barbato lu_z...@gentoo.org wrote:
 On 10/14/2012 09:54 PM, Ronald S. Bultje wrote:
 Hi,

 On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje rsbul...@gmail.com 
 wrote:
 Hi,

 On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje rsbul...@gmail.com 
 wrote:
 Hi,

 On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato lu_z...@gentoo.org 
 wrote:
 ---

 Here an initial patch to support many yuva, apparently either I botched
 adding them (since I did lots of cut and paste) or they manage the 
 expose some
 flaws in swscale. valgrind manages to spot something and might be nice 
 if
 other people try this and see if the values for the fate-lavfi change 
 after
 each round.

 If this is what I think it is, search for YUVA420P in swscale and
 replace it with a generic alpha check.

 Oh I see now.

 swscale.c:661
 if (isPlanar(dstFormat)  isALPHA(dstFormat)  !alpPixBuf)
 fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 
 255);
 This only works for 8bit. For 9bit, you need a word fill with a value
 of (1  numbits) - 1.

 Which actually leads to a question: does any of this exist in the real
 world? I've never, ever, ever seen an image with non-8bpc alpha
 planes. I hate to invent stuff that has no utility but does carry
 implementation burden.

 You asked me to add those ^^;

 Uhm, ok, maybe I meant something else? I mean, there really isn't much
 use for 9-15bpp alpha channels as far as I know. 8bpp is obvious and I
 can sort of see 16bpp being useful (although I don't think I've ever
 seen it used).

 BTW while I was investigating I noticed
 all the previous 9 and 10 formats look strange, if you have time we
 could have a look.

 Can you elaborate?

 try this command

 ./avconv -v quiet -nostats -y -cpuflags all -i ./tests/vsynth1/%02d.pgm
 -vcodec rawvideo -pix_fmt yuv420p10 -f nut - | ./avplay -

 You will see it blue tinted while it should be gray.

 (you might need to apply my yet to be reviewed patch to fix nut with
 rawvideo).

Well yes, can we first sort out this stuff without nut? I mean, I
really don't have much reason to trust nut at this moment. When you
test a typical conversion (convert that thing to nut without
colorspace conversion, and then ./avplay -vf format=yuv420p10le), it
works, so the colorspace conversion itself is correct.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] rmdec: Move code shared with Matroska demuxer to RealMedia common code

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 3:02 PM, Diego Biurrun di...@biurrun.de wrote:
 This removes a dependency of the Matroska demuxer on the RealMedia
 demuxer while only minimally bloating a RealMedia muxer only build.
 ---
  libavformat/Makefile |4 ++--
  libavformat/rm.c |   41 +
  libavformat/rmdec.c  |   41 -
  3 files changed, 43 insertions(+), 43 deletions(-)

I'm a little confused, rm.c contains code that is to be shared between
rmdec and rmenc. Shouldn't you add a new file (e.g. sipr.c) that
contains sipr-specific code to be shared between different demuxers
(e.g. rmdec and matroskadec)?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 09/15] swscale: avoid pointless use of compound literals

2012-10-14 Thread Ronald S. Bultje
On Sun, Oct 14, 2012 at 8:11 PM, Mans Rullgard m...@mansr.com wrote:
 Some compilers (e.g. old gcc) have trouble with these.

 Signed-off-by: Mans Rullgard m...@mansr.com
 ---
  libswscale/swscale_unscaled.c | 17 +
  1 file changed, 9 insertions(+), 8 deletions(-)

 diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
 index 70eff72..5efc647 100644
 --- a/libswscale/swscale_unscaled.c
 +++ b/libswscale/swscale_unscaled.c
 @@ -396,6 +396,11 @@ static int planarRgbToRgbWrapper(SwsContext *c, const 
 uint8_t *src[],
   uint8_t *dst[], int dstStride[])
  {
  int alpha_first = 0;
 +const uint8_t *src102[] = { src[1], src[0], src[2] };
 +const uint8_t *src201[] = { src[2], src[0], src[1] };
 +int stride102[] = { srcStride[1], srcStride[0], srcStride[2] };
 +int stride201[] = { srcStride[2], srcStride[0], srcStride[1] };
 +
  if (c-srcFormat != AV_PIX_FMT_GBRP) {
  av_log(c, AV_LOG_ERROR, unsupported planar RGB conversion %s - 
 %s\n,
 av_get_pix_fmt_name(c-srcFormat),
 @@ -405,15 +410,13 @@ static int planarRgbToRgbWrapper(SwsContext *c, const 
 uint8_t *src[],

  switch (c-dstFormat) {
  case AV_PIX_FMT_BGR24:
 -gbr24ptopacked24((const uint8_t *[]) { src[1], src[0], src[2] },
 - (int []) { srcStride[1], srcStride[0], srcStride[2] 
 },
 +gbr24ptopacked24(src102, stride102,
   dst[0] + srcSliceY * dstStride[0], dstStride[0],
   srcSliceH, c-srcW);
  break;

I find it slightly hilarious that there is no way in hell this would
have gotten in had the compiler's name been gcc. Why don't you use
c99-to-c89 for this so-called compiler?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] H.264: Convert 8-bit qpel inlined assembly to yasm

2012-10-13 Thread Ronald S. Bultje
Hi,

On Sat, Oct 13, 2012 at 9:05 AM, Diego Biurrun di...@biurrun.de wrote:
 On Sat, Oct 13, 2012 at 10:04:50AM -0500, Daniel Kang wrote:
 +%macro op_avgh 3
 +movh   %3, %2
 +pavgb  %1, %3
 +movh   %2, %1
 +%endmacro
 +
 +%macro op_avg 3
 +pavgb  %1, %2
 +mova   %2, %1
 +%endmacro
 +
 +%macro op_puth 3
 +movh   %2, %1
 +%endmacro
 +
 +%macro op_put 3
 +mova   %2, %1
 +%endmacro

 All of these macros only take 2 parameters.

op_avgh takes 3.

 +.loop:
 +movh  m1, [r1-1]
 +movh  m2, [r1+0]
 +movh  m3, [r1+1]
 +movh  m0, [r1+2]

 Spaces around operators would help readability IMO; more below...

No spaces around operators in assembly dereferences please.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] Fix for FFMPEG vp8dec loopfilter delta values reset at keyframes

2012-10-12 Thread Ronald S. Bultje
Hi,

On Fri, Oct 12, 2012 at 7:12 AM, Luca Barbato lu_z...@gentoo.org wrote:
 On 10/12/2012 12:42 PM, Sami Pietilä wrote:
 Hi all,

 while testing a VP8 encoder I found a mismatch between FFMPEG and libvpx
 VP8 decoders. The reason for this mismatch is that FFMPEG doesn't reset
 loopfilter delta values at keyframes. Patch that fixes the issue is below.
 I've verified that the output of ffmpeg after this patch matches libvpx. If
 you need I can send you a stream that shows the issue.

 Would be nice to have a small vector to add it to fate, indeed.

 The patch itself looks like that this specific detail hadn't been
 considered, while implementing it.

 If you can point me where in the specification it is stated I'd like to
 point it in the commit message.

It's only natural for all codec state to be reset in a keyframe. It is
supposed to be a re-entry point where all values are reset to
defaults.

The patch itself is LGTM, I was about to push it myself just now.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] Fix for FFMPEG vp8dec loopfilter delta values reset at keyframes

2012-10-12 Thread Ronald S. Bultje
Hi,

On Fri, Oct 12, 2012 at 7:18 AM, Luca Barbato lu_z...@gentoo.org wrote:
 On 10/12/2012 04:15 PM, Ronald S. Bultje wrote:
 If you can point me where in the specification it is stated I'd like to
 point it in the commit message.

 It's only natural for all codec state to be reset in a keyframe. It is
 supposed to be a re-entry point where all values are reset to
 defaults.

 The patch itself is LGTM, I was about to push it myself just now.

 Please do =)

Pushed.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Support for MSVC.

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 10:33 AM, aviad rozenhek avia...@gmail.com wrote:

 On Thu, Jun 14, 2012 at 1:59 AM, Ronald S. Bultje rsbul...@gmail.com
 wrote:

 Hi,

 see attached. It's kinda big, some parts can/should be done
 differently, but it's a starting point so let's start talking about
 how to get this in the right way.

 Some general comments:
 - only tested on 32bit so far
 - no it won't compile on non-MSVC with my patch (I had to break the
 Makefile in a couple of places to get linking to work)
 - the include of stdlib.h everywhere is because of [1]. We may need to
 add av_restrict instead.
 - the include of mathematics.h, avconfig.h and avstring.h everywhere
 is (I think) legit, to account for snprintf(), M_PI or inline.
 - the added assembly in fmtconvert.asm helps to get rid of a VLA in
 fmtconvert_mmx.c.
 - test.c converts c99 to c89 which MSVC can compile. Don't look at it
 unless you want to see true pain, it's intended to be shipped outside,
 like gas-preprocessor.pl. (unit.c and unit2.c are unittests for it.)
 - the renames of libavutil/x86/cpu.c, libavcodec/x86/mlpdsp.c and
 libswscale/x86/rgb2rgb.c are to prevent duplicate filenames in the
 same library, which makes the MSVC linker crap out in debug mode
 (works fine in release mode).
 - it mostly passes fate, except zlib-dependent tests (because I didn't
 try including zlib yet) and wtv (which uses gmtime(), which is broken
 on Windows since it only supports years up to 3000, and our test file
 uses 10900.) All other tests pass.

 Ronald

 [1]
 http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/bf3c0191-c15d-4e73-a111-3796608900c7


 can you provide prebuilt binaries for c99-to-c89.exe for windows? it will
 make using it that much easier

https://gerrit.chromium.org/gerrit/#/c/34728/

You'll need c99wrap.exe also, but that one is trivial to build
yourself, since it has no dependencies.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/5] vc1dec: Invoke edge emulation regardless of MV precision for 1-MV chroma

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
mashiat.sar...@gmail.com wrote:
 This is required due to the way VC-1 handles chroma pull-back which may end
 up causing negative chroma MV for zero luma MV. Edge emulation needs to be
 invoked in such cases.

 This problem only affects chroma Y motion vector.
 ---
  libavcodec/vc1dec.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c
 index 42eb4a5..2683c86 100644
 --- a/libavcodec/vc1dec.c
 +++ b/libavcodec/vc1dec.c
 @@ -430,7 +430,7 @@ static void vc1_mc_1mv(VC1Context *v, int dir)
  if (v-rangeredfrm || (v-mv_mode == MV_PMODE_INTENSITY_COMP)
  || s-h_edge_pos  22 || v_edge_pos  22
  || (unsigned)(src_x - s-mspel)  s-h_edge_pos - (mx3) - 16 - 
 s-mspel * 3
 -|| (unsigned)(src_y - s-mspel)  v_edge_pos- (my3) - 16 - 
 s-mspel * 3) {
 +|| (unsigned)(src_y - 1) v_edge_pos- (my3) - 16 - 3) {

I'm slightly curious why this is only the case for the vertical
dimension, but not the horizontal. Is that intentional, or an
oversight?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 5/5] Double motion vector range for HPEL interlaced picture in proper place

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
mashiat.sar...@gmail.com wrote:
 The existing code is not in the right place and it should cover both
 interlaced frame and field pictures.
 ---
  libavcodec/vc1.c|5 +
  libavcodec/vc1dec.c |4 
  2 files changed, 5 insertions(+), 4 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/5] vc1dec: Set opposite to the correct value for 1REF field pictures

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
mashiat.sar...@gmail.com wrote:
 ---
  libavcodec/vc1dec.c |   14 ++
  1 file changed, 10 insertions(+), 4 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/5] vc1dec: Set chroma reference field from REFFIELD for 1REF field pictures

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
mashiat.sar...@gmail.com wrote:
 Interlaced field pictures can have one or two reference pictures, signaled
 by NUMREF syntax element. For single reference pictures, reference picture
 is determined by REFFIELD syntax element.
 ---
  libavcodec/vc1dec.c |1 +
  1 file changed, 1 insertion(+)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/5] vc1dec: Invoke edge emulation regardless of MV precision for 1-MV chroma

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 4:56 PM, Mashiat Sarker Shakkhar
mashiat.sar...@gmail.com wrote:
 On 10/9/2012 7:37 PM, Ronald S. Bultje wrote:

 Hi,

 On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
 mashiat.sar...@gmail.com wrote:

 This is required due to the way VC-1 handles chroma pull-back which may
 end
 up causing negative chroma MV for zero luma MV. Edge emulation needs to
 be
 invoked in such cases.

 This problem only affects chroma Y motion vector.
 ---
   libavcodec/vc1dec.c |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c
 index 42eb4a5..2683c86 100644
 --- a/libavcodec/vc1dec.c
 +++ b/libavcodec/vc1dec.c
 @@ -430,7 +430,7 @@ static void vc1_mc_1mv(VC1Context *v, int dir)
   if (v-rangeredfrm || (v-mv_mode == MV_PMODE_INTENSITY_COMP)
   || s-h_edge_pos  22 || v_edge_pos  22
   || (unsigned)(src_x - s-mspel)  s-h_edge_pos - (mx3) - 16 -
 s-mspel * 3
 -|| (unsigned)(src_y - s-mspel)  v_edge_pos- (my3) - 16 -
 s-mspel * 3) {
 +|| (unsigned)(src_y - 1) v_edge_pos- (my3) - 16 -
 3) {


 I'm slightly curious why this is only the case for the vertical
 dimension, but not the horizontal. Is that intentional, or an
 oversight?


 This thread contains my earlier discussion on a rejected patch which tried
 to fix the same bug:

 http://patches.libav.org/patch/26240/

OK so it's not because of pullback per se, but because of field
adjustment, then it makes sense.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/9] Replace PIX_FMT_* - AV_PIX_FMT_*, PixelFormat - AVPixelFormat

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote:
[..]

Yay! OK.

(Didn't review at all, a sed replace can't be that hard, plus this is awesome.)

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote:
 ---
  avprobe.c |6 +++---
  cmdutils.c|   16 +++-
  tools/graph2dot.c |4 ++--
  3 files changed, 16 insertions(+), 10 deletions(-)

 diff --git a/avprobe.c b/avprobe.c
 index 16a5d29..3a3ae0f 100644
 --- a/avprobe.c
 +++ b/avprobe.c
 @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int 
 stream_idx)
  const char *profile;
  char val_str[128];
  AVRational display_aspect_ratio;
 +const AVPixFmtDescriptor *desc;

  probe_object_header(stream);

 @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int 
 stream_idx)
rational_string(val_str, sizeof(val_str), :,
display_aspect_ratio));
  }
 -probe_str(pix_fmt,
 -  dec_ctx-pix_fmt != AV_PIX_FMT_NONE ?
 -  av_pix_fmt_descriptors[dec_ctx-pix_fmt].name : 
 unknown);
 +desc = av_pix_fmt_desc_get(dec_ctx-pix_fmt);

I think this is an awful idea, it should be OK to access the array directly.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-10-06 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++-
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 167 +---
 5 files changed, 181 insertions(+), 93 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index 940a8f7..43aaf6d 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -399,14 +399,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -444,22 +442,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+16
 
 ; transpose 6x16 - tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -501,7 +496,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -632,7 +626,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -647,10 +641,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -699,9 +690,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -738,12 +726,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 - tmp space
@@ -774,7 +760,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index 3f6ded4..8596f26 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c-h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c-h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c-h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c-h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3

Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov an...@khirnov.net wrote:
 On Sat, 6 Oct 2012 12:35:44 -0700, Ronald S. Bultje rsbul...@gmail.com 
 wrote:
 On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote:
  ---
   avprobe.c |6 +++---
   cmdutils.c|   16 +++-
   tools/graph2dot.c |4 ++--
   3 files changed, 16 insertions(+), 10 deletions(-)
 
  diff --git a/avprobe.c b/avprobe.c
  index 16a5d29..3a3ae0f 100644
  --- a/avprobe.c
  +++ b/avprobe.c
  @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int 
  stream_idx)
   const char *profile;
   char val_str[128];
   AVRational display_aspect_ratio;
  +const AVPixFmtDescriptor *desc;
 
   probe_object_header(stream);
 
  @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int 
  stream_idx)
 rational_string(val_str, sizeof(val_str), :,
 display_aspect_ratio));
   }
  -probe_str(pix_fmt,
  -  dec_ctx-pix_fmt != AV_PIX_FMT_NONE ?
  -  av_pix_fmt_descriptors[dec_ctx-pix_fmt].name : 
  unknown);
  +desc = av_pix_fmt_desc_get(dec_ctx-pix_fmt);

 I think this is an awful idea, it should be OK to access the array directly.


 I don't see any advantage to accessing the array directly.

It often leads to smaller code. Instead of AVPixFmtDesc *d =
getter(pix_fmt); if (!d) return -1; printf(%d\n, d-something);, you
can just do printf(%d\n, array[pix_fmt].something);. Do we really
need to babysit our users?

 It is the only case in all Libav where we export an array like this.

We never supported MSVC. So we shouldn't?

 It fixes the size of AVPixFmtDescriptor for no good reason.

Is there a need to extend it?

 It prevents the caller from accessing all the descriptors when he's
 using a newer library than the one he compiled against.

if (compile_ver  runtime_ver) error(); on app init. Likely, you want
to do that anyway.

 And I hear there's some trouble with the evil microsoft thing and data
 symbols.

You have that anyway for any symbol shared between lavu/f/fi/c/r/*.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:19 PM, Anton Khirnov an...@khirnov.net wrote:

 On Sat, 6 Oct 2012 17:01:57 -0700, Ronald S. Bultje rsbul...@gmail.com 
 wrote:
 Hi,

 On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov an...@khirnov.net wrote:
  On Sat, 6 Oct 2012 12:35:44 -0700, Ronald S. Bultje rsbul...@gmail.com 
  wrote:
  On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote:
   ---
avprobe.c |6 +++---
cmdutils.c|   16 +++-
tools/graph2dot.c |4 ++--
3 files changed, 16 insertions(+), 10 deletions(-)
  
   diff --git a/avprobe.c b/avprobe.c
   index 16a5d29..3a3ae0f 100644
   --- a/avprobe.c
   +++ b/avprobe.c
   @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, 
   int stream_idx)
const char *profile;
char val_str[128];
AVRational display_aspect_ratio;
   +const AVPixFmtDescriptor *desc;
  
probe_object_header(stream);
  
   @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, 
   int stream_idx)
  rational_string(val_str, sizeof(val_str), 
   :,
  display_aspect_ratio));
}
   -probe_str(pix_fmt,
   -  dec_ctx-pix_fmt != AV_PIX_FMT_NONE ?
   -  av_pix_fmt_descriptors[dec_ctx-pix_fmt].name : 
   unknown);
   +desc = av_pix_fmt_desc_get(dec_ctx-pix_fmt);
 
  I think this is an awful idea, it should be OK to access the array 
  directly.
 
 
  I don't see any advantage to accessing the array directly.

 It often leads to smaller code. Instead of AVPixFmtDesc *d =
 getter(pix_fmt); if (!d) return -1; printf(%d\n, d-something);, you
 can just do printf(%d\n, array[pix_fmt].something);. Do we really
 need to babysit our users?

 Checking for d to be non-NULL is equivalent to checking that pix_fmt is
 valid. So no, it does not necessarily makes the code longer.

  It is the only case in all Libav where we export an array like this.

 We never supported MSVC. So we shouldn't?


 YES! =p

 Seriously though, there are only disadvantages to exporting the array
 directly.

  It fixes the size of AVPixFmtDescriptor for no good reason.

 Is there a need to extend it?


 There might be. You don't want to find out that you need to extend it
 two months after somebody else just bumped lavu.

  It prevents the caller from accessing all the descriptors when he's
  using a newer library than the one he compiled against.

 if (compile_ver  runtime_ver) error(); on app init. Likely, you want
 to do that anyway.

  And I hear there's some trouble with the evil microsoft thing and data
  symbols.

 You have that anyway for any symbol shared between lavu/f/fi/c/r/*.

 It should be our long-term goal to remove those.

You can't. Stuff like ff_sqrt[], ff_log2[] etc. can't be removed. Never. Sorry.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:47 PM, Anton Khirnov an...@khirnov.net wrote:

 On Sat, 6 Oct 2012 21:44:03 -0700, Ronald S. Bultje rsbul...@gmail.com 
 wrote:
 Hi,

 On Sat, Oct 6, 2012 at 9:19 PM, Anton Khirnov an...@khirnov.net wrote:
 
  On Sat, 6 Oct 2012 17:01:57 -0700, Ronald S. Bultje rsbul...@gmail.com 
  wrote:
  Hi,
 
  On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov an...@khirnov.net wrote:
   On Sat, 6 Oct 2012 12:35:44 -0700, Ronald S. Bultje 
   rsbul...@gmail.com wrote:
   On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net 
   wrote:
---
 avprobe.c |6 +++---
 cmdutils.c|   16 +++-
 tools/graph2dot.c |4 ++--
 3 files changed, 16 insertions(+), 10 deletions(-)
   
diff --git a/avprobe.c b/avprobe.c
index 16a5d29..3a3ae0f 100644
--- a/avprobe.c
+++ b/avprobe.c
@@ -584,6 +584,7 @@ static void show_stream(AVFormatContext 
*fmt_ctx, int stream_idx)
 const char *profile;
 char val_str[128];
 AVRational display_aspect_ratio;
+const AVPixFmtDescriptor *desc;
   
 probe_object_header(stream);
   
@@ -629,9 +630,8 @@ static void show_stream(AVFormatContext 
*fmt_ctx, int stream_idx)
   rational_string(val_str, sizeof(val_str), 
:,
   display_aspect_ratio));
 }
-probe_str(pix_fmt,
-  dec_ctx-pix_fmt != AV_PIX_FMT_NONE ?
-  av_pix_fmt_descriptors[dec_ctx-pix_fmt].name 
: unknown);
+desc = av_pix_fmt_desc_get(dec_ctx-pix_fmt);
  
   I think this is an awful idea, it should be OK to access the array 
   directly.
  
  
   I don't see any advantage to accessing the array directly.
 
  It often leads to smaller code. Instead of AVPixFmtDesc *d =
  getter(pix_fmt); if (!d) return -1; printf(%d\n, d-something);, you
  can just do printf(%d\n, array[pix_fmt].something);. Do we really
  need to babysit our users?
 
  Checking for d to be non-NULL is equivalent to checking that pix_fmt is
  valid. So no, it does not necessarily makes the code longer.
 
   It is the only case in all Libav where we export an array like this.
 
  We never supported MSVC. So we shouldn't?
 
 
  YES! =p
 
  Seriously though, there are only disadvantages to exporting the array
  directly.
 
   It fixes the size of AVPixFmtDescriptor for no good reason.
 
  Is there a need to extend it?
 
 
  There might be. You don't want to find out that you need to extend it
  two months after somebody else just bumped lavu.
 
   It prevents the caller from accessing all the descriptors when he's
   using a newer library than the one he compiled against.
 
  if (compile_ver  runtime_ver) error(); on app init. Likely, you want
  to do that anyway.
 
   And I hear there's some trouble with the evil microsoft thing and data
   symbols.
 
  You have that anyway for any symbol shared between lavu/f/fi/c/r/*.
 
  It should be our long-term goal to remove those.

 You can't. Stuff like ff_sqrt[], ff_log2[] etc. can't be removed. Never. 
 Sorry.


 The same way we can never have msvc support?

Right.

Shall we move on to more practical patches?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/9] pixfmt: add AV_ prefixes to PIX_FMT_*

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov an...@khirnov.net wrote:
 ---
  doc/APIchanges   |8 ++
  libavutil/Makefile   |1 +
  libavutil/old_pix_fmts.h |  128 
  libavutil/pixfmt.h   |  298 
 ++
  libavutil/version.h  |7 +-
  5 files changed, 311 insertions(+), 131 deletions(-)
  create mode 100644 libavutil/old_pix_fmts.h

\o/ OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Drop broken and unused CABAC test program.

2012-10-05 Thread Ronald S. Bultje
Hi,

On Fri, Oct 5, 2012 at 3:50 AM, Diego Biurrun di...@biurrun.de wrote:
 ---
 The test program is of doubtful utility, fails to compile on x86_32 and
 was never part of our test suite.
  libavcodec/Makefile |3 +-
  libavcodec/cabac.c  |  147 
 ---
  2 files changed, 1 insertions(+), 149 deletions(-)

This is probably the 20th time you're submitting this patch. Can you
please not do that? Are you just hoping someone isn't paying attention
and accidently says yes?

Why don't you fix the bug that prevents it from working.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.

2012-10-05 Thread Ronald S. Bultje
Hi,

On Fri, Oct 5, 2012 at 12:25 AM, Diego Biurrun di...@biurrun.de wrote:
 On Fri, Oct 05, 2012 at 02:48:20AM +0200, Luca Barbato wrote:
 On 10/05/2012 01:29 AM, Ronald S. Bultje wrote:
  On Thu, Oct 4, 2012 at 4:27 PM, Ronald S. Bultje rsbul...@gmail.com 
  wrote:

  Oh right you mean for performance?
  fate-h264-conformance-cama2_vtc_b
  fate-h264-conformance-cama3_vtc_b
  should be ok.

 I'd push your patch + my style nits soon, I'll build tsan2 soon (yai
 gentoo live ebuilds) since I'm curious about what's wrong with slice
 threading I guess most of the issues could be uncovered clearly there.

 Here's an issue to fix wrt sliced threading:

 http://fate.libav.org/x86_64-linux-slice-threading/20121005005502

One of the mailinglists recently had an email about our slice
threading impl relying on wakeups-only-by-signal, i.e. spurious
wakeups could lead to races, can you look at that and see if it fixes
it?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-10-05 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++-
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 -
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 160 +---
 5 files changed, 175 insertions(+), 92 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index 6abdba2..6137678 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -401,14 +401,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -446,22 +444,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+16
 
 ; transpose 6x16 - tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -503,7 +498,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -636,7 +630,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -651,10 +645,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -703,9 +694,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -742,12 +730,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 - tmp space
@@ -778,7 +764,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index bb48867..f84e3e8 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -276,18 +276,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c-h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c-h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c-h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c-h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3

[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-10-05 Thread Ronald S. Bultje
From: Ronald S. Bultje rbul...@chromium.org

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++-
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 +-
 libavcodec/x86/vp8dsp_init.c|   8 ---
 libavutil/x86/x86inc.asm| 156 +---
 5 files changed, 173 insertions(+), 90 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index 6abdba2..6137678 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -401,14 +401,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -446,22 +444,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+16
 
 ; transpose 6x16 - tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -503,7 +498,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -636,7 +630,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -651,10 +645,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -703,9 +694,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -742,12 +730,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 - tmp space
@@ -778,7 +764,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index bb48867..f84e3e8 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -276,18 +276,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c-h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c-h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c-h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c-h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3

[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-10-05 Thread Ronald S. Bultje
From: Ronald S. Bultje rbul...@chromium.org

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++-
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 -
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 158 +---
 5 files changed, 173 insertions(+), 92 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index 6abdba2..6137678 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -401,14 +401,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -446,22 +444,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+16
 
 ; transpose 6x16 - tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -503,7 +498,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -636,7 +630,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -651,10 +645,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -703,9 +694,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -742,12 +730,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 - tmp space
@@ -778,7 +764,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index bb48867..f84e3e8 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -276,18 +276,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c-biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c-h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c-h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c-h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c-h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c-biweight_h264_pixels_tab[0] = ff_h264_biweight_16_ssse3

Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-10-05 Thread Ronald S. Bultje
Hi,

On Fri, Oct 5, 2012 at 1:46 PM, Diego Biurrun di...@biurrun.de wrote:
 On Fri, Oct 05, 2012 at 01:38:44PM -0700, Ronald S. Bultje wrote:
 From: Ronald S. Bultje rsbul...@gmail.com

 Use this in VP8/H264-8bit loopfilter functions so they can be used if
 there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
 ---
  libavcodec/x86/h264_deblock.asm |  27 ++-
  libavcodec/x86/h264dsp_init.c   |   4 +-
  libavcodec/x86/vp8dsp.asm   |  68 -
  libavcodec/x86/vp8dsp_init.c|   8 --
  libavutil/x86/x86inc.asm| 160 
 +---
  5 files changed, 175 insertions(+), 92 deletions(-)

 This has tabs in many places - please fix your editor.

Fixed.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.

2012-10-04 Thread Ronald S. Bultje
Hi,

On Thu, Oct 4, 2012 at 12:59 PM, Luca Barbato lu_z...@gentoo.org wrote:
 On 10/04/2012 01:46 AM, Luca Barbato wrote:
 From: Ronald S. Bultje rsbul...@gmail.com

 The variable is copied to subsequent threads at the same time, so this
 may cause wrong ref_count[] values to be copied to subsequent threads.

 This bug was found using TSAN.

 Signed-off-by: Luca Barbato lu_z...@gentoo.org
 ---

 Few cosmetic nits.


 The patch looks fine, do you have a vector to try it?

just fate-h264 will trigger it sufficiently.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.

2012-10-04 Thread Ronald S. Bultje
On Thu, Oct 4, 2012 at 4:27 PM, Ronald S. Bultje rsbul...@gmail.com wrote:
 Hi,

 On Thu, Oct 4, 2012 at 12:59 PM, Luca Barbato lu_z...@gentoo.org wrote:
 On 10/04/2012 01:46 AM, Luca Barbato wrote:
 From: Ronald S. Bultje rsbul...@gmail.com

 The variable is copied to subsequent threads at the same time, so this
 may cause wrong ref_count[] values to be copied to subsequent threads.

 This bug was found using TSAN.

 Signed-off-by: Luca Barbato lu_z...@gentoo.org
 ---

 Few cosmetic nits.


 The patch looks fine, do you have a vector to try it?

 just fate-h264 will trigger it sufficiently.

Oh right you mean for performance?
fate-h264-conformance-cama2_vtc_b
fate-h264-conformance-cama3_vtc_b
should be ok.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Shared library support for MSVC build.

2012-10-03 Thread Ronald S. Bultje
Hi,

On Wed, Oct 3, 2012 at 4:02 AM, Diego Biurrun di...@biurrun.de wrote:
 On Tue, Oct 02, 2012 at 05:48:26PM -0700, Ronald S. Bultje wrote:
 2) a lot of ff_ symbols, and dsputil_init, are accessed from outside
 the library in which they exist. These are bugs and should be fixed (i.e.
 renamed).

 You seem to be working on the wrong tree.

This is Chrome's tree.


 --- a/configure
 +++ b/configure
 @@ -2616,7 +2616,7 @@ probe_cc(){
  _flags_filter=msvc_flags
 -_ld_lib='lib%.a'
 +_ld_lib='%.lib'
  _ld_path='-libpath:'

 Do static libs still work as expected?

Will test.

 @@ -3078,11 +3078,13 @@ case $target_os in
  shlibdir_default=$bindir_default
  SLIBPREF=
  SLIBSUF=.dll
 + LIBPREF=
 + LIBSUF=.lib

 Tabs; fix your editor.

Yup, fixed.

 How does this affect normal gcc MinGW builds?

Haven't tested yet. It probably breaks it right now, I'll unbreak that later.

 --- a/libavcodec/golomb.h
 +++ b/libavcodec/golomb.h
 @@ -33,13 +33,14 @@

 -extern const uint8_t ff_golomb_vlc_len[512];
 -extern const uint8_t ff_ue_golomb_vlc_code[512];
 -extern const  int8_t ff_se_golomb_vlc_code[512];
 -extern const uint8_t ff_ue_golomb_len[256];
 +extern AVCODEC_SYMBOL const uint8_t ff_golomb_vlc_len[512];
 +extern AVCODEC_SYMBOL const uint8_t ff_ue_golomb_vlc_code[512];
 +extern AVCODEC_SYMBOL const  int8_t ff_se_golomb_vlc_code[512];
 +extern AVCODEC_SYMBOL const uint8_t ff_ue_golomb_len[256];

 I somehow doubt marking ff_-prefixed symbols is necessary.

 --- a/libavcodec/raw.h
 +++ b/libavcodec/raw.h
 @@ -28,13 +28,14 @@

 -extern const PixelFormatTag ff_raw_pix_fmt_tags[];
 +extern AVCODEC_SYMBOL const PixelFormatTag ff_raw_pix_fmt_tags[];

 same

So, yes, but, these are used between libraries. E.g.
ff_raw_pix_fmt_tags is used in libavformat:
dhcp-172-31-81-183:ffmpeg rbultje$ grep ff_raw_pix_fmt_tags libavformat/*.c
libavformat/utils.c:
if(ff_find_pix_fmt(ff_raw_pix_fmt_tags, tag) == st-codec-pix_fmt)

and the golomb symbols are used by the golomb fate test. Maybe we can
link in a bunch of other .c files from libavcodec instead of linking
to avcodec.dll instead (I agree that it's ugly). Note that gcc/mingw
shared lib build (see symbol list in libavcodec.v) also exports those
symbols.

 +#ifndef AVCODEC_SYMBOLS_H
 +#define AVCODEC_SYMBOLS_H
 +
 +#include libavutil/avconfig.h
 +
 +#if AV_HAVE_SHARED_LIBS  defined(_MSC_VER)  !defined(COMPILING_avcodec)

 What's AV_HAVE_SHARED_LIBS?  I'm not aware of it anywhere.
 What do you need avconfig.h for?

These AVCODEC_SYMBOL things will be used in public headers, thus the
definition needs t exist in a public header. Since I need to use
__declspec(dllimport) only for .dll (shared) builds, not for .lib
(static) builds, I need to know in a public header whether I'm a
shared lib or not - hence AV_HAVE_SHARED_LIBS (I added that to
configure, based on CONFIG_SHARED).

 +#ifndef AVUTIL_SYMBOLS_H
 +#define AVUTIL_SYMBOLS_H
 +
 +#include libavutil/avconfig.h

 No need for the libavutil/ prefix, it's the same directory.

It's generated, i.e. it lives in the build dir, not in the source dir,
so if you don't have the libavutil prefix, it won't build, since the
header isn't in the same dir (sourcedir vs. builddir) as the file that
included it. See other files including specifically avconfig.h, they
all use the libavutil/ prefix.

 --- a/library.mak
 +++ b/library.mak
 @@ -94,7 +95,7 @@ endef

 -$(EXAMPLES) $(TESTPROGS) $(TOOLS): $(THIS_LIB) $(DEP_LIBS)
 +#$(EXAMPLES) $(TESTPROGS) $(TOOLS): $(THIS_LIB) $(DEP_LIBS)
  $(TESTPROGS): $(SUBDIR)$(LIBNAME)

 stray change

No, that's fully intentional, although it should be removed. So, what
this does, is that it adds DEP_LIBS (plus unrelated stuff) as a
dependency to TESTPROGS, i.e. *.dll becomes a dep for e.g.
dct-test.exe. If you look at the link rule for TESTPROGS, it uses $$
to link in all listed dependencies. Now note that you shouldn't link
against .dll files on Windows, you generate .lib files that contain
the linker information to link against .dll files (but .lib without
.dll is a static build, sorry if this is confusing). The .lib files
are already part of FFEXTRALIBS. So what you end up getting is a
linker that tries to link against .dll files directly (link
-libpath:libavcodec -out:dct-test.exe dct-test.o [other object files]
libavcodec/libavcodec.dll [other dlls] libavcodec.lib [other libs]
[external lib dependencies]), which doesn't make any sense, and
obviously fails. This change removes the dependency on the .dll as a
listed dep, and thus instead it just links like link
-libpath:libavcodec -out:dct-test.exe dct-test.o [other object files]
libavcodec.lib [other libs] [external lib dependencies], which works
as expected.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Shared library support for MSVC build.

2012-10-03 Thread Ronald S. Bultje
Hi,

On Wed, Oct 3, 2012 at 7:44 AM, Måns Rullgård m...@mansr.com wrote:
 Ronald S. Bultje rsbul...@gmail.com writes:
 These AVCODEC_SYMBOL things will be used in public headers, thus the
 definition needs t exist in a public header. Since I need to use
 __declspec(dllimport) only for .dll (shared) builds, not for .lib
 (static) builds, I need to know in a public header whether I'm a
 shared lib or not - hence AV_HAVE_SHARED_LIBS (I added that to
 configure, based on CONFIG_SHARED).

 That reasoning is flawed.  What if you have both static and shared?

What is CONFIG_SHARED then? Schroedinger's cat?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Shared library support for MSVC build.

2012-10-03 Thread Ronald S. Bultje
Hi,

On Wed, Oct 3, 2012 at 8:27 AM, Diego Biurrun di...@biurrun.de wrote:
 On Wed, Oct 03, 2012 at 07:37:29AM -0700, Ronald S. Bultje wrote:
  --- a/libavcodec/golomb.h
  +++ b/libavcodec/golomb.h
  @@ -33,13 +33,14 @@
 
  -extern const uint8_t ff_golomb_vlc_len[512];
  -extern const uint8_t ff_ue_golomb_vlc_code[512];
  -extern const  int8_t ff_se_golomb_vlc_code[512];
  -extern const uint8_t ff_ue_golomb_len[256];
  +extern AVCODEC_SYMBOL const uint8_t ff_golomb_vlc_len[512];
  +extern AVCODEC_SYMBOL const uint8_t ff_ue_golomb_vlc_code[512];
  +extern AVCODEC_SYMBOL const  int8_t ff_se_golomb_vlc_code[512];
  +extern AVCODEC_SYMBOL const uint8_t ff_ue_golomb_len[256];
 
  I somehow doubt marking ff_-prefixed symbols is necessary.
[..]
 and the golomb symbols are used by the golomb fate test. Maybe we can
 link in a bunch of other .c files from libavcodec instead of linking
 to avcodec.dll instead (I agree that it's ugly). Note that gcc/mingw
 shared lib build (see symbol list in libavcodec.v) also exports those
 symbols.

 No:

 biurrun@passion:/opt/biurrun/libav $ git grep ff_.e_golomb_
 libavcodec/golomb.c:const uint8_t ff_ue_golomb_vlc_code[512]={
 libavcodec/golomb.c:const int8_t ff_se_golomb_vlc_code[512]={
 libavcodec/golomb.c:const uint8_t ff_ue_golomb_len[256]={
 libavcodec/golomb.h:extern const uint8_t ff_ue_golomb_vlc_code[512];
 libavcodec/golomb.h:extern const  int8_t ff_se_golomb_vlc_code[512];
 libavcodec/golomb.h:extern const uint8_t ff_ue_golomb_len[256];
 libavcodec/golomb.h:return ff_ue_golomb_vlc_code[buf];
 libavcodec/golomb.h:return ff_ue_golomb_vlc_code[buf];
 libavcodec/golomb.h:return ff_se_golomb_vlc_code[buf];
 libavcodec/golomb.h:put_bits(pb, ff_ue_golomb_len[i], i+1);
 biurrun@passion:/opt/biurrun/libav $ git grep ff_golomb_vlc
 libavcodec/golomb.c:const uint8_t ff_golomb_vlc_len[512]={
 libavcodec/golomb.h:extern const uint8_t ff_golomb_vlc_len[512];
 libavcodec/golomb.h:LAST_SKIP_BITS(re, gb, ff_golomb_vlc_len[buf]);
 libavcodec/golomb.h:LAST_SKIP_BITS(re, gb, ff_golomb_vlc_len[buf]);
 libavcodec/golomb.h:LAST_SKIP_BITS(re, gb, ff_golomb_vlc_len[buf]);

dhcp-172-31-81-183:libav rbultje$ grep golomb libavcodec/golomb-test.c
#include golomb.h
set_ue_golomb(pb, i);
j = get_ue_golomb(gb);
[..]
set_ue_golomb(pb, EXTEND(i));
j = get_ue_golomb_long(gb);
[..]
set_se_golomb(pb, i - COUNT / 2);

in golomb.h:
static inline int get_ue_golomb(GetBitContext *gb){
unsigned int buf;
int log;

OPEN_READER(re, gb);
UPDATE_CACHE(re, gb);
buf=GET_CACHE(re, gb);

if(buf = (127)){
buf = 32 - 9;
LAST_SKIP_BITS(re, gb, ff_golomb_vlc_len[buf]);
CLOSE_READER(re, gb);

return ff_ue_golomb_vlc_code[buf];
}else{
[..]

Maybe Libav fixed the raw pix fmt stuff, but the point is that we
unfortunately do use internal test symbols in various places.

  +#ifndef AVCODEC_SYMBOLS_H
  +#define AVCODEC_SYMBOLS_H
  +
  +#include libavutil/avconfig.h
  +
  +#if AV_HAVE_SHARED_LIBS  defined(_MSC_VER)  
  !defined(COMPILING_avcodec)
 
  What's AV_HAVE_SHARED_LIBS?  I'm not aware of it anywhere.
  What do you need avconfig.h for?

 These AVCODEC_SYMBOL things will be used in public headers, thus the
 definition needs t exist in a public header. Since I need to use
 __declspec(dllimport) only for .dll (shared) builds, not for .lib
 (static) builds, I need to know in a public header whether I'm a
 shared lib or not - hence AV_HAVE_SHARED_LIBS (I added that to
 configure, based on CONFIG_SHARED).

 The configure change you speak about is missing then; nothing adds
 that define to avconfig.h ...

You're right, so basically I added shared_libs to HAVE_LIST_PUB and
set its value to $shared just before printing it into avconfig.h, so
it always has the same value as CONFIG_SHARED. Sorry for the
confusion.

 So, what this does, is that it adds DEP_LIBS (plus unrelated stuff)
 as a dependency to TESTPROGS, i.e. *.dll becomes a dep for e.g.
 dct-test.exe. If you look at the link rule for TESTPROGS, it uses
 $$ to link in all listed dependencies.

 No, that rule uses $$^, which has different semantics.

 Now note that you shouldn't link
 against .dll files on Windows, you generate .lib files that contain
 the linker information to link against .dll files (but .lib without
 .dll is a static build, sorry if this is confusing). The .lib files
 are already part of FFEXTRALIBS. So what you end up getting is a
 linker that tries to link against .dll files directly (link
 -libpath:libavcodec -out:dct-test.exe dct-test.o [other object files]
 libavcodec/libavcodec.dll [other dlls] libavcodec.lib [other libs]
 [external lib dependencies]), which doesn't make any sense, and
 obviously fails. This change removes the dependency on the .dll as a
 listed dep, and thus instead it just links like link
 -libpath:libavcodec -out:dct-test.exe dct-test.o [other object files]
 libavcodec.lib [other libs

Re: [libav-devel] [PATCH] Shared library support for MSVC build.

2012-10-03 Thread Ronald S. Bultje
Hi,

On Wed, Oct 3, 2012 at 9:07 AM, Diego Biurrun di...@biurrun.de wrote:
 On Wed, Oct 03, 2012 at 08:49:30AM -0700, Ronald S. Bultje wrote:
 On Wed, Oct 3, 2012 at 8:43 AM, Diego Biurrun di...@biurrun.de wrote:
  On Wed, Oct 03, 2012 at 08:31:43AM -0700, Ronald S. Bultje wrote:
  On Wed, Oct 3, 2012 at 7:44 AM, Måns Rullgård m...@mansr.com wrote:
   Ronald S. Bultje rsbul...@gmail.com writes:
   These AVCODEC_SYMBOL things will be used in public headers, thus the
   definition needs t exist in a public header. Since I need to use
   __declspec(dllimport) only for .dll (shared) builds, not for .lib
   (static) builds, I need to know in a public header whether I'm a
   shared lib or not - hence AV_HAVE_SHARED_LIBS (I added that to
   configure, based on CONFIG_SHARED).
  
   That reasoning is flawed.  What if you have both static and shared?
 
  What is CONFIG_SHARED then? Schroedinger's cat?
 
  An indication that shared libraries shall be built.  It is, however,
  independent of CONFIG_STATIC; hence you cannot use it as an indication
  that only shared libraries are built.
 
  It's not clear to me if you need an indication whether shared libs are
  available or whether *only* shared libs are available.

 Like Derek said, you can't do both on MSVC. The object files have
 different semantics (see __declspec(dllimport)) and are not binary
 compatible (most notably, a __declspec(dllimport) int variable; isn't
 actually an integer; it's a pointer to the integer in the dll where it
 actually resides).

 Does configure fail if you enable both at the same time?  I don't think
 so; that should be addressed.

I can certainly add that.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] Shared library support for MSVC build.

2012-10-03 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

This patch adds support for .dll shared library generation using MSVC's
native build tools. To properly load non-function symbols from DLL files,
we prefix them with __declspec(dllimport) on MSVC for shared library
builds when accessed from outside the .dll. For optimal performance, we
should do this for functions also, but I'm too lazy to do that right now
and it works without.

Caveats: 1) you need a new version of c99-to-c89, which will generate
the .def files which contains the symbols to be exported. The interface
for this isn't stable yet, I'd like to be able to specify symbols on the
commandline with wildcard support, e.g. c99wrap.exe -symbols 'avpriv_*'
link.exe -dll ..., but for now the hardcoded list does its job. Martin
suggested I use the existing .v files for this.
2) a lot of ff_ symbols, and dsputil_init, are accessed from outside
the library in which they exist. These are bugs and should be fixed (i.e.
renamed).
---
 configure   |   14 ++
 libavcodec/ac3tab.h |3 ++-
 libavcodec/dca.h|3 ++-
 libavcodec/dnxhddata.h  |3 ++-
 libavcodec/golomb.h |9 +
 libavcodec/mjpeg.h  |   16 
 libavcodec/mpeg4audio.h |3 ++-
 libavcodec/mpegaudiodata.h  |6 --
 libavcodec/raw.h|3 ++-
 libavcodec/snowenc.c|4 ++--
 libavcodec/symbols.h|   32 
 libavformat/gxf.c   |3 ++-
 libavutil/aes.h |4 +++-
 libavutil/common.h  |5 +++--
 libavutil/intmath.h |3 ++-
 libavutil/md5.h |4 +++-
 libavutil/pca.c |   18 ++
 libavutil/pixdesc.h |3 ++-
 libavutil/sha.h |4 +++-
 libavutil/symbols.h |   32 
 libavutil/tree.h|4 +++-
 library.mak |3 ++-
 libswresample/swresample-test.c |2 +-
 tools/ffeval.c  |5 +
 24 files changed, 142 insertions(+), 44 deletions(-)
 create mode 100644 libavcodec/symbols.h
 create mode 100644 libavutil/symbols.h

diff --git a/configure b/configure
index b7a8b77..3748ffb 100755
--- a/configure
+++ b/configure
@@ -1247,6 +1247,7 @@ HAVE_LIST_PUB='
 bigendian
 fast_unaligned
 incompatible_fork_abi
+shared_libs
 '
 
 HAVE_LIST=
@@ -2616,7 +2617,7 @@ probe_cc(){
 _cc_o='-Fo $@'
 _cc_e='-P -Fi $@'
 _flags_filter=msvc_flags
-_ld_lib='lib%.a'
+_ld_lib='%.lib'
 _ld_path='-libpath:'
 _flags='-nologo'
 _cflags='-D_USE_MATH_DEFINES -Dinline=__inline -FIstdlib.h 
-Dstrtoll=_strtoi64'
@@ -3078,11 +3079,13 @@ case $target_os in
 shlibdir_default=$bindir_default
 SLIBPREF=
 SLIBSUF=.dll
+LIBPREF=
+LIBSUF=.lib
 SLIBNAME_WITH_VERSION='$(SLIBPREF)$(FULLNAME)-$(LIBVERSION)$(SLIBSUF)'
 SLIBNAME_WITH_MAJOR='$(SLIBPREF)$(FULLNAME)-$(LIBMAJOR)$(SLIBSUF)'
 dlltool=${cross_prefix}dlltool
 if check_cmd lib.exe -list; then
-SLIB_EXTRA_CMD='-lib.exe /machine:$(LIBTARGET) 
/def:$$(@:$(SLIBSUF)=.def) /out:$(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)'
+SLIB_EXTRA_CMD='-lib.exe -machine:$(LIBTARGET) 
-def:$$(@:$(SLIBSUF)=.def) -out:$(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)'
 if enabled x86_64; then
 LIBTARGET=x64
 fi
@@ -3092,8 +3095,8 @@ case $target_os in
 SLIB_INSTALL_NAME='$(SLIBNAME_WITH_MAJOR)'
 SLIB_INSTALL_LINKS=
 SLIB_INSTALL_EXTRA_SHLIB='$(SLIBNAME:$(SLIBSUF)=.lib)'
-SLIB_INSTALL_EXTRA_LIB='lib$(SLIBNAME:$(SLIBSUF)=.dll.a) 
$(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.def)'
-SHFLAGS='-shared -Wl,--output-def,$$(@:$(SLIBSUF)=.def) 
-Wl,--out-implib,$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a) 
-Wl,--enable-runtime-pseudo-reloc -Wl,--enable-auto-image-base'
+SLIB_INSTALL_EXTRA_LIB='lib$(SLIBNAME:$(SLIBSUF)=.dll) 
$(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.def)'
+SHFLAGS='-dll -def:$$(@:$(SLIBSUF)=.def) 
-implib:$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a)'
 objformat=win32
 ranlib=:
 enable dos_paths
@@ -3912,6 +3915,8 @@ elif enabled pathscale; then
 add_cflags -fstrict-overflow -OPT:wrap_around_unsafe_opt=OFF
 elif enabled msvc; then
 enabled x86_32  disable aligned_stack
+enabled_all shared static 
+die Cannot build shared and static at the same time with MSVC
 fi
 
 enabled_any $THREADS_LIST   enable threads
@@ -4306,6 +4311,7 @@ test $postproc_version != current  cat  $TMPH EOF
 #define LIBPOSTPROC_VERSION_MICRO $LIBPOSTPROC_VERSION_MICRO
 EOF
 
+shared_libs=$shared
 print_config AV_HAVE_ $TMPH $HAVE_LIST_PUB
 
 echo #endif /* AVUTIL_AVCONFIG_H */  $TMPH
diff --git a/libavcodec/ac3tab.h b/libavcodec/ac3tab.h

[libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.

2012-10-03 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

The variable is copied to subsequent threads at the same time, so this
may cause wrong ref_count[] values to be copied to subsequent threads.
---
 libavcodec/h264_cabac.c |   41 -
 libavcodec/h264_cavlc.c |   33 +
 2 files changed, 29 insertions(+), 45 deletions(-)

diff --git a/libavcodec/h264_cabac.c b/libavcodec/h264_cabac.c
index 7e8947b..da3a695 100644
--- a/libavcodec/h264_cabac.c
+++ b/libavcodec/h264_cabac.c
@@ -2003,11 +2003,6 @@ decode_intra_mb:
 return 0;
 }
 
-if(MB_MBAFF){
-h-ref_count[0] = 1;
-h-ref_count[1] = 1;
-}
-
 fill_decode_caches(h, mb_type);
 
 if( IS_INTRA( mb_type ) ) {
@@ -2076,10 +2071,11 @@ decode_intra_mb:
 for( i = 0; i  4; i++ ) {
 if(IS_DIRECT(h-sub_mb_type[i])) continue;
 if(IS_DIR(h-sub_mb_type[i], 0, list)){
-if( h-ref_count[list]  1 ){
+int rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1) {
 ref[list][i] = decode_cabac_mb_ref( h, list, 4*i );
-if(ref[list][i] = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference %d 
= %d\n, ref[list][i], h-ref_count[list]);
+if (ref[list][i] = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference %d 
= %d\n, ref[list][i], rc);
 return -1;
 }
 }else
@@ -2161,11 +2157,11 @@ decode_intra_mb:
 if(IS_16X16(mb_type)){
 for(list=0; listh-list_count; list++){
 if(IS_DIR(mb_type, 0, list)){
-int ref;
-if(h-ref_count[list]  1){
+int ref, rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1){
 ref= decode_cabac_mb_ref(h, list, 0);
-if(ref = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference %d = 
%d\n, ref, h-ref_count[list]);
+if (ref = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference %d = 
%d\n, ref, rc);
 return -1;
 }
 }else
@@ -2189,11 +2185,11 @@ decode_intra_mb:
 for(list=0; listh-list_count; list++){
 for(i=0; i2; i++){
 if(IS_DIR(mb_type, i, list)){
-int ref;
-if(h-ref_count[list]  1){
+int ref, rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1) {
 ref= decode_cabac_mb_ref( h, list, 8*i );
-if(ref = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, h-ref_count[list]);
+if (ref = rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, rc);
 return -1;
 }
 }else
@@ -2224,11 +2220,11 @@ decode_intra_mb:
 for(list=0; listh-list_count; list++){
 for(i=0; i2; i++){
 if(IS_DIR(mb_type, i, list)){ //FIXME optimize
-int ref;
-if(h-ref_count[list]  1){
+int ref, rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1) {
 ref= decode_cabac_mb_ref( h, list, 4*i );
-if(ref = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, h-ref_count[list]);
+if (ref = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, rc);
 return -1;
 }
 }else
@@ -2401,10 +2397,5 @@ decode_intra_mb:
 s-current_picture.f.qscale_table[mb_xy] = s-qscale;
 write_back_non_zero_count(h);
 
-if(MB_MBAFF){
-h-ref_count[0] = 1;
-h-ref_count[1] = 1;
-}
-
 return 0;
 }
diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c
index 06e1ba7..cf3ce8f 100644
--- a/libavcodec/h264_cavlc.c
+++ b/libavcodec/h264_cavlc.c
@@ -784,11 +784,6 @@ decode_intra_mb:
 return 0;
 }
 
-if(MB_MBAFF){
-h-ref_count[0] = 1;
-h-ref_count[1] = 1;
-}
-
 fill_decode_neighbors(h, mb_type);
 fill_decode_caches(h, mb_type);
 
@@ -868,7

[libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.

2012-10-03 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

The variable is copied to subsequent threads at the same time, so this
may cause wrong ref_count[] values to be copied to subsequent threads.
---
 libavcodec/h264_cabac.c |   41 -
 libavcodec/h264_cavlc.c |   33 +
 2 files changed, 29 insertions(+), 45 deletions(-)

diff --git a/libavcodec/h264_cabac.c b/libavcodec/h264_cabac.c
index 7e8947b..97e8128 100644
--- a/libavcodec/h264_cabac.c
+++ b/libavcodec/h264_cabac.c
@@ -2003,11 +2003,6 @@ decode_intra_mb:
 return 0;
 }
 
-if(MB_MBAFF){
-h-ref_count[0] = 1;
-h-ref_count[1] = 1;
-}
-
 fill_decode_caches(h, mb_type);
 
 if( IS_INTRA( mb_type ) ) {
@@ -2076,10 +2071,11 @@ decode_intra_mb:
 for( i = 0; i  4; i++ ) {
 if(IS_DIRECT(h-sub_mb_type[i])) continue;
 if(IS_DIR(h-sub_mb_type[i], 0, list)){
-if( h-ref_count[list]  1 ){
+int rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1) {
 ref[list][i] = decode_cabac_mb_ref( h, list, 4*i );
-if(ref[list][i] = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference %d 
= %d\n, ref[list][i], h-ref_count[list]);
+if (ref[list][i] = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference %d 
= %d\n, ref[list][i], rc);
 return -1;
 }
 }else
@@ -2161,11 +2157,11 @@ decode_intra_mb:
 if(IS_16X16(mb_type)){
 for(list=0; listh-list_count; list++){
 if(IS_DIR(mb_type, 0, list)){
-int ref;
-if(h-ref_count[list]  1){
+int ref, rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1){
 ref= decode_cabac_mb_ref(h, list, 0);
-if(ref = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference %d = 
%d\n, ref, h-ref_count[list]);
+if (ref = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference %d = 
%d\n, ref, rc);
 return -1;
 }
 }else
@@ -2189,11 +2185,11 @@ decode_intra_mb:
 for(list=0; listh-list_count; list++){
 for(i=0; i2; i++){
 if(IS_DIR(mb_type, i, list)){
-int ref;
-if(h-ref_count[list]  1){
+int ref, rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1) {
 ref= decode_cabac_mb_ref( h, list, 8*i );
-if(ref = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, h-ref_count[list]);
+if (ref = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, rc);
 return -1;
 }
 }else
@@ -2224,11 +2220,11 @@ decode_intra_mb:
 for(list=0; listh-list_count; list++){
 for(i=0; i2; i++){
 if(IS_DIR(mb_type, i, list)){ //FIXME optimize
-int ref;
-if(h-ref_count[list]  1){
+int ref, rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1) {
 ref= decode_cabac_mb_ref( h, list, 4*i );
-if(ref = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, h-ref_count[list]);
+if (ref = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, rc);
 return -1;
 }
 }else
@@ -2401,10 +2397,5 @@ decode_intra_mb:
 s-current_picture.f.qscale_table[mb_xy] = s-qscale;
 write_back_non_zero_count(h);
 
-if(MB_MBAFF){
-h-ref_count[0] = 1;
-h-ref_count[1] = 1;
-}
-
 return 0;
 }
diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c
index 06e1ba7..cf3ce8f 100644
--- a/libavcodec/h264_cavlc.c
+++ b/libavcodec/h264_cavlc.c
@@ -784,11 +784,6 @@ decode_intra_mb:
 return 0;
 }
 
-if(MB_MBAFF){
-h-ref_count[0] = 1;
-h-ref_count[1] = 1;
-}
-
 fill_decode_neighbors(h, mb_type);
 fill_decode_caches(h, mb_type

[libav-devel] [PATCH] h264: don't touch H264Context-ref_count[] during MB decoding.

2012-10-03 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

The variable is copied to subsequent threads at the same time, so this
may cause wrong ref_count[] values to be copied to subsequent threads.

This bug was found using TSAN.
---
 libavcodec/h264_cabac.c |   41 -
 libavcodec/h264_cavlc.c |   33 +
 2 files changed, 29 insertions(+), 45 deletions(-)

diff --git a/libavcodec/h264_cabac.c b/libavcodec/h264_cabac.c
index 7e8947b..97e8128 100644
--- a/libavcodec/h264_cabac.c
+++ b/libavcodec/h264_cabac.c
@@ -2003,11 +2003,6 @@ decode_intra_mb:
 return 0;
 }
 
-if(MB_MBAFF){
-h-ref_count[0] = 1;
-h-ref_count[1] = 1;
-}
-
 fill_decode_caches(h, mb_type);
 
 if( IS_INTRA( mb_type ) ) {
@@ -2076,10 +2071,11 @@ decode_intra_mb:
 for( i = 0; i  4; i++ ) {
 if(IS_DIRECT(h-sub_mb_type[i])) continue;
 if(IS_DIR(h-sub_mb_type[i], 0, list)){
-if( h-ref_count[list]  1 ){
+int rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1) {
 ref[list][i] = decode_cabac_mb_ref( h, list, 4*i );
-if(ref[list][i] = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference %d 
= %d\n, ref[list][i], h-ref_count[list]);
+if (ref[list][i] = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference %d 
= %d\n, ref[list][i], rc);
 return -1;
 }
 }else
@@ -2161,11 +2157,11 @@ decode_intra_mb:
 if(IS_16X16(mb_type)){
 for(list=0; listh-list_count; list++){
 if(IS_DIR(mb_type, 0, list)){
-int ref;
-if(h-ref_count[list]  1){
+int ref, rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1){
 ref= decode_cabac_mb_ref(h, list, 0);
-if(ref = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference %d = 
%d\n, ref, h-ref_count[list]);
+if (ref = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference %d = 
%d\n, ref, rc);
 return -1;
 }
 }else
@@ -2189,11 +2185,11 @@ decode_intra_mb:
 for(list=0; listh-list_count; list++){
 for(i=0; i2; i++){
 if(IS_DIR(mb_type, i, list)){
-int ref;
-if(h-ref_count[list]  1){
+int ref, rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1) {
 ref= decode_cabac_mb_ref( h, list, 8*i );
-if(ref = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, h-ref_count[list]);
+if (ref = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, rc);
 return -1;
 }
 }else
@@ -2224,11 +2220,11 @@ decode_intra_mb:
 for(list=0; listh-list_count; list++){
 for(i=0; i2; i++){
 if(IS_DIR(mb_type, i, list)){ //FIXME optimize
-int ref;
-if(h-ref_count[list]  1){
+int ref, rc = h-ref_count[list]  MB_MBAFF;
+if (rc  1) {
 ref= decode_cabac_mb_ref( h, list, 4*i );
-if(ref = (unsigned)h-ref_count[list]){
-av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, h-ref_count[list]);
+if (ref = (unsigned) rc) {
+av_log(s-avctx, AV_LOG_ERROR, Reference 
%d = %d\n, ref, rc);
 return -1;
 }
 }else
@@ -2401,10 +2397,5 @@ decode_intra_mb:
 s-current_picture.f.qscale_table[mb_xy] = s-qscale;
 write_back_non_zero_count(h);
 
-if(MB_MBAFF){
-h-ref_count[0] = 1;
-h-ref_count[1] = 1;
-}
-
 return 0;
 }
diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c
index 06e1ba7..cf3ce8f 100644
--- a/libavcodec/h264_cavlc.c
+++ b/libavcodec/h264_cavlc.c
@@ -784,11 +784,6 @@ decode_intra_mb:
 return 0;
 }
 
-if(MB_MBAFF){
-h-ref_count[0] = 1;
-h-ref_count[1] = 1;
-}
-
 fill_decode_neighbors(h, mb_type

Re: [libav-devel] [PATCH 5/6] x86: af_volume: add SSE2-optimized s16 volume scaling

2012-10-02 Thread Ronald S. Bultje
Hi,

On Tue, Oct 2, 2012 at 10:54 AM, Justin Ruggles
justin.rugg...@gmail.com wrote:
 On 09/29/2012 01:17 AM, Justin Ruggles wrote:
 ---
  libavfilter/af_volume.c  |3 ++
  libavfilter/x86/Makefile |3 ++
  libavfilter/x86/af_volume.asm|   56 
 ++
  libavfilter/x86/af_volume_init.c |   39 ++
  4 files changed, 101 insertions(+), 0 deletions(-)
  create mode 100644 libavfilter/x86/af_volume.asm
  create mode 100644 libavfilter/x86/af_volume_init.c

 ping.

Sorry, lgtm.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] Shared library support for MSVC build.

2012-10-02 Thread Ronald S. Bultje
From: Ronald S. Bultje rsbul...@gmail.com

This patch adds support for .dll shared library generation using MSVC's
native build tools. To properly load non-function symbols from DLL files,
we prefix them with __declspec(dllimport) on MSVC for shared library
builds when accessed from outside the .dll. For optimal performance, we
should do this for functions also, but I'm too lazy to do that right now
and it works without.

Caveats: 1) you need a new version of c99-to-c89, which will generate
the .def files which contains the symbols to be exported. The interface
for this isn't stable yet, I'd like to be able to specify symbols on the
commandline with wildcard support, e.g. c99wrap.exe -symbols 'avpriv_*'
link.exe -dll ..., but for now the hardcoded list does its job. Martin
suggested I use the existing .v files for this.
2) a lot of ff_ symbols, and dsputil_init, are accessed from outside
the library in which they exist. These are bugs and should be fixed (i.e.
renamed).
---
 configure  |   10 ++
 libavcodec/ac3tab.h|3 ++-
 libavcodec/dca.h   |3 ++-
 libavcodec/dnxhddata.h |3 ++-
 libavcodec/golomb.h|9 +
 libavcodec/mjpeg.h |   16 
 libavcodec/mpeg4audio.h|3 ++-
 libavcodec/mpegaudiodata.h |6 --
 libavcodec/raw.h   |3 ++-
 libavcodec/symbols.h   |   32 
 libavformat/gxf.c  |3 ++-
 libavutil/aes.h|4 +++-
 libavutil/common.h |5 +++--
 libavutil/intmath.h|3 ++-
 libavutil/md5.h|4 +++-
 libavutil/pixdesc.h|3 ++-
 libavutil/sha.h|4 +++-
 libavutil/symbols.h|   32 
 libavutil/tree.h   |4 +++-
 library.mak|3 ++-
 20 files changed, 120 insertions(+), 33 deletions(-)
 create mode 100644 libavcodec/symbols.h
 create mode 100644 libavutil/symbols.h

diff --git a/configure b/configure
index b7a8b77..2828956 100755
--- a/configure
+++ b/configure
@@ -2616,7 +2616,7 @@ probe_cc(){
 _cc_o='-Fo $@'
 _cc_e='-P -Fi $@'
 _flags_filter=msvc_flags
-_ld_lib='lib%.a'
+_ld_lib='%.lib'
 _ld_path='-libpath:'
 _flags='-nologo'
 _cflags='-D_USE_MATH_DEFINES -Dinline=__inline -FIstdlib.h 
-Dstrtoll=_strtoi64'
@@ -3078,11 +3078,13 @@ case $target_os in
 shlibdir_default=$bindir_default
 SLIBPREF=
 SLIBSUF=.dll
+   LIBPREF=
+   LIBSUF=.lib
 SLIBNAME_WITH_VERSION='$(SLIBPREF)$(FULLNAME)-$(LIBVERSION)$(SLIBSUF)'
 SLIBNAME_WITH_MAJOR='$(SLIBPREF)$(FULLNAME)-$(LIBMAJOR)$(SLIBSUF)'
 dlltool=${cross_prefix}dlltool
 if check_cmd lib.exe -list; then
-SLIB_EXTRA_CMD='-lib.exe /machine:$(LIBTARGET) 
/def:$$(@:$(SLIBSUF)=.def) /out:$(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)'
+SLIB_EXTRA_CMD='-lib.exe -machine:$(LIBTARGET) 
-def:$$(@:$(SLIBSUF)=.def) -out:$(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)'
 if enabled x86_64; then
 LIBTARGET=x64
 fi
@@ -3092,8 +3094,8 @@ case $target_os in
 SLIB_INSTALL_NAME='$(SLIBNAME_WITH_MAJOR)'
 SLIB_INSTALL_LINKS=
 SLIB_INSTALL_EXTRA_SHLIB='$(SLIBNAME:$(SLIBSUF)=.lib)'
-SLIB_INSTALL_EXTRA_LIB='lib$(SLIBNAME:$(SLIBSUF)=.dll.a) 
$(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.def)'
-SHFLAGS='-shared -Wl,--output-def,$$(@:$(SLIBSUF)=.def) 
-Wl,--out-implib,$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a) 
-Wl,--enable-runtime-pseudo-reloc -Wl,--enable-auto-image-base'
+SLIB_INSTALL_EXTRA_LIB='lib$(SLIBNAME:$(SLIBSUF)=.dll) 
$(SLIBNAME_WITH_MAJOR:$(SLIBSUF)=.def)'
+SHFLAGS='-dll -def:$$(@:$(SLIBSUF)=.def) 
-implib:$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a)'
 objformat=win32
 ranlib=:
 enable dos_paths
diff --git a/libavcodec/ac3tab.h b/libavcodec/ac3tab.h
index d6e0eed..0063c8d 100644
--- a/libavcodec/ac3tab.h
+++ b/libavcodec/ac3tab.h
@@ -24,6 +24,7 @@
 
 #include libavutil/common.h
 #include ac3.h
+#include symbols.h
 
 #if CONFIG_HARDCODED_TABLES
 #   define HCONST const
@@ -33,7 +34,7 @@
 
 extern const uint16_t ff_ac3_frame_size_tab[38][3];
 extern const uint8_t  ff_ac3_channels_tab[8];
-extern const uint16_t avpriv_ac3_channel_layout_tab[8];
+extern AVCODEC_SYMBOL const uint16_t avpriv_ac3_channel_layout_tab[8];
 extern const uint8_t  ff_ac3_enc_channel_map[8][2][6];
 extern const uint8_t  ff_ac3_dec_channel_map[8][2][6];
 extern const uint16_t ff_ac3_sample_rate_tab[3];
diff --git a/libavcodec/dca.h b/libavcodec/dca.h
index 4d2a829..e754056 100644
--- a/libavcodec/dca.h
+++ b/libavcodec/dca.h
@@ -26,6 +26,7 @@
 #define AVCODEC_DCA_H
 
 #include stdint.h
+#include symbols.h
 
 /** DCA syncwords, also used for bitstream type detection */
 #define DCA_MARKER_RAW_BE 0x7FFE8001
@@ -36,6 +37,6 @@
 /** DCA-HD specific block starts

Re: [libav-devel] Explicit symbol visibility Was: [PATCH] Shared library support for MSVC build.

2012-10-02 Thread Ronald S. Bultje
Hi,

On Tue, Oct 2, 2012 at 7:07 PM, Luca Barbato lu_z...@gentoo.org wrote:
 On 10/03/2012 03:47 AM, Diego Elio Pettenò wrote:
 On 02/10/2012 18:40, Luca Barbato wrote:

 The patch was rejected since it was _really_ invasive and after lots of
 back and forth we ended up with pattern matching and code w/out any
 explicit marking.

 Which is actually done at link time instead than at compile time. Now
 going with compile time declaration brings us at least a couple of
 things: a) LTO-enabled compilers should be able then to know what is
 internal and what is exported; b) tools like my missingstatic can rely
 on the hidden flag to know what is supposed to be used or static.

 On the other hand it seems Ronald's patch is for the other side, i.e.
 the importing of the symbols, not the exporting..


 As I said, Ronald approach covers the minimal amount of changes needed
 to achieve a specific need (shared libraries in windows), I'd rather to
 cover everything at once and make it useful for everybody.

 Here my take on the subject, the sed expression to convert the codebase
 is left for when I'm less sleepy and if there is enough traction to go
 that way.

A symbol exported from avcodec may be imported in avformat, I don't
think your patch handles that case. I.e. rather than having av_api,
you'd need avcodec_api, avformat_api, etc., to indicate in which dll a
particular symbol resides.

As for the more generic question, yes I'd like to be able to mark all
symbols (functions as well as tables) as exported or imported, but
foresee quite some resistance so I didn't go that way right now. If
others like that, we can certainly go down that route.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] avutil: add yuva444p10 pixel format

2012-10-01 Thread Ronald S. Bultje
Hi,

On Mon, Oct 1, 2012 at 5:03 AM, Luca Barbato lu_z...@gentoo.org wrote:
 From: Paul B Mahol one...@gmail.com

 Signed-off-by: Luca Barbato lu_z...@gentoo.org
 ---
  libavutil/pixdesc.c   | 26 ++
  libavutil/pixfmt.h|  3 ++-
  libswscale/utils.c|  2 ++
  tests/ref/lavfi/pixdesc   |  2 ++
  tests/ref/lavfi/pixfmts_copy  |  2 ++
  tests/ref/lavfi/pixfmts_null  |  2 ++
  tests/ref/lavfi/pixfmts_scale |  2 ++
  tests/ref/lavfi/pixfmts_vflip |  2 ++
  8 files changed, 40 insertions(+), 1 deletion(-)

 diff --git a/libavutil/pixdesc.c b/libavutil/pixdesc.c
 index d0d5de0..04801a7 100644
 --- a/libavutil/pixdesc.c
 +++ b/libavutil/pixdesc.c
 @@ -553,6 +553,32 @@ const AVPixFmtDescriptor 
 av_pix_fmt_descriptors[PIX_FMT_NB] = {
  },
  .flags = PIX_FMT_PLANAR,
  },
 +[PIX_FMT_YUVA444P10LE] = {
 +.name = yuva444p10le,
 +.nb_components = 4,
 +.log2_chroma_w = 0,
 +.log2_chroma_h = 0,
 +.comp = {
 +{ 0, 1, 1, 0, 9 },/* Y */
 +{ 1, 1, 1, 0, 9 },/* U */
 +{ 2, 1, 1, 0, 9 },/* V */
 +{ 3, 1, 1, 0, 9 },/* A */
 +},
 +.flags = PIX_FMT_PLANAR,
 +},
 +[PIX_FMT_YUVA444P10BE] = {
 +.name = yuva444p10be,
 +.nb_components = 4,
 +.log2_chroma_w = 0,
 +.log2_chroma_h = 0,
 +.comp = {
 +{ 0, 1, 1, 0, 9 },/* Y */
 +{ 1, 1, 1, 0, 9 },/* U */
 +{ 2, 1, 1, 0, 9 },/* V */
 +{ 3, 1, 1, 0, 9 },/* A */
 +},
 +.flags = PIX_FMT_BE | PIX_FMT_PLANAR,
 +},
  [PIX_FMT_VDPAU_H264] = {
  .name = vdpau_h264,
  .log2_chroma_w = 1,
 diff --git a/libavutil/pixfmt.h b/libavutil/pixfmt.h
 index f3c504a..d3f10ee 100644
 --- a/libavutil/pixfmt.h
 +++ b/libavutil/pixfmt.h
 @@ -159,7 +159,8 @@ enum PixelFormat {
  PIX_FMT_GBRP16LE,  /// planar GBR 4:4:4 48bpp, little endian
  PIX_FMT_YUVA422P,  /// planar YUV 4:2:2 24bpp, (1 Cr  Cb sample per 
 2x1 Y  A samples)
  PIX_FMT_YUVA444P,  /// planar YUV 4:4:4 32bpp, (1 Cr  Cb sample per 
 1x1 Y  A samples)
 -
 +PIX_FMT_YUVA444P10BE, /// planar YUV 4:4:4 40bpp, (1 Cr  Cb sample per 
 1x1 Y  A samples)
 +PIX_FMT_YUVA444P10LE, /// planar YUV 4:4:4 40bpp, (1 Cr  Cb sample per 
 1x1 Y  A samples)
  PIX_FMT_NB,/// number of pixel formats, DO NOT USE THIS if you 
 want to link with shared libav* because the number of formats might differ 
 between versions
  };

 diff --git a/libswscale/utils.c b/libswscale/utils.c
 index 2f78e87..7d9c120 100644
 --- a/libswscale/utils.c
 +++ b/libswscale/utils.c
 @@ -110,6 +110,8 @@ static const FormatEntry format_entries[PIX_FMT_NB] = {
  [PIX_FMT_YUVA420P]= { 1, 1 },
  [PIX_FMT_YUVA422P]= { 1, 1 },
  [PIX_FMT_YUVA444P]= { 1, 1 },
 +[PIX_FMT_YUVA444P10BE]= { 1, 1 },
 +[PIX_FMT_YUVA444P10LE]= { 1, 1 },
  [PIX_FMT_RGB48BE] = { 1, 1 },
  [PIX_FMT_RGB48LE] = { 1, 1 },
  [PIX_FMT_RGB565BE]= { 1, 1 },


Why not add support for yuva{420,422}p10{be,le} also? And while at it,
Why not add yuva{420,422,444}p9{be.le} at the same time?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


<    1   2   3   4   5   6   7   8   9   10   >