Re: [libav-devel] [PATCH] mkv: add support for the VP9 tag

2012-12-18 Thread Ronald S. Bultje
Hi,

On Mon, Dec 17, 2012 at 8:16 PM, Luca Barbato  wrote:
> On 12/16/12 2:09 AM, Luca Barbato wrote:
>>
>> From: Tom Finegan 
>
>
>> +AV_CODEC_ID_VP9,
>>
>
> Ping

I had already said OK - if you add a big phat warning saying this is
experimental and a non-stable bitstream (e.g. under the experimental
flag for muxing purposes).

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: increase dist_scale_factor for up to 32 references

2012-12-18 Thread Ronald S. Bultje
Hi,

On Mon, Dec 17, 2012 at 10:55 PM, Janne Grunau  wrote:
> On 2012-12-17 18:24:37 -0800, Ronald S. Bultje wrote:
>> Hi,
>> On Dec 17, 2012 1:02 PM, "Janne Grunau"  wrote:
>> >
>> > Compute dist_scale_factor_field only for MBAFF since that is the only
>> > case in which it is used.
>>
>> The patch also increases an array size - why?
>
> to handle up to 32 references for field pictures. Alternatively both
> arrays can probably be merged to the usual 16 + 32 size.

Check the code:
+for (i = 0; i < 2 * h->ref_count[0]; i++)
+h->dist_scale_factor_field[field][i^field] =
+get_scale_factor(h, poc, poc1, i+16);
+}
+
+for (i = 0; i < h->ref_count[0]; i++){
 h->dist_scale_factor[i] = get_scale_factor(h, poc, poc1, i);
 }

For arrays of size:

-int dist_scale_factor[16];
+int dist_scale_factor[32];
 int dist_scale_factor_field[2][32];

If i for setting dist_scale_factor_field is [0, ref_count * 2] and
that's [0, 32], them ref_count is 16, thus the second loop i [0,
ref_count] is [0, 16], thus dist_scale_factor[] only needs to be of
size 16.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 13/15] h264: break after error if enough NALs were seen in decode_nal_units

2012-12-17 Thread Ronald S. Bultje
Hi,

On Dec 12, 2012 12:31 PM, "Janne Grunau"  wrote:
>
> ---
>  libavcodec/h264.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index 8847f1f..35cc800 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -4070,17 +4070,16 @@ again:
> hx->nal_unit_type, bit_length);
>  }
>
> -if (context_count == h->max_contexts) {
> -execute_decode_slices(h, context_count);
> +if (!err && context_count == h->max_contexts) {
> +err = execute_decode_slices(h, context_count);
>  context_count = 0;
>  }

Does this completely break ER?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 3/3] h264: prevent decoding of slice NALs in extradata

2012-12-17 Thread Ronald S. Bultje
Hi,

On Dec 17, 2012 1:41 PM, "Janne Grunau"  wrote:
>
> It is not posible to call get_buffer during frame-mt codec
> initialization. Libavformat might pass huge amounts of data as
> extradata after parsing broken files. The 'extradata' for the fuzzed
> sample sample_varPAR_s5374_r001-02.avi is 2.8M large and contains
> multiple slices.

OK, maybe log an error if you like.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/3] h264: clear picture copies when unreferencing them

2012-12-17 Thread Ronald S. Bultje
Hi,

On Dec 17, 2012 1:41 PM, "Janne Grunau"  wrote:
>
> Copy a neighbouring frame/field from the list as error resilience
> measure since the decoder assumes frame data pointers of known reference
> to be valid.
>
> Prevents stale references in ref_list in the fuzzed sample
> bipbop234.ts_s20118 caused by not refreshing the ref lists when required
> due to slice decoding errors.

So maybe I'm being an asshole - tell me if I am - but why would we unref an
image if we're still referencing it? That seems broken. Shouldn't the code
that unrefs clear and/or check this?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: increase dist_scale_factor for up to 32 references

2012-12-17 Thread Ronald S. Bultje
Hi,
On Dec 17, 2012 1:02 PM, "Janne Grunau"  wrote:
>
> Compute dist_scale_factor_field only for MBAFF since that is the only
> case in which it is used.

The patch also increases an array size - why?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Add a VideoDSPContext with core video DSP functions.

2012-12-16 Thread Ronald S. Bultje
Hi,

On Sun, Dec 16, 2012 at 9:33 AM, Luca Barbato  wrote:
> +void ff_videodsp_init_x86(VideoDSPContext *ctx, int bpc)
[..]
> +{
> +#if HAVE_YASM
> +int mm_flags = av_get_cpu_flags();
> +
> +#if ARCH_X86_32
> +if (bpc <= 8 && mm_flags & AV_CPU_FLAG_MMX) {
> +c->emulated_edge_mc = emulated_edge_mc_mmx;

s/c/ctx/.

> +}
> +if (mm_flags & AV_CPU_FLAG_AMD3DNOW) {

AV_CPU_FLAG_3DNOW.

Then it compiles on x86-32 also. I haven't tested on ppc or arm.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] x86inc: remove wrong assert in X86_32 PROLOGUE macro

2012-12-16 Thread Ronald S. Bultje
Hi,

On Sun, Dec 16, 2012 at 1:49 PM, Janne Grunau  wrote:
> On 2012-12-16 12:36:56 -0800, Ronald S. Bultje wrote:
>>
>> On Sun, Dec 16, 2012 at 5:52 AM, Janne Grunau  wrote:
>> >
>> > Nasm also fails on '%1 %+ SUFFIX' with empty SUFFIX in cpuid.asm. It was
>> > previously working since it was always followed by ', %2'. Easiest
>> > solution I've found was adding parameters to cpu_cpuid_test in
>> > cpuid.asm.
>>
>> Does it work if you add a comma behind it (but no arguments)? Or does
>> cglobal_internal then think %0 == 2?
>
> It already failes earlier with:
> libavutil/x86/cpuid.asm:70: error: (cglobal_internal:1) `%ifndef'
> expects macro identifiers
>
> presumeably failing on cglobaled_cpu_cpuid_test,
>
> Adding a unused parameter to both cglobal_internal calls works ugly
> as it is.

The old one used this:

diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index 483ae7b..564ba50 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -614,12 +614,8 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14
 ; Applies any symbol mangling needed for C linkage, and sets up a define such t
 ; subsequent uses of the function name automatically refer to the mangled versi
 ; Appends cpuflags to the function name if cpuflags has been specified.
-%macro cglobal 1-2+ ; name, [PROLOGUE args]
-%if %0 == 1
-cglobal_internal %1 %+ SUFFIX
-%else
+%macro cglobal 1-2+ "" ; name, [PROLOGUE args]
 cglobal_internal %1 %+ SUFFIX, %2
-%endif
 %endmacro
 %macro cglobal_internal 1-2+
 %ifndef cglobaled_%1
@@ -640,7 +636,7 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14
 %assign stack_offset 0
 %assign stack_size 0
 %assign stack_size_padded 0
-%if %0 > 1
+%ifnidn %2, ""
 PROLOGUE %2
 %endif
 %endmacro

Does that work?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] x86inc: remove wrong assert in X86_32 PROLOGUE macro

2012-12-16 Thread Ronald S. Bultje
Hi,

On Sun, Dec 16, 2012 at 5:52 AM, Janne Grunau  wrote:
> On 2012-12-14 17:12:05 -0800, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Fri, Dec 14, 2012 at 8:56 AM, Janne Grunau  wrote:
>> > On 2012-12-14 08:32:14 -0800, Ronald S. Bultje wrote:
>> >> Hi,
>> >>
>> >> On Dec 14, 2012 3:30 AM, "Janne Grunau"  wrote:
>> >> >
>> >> > Arguments on the stack are handled properly and functions can use more
>> >> > than arguments than the 7 registers available on x86_32.
>> >> > Fixes nasm build, yasm would fail too if it would error out on %error
>> >> > instead of just emitting a warning.
>> >> > ---
>> >> >  libavutil/x86/x86inc.asm | 1 -
>> >> >  1 file changed, 1 deletion(-)
>> >> >
>> >> > diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
>> >> > index 60d05f4..40705e0 100644
>> >> > --- a/libavutil/x86/x86inc.asm
>> >> > +++ b/libavutil/x86/x86inc.asm
>> >> > @@ -554,7 +554,6 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14
>> >> >  %endif
>> >> >  SETUP_STACK_POINTER %4
>> >> >  ASSERT regs_used <= 7
>> >> > -ASSERT regs_used >= num_args
>> >> >  PUSH_IF_USED 3, 4, 5, 6
>> >> >  ALLOC_STACK %4
>> >> >  LOAD_IF_USED 0, 1, 2, 3, 4, 5, 6
>> >>
>> >> Please submit to x264 first.
>> >>
>> >> Where does this trigger? This shouldn't happen, an assert is the right
>> >> thing to do.
>> >
>> > It is? make fate passes on x86_32 instances and yasm is affected too
>> > except that it emits just a warning on %error.
>> >
>> > It triggers on h264_loop_filter_strength_mmxext in h264_deblock.asm
>> > with 9 arguments.
>>
>> But we have code above that clips both num_regs and regs_used to 7, so
>> both should be 7?
>
> if you mean num_args with num_regs then you removed the code to clamp it
> to 7 in 6f40e9f070f7 'x86inc: support stack mem allocation and
> re-alignment in PROLOGUE'. Re-adding that breaks several fate-h264
> tests.
>
> Nasm also fails on '%1 %+ SUFFIX' with empty SUFFIX in cpuid.asm. It was
> previously working since it was always followed by ', %2'. Easiest
> solution I've found was adding parameters to cpu_cpuid_test in
> cpuid.asm.

Does it work if you add a comma behind it (but no arguments)? Or does
cglobal_internal then think %0 == 2?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] mkv: support vp9 tag

2012-12-15 Thread Ronald S. Bultje
Hi,

On Sat, Dec 15, 2012 at 2:28 PM, Luca Barbato  wrote:
> From: Tom Finegan 
>
> ---
>
> Maybe is a bit early but since files are cropping already might be good to 
> have
> it.
>
>  libavcodec/avcodec.h   | 1 +
>  libavformat/matroska.c | 1 +
>  2 files changed, 2 insertions(+)
>
> diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
> index 29e3701..75655df 100644
> --- a/libavcodec/avcodec.h
> +++ b/libavcodec/avcodec.h
> @@ -238,6 +238,7 @@ enum AVCodecID {
>  AV_CODEC_ID_KGV1,
>  AV_CODEC_ID_YOP,
>  AV_CODEC_ID_VP8,
> +AV_CODEC_ID_VP9,
>  AV_CODEC_ID_PICTOR,

Breaks ABI.

More generally, a big phat warning is missing that says the bitstream
is not stable yet.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Add a VideoDSPContext with core video DSP functions.

2012-12-15 Thread Ronald S. Bultje
Hi,

On Sat, Dec 15, 2012 at 2:20 PM, Diego Biurrun  wrote:
> On Sat, Dec 15, 2012 at 09:46:02AM -0800, Ronald S. Bultje wrote:
>> --- /dev/null
>> +++ b/libavcodec/arm/videodsp_arm.S
>> @@ -0,0 +1,33 @@
>> +@
>> +@ ARMv4 optimized DSP utils
>> +@ Copyright (c) 2004 AGAWA Koji 
>> +
>> +#include "config.h"
>> +#include "libavutil/arm/asm.S"
>> +
>> +#if HAVE_ARMV5TE_EXTERNAL
>> +function ff_prefetch_arm, export=1
>> +subsr2,  r2,  #1
>> +pld [r0]
>> +add r0,  r0,  r1
>> +bne ff_prefetch_arm
>> +bx  lr
>> +endfunc
>> +#endif
>
> You continue playing fast and loose with copyright.  This was written
> by Mans, not by Koji Agawa.
>
> Many of the instances where you add your name to the top of files in
> this patch are also rather dubious.

Which ones?

I wrote all videodsp initialization code, and I wrote the x86 asm. I
only see the ppc one as being dubious, where I did write the init code
but not the prefetch, so possibly one name is missing, and perhaps the
C code, which was written and touched by a bunch of people (I'm too
lazy to look up who, feel free to fill it in for me). So my name is
not misplaced anywhere, rather 1-2 names of other people may be
missing here or there.

The part where Mans' name is missing is because it's missing in the
original source file also. I can hardly be blamed for that. Will fix
though.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Add a VideoDSPContext with core video DSP functions.

2012-12-15 Thread Ronald S. Bultje
Hi,

On Sat, Dec 15, 2012 at 9:15 AM, Janne Grunau  wrote:
> Everything quoted without remarks looks ok, I haven't looked at
> the x86 asm. I guess only tested on x86?

Yes. The emulated edge asm is a move so it needs no review. Review of
the prefetch x86 asm would be nice, since it's a move from inline asm
to yasm.

All issues that you pointed out fixed locally.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: handle complementary field pairs in await_references()

2012-12-14 Thread Ronald S. Bultje
Hi,

On Fri, Dec 14, 2012 at 7:02 AM, Janne Grunau  wrote:
> On 2012-12-12 16:16:21 -0800, Ronald S. Bultje wrote:
>>
>> On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau  wrote:
>> > On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote:
>> >> Hi,
>> >>
>> >> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  
>> >> wrote:
>> >> > Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid
>> >> > field 2.
>> >> > ---
>> >> >  libavcodec/h264.c | 3 ++-
>> >> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/libavcodec/h264.c b/libavcodec/h264.c
>> >> > index 546b046..d5a54e2 100644
>> >> > --- a/libavcodec/h264.c
>> >> > +++ b/libavcodec/h264.c
>> >> > @@ -433,7 +433,8 @@ static void await_references(H264Context *h)
>> >> >  ff_thread_await_progress(&ref_pic->f,
>> >> >   FFMIN((row >> 1), 
>> >> > pic_height - 1),
>> >> >   0);
>> >> > -} else if (FIELD_PICTURE && !ref_field_picture) { // 
>> >> > field referencing one field of a frame
>> >> > +} else if (FIELD_PICTURE &&
>> >> > +   (!ref_field_picture || ref_field > 1)) { // 
>> >> > field referencing one field of a frame or complementary field pair
>> >>
>> >> I don't understand this one. If we're referencing two fields,
>> >> shouldn't ref_field_picture automatically be true?
>> >
>> > it isn't and the code that marks both fields of a complementory field
>> > pair as available doesn't touch field_picture.
>>
>> Right, but isn't this wrong then? I mean, the reference is
>> field-based, so we're going to reference a field. The part where both
>> fields are available doesn't negate the fact that we're only
>> referencing one.
>
> Yes the patch is wrong, I wasn't paying enough attention.
> A complementary ref field pair can't handled in the same way as a frame.
> We reported progress on the individual fields. I don't know how to
> determine which field it references so the patch waits on both. eld
>
>> It seems to me that a change like this could potentially cause use of
>> reference data before the second field of a reference is fully
>> decoded.
>
> Yes, see above.
>
> Janne
>
> ---8<---
> Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid
> field 2.
> ---
>  libavcodec/h264.c | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index f01ff67..be2a98b 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -438,6 +438,11 @@ static void await_references(H264Context *h)
>   FFMIN(row * 2 + ref_field,
> pic_height - 1),
>   0);
> +} else if (FIELD_PICTURE && ref_field_picture && ref_field 
> == 2) { // field referencing one field of a complementary field pair
> +ff_thread_await_progress(&ref_pic->f,
> + FFMIN(row, pic_height - 1), 1);
> +ff_thread_await_progress(&ref_pic->f,
> + FFMIN(row, pic_height - 1), 0);
>  } else if (FIELD_PICTURE) {

I still don't get it, how does a field reference need to wait for two
fields? The idea of a field reference is that we only need the data of
1 field.

Can I get the sample? I'll try to see what's going on this weekend
perhaps... All this looks weird.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: fix memleak on error during SPS parsing

2012-12-14 Thread Ronald S. Bultje
Hi,

On Fri, Dec 14, 2012 at 12:09 PM, Janne Grunau  wrote:
> Introduced in d7d6efe42b0d.
> ---
>  libavcodec/h264_ps.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264-mt: handle NAL_DPAs before calling ff_thread_finish_setup

2012-12-14 Thread Ronald S. Bultje
Hi,

On Fri, Dec 14, 2012 at 2:33 PM, Janne Grunau  wrote:
> Hi,
> On 2012-12-12 16:20:50 -0800, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  
>> wrote:
>> > ---
>> >  libavcodec/h264.c | 13 -
>> >  libavcodec/h264.h |  1 +
>> >  2 files changed, 13 insertions(+), 1 deletion(-)
>>
>> In what situation does this happen?
>
> One situation is data partitioning. NAL_DPA calls decode_slice_header
> and can start a new frame but wasn't counted as needed frame. See
> attached patch. With a little luck it make this patch and patch 12/15
> obsolete.
>
> Janne
>
> ---8<---
>
> Since a NAL_DPA can start a new frame it has to be handled before
> ff_thread_finish_setup is called.
> ---
>  libavcodec/h264.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index b863c41..75192ac 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -3929,6 +3929,7 @@ static int decode_nal_units(H264Context *h, const 
> uint8_t *buf, int buf_size)
>  case NAL_PPS:
>  nals_needed = nal_index;
>  break;
> +case NAL_DPA:
>  case NAL_IDR_SLICE:
>  case NAL_SLICE:
>  init_get_bits(&hx->s.gb, ptr, bit_length);
> --
> 1.7.12.4

That looks more in line with the intent of the current code, so is OK
as a short-gap. Ideally, at some point we would add parsing for these
kind of sequences, so they are input as 2 separate pkts to the decoder
in 2 separate threads, thus running concurrently.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] x86inc: remove wrong assert in X86_32 PROLOGUE macro

2012-12-14 Thread Ronald S. Bultje
Hi,

On Fri, Dec 14, 2012 at 8:56 AM, Janne Grunau  wrote:
> On 2012-12-14 08:32:14 -0800, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Dec 14, 2012 3:30 AM, "Janne Grunau"  wrote:
>> >
>> > Arguments on the stack are handled properly and functions can use more
>> > than arguments than the 7 registers available on x86_32.
>> > Fixes nasm build, yasm would fail too if it would error out on %error
>> > instead of just emitting a warning.
>> > ---
>> >  libavutil/x86/x86inc.asm | 1 -
>> >  1 file changed, 1 deletion(-)
>> >
>> > diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
>> > index 60d05f4..40705e0 100644
>> > --- a/libavutil/x86/x86inc.asm
>> > +++ b/libavutil/x86/x86inc.asm
>> > @@ -554,7 +554,6 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14
>> >  %endif
>> >  SETUP_STACK_POINTER %4
>> >  ASSERT regs_used <= 7
>> > -ASSERT regs_used >= num_args
>> >  PUSH_IF_USED 3, 4, 5, 6
>> >  ALLOC_STACK %4
>> >  LOAD_IF_USED 0, 1, 2, 3, 4, 5, 6
>>
>> Please submit to x264 first.
>>
>> Where does this trigger? This shouldn't happen, an assert is the right
>> thing to do.
>
> It is? make fate passes on x86_32 instances and yasm is affected too
> except that it emits just a warning on %error.
>
> It triggers on h264_loop_filter_strength_mmxext in h264_deblock.asm
> with 9 arguments.

But we have code above that clips both num_regs and regs_used to 7, so
both should be 7?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] vp56: release frames on error

2012-12-14 Thread Ronald S. Bultje
Hi,

On Dec 14, 2012 12:59 AM, "Luca Barbato"  wrote:
>
> Fixes CVE-2012-2783
>
> CC: libav-sta...@libav.org
> ---
>  libavcodec/vp56.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/vp56.c b/libavcodec/vp56.c
> index 6779ffb..5bd0a1a 100644
> --- a/libavcodec/vp56.c
> +++ b/libavcodec/vp56.c
> @@ -514,8 +514,14 @@ int ff_vp56_decode_frame(AVCodecContext *avctx, void
*data, int *got_frame,
>  s->modelp = &s->models[is_alpha];
>
>  res = s->parse_header(s, buf, remaining_buf_size, &golden_frame);
> -if (res < 0)
> +if (res < 0) {
> +int i;
> +for (i = 0; i < 4; i++) {
> +if (s->frames[i].data[0])
> +avctx->release_buffer(avctx, &s->frames[i]);
> +}
>  return res;
> +}
>
>  if (res == VP56_SIZE_CHANGE) {
>  int i;

What about error resilience / concealment?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] x86inc: remove wrong assert in X86_32 PROLOGUE macro

2012-12-14 Thread Ronald S. Bultje
Hi,

On Dec 14, 2012 3:30 AM, "Janne Grunau"  wrote:
>
> Arguments on the stack are handled properly and functions can use more
> than arguments than the 7 registers available on x86_32.
> Fixes nasm build, yasm would fail too if it would error out on %error
> instead of just emitting a warning.
> ---
>  libavutil/x86/x86inc.asm | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
> index 60d05f4..40705e0 100644
> --- a/libavutil/x86/x86inc.asm
> +++ b/libavutil/x86/x86inc.asm
> @@ -554,7 +554,6 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14
>  %endif
>  SETUP_STACK_POINTER %4
>  ASSERT regs_used <= 7
> -ASSERT regs_used >= num_args
>  PUSH_IF_USED 3, 4, 5, 6
>  ALLOC_STACK %4
>  LOAD_IF_USED 0, 1, 2, 3, 4, 5, 6

Please submit to x264 first.

Where does this trigger? This shouldn't happen, an assert is the right
thing to do.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] x86inc: fully concatenate tokens to fix macro expansion for nasm

2012-12-13 Thread Ronald S. Bultje
Hi,

On Thu, Dec 13, 2012 at 1:48 PM, Janne Grunau  wrote:
> Fixes build errors with nasm introduced in 6f40e9f070f7 for stack
> memory alignment. Noticed by BugMaster.
> ---
>  libavutil/x86/x86inc.asm | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 4:29 PM, Janne Grunau  wrote:
> On 2012-12-12 16:16:21 -0800, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau  wrote:
>> > On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote:
>> >> Hi,
>> >>
>> >> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  
>> >> wrote:
>> >> > Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid
>> >> > field 2.
>> >> > ---
>> >> >  libavcodec/h264.c | 3 ++-
>> >> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >> >
>> >> > diff --git a/libavcodec/h264.c b/libavcodec/h264.c
>> >> > index 546b046..d5a54e2 100644
>> >> > --- a/libavcodec/h264.c
>> >> > +++ b/libavcodec/h264.c
>> >> > @@ -433,7 +433,8 @@ static void await_references(H264Context *h)
>> >> >  ff_thread_await_progress(&ref_pic->f,
>> >> >   FFMIN((row >> 1), 
>> >> > pic_height - 1),
>> >> >   0);
>> >> > -} else if (FIELD_PICTURE && !ref_field_picture) { // 
>> >> > field referencing one field of a frame
>> >> > +} else if (FIELD_PICTURE &&
>> >> > +   (!ref_field_picture || ref_field > 1)) { // 
>> >> > field referencing one field of a frame or complementary field pair
>> >>
>> >> I don't understand this one. If we're referencing two fields,
>> >> shouldn't ref_field_picture automatically be true?
>> >
>> > it isn't and the code that marks both fields of a complementory field
>> > pair as available doesn't touch field_picture.
>>
>> Right, but isn't this wrong then? I mean, the reference is
>> field-based, so we're going to reference a field. The part where both
>> fields are available doesn't negate the fact that we're only
>> referencing one.
>>
>> It seems to me that a change like this could potentially cause use of
>> reference data before the second field of a reference is fully
>> decoded.
>
> I need to check
>
>> What does the hang look like?
>
> ref_field = ref_pic->f.reference;
>
> and it waits on ref_field in the next else that breaks when reference is
> PICT_FRAME

that's BOTTOM_FIELD | TOP_FIELD.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 11/15] h264: prevent starting new frames after ff_thread_finish_setup()

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  wrote:
> ---
>  libavcodec/h264.c | 13 -
>  libavcodec/h264.h |  1 +
>  2 files changed, 13 insertions(+), 1 deletion(-)

In what situation does this happen? We should really make this an
assert() at some point, and track the cases where it does happen back
to ensure we fix it at whatever level. (The current hack to decode two
fields in a packet is a hack and should die; we should parse such
files better and split at the field boundary.)

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau  wrote:
> On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  
>> wrote:
>> > Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid
>> > field 2.
>> > ---
>> >  libavcodec/h264.c | 3 ++-
>> >  1 file changed, 2 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/libavcodec/h264.c b/libavcodec/h264.c
>> > index 546b046..d5a54e2 100644
>> > --- a/libavcodec/h264.c
>> > +++ b/libavcodec/h264.c
>> > @@ -433,7 +433,8 @@ static void await_references(H264Context *h)
>> >  ff_thread_await_progress(&ref_pic->f,
>> >   FFMIN((row >> 1), pic_height 
>> > - 1),
>> >   0);
>> > -} else if (FIELD_PICTURE && !ref_field_picture) { // 
>> > field referencing one field of a frame
>> > +} else if (FIELD_PICTURE &&
>> > +   (!ref_field_picture || ref_field > 1)) { // 
>> > field referencing one field of a frame or complementary field pair
>>
>> I don't understand this one. If we're referencing two fields,
>> shouldn't ref_field_picture automatically be true?
>
> it isn't and the code that marks both fields of a complementory field
> pair as available doesn't touch field_picture.

Right, but isn't this wrong then? I mean, the reference is
field-based, so we're going to reference a field. The part where both
fields are available doesn't negate the fact that we're only
referencing one.

It seems to me that a change like this could potentially cause use of
reference data before the second field of a reference is fully
decoded.

What does the hang look like?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 04/15] mpegvideo: align width to 32 for scratch buffer sizes

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 3:12 PM, Janne Grunau  wrote:
> On 2012-12-12 14:30:43 -0800, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  
>> wrote:
>> > cmdutis.c's alloc_buffer() uses aligned to 32 width plus 2 edges of 32
>> > pixels as linesize. emu_edge_buffer has to work with the same stride.
>> > This makes only a difference for > 8 bit per pixel bit depths since we
>> > always allocate for 16 bit per pixel.
>> >
>> > Fixes fuzzed sample nasa-8s2.ts_s244342.
>> > ---
>> >  libavcodec/mpegvideo.c | 13 -
>> >  1 file changed, 8 insertions(+), 5 deletions(-)
>>
>> What if we implement a custom get_buffer() which returns bigger strides?
>
> it fails horrible for high pixel bit depth like it does with this
> sample. I could add an assumed_linesize into the context and realloc if
> the linesize returned by get_buffer is larger or maybe preallocate a
> buffer here if it' feasible.

Can't you allocate these buffers after you receive the first buffer
from get_buffer() and actually know the linesize?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  wrote:
> Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid
> field 2.
> ---
>  libavcodec/h264.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index 546b046..d5a54e2 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -433,7 +433,8 @@ static void await_references(H264Context *h)
>  ff_thread_await_progress(&ref_pic->f,
>   FFMIN((row >> 1), pic_height - 
> 1),
>   0);
> -} else if (FIELD_PICTURE && !ref_field_picture) { // field 
> referencing one field of a frame
> +} else if (FIELD_PICTURE &&
> +   (!ref_field_picture || ref_field > 1)) { // field 
> referencing one field of a frame or complementary field pair

I don't understand this one. If we're referencing two fields,
shouldn't ref_field_picture automatically be true?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 05/15] h264: initialize frame-mt context copies properly

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  wrote:
> ---
>  libavcodec/h264.c | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index e8a22f8..546b046 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -1125,6 +1125,8 @@ static int decode_init_thread_copy(AVCodecContext 
> *avctx)
>  memset(h->sps_buffers, 0, sizeof(h->sps_buffers));
>  memset(h->pps_buffers, 0, sizeof(h->pps_buffers));
>
> +h->s.context_initialized = 0;
> +
>  return 0;
>  }

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 04/15] mpegvideo: align width to 32 for scratch buffer sizes

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  wrote:
> cmdutis.c's alloc_buffer() uses aligned to 32 width plus 2 edges of 32
> pixels as linesize. emu_edge_buffer has to work with the same stride.
> This makes only a difference for > 8 bit per pixel bit depths since we
> always allocate for 16 bit per pixel.
>
> Fixes fuzzed sample nasa-8s2.ts_s244342.
> ---
>  libavcodec/mpegvideo.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)

What if we implement a custom get_buffer() which returns bigger strides?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 03/15] mpegvideo: treat delayed pictures as used

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  wrote:
> This requires to move the avcodec_default_free_buffers() call to
> ff_MPV_common_end() since otherwise delayed pictures would get freed
> during a size change.
> ---
>  libavcodec/h264.h  | 6 --
>  libavcodec/mpegvideo.c | 8 
>  libavcodec/mpegvideo.h | 6 ++
>  3 files changed, 10 insertions(+), 10 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 02/15] lavc: do not overwrite frame parameters of delayed frames

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  wrote:
> Decoders supporting frame parameter changes can return delayed frames
> with the old parameters.
> ---
>  libavcodec/utils.c | 15 ++-
>  1 file changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/libavcodec/utils.c b/libavcodec/utils.c
> index 1185a35..97fa570 100644
> --- a/libavcodec/utils.c
> +++ b/libavcodec/utils.c
> @@ -1282,11 +1282,16 @@ int attribute_align_arg 
> avcodec_decode_video2(AVCodecContext *avctx, AVFrame *pi
>  else {
>  ret = avctx->codec->decode(avctx, picture, got_picture_ptr,
> avpkt);
> -picture->pkt_dts = avpkt->dts;
> -picture->sample_aspect_ratio = avctx->sample_aspect_ratio;
> -picture->width   = avctx->width;
> -picture->height  = avctx->height;
> -picture->format  = avctx->pix_fmt;
> +picture->pkt_dts = avpkt->dts;
> +
> +/* delayed pictures might have different parameters than
> + * the context returning them */
> +if (!(avctx->codec->capabilities & CODEC_CAP_DELAY)) {
> +picture->sample_aspect_ratio = avctx->sample_aspect_ratio;
> +picture->width   = avctx->width;
> +picture->height  = avctx->height;
> +picture->format  = avctx->pix_fmt;
> +}
>  }

I can live with this. We can clean this further later.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 01/15] h264: set parameters from SPS whenever it changes

2012-12-12 Thread Ronald S. Bultje
Hi,

On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau  wrote:
> Fixes a crash in the fuzzed sample sample_varPAR.avi_s26638 with
> alternating bit depths.
> ---
>  libavcodec/h264.c| 107 
> +--
>  libavcodec/h264.h|   2 +
>  libavcodec/h264_ps.c |   3 ++
>  3 files changed, 65 insertions(+), 47 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-08 Thread Ronald S. Bultje
Hi,

On Sat, Dec 8, 2012 at 4:12 PM, Ronald S. Bultje  wrote:
> From: "Ronald S. Bultje" 
>
> Use this in VP8/H264-8bit loopfilter functions so they can be used if
> there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
> ---
>  libavcodec/x86/h264_deblock.asm |  27 ++
>  libavcodec/x86/h264dsp_init.c   |   4 +-
>  libavcodec/x86/vp8dsp.asm   |  68 ---
>  libavcodec/x86/vp8dsp_init.c|   8 --
>  libavutil/x86/x86inc.asm| 185 
> 
>  5 files changed, 191 insertions(+), 101 deletions(-)

One more fix for invalid stack free'ing if a YMM function on win64
used >6 registers, but no stack. Any more reviews, or can this be
applied?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-08 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 ---
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 185 
 5 files changed, 191 insertions(+), 101 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index c124c4d..1f1dbc6 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -398,14 +398,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5,5,8,2*%2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset&15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset&15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+12*HAVE_ALIGNED_STACK
 
 ; transpose 6x16 -> tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)&15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2,4,8,0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset&15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 -> tmp space
@@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index ac231cb..73d4990 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c->biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c->h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c->h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c->h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c->h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c->biweight_h264_pixels_tab[0

Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-08 Thread Ronald S. Bultje
Hi,

On Sat, Dec 8, 2012 at 8:41 AM, MĂĄns RullgĂĄrd  wrote:
> "Ronald S. Bultje"  writes:
>
>> From: "Ronald S. Bultje" 
>>
>> Use this in VP8/H264-8bit loopfilter functions so they can be used if
>> there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
>> ---
>>  libavcodec/x86/h264_deblock.asm |  27 ++
>>  libavcodec/x86/h264dsp_init.c   |   4 +-
>>  libavcodec/x86/vp8dsp.asm   |  68 ---
>>  libavcodec/x86/vp8dsp_init.c|   8 --
>>  libavutil/x86/x86inc.asm| 185 
>> 
>>  5 files changed, 191 insertions(+), 101 deletions(-)
>
> How is this different from the patch you sent yesterday?

It adds a missing %endrep for win64.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-08 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 ---
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 185 
 5 files changed, 191 insertions(+), 101 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index c124c4d..1f1dbc6 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -398,14 +398,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5,5,8,2*%2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset&15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset&15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+12*HAVE_ALIGNED_STACK
 
 ; transpose 6x16 -> tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)&15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2,4,8,0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset&15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 -> tmp space
@@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index ac231cb..73d4990 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c->biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c->h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c->h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c->h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c->h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c->biweight_h264_pixels_tab[0

Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-07 Thread Ronald S. Bultje
Hi,

On Fri, Dec 7, 2012 at 2:08 PM, MĂĄns RullgĂĄrd  wrote:
> "Ronald S. Bultje"  writes:
>
>> Hi,
>>
>> On Fri, Dec 7, 2012 at 2:01 PM, MĂĄns RullgĂĄrd  wrote:
>>> "Ronald S. Bultje"  writes:
>>>> On Fri, Dec 7, 2012 at 1:01 PM, MĂĄns RullgĂĄrd  wrote:
>>>>> "Ronald S. Bultje"  writes:
>>>>>
>>>>>> +%if mmsize <= 16 && HAVE_ALIGNED_STACK
>>>>>
>>>>> How much overhead would it be to drop HAVE_ALIGNED_STACK entirely?
>>>>
>>>> Well, for now, we still have a ton of functions that don't use the
>>>> cglobal-method of allocating stack. I only ported h264/vp8 loopfilter,
>>>> nothing else.
>>>>
>>>> But anyway, more generally, it's 4-5 instructions per function. For
>>>> typical functions with an inner loop, that's negligible, but for a
>>>> select small set of functions, it may be significant.
>>>
>>> The remaining functions are ff_h264_idct8_add(4)_10_{sse2,avx},
>>> ff_hadamard8_diff(16)_{sse2,ssse3}, and something in swscale.
>>
>> lavr also.
>
> There are no references to HAVE_ALIGNED_STACK there.

It crashes on x86-32 icc10.x and msvc.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-07 Thread Ronald S. Bultje
Hi,

On Fri, Dec 7, 2012 at 2:01 PM, MĂĄns RullgĂĄrd  wrote:
> "Ronald S. Bultje"  writes:
>> On Fri, Dec 7, 2012 at 1:01 PM, MĂĄns RullgĂĄrd  wrote:
>>> "Ronald S. Bultje"  writes:
>>>
>>>> +%if mmsize <= 16 && HAVE_ALIGNED_STACK
>>>
>>> How much overhead would it be to drop HAVE_ALIGNED_STACK entirely?
>>
>> Well, for now, we still have a ton of functions that don't use the
>> cglobal-method of allocating stack. I only ported h264/vp8 loopfilter,
>> nothing else.
>>
>> But anyway, more generally, it's 4-5 instructions per function. For
>> typical functions with an inner loop, that's negligible, but for a
>> select small set of functions, it may be significant.
>
> The remaining functions are ff_h264_idct8_add(4)_10_{sse2,avx},
> ff_hadamard8_diff(16)_{sse2,ssse3}, and something in swscale.

lavr also.

> Besides, does anyone still use 32-bit where performance is that
> critical?

This is used for YMM (e.g. avx float) stack alignment (to 32-byte)
also, so it will affect 64-bit also. My personal point of view is that
the code to take advantage of an actual feature of the compiler/system
(alignment) is there. I don't see why we'd remove it.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] golomb: use unsigned arithmetics in svq3_get_ue_golomb()

2012-12-07 Thread Ronald S. Bultje
Hi,

On Fri, Nov 30, 2012 at 10:56 AM, Janne Grunau  wrote:
> This prevents undefined behaviour of signed left shift if the coded
> value is larger than 2^31. Large values are most likely invalid and
> caused errors or by feeding random.
>
> Validate every use of svq3_get_ue_golomb() and changed the place there
> the return value was compared with negative numbers. dirac.c was clean,
> fixed rv30 and svq3.
> ---
>  libavcodec/golomb.h |  5 +++--
>  libavcodec/rv30.c   |  6 +++---
>  libavcodec/svq3.c   | 17 -
>  3 files changed, 14 insertions(+), 14 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-07 Thread Ronald S. Bultje
Hi,

On Fri, Dec 7, 2012 at 1:01 PM, MĂĄns RullgĂĄrd  wrote:
> "Ronald S. Bultje"  writes:
>
>> +%if mmsize <= 16 && HAVE_ALIGNED_STACK
>
> How much overhead would it be to drop HAVE_ALIGNED_STACK entirely?

Well, for now, we still have a ton of functions that don't use the
cglobal-method of allocating stack. I only ported h264/vp8 loopfilter,
nothing else.

But anyway, more generally, it's 4-5 instructions per function. For
typical functions with an inner loop, that's negligible, but for a
select small set of functions, it may be significant.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] Check for invalid VLC code in zeros_left before writing coefficients.

2012-12-07 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

This prevents an invalid write into coeffs[scantable[-1]] if zeros_left
itself was an invalid VLC code (and thus -1).
---
 libavcodec/h264_cavlc.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c
index 8702122..fa2bd99 100644
--- a/libavcodec/h264_cavlc.c
+++ b/libavcodec/h264_cavlc.c
@@ -610,17 +610,17 @@ static int decode_residual(H264Context *h, GetBitContext 
*gb, DCTELEM *block, in
 } \
 }
 
+if(zeros_left<0){
+av_log(h->s.avctx, AV_LOG_ERROR, "negative number of zero coeffs at %d 
%d\n", s->mb_x, s->mb_y);
+return -1;
+}
+
 if (h->pixel_shift) {
 STORE_BLOCK(int32_t)
 } else {
 STORE_BLOCK(int16_t)
 }
 
-if(zeros_left<0){
-av_log(h->s.avctx, AV_LOG_ERROR, "negative number of zero coeffs at %d 
%d\n", s->mb_x, s->mb_y);
-return -1;
-}
-
 return 0;
 }
 
-- 
1.8.0

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-12-07 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 ---
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 186 
 5 files changed, 191 insertions(+), 102 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index c124c4d..1f1dbc6 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -398,14 +398,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5,5,8,2*%2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset&15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset&15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+12*HAVE_ALIGNED_STACK
 
 ; transpose 6x16 -> tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)&15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2,4,8,0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset&15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 -> tmp space
@@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index ac231cb..73d4990 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c->biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c->h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c->h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c->h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c->h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c->biweight_h264_pixels_tab[0

Re: [libav-devel] [PATCH 1/1] h264: slice-mt: get last_pic_dropable from master context

2012-12-05 Thread Ronald S. Bultje
Hi,

On Wed, Dec 5, 2012 at 11:00 AM, Janne Grunau  wrote:
> Fixes fate-h264-conformance-cvnlfi2_sony_h and smllwebdl.mkv from
> https://github.com/OpenELEC/OpenELEC.tv/issues/1557 .
> ---
>  libavcodec/h264.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index cd8aeec..b9c46ea 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -2759,7 +2759,7 @@ static int decode_slice_header(H264Context *h, 
> H264Context *h0)
>  h->mb_mbaff= 0;
>  h->mb_aff_frame= 0;
>  last_pic_structure = s0->picture_structure;
> -last_pic_dropable  = s->dropable;
> +last_pic_dropable  = s0->dropable;

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: slice-mt: check master context for valid current_picture_ptr

2012-12-05 Thread Ronald S. Bultje
Hi,

On Wed, Dec 5, 2012 at 11:10 AM, Janne Grunau  wrote:
> Fixes errors in slice based multithreading introduced in 0b300daad2f5.
>
> CC: libav-sta...@libav.org
> ---
>  libavcodec/h264.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index b9c46ea..317ee95 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -2782,7 +2782,7 @@ static int decode_slice_header(H264Context *h, 
> H264Context *h0)
>  s->picture_structure = last_pic_structure;
>  s->dropable  = last_pic_dropable;
>  return AVERROR_INVALIDDATA;
> -} else if (!s->current_picture_ptr) {
> +} else if (!s0->current_picture_ptr) {
>  av_log(s->avctx, AV_LOG_ERROR,
> "unset current_picture_ptr on %d. slice\n",
> h0->current_slice + 1);

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32

2012-12-05 Thread Ronald S. Bultje
Hi,

On Wed, Dec 5, 2012 at 9:53 AM, Justin Ruggles  wrote:
> ---
>  libavutil/x86/float_dsp.asm |2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
>
> diff --git a/libavutil/x86/float_dsp.asm b/libavutil/x86/float_dsp.asm
> index d8fd93a..4a1742f 100644
> --- a/libavutil/x86/float_dsp.asm
> +++ b/libavutil/x86/float_dsp.asm
> @@ -127,7 +127,7 @@ cglobal vector_dmul_scalar, 3,3,3, dst, src, len
>  cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len
>  %endif
>  %if ARCH_X86_32
> -VBROADCASTSD xmm0, mulm
> +VBROADCASTSD   m0, mulm

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] x86: float_dsp: fix loading of the len parameter on x86-32

2012-12-05 Thread Ronald S. Bultje
Hi,

On Wed, Dec 5, 2012 at 9:53 AM, Justin Ruggles  wrote:
> ---
>  libavutil/x86/float_dsp.asm |3 +++
>  1 files changed, 3 insertions(+), 0 deletions(-)
>
> diff --git a/libavutil/x86/float_dsp.asm b/libavutil/x86/float_dsp.asm
> index 4a1742f..dc75532 100644
> --- a/libavutil/x86/float_dsp.asm
> +++ b/libavutil/x86/float_dsp.asm
> @@ -127,6 +127,9 @@ cglobal vector_dmul_scalar, 3,3,3, dst, src, len
>  cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len
>  %endif
>  %if ARCH_X86_32
> +; PROLOGUE loads len from the wrong stack address because mul is an 
> 8-byte
> +; parameter and PROLOGUE assumes all parameters are 4-byte
> +mov  lenq, [esp+0x18]
>  VBROADCASTSD   m0, mulm

That loads len twice. Why not:

%if ARCH_X86_32
cglobal vector_dmul_scalar, 3,4,3, dst, src, mul, len, lenaddr
mov lenq, lenaddrm
%else
cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len
%endif

which is more functionally correct?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 03/10] SBR DSP x86: implement SSE qmf_post_shuffle

2012-11-30 Thread Ronald S. Bultje
Hi,

On Fri, Nov 30, 2012 at 1:14 PM, Christophe Gisquet
 wrote:
> Hello,
>
> 2012/11/30 Loren Merritt :
>> If you increment an index into W and z rather than the pointers
>> themselves, then you can eliminate an add and a cmp.
>
> I add already tested that, and redid it:
> cglobal sbr_qmf_post_shuffle, 2,4,3,W,z
>   mov   r3q, 32*4
>   lea   r2q, [zq + (64-4)*4]
>   addzq, r3q
>   leaWq, [Wq + 2*r3q]
>   neg   r3q
> .loop:
>   mova   m0, [r2q]
>   mova   m1, [zq  + r3q]
>   xorps  m0, [ps_neg]
>   shufps m0, m0, 0x1B
>   mova   m2, m0
>   unpcklps   m0, m1
>   unpckhps   m2, m1
>   mova  [Wq + 2*r3q +  0], m0
>   mova  [Wq + 2*r3q + 16], m2
>   sub   r2q, 16
>   add   r3q, 16
>   jl  .loop
>   REP_RET
>
> It's 2 cycles slower on Penrynn/Win64 (154 vs 152).

Try adding an "ALIGN 16" just above ".loop:", maybe that fixes it?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] wavenc: write fact chunk sample count at the correct file position

2012-11-26 Thread Ronald S. Bultje
Hi,

On Mon, Nov 26, 2012 at 4:11 PM, Justin Ruggles
 wrote:
> From: Michael Niedermayer 
>
> Fixes curruption of metadata in the INFO chunk.
>
> Signed-off-by: Michael Niedermayer 
> Signed-off-by: Justin Ruggles 
> ---
>  libavformat/wavenc.c |9 +
>  1 files changed, 5 insertions(+), 4 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 3/4] h264: check ref_count validity for num_ref_idx_active_override_flag

2012-11-26 Thread Ronald S. Bultje
On Mon, Nov 26, 2012 at 7:54 AM, Janne Grunau  wrote:
> On 2012-11-26 07:01:20 -0800, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Mon, Nov 26, 2012 at 4:06 AM, Janne Grunau  wrote:
>> > Fixes segfault in the fuzzed sample bipbop234.ts_s226407.
>> >
>> > CC: libav-sta...@libav.org
>> > ---
>> >  libavcodec/h264.c | 7 ++-
>> >  1 file changed, 6 insertions(+), 1 deletion(-)
>> >
>> > diff --git a/libavcodec/h264.c b/libavcodec/h264.c
>> > index eca4636..730e34a 100644
>> > --- a/libavcodec/h264.c
>> > +++ b/libavcodec/h264.c
>> > @@ -2896,8 +2896,13 @@ static int decode_slice_header(H264Context *h, 
>> > H264Context *h0)
>> >
>> >  if (num_ref_idx_active_override_flag) {
>> >  h->ref_count[0] = get_ue_golomb(&s->gb) + 1;
>> > -if (h->slice_type_nos == AV_PICTURE_TYPE_B)
>> > +if (h->ref_count[0] < 1)
>> > +return AVERROR_INVALIDDATA;
>> > +if (h->slice_type_nos == AV_PICTURE_TYPE_B) {
>> >  h->ref_count[1] = get_ue_golomb(&s->gb) + 1;
>> > +if (h->ref_count[1] < 1)
>> > +return AVERROR_INVALIDDATA;
>> > +}
>> >  }
>> >
>> >  if (h->slice_type_nos == AV_PICTURE_TYPE_B)
>> > --
>> > 1.7.12.4
>>
>> get_ue_golomb() guarantees to return something non-negative, so is
>> this an integer overflow? We should then prevent the integer overflow.
>
> It returns -1 if the next 32 bits are all 0. That is arguebly not a
> valid exp-golomb code but occurs with the sample mentioned in the commit
> message (available in incoming/2012-11-15_fuzz_h264_janne as my other
> fuzzed samples).

Oh I see, patch OK then.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/4] h264: check sps.log2_max_frame_num for validity

2012-11-26 Thread Ronald S. Bultje
Hi,

On Mon, Nov 26, 2012 at 4:27 AM, Janne Grunau  wrote:
> On 2012-11-26 13:22:51 +0100, Kostya Shishkov wrote:
>> On Mon, Nov 26, 2012 at 01:06:18PM +0100, Janne Grunau wrote:
>> > Fixes infinitive or long taking loop in frame num gap code in
>> > the fuzzed sample bipbop234.ts_s223302.
>> >
>> > CC: libav-sta...@libav.org
>> > ---
>> >  libavcodec/h264_ps.c | 9 +
>> >  1 file changed, 9 insertions(+)
>> >
>> > diff --git a/libavcodec/h264_ps.c b/libavcodec/h264_ps.c
>> > index 810f69f..dc6b676 100644
>> > --- a/libavcodec/h264_ps.c
>> > +++ b/libavcodec/h264_ps.c
>> > @@ -37,6 +37,9 @@
>> >  //#undef NDEBUG
>> >  #include 
>> >
>> > +#define MAX_LOG2_MAX_FRAME_NUM(12 + 4)
>> > +#define MIN_LOG2_MAX_FRAME_NUM4
>> > +
>> >  static const AVRational pixel_aspect[17]={
>> >   {0, 1},
>> >   {1, 1},
>> > @@ -349,6 +352,12 @@ int ff_h264_decode_seq_parameter_set(H264Context *h){
>> >  }
>> >
>> >  sps->log2_max_frame_num= get_ue_golomb(&s->gb) + 4;
>> > +if (sps->log2_max_frame_num > MAX_LOG2_MAX_FRAME_NUM ||
>> > +sps->log2_max_frame_num < MIN_LOG2_MAX_FRAME_NUM) {
>> > +av_log(h->s.avctx, AV_LOG_ERROR, "log2_max_frame_num out of range 
>> > "
>> > +   "(4-16): %d\n", sps->log2_max_frame_num);
>> > +return AVERROR_INVALIDDATA;
>> > +}
>> >  sps->poc_type= get_ue_golomb_31(&s->gb);
>> >
>> >  if(sps->poc_type == 0){ //FIXME #define
>> > --
>>
>> LGTM though I suspect it's useless to check for the minimum size (unless it
>> overflows).
>
> The min check is added to protect against overflows. The sample in the
> commit msg doesn't overflow but is only slightly lower than INT_MAX.

Here, too, we should then protect against the actual overflow itself
from happening, not so much check that it just happened.

number = read_golomb();
if number >= MIN_VALUE && number - MIN_VALUE >= MAX_VALUE
  error;
number += MIN_VALUE;

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 3/4] h264: check ref_count validity for num_ref_idx_active_override_flag

2012-11-26 Thread Ronald S. Bultje
Hi,

On Mon, Nov 26, 2012 at 4:06 AM, Janne Grunau  wrote:
> Fixes segfault in the fuzzed sample bipbop234.ts_s226407.
>
> CC: libav-sta...@libav.org
> ---
>  libavcodec/h264.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index eca4636..730e34a 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -2896,8 +2896,13 @@ static int decode_slice_header(H264Context *h, 
> H264Context *h0)
>
>  if (num_ref_idx_active_override_flag) {
>  h->ref_count[0] = get_ue_golomb(&s->gb) + 1;
> -if (h->slice_type_nos == AV_PICTURE_TYPE_B)
> +if (h->ref_count[0] < 1)
> +return AVERROR_INVALIDDATA;
> +if (h->slice_type_nos == AV_PICTURE_TYPE_B) {
>  h->ref_count[1] = get_ue_golomb(&s->gb) + 1;
> +if (h->ref_count[1] < 1)
> +return AVERROR_INVALIDDATA;
> +}
>  }
>
>  if (h->slice_type_nos == AV_PICTURE_TYPE_B)
> --
> 1.7.12.4

get_ue_golomb() guarantees to return something non-negative, so is
this an integer overflow? We should then prevent the integer overflow.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: enable low delay only if no delayed frames were seen

2012-11-25 Thread Ronald S. Bultje
Hi,

On Sun, Nov 25, 2012 at 2:40 PM, Janne Grunau  wrote:
> Dropping frames is undesirable but that is the only way by which the
> decoder could return to low delay mode. Instead emit a warning and
> continue with delayed frames.
> Fixes a crash in fuzzed sample nasa-8s2.ts_s20033 caused by a larger
> than expected has_b_frames value. Low delay keeps getting re-enabled
> from a presumely broken SPS.
> ---
>  libavcodec/h264.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index f45c572..658d865 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -3867,8 +3867,14 @@ again:
>
>  if (s->flags & CODEC_FLAG_LOW_DELAY ||
>  (h->sps.bitstream_restriction_flag &&
> - !h->sps.num_reorder_frames))
> -s->low_delay = 1;
> + !h->sps.num_reorder_frames)) {
> +if (s->avctx->has_b_frames > 1 || h->delayed_pic[0])
> +av_log(avctx, AV_LOG_WARNING, "Delayed frames seen "
> +   "reenabling low delay requires a codec "
> +   "flush.\n");
> +else
> +s->low_delay = 1;
> +}

That looks good, thanks.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: modify scalarproduct_int16 to handle mod8 numbers of loops.

2012-11-25 Thread Ronald S. Bultje
Hi,

On Sun, Nov 25, 2012 at 8:43 AM, Ronald S. Bultje  wrote:
> (This is kind of a pain on Fedora

Whoops, I meant Mac here.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] dsputil: modify scalarproduct_int16 to handle mod8 numbers of loops.

2012-11-25 Thread Ronald S. Bultje
Hi,

On Sun, Nov 25, 2012 at 1:18 AM, Diego Biurrun  wrote:
> On Sat, Nov 24, 2012 at 03:43:04PM +0100, Christophe Gisquet wrote:
>>
>> PS: no idea why firefox/gmail treats those patches as octet-stream,
>> and how to fix that.
>
> Ronald wrote about this a while back, but there seems to be no easy
> solution:
>
> http://blogs.gnome.org/rbultje/2009/03/04/firefox-gmail-and-content-type-of-attachments/

The solution is basically to use git send-email, not gmail. git
send-email can use your gmail account as smtp, if wanted. I can help
you set that up if this doesn't help enough:

http://morefedora.blogspot.com/2009/02/configuring-git-send-email-to-use-gmail.html

(This is kind of a pain on Fedora, since you'll first have to install
a bunch of perl extensions. I'm gonna hope you're not using a Mac at
this point. See
http://kbase.wincent.com/old/knowledge-base/Installing_Net::SMTP::SSL_for_sending_patches_with_Git_over_secure_SMTP.html
for gory details.)

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: reset has_b_frames after enabling low_delay from SPS

2012-11-21 Thread Ronald S. Bultje
Hi,

On Wed, Nov 21, 2012 at 12:29 PM, Janne Grunau wrote:

> On 2012-11-21 09:13:27 -0800, Ronald S. Bultje wrote:
> > On Wed, Nov 21, 2012 at 5:37 AM, Janne Grunau  >wrote:
> > > On 2012-11-21 13:14:34 +, Loren Merritt wrote:
> > > > On Wed, 21 Nov 2012, Janne Grunau wrote:
> > > > > On 2012-11-16 18:14:29 -0800, Ronald S. Bultje wrote:
> > > > > >
> > > > > > So I'm going to have to wonder what happens to the delayed frames
> > > > > > already
> > > > > > cached in h->delayed_pics[]? Are they still output?
> > > > >
> > > > > yes, delayed pictures are returned as long as h->delayed_pics[0]
> is not
> > > > > NULL.
> > > >
> > > > And when you return something from delayed_pics[] instead of the new
> > > > frame that just got decoded, the new frame gets delayed, so that you
> > > > don't ever actually switch to low delay mode?
> > >
> > > Since we can't return multiple frames at once we can't switch to low
> > > delay mode once we have delayed pictures. If low delay after switching
> > > from a stream with delayed pictures to a low_delay stream is important
> > > and dropping frames is an option requiring a decoder flush from sounds
> > > reasonable to me.
> > >
> >
> > That's one option. We could also simply say that low_delay can't be
> > re-enabled once disabled, so even though the SPS says we're low-delay,
> just
> > ignore it. I don't really have a preference either way, whichever is
> easier
> > implementation-wise or whichever you feel is a better representation of
> > what the end user of a non-fuzzed bitstream likely wanted to happen. (In
> > practice, this probably never happens anyway?)
>
> The current patch seems to work fine and fate-h264 still passes so this
> approach is easier for me. I also think that not dropping frames is
> less surprising option. Expecting tests of sender and receiver in
> situations where low delay is important is not unreasonable. I can add a
> warning when low_delay gets enabled with delayed frames.


Well, what I meant is that we probably shouldn't reset
s->avctx->has_b_frames and s->low_delay, since that may give the
(incorrect) assumption to the application that there is no delay... A
warning would be good, yes.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/1] h264: reset has_b_frames after enabling low_delay from SPS

2012-11-21 Thread Ronald S. Bultje
Hi,

On Wed, Nov 21, 2012 at 5:37 AM, Janne Grunau wrote:

> On 2012-11-21 13:14:34 +, Loren Merritt wrote:
> > On Wed, 21 Nov 2012, Janne Grunau wrote:
> > > On 2012-11-16 18:14:29 -0800, Ronald S. Bultje wrote:
> > > >
> > > > So I'm going to have to wonder what happens to the delayed frames
> already
> > > > cached in h->delayed_pics[]? Are they still output?
> > >
> > > yes, delayed pictures are returned as long as h->delayed_pics[0] is not
> > > NULL.
> >
> > And when you return something from delayed_pics[] instead of the new
> > frame that just got decoded, the new frame gets delayed, so that you
> > don't ever actually switch to low delay mode?
>
> Since we can't return multiple frames at once we can't switch to low
> delay mode once we have delayed pictures. If low delay after switching
> from a stream with delayed pictures to a low_delay stream is important
> and dropping frames is an option requiring a decoder flush from sounds
> reasonable to me.
>

That's one option. We could also simply say that low_delay can't be
re-enabled once disabled, so even though the SPS says we're low-delay, just
ignore it. I don't really have a preference either way, whichever is easier
implementation-wise or whichever you feel is a better representation of
what the end user of a non-fuzzed bitstream likely wanted to happen. (In
practice, this probably never happens anyway?)

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC PATCH 1/2] h264: set parameters from SPS whenever it changes

2012-11-16 Thread Ronald S. Bultje
Hi,

On Fri, Nov 16, 2012 at 8:43 AM, Janne Grunau wrote:

> I'm not entirely convinced that is the the correct/best way to fix the
> problem with this sample. It crashes in draw_edges_10_c because the
> image format and parameters has just 8 bit per pixels.
>
> The main problem I see is that it does far too many context re-inits.
>
> Janne
> ---8<---
>
> Fixes a crash in the fuzzed sample sample_varPAR.avi_s26638 with
> alternating bit depths.
> ---
>  libavcodec/h264.c | 88
> ++-
>  1 file changed, 48 insertions(+), 40 deletions(-)
>

So ... I originally added code to not support changing bitdepth at all,
basically to prevent this and alike issues. E.g., what if the reference is
8bit but the next frame is 10bit? What if we're using frame threading at
the same time?

So I wouldn't mind just erroring out altogether, unless you think this can
really be made to work? I mean, if we want to support reinits, with frame
threading, then we should just rewrite the init code and allow complete
reinits and be convinced it always works, like we are for ffvp8.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] h264: reset has_b_frames after enabling low_delay from SPS

2012-11-16 Thread Ronald S. Bultje
Hi,

On Fri, Nov 16, 2012 at 8:43 AM, Janne Grunau wrote:

> Fixes a crash in fuzzed file nasa-8s2.ts_s20033 caused by a too large
> has_b_frames value. low_delay keeps getting re-enabled from the the
> presumely broken SPS.
> ---
>  libavcodec/h264.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/h264.c b/libavcodec/h264.c
> index c30c478..7935fe6 100644
> --- a/libavcodec/h264.c
> +++ b/libavcodec/h264.c
> @@ -2346,8 +2346,10 @@ static int h264_set_parameter_from_sps(H264Context
> *h)
>
>  if (s->flags & CODEC_FLAG_LOW_DELAY ||
>  (h->sps.bitstream_restriction_flag &&
> - !h->sps.num_reorder_frames))
> + !h->sps.num_reorder_frames)) {
>  s->low_delay = 1;
> +s->avctx->has_b_frames = 0;
> +}
>

So I'm going to have to wonder what happens to the delayed frames already
cached in h->delayed_pics[]? Are they still output? What happens if we
concatenate a delayed h264 file by a file that has this kind of sps header?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: lavr: fix stack allocation for 7 and 8 channel downmixing on x86-32

2012-11-12 Thread Ronald S. Bultje
Hi,

On Mon, Nov 12, 2012 at 12:41 PM, Luca Barbato  wrote:

> On 11/12/2012 07:35 PM, Justin Ruggles wrote:
> > Fixes crashes on Win32 and stack overruns on x86-32 in general.
>
> Who can test it?
>

http://bugzilla.libav.org/show_bug.cgi?id=338

Reproducible on x86-32 Linux/Mac with asan.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: lavr: fix stack allocation for 7 and 8 channel downmixing on x86-32

2012-11-08 Thread Ronald S. Bultje
Hi,

On Thu, Nov 8, 2012 at 2:17 PM, Justin Ruggles wrote:

> On 10/29/2012 04:39 PM, Justin Ruggles wrote:
> > From: Ronald S. Bultje 
> >
> > Fixes crashes on Win32 and stack overruns on x86-32 in general.
> > ---
> >  libavresample/x86/audio_mix.asm |7 +++
> >  1 files changed, 7 insertions(+), 0 deletions(-)
> >
> > diff --git a/libavresample/x86/audio_mix.asm
> b/libavresample/x86/audio_mix.asm
> > index 0c4a9bd..cdc787d 100644
> > --- a/libavresample/x86/audio_mix.asm
> > +++ b/libavresample/x86/audio_mix.asm
> > @@ -280,6 +280,13 @@ cglobal mix_%1_to_%2_%3_flt,
> 3,in_channels+2,needed_mmregs+matrix_elements_mm, s
> >  sub   rsp, matrix_elements_stack * mmsize
> >  %else
> >  %assign pad matrix_elements_stack * mmsize + (mmsize - gprsize) -
> (stack_offset & (mmsize - gprsize))
> > +; on x86-32 for 7 and 8 channels we need more stack space for src
> pointers
> > +%if ARCH_X86_32 && in_channels >= 7
> > +%assign pad pad + 0x10
> > +%define src5m [rsp+0xc0]
> > +%define src6m [rsp+0xc4]
> > +%define src7m [rsp+0xc8]
> > +%endif
> >  SUB   rsp, pad
> >  %endif
> >  %endif
>
> ping


 Well it's my patch and you reviewed it, isn't that enough?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: h264: Convert 8-bit QPEL inline assembly to YASM

2012-11-05 Thread Ronald S. Bultje
Hi,

On Mon, Nov 5, 2012 at 10:48 AM, Diego Biurrun  wrote:
> +%macro QPEL8_H_LOWPASS_OP 1
> +cglobal %1_h264_qpel8_h_lowpass, 4,5 ; dst, src, dstStride, srcStride
> +mov   r4d, 8

Each and every single one of these needs a stride sign extension like
you did in dsputil.asm.

Do you have numbers to compare performance of qpel (both with
START/STOP_TIMER as well as with total wallclock time, e.g. using the
"time" command) before/after this patch?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86: h264_qpel: sign-extend stride argument in put_pixels16_sse2()

2012-11-05 Thread Ronald S. Bultje
Hi,

On Mon, Nov 5, 2012 at 1:22 PM, Diego Biurrun  wrote:
> ---
>
> This fixes the crash in fate-vp5 with the QPEL code converted to YASM.
> I'm sending this separately to not spam everybody with 100+kB patches.
> The patch is meant to be squashed into the QPEL yasmification.
>  ; void put_pixels16_sse2(uint8_t *block, const uint8_t *pixels, int 
> line_size, int h)
>  cglobal put_pixels16, 4,5,4
> +movsxdifnidnr2, r2d
>  lea r4, [r2*3]
>  .loop:

No, I said each occurence of cglobal needed this, not just one of
them. Also, please fix indenting of this instruction to be aligned
with the line below.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/4] x86util: Add cpuflags_mmxext alias for cpuflags_mmx2

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:28 PM, Diego Biurrun  wrote:
> "mmxext" is a more sensible name and more common in outside projects.
> ---
>  libavutil/x86/x86util.asm |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)

No.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/4] x86: include x86inc.asm in x86util.asm

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:28 PM, Diego Biurrun  wrote:
> -%include "x86inc.asm"
>  %include "x86util.asm"

Didn't you come up with the rule that each file should contain all
headers it directly takes symbols from? How do these changes fit with
that scheme?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] pixfmt: support more yuva formats

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:14 PM, Luca Barbato  wrote:
> Signed-off-by: Luca Barbato 
> ---
>  libavcodec/raw.c  |  19 
>  libavformat/nut.c |  28 +
>  libavutil/pixdesc.c   | 234 
> ++
>  libavutil/pixfmt.h|  28 +
>  libswscale/utils.c|  18 
>  tests/ref/lavfi/pixdesc   |  18 
>  tests/ref/lavfi/pixfmts_copy  |  18 
>  tests/ref/lavfi/pixfmts_null  |  18 
>  tests/ref/lavfi/pixfmts_scale |  18 
>  tests/ref/lavfi/pixfmts_vflip |  18 
>  10 files changed, 417 insertions(+)

OK, assuming it works without new warnings under valgrind.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] swscale: support gray to 9bit and 10bit formats

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:14 PM, Luca Barbato  wrote:
> With the input of Kostya and Ronald.
> ---
>  libswscale/swscale.c  | 38 --
>  libswscale/swscale_unscaled.c | 32 ++--
>  2 files changed, 66 insertions(+), 4 deletions(-)

So it works now, under valgrind? LGTM then.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: Add cpuflags_mmxext alias for cpuflags_mmx2

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 5:56 PM, Diego Biurrun  wrote:
> On Mon, Oct 29, 2012 at 05:25:08PM -0700, Ronald S. Bultje wrote:
>> On Mon, Oct 29, 2012 at 5:15 PM, Diego Biurrun  wrote:
>> > This allows using "mmxext" as name in Libav while staying compatible
>> > with changes to the YASM macro infrastructure imported from x264.
>> > ---
>> >  libavutil/x86/x86inc.asm |1 +
>> >  1 files changed, 1 insertions(+), 0 deletions(-)
>> >
>> > --- a/libavutil/x86/x86inc.asm
>> > +++ b/libavutil/x86/x86inc.asm
>> > @@ -35,6 +35,7 @@
>> >  ; to x264-de...@videolan.org .
>> >
>> >  %define program_name ff
>> > +%define cpuflags_mmxext cpuflags_mmx2
>>
>> Any such a change to x86inc.asm is OK if and only if x264 accepts this
>> change upstream. Loren?
>
> The x86inc.asm files from x264 and Libav are already different, x264 uses
> "x264" instead of "ff" as program_name.  That's the reason why I chose to
> place the alias in this very spot: during upstream merges there will be
> no conflicts that are not
>
> a) trivial to fix and
> b) would have occurred anyway due to the existing difference.
>
> This patch is also what a room full of Libav developers accepted as a
> compromise at VDD.  Please don't oppose the consensus of your peers.

You keep trying to push this through. No.

I do the syncing, and nobody wants to take that job from me, I assure
you. Every time the semantics of cglobal change by however little,
someone who knows that stuff needs to go in and fix/port it. I prefer
to not have to take any additional tasks on me, such as maintaining
forky differences.

Thus, please, no.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: Add cpuflags_mmxext alias for cpuflags_mmx2

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 5:15 PM, Diego Biurrun  wrote:
> This allows using "mmxext" as name in Libav while staying compatible
> with changes to the YASM macro infrastructure imported from x264.
> ---
>  libavutil/x86/x86inc.asm |1 +
>  1 files changed, 1 insertions(+), 0 deletions(-)
>
> diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
> index 1fe9f55..2655154 100644
> --- a/libavutil/x86/x86inc.asm
> +++ b/libavutil/x86/x86inc.asm
> @@ -35,6 +35,7 @@
>  ; to x264-de...@videolan.org .
>
>  %define program_name ff
> +%define cpuflags_mmxext cpuflags_mmx2

Any such a change to x86inc.asm is OK if and only if x264 accepts this
change upstream. Loren?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/2] Use PRED4x4/8x8/8x8L/16x16 macros to declare x86 intrapred prototypes.

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 10:25 AM, Luca Barbato  wrote:
> On 10/29/2012 04:44 AM, Ronald S. Bultje wrote:
>> From: "Ronald S. Bultje" 
>>
>
> If you don't mind
>
> x86: use PRED4x4/8x8/8x8L/16x16 macros to declare intrapred prototypes
>
> might fit better.

Will change locally. Any other comments on the code or can I commit
with that modified?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] Remove usage of INIT_AVX in h264_intrapred_10bit.asm.

2012-10-29 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

Replace INIT_AVX by INIT_XMM avx. Port the whole file to use cpuflag
based function declarations. Remove (now unused) cputype argument in
function declaration macros. Change function prototypes to have mmx2
instead of mmxext as suffix, since that's required by cpuflags.
---
 libavcodec/x86/h264_intrapred_10bit.asm | 306 
 libavcodec/x86/h264_intrapred_init.c|  40 ++---
 2 files changed, 177 insertions(+), 169 deletions(-)

diff --git a/libavcodec/x86/h264_intrapred_10bit.asm 
b/libavcodec/x86/h264_intrapred_10bit.asm
index 529134e..c3f6dc4 100644
--- a/libavcodec/x86/h264_intrapred_10bit.asm
+++ b/libavcodec/x86/h264_intrapred_10bit.asm
@@ -53,8 +53,8 @@ SECTION .text
 ;-
 ; void pred4x4_down_right(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_DR 1
-cglobal pred4x4_down_right_10_%1, 3,3
+%macro PRED4x4_DR 0
+cglobal pred4x4_down_right_10, 3, 3
 sub   r0, r2
 lea   r1, [r0+r2*2]
 movhpsm1, [r1-8]
@@ -79,21 +79,22 @@ cglobal pred4x4_down_right_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_DR sse2
+PRED4x4_DR
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_DR ssse3
+PRED4x4_DR
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_DR avx
+INIT_XMM avx
+PRED4x4_DR
 %endif
 
 ;-
 ; void pred4x4_vertical_right(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_VR 1
-cglobal pred4x4_vertical_right_10_%1, 3,3,6
+%macro PRED4x4_VR 0
+cglobal pred4x4_vertical_right_10, 3, 3, 6
 sub r0, r2
 lea r1, [r0+r2*2]
 movqm5, [r0]; t3t2t1t0
@@ -119,21 +120,22 @@ cglobal pred4x4_vertical_right_10_%1, 3,3,6
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_VR sse2
+PRED4x4_VR
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_VR ssse3
+PRED4x4_VR
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_VR avx
+INIT_XMM avx
+PRED4x4_VR
 %endif
 
 ;-
 ; void pred4x4_horizontal_down(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_HD 1
-cglobal pred4x4_horizontal_down_10_%1, 3,3
+%macro PRED4x4_HD 0
+cglobal pred4x4_horizontal_down_10, 3, 3
 subr0, r2
 lear1, [r0+r2*2]
 movq   m0, [r0-8]  ; lt ..
@@ -162,14 +164,15 @@ cglobal pred4x4_horizontal_down_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_HD sse2
+PRED4x4_HD
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_HD ssse3
+PRED4x4_HD
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_HD avx
+INIT_XMM avx
+PRED4x4_HD
 %endif
 
 ;-
@@ -192,8 +195,8 @@ PRED4x4_HD avx
 HADDD   %1, %2
 %endmacro
 
-INIT_MMX
-cglobal pred4x4_dc_10_mmxext, 3,3
+INIT_MMX mmx2
+cglobal pred4x4_dc_10, 3, 3
 subr0, r2
 lear1, [r0+r2*2]
 movq   m2, [r0+r2*1-8]
@@ -216,8 +219,8 @@ cglobal pred4x4_dc_10_mmxext, 3,3
 ;-
 ; void pred4x4_down_left(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_DL 1
-cglobal pred4x4_down_left_10_%1, 3,3
+%macro PRED4x4_DL 0
+cglobal pred4x4_down_left_10, 3, 3
 subr0, r2
 movq   m0, [r0]
 movhps m0, [r1]
@@ -236,18 +239,18 @@ cglobal pred4x4_down_left_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
-PRED4x4_DL sse2
+INIT_XMM sse2
+PRED4x4_DL
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_DL avx
+INIT_XMM avx
+PRED4x4_DL
 %endif
 
 ;-
 ; void pred4x4_vertical_left(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_VL 1
-cglobal pred4x4_vertical_left_10_%1, 3,3
+%macro PRED4x4_VL 0
+cglobal pred4x4_vertical_left_10, 3, 3
 subr0, r2
 movu   m1, [r0]
 movhps m1, [r1]
@@ -265,18 +268,18 @@ cglobal pred4x4_vertical_left_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
-PRED4x4_VL sse2
+INIT_XMM sse2
+PRED4x4_VL
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_VL avx
+INIT_XMM avx
+PRED4x4_VL
 %endif
 
 ;-
 ; void pred4x4_horizontal_up(pixel *src, const pixel *topright, int stride)
 ;

Re: [libav-devel] [PATCH 2/2] Use ptrdiff_t instead of int for intra pred "stride" function parameter.

2012-10-29 Thread Ronald S. Bultje
Hi,

On Mon, Oct 29, 2012 at 7:53 AM, Diego Biurrun  wrote:
> On Sun, Oct 28, 2012 at 08:44:54PM -0700, Ronald S. Bultje wrote:
>> From: "Ronald S. Bultje" 
>>
>> ---
>>  libavcodec/arm/h264pred_init_arm.c   |  36 +++---
>>  libavcodec/h264pred.c|  79 
>>  libavcodec/h264pred.h|  18 +--
>>  libavcodec/h264pred_template.c   | 233 
>> +--
>>  libavcodec/x86/h264_intrapred_init.c |  15 ++-
>>  5 files changed, 264 insertions(+), 117 deletions(-)
>
> Why?

So you don't have to sign-extend integers in simd functions (or suffer
undefined behaviour if you forget). I've done this before for several
other simd functions, use "git log".

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 2/2] Use ptrdiff_t instead of int for intra pred "stride" function parameter.

2012-10-28 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

---
 libavcodec/arm/h264pred_init_arm.c   |  36 +++---
 libavcodec/h264pred.c|  79 
 libavcodec/h264pred.h|  18 +--
 libavcodec/h264pred_template.c   | 233 +--
 libavcodec/x86/h264_intrapred_init.c |  15 ++-
 5 files changed, 264 insertions(+), 117 deletions(-)

diff --git a/libavcodec/arm/h264pred_init_arm.c 
b/libavcodec/arm/h264pred_init_arm.c
index 371b1a6..39c0121 100644
--- a/libavcodec/arm/h264pred_init_arm.c
+++ b/libavcodec/arm/h264pred_init_arm.c
@@ -23,25 +23,25 @@
 #include "libavutil/arm/cpu.h"
 #include "libavcodec/h264pred.h"
 
-void ff_pred16x16_vert_neon(uint8_t *src, int stride);
-void ff_pred16x16_hor_neon(uint8_t *src, int stride);
-void ff_pred16x16_plane_neon(uint8_t *src, int stride);
-void ff_pred16x16_dc_neon(uint8_t *src, int stride);
-void ff_pred16x16_128_dc_neon(uint8_t *src, int stride);
-void ff_pred16x16_left_dc_neon(uint8_t *src, int stride);
-void ff_pred16x16_top_dc_neon(uint8_t *src, int stride);
+void ff_pred16x16_vert_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_hor_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_plane_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_128_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_left_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred16x16_top_dc_neon(uint8_t *src, ptrdiff_t stride);
 
-void ff_pred8x8_vert_neon(uint8_t *src, int stride);
-void ff_pred8x8_hor_neon(uint8_t *src, int stride);
-void ff_pred8x8_plane_neon(uint8_t *src, int stride);
-void ff_pred8x8_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_128_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_left_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_top_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_l0t_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_0lt_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_l00_dc_neon(uint8_t *src, int stride);
-void ff_pred8x8_0l0_dc_neon(uint8_t *src, int stride);
+void ff_pred8x8_vert_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_hor_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_plane_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_128_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_left_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_top_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_l0t_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_0lt_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_l00_dc_neon(uint8_t *src, ptrdiff_t stride);
+void ff_pred8x8_0l0_dc_neon(uint8_t *src, ptrdiff_t stride);
 
 static void ff_h264_pred_init_neon(H264PredContext *h, int codec_id, const int 
bit_depth, const int chroma_format_idc)
 {
diff --git a/libavcodec/h264pred.c b/libavcodec/h264pred.c
index fb44046..94cf9d0 100644
--- a/libavcodec/h264pred.c
+++ b/libavcodec/h264pred.c
@@ -39,7 +39,9 @@
 #include "h264pred_template.c"
 #undef BIT_DEPTH
 
-static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright, int 
stride){
+static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright,
+   ptrdiff_t stride)
+{
 const unsigned lt = src[-1-1*stride];
 LOAD_TOP_EDGE
 LOAD_TOP_RIGHT_EDGE
@@ -54,7 +56,9 @@ static void pred4x4_vertical_vp8_c(uint8_t *src, const 
uint8_t *topright, int st
 AV_WN32A(src+3*stride, v);
 }
 
-static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright, 
int stride){
+static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright,
+ ptrdiff_t stride)
+{
 const unsigned lt = src[-1-1*stride];
 LOAD_LEFT_EDGE
 
@@ -64,7 +68,9 @@ static void pred4x4_horizontal_vp8_c(uint8_t *src, const 
uint8_t *topright, int
 AV_WN32A(src+3*stride, ((l2 + 2*l3 + l3 + 2) >> 2)*0x01010101);
 }
 
-static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright, 
int stride){
+static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright,
+ ptrdiff_t stride)
+{
 LOAD_TOP_EDGE
 LOAD_LEFT_EDGE
 
@@ -86,7 +92,9 @@ static void pred4x4_down_left_svq3_c(uint8_t *src, const 
uint8_t *topright, int
 src[3+3*stride]=(l3 + t3)>>1;
 }
 
-static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright, 
int stride){
+static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright,
+ ptrdiff_t stride)
+{
 LOAD_TOP_EDGE
 LOAD_TOP_RIGHT_EDGE
 LOAD_LEFT_EDGE
@@ -110,7 +118,10 @@ static void pred4x4_down_left_rv40_c(uint8_t *src, const 
uint8_t *topright, int
 src[3+3*stride]=(t6 + t7 + 1 + l6 + l7 + 1)>>2;
 }
 
-static void pred4x4_down_left_rv40_nodown_c(uint8_t

[libav-devel] [PATCH 1/2] Use PRED4x4/8x8/8x8L/16x16 macros to declare x86 intrapred prototypes.

2012-10-28 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

---
 libavcodec/x86/h264_intrapred.asm|  80 +-
 libavcodec/x86/h264_intrapred_init.c | 296 ++-
 2 files changed, 190 insertions(+), 186 deletions(-)

diff --git a/libavcodec/x86/h264_intrapred.asm 
b/libavcodec/x86/h264_intrapred.asm
index dc418f7..7c6aa12 100644
--- a/libavcodec/x86/h264_intrapred.asm
+++ b/libavcodec/x86/h264_intrapred.asm
@@ -53,7 +53,7 @@ cextern pw_32
 ; void pred16x16_vertical(uint8_t *src, int stride)
 ;-
 
-cglobal pred16x16_vertical_mmx, 2,3
+cglobal pred16x16_vertical_8_mmx, 2,3
 sub   r0, r1
 mov   r2, 8
 movq mm0, [r0+0]
@@ -68,7 +68,7 @@ cglobal pred16x16_vertical_mmx, 2,3
 jg .loop
 REP_RET
 
-cglobal pred16x16_vertical_sse, 2,3
+cglobal pred16x16_vertical_8_sse, 2,3
 sub   r0, r1
 mov   r2, 4
 movaps xmm0, [r0]
@@ -88,7 +88,7 @@ cglobal pred16x16_vertical_sse, 2,3
 ;-
 
 %macro PRED16x16_H 0
-cglobal pred16x16_horizontal, 2,3
+cglobal pred16x16_horizontal_8, 2,3
 mov   r2, 8
 %if cpuflag(ssse3)
 mova  m2, [pb_3]
@@ -130,7 +130,7 @@ INIT_XMM
 ;-
 
 %macro PRED16x16_DC 0
-cglobal pred16x16_dc, 2,7
+cglobal pred16x16_dc_8, 2,7
 mov   r4, r0
 sub   r0, r1
 pxor  mm0, mm0
@@ -193,7 +193,7 @@ INIT_XMM
 ;-
 
 %macro PRED16x16_TM_MMX 0
-cglobal pred16x16_tm_vp8, 2,5
+cglobal pred16x16_tm_vp8_8, 2,5
 subr0, r1
 pxor  mm7, mm7
 movq  mm0, [r0+0]
@@ -234,7 +234,7 @@ INIT_MMX mmx2
 PRED16x16_TM_MMX
 INIT_MMX
 
-cglobal pred16x16_tm_vp8_sse2, 2,6,6
+cglobal pred16x16_tm_vp8_8_sse2, 2,6,6
 sub  r0, r1
 pxor   xmm2, xmm2
 movdqa xmm0, [r0]
@@ -274,7 +274,7 @@ cglobal pred16x16_tm_vp8_sse2, 2,6,6
 ;-
 
 %macro H264_PRED16x16_PLANE 1
-cglobal pred16x16_plane_%1, 2,9,7
+cglobal pred16x16_plane_%1_8, 2,9,7
 mov  r2, r1   ; +stride
 neg  r1   ; -stride
 
@@ -556,7 +556,7 @@ INIT_XMM
 ;-
 
 %macro H264_PRED8x8_PLANE 0
-cglobal pred8x8_plane, 2,9,7
+cglobal pred8x8_plane_8, 2,9,7
 mov  r2, r1   ; +stride
 neg  r1   ; -stride
 
@@ -730,7 +730,7 @@ INIT_XMM
 ; void pred8x8_vertical(uint8_t *src, int stride)
 ;-
 
-cglobal pred8x8_vertical_mmx, 2,2
+cglobal pred8x8_vertical_8_mmx, 2,2
 subr0, r1
 movq  mm0, [r0]
 %rep 3
@@ -747,7 +747,7 @@ cglobal pred8x8_vertical_mmx, 2,2
 ;-
 
 %macro PRED8x8_H 0
-cglobal pred8x8_horizontal, 2,3
+cglobal pred8x8_horizontal_8, 2,3
 mov   r2, 4
 %if cpuflag(ssse3)
 mova  m2, [pb_3]
@@ -774,7 +774,7 @@ INIT_MMX
 ;-
 ; void pred8x8_top_dc_mmxext(uint8_t *src, int stride)
 ;-
-cglobal pred8x8_top_dc_mmxext, 2,5
+cglobal pred8x8_top_dc_8_mmxext, 2,5
 sub r0, r1
 movq   mm0, [r0]
 pxor   mm1, mm1
@@ -809,7 +809,7 @@ cglobal pred8x8_top_dc_mmxext, 2,5
 ;-
 
 INIT_MMX
-cglobal pred8x8_dc_mmxext, 2,5
+cglobal pred8x8_dc_8_mmxext, 2,5
 sub   r0, r1
 pxor  m7, m7
 movd  m0, [r0+0]
@@ -869,7 +869,7 @@ cglobal pred8x8_dc_mmxext, 2,5
 ; void pred8x8_dc_rv40(uint8_t *src, int stride)
 ;-
 
-cglobal pred8x8_dc_rv40_mmxext, 2,7
+cglobal pred8x8_dc_rv40_8_mmxext, 2,7
 mov   r4, r0
 sub   r0, r1
 pxor  mm0, mm0
@@ -906,7 +906,7 @@ cglobal pred8x8_dc_rv40_mmxext, 2,7
 ;-
 
 %macro PRED8x8_TM_MMX 0
-cglobal pred8x8_tm_vp8, 2,6
+cglobal pred8x8_tm_vp8_8, 2,6
 subr0, r1
 pxor  mm7, mm7
 movq  mm0, [r0]
@@ -946,7 +946,7 @@ INIT_MMX mmx2
 PRED8x8_TM_MMX
 INIT_MMX
 
-cglobal pred8x8_tm_vp8_sse2, 2,6,4
+cglobal pred8x8_tm_vp8_8_sse2, 2,6,4
 sub  r0, r1
 pxor   xmm1, xmm1
 movq   xmm0, [r0]
@@ -974,7 +974,7 @@ cglobal pred8x8_tm_vp8_sse2, 2,6,4
 jg .loop
 REP_RET
 
-cglobal pred8x8_tm_vp8_ssse3, 2,3,6
+cglobal pred8x8_tm_vp8_8_ssse3, 2,3,6
 sub  r0, r1
 movdqa xmm4, [tm_shuf]
 pxor   xmm1, xmm1
@@ -1016,7 +1016,7 @@ cglobal pred8x8_tm

[libav-devel] [PATCH 2/2] Remove INIT_AVX from x86inc.asm.

2012-10-28 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

---
 libavutil/x86/x86inc.asm | 8 
 1 file changed, 8 deletions(-)

diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm
index d734c6e..1fe9f55 100644
--- a/libavutil/x86/x86inc.asm
+++ b/libavutil/x86/x86inc.asm
@@ -678,14 +678,6 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits
 INIT_CPUFLAGS %1
 %endmacro
 
-; FIXME: INIT_AVX can be replaced by INIT_XMM avx
-%macro INIT_AVX 0
-INIT_XMM
-%assign avx_enabled 1
-%define PALIGNR PALIGNR_SSSE3
-%define RESET_MM_PERMUTATION INIT_AVX
-%endmacro
-
 %macro INIT_YMM 0-1+
 %assign avx_enabled 1
 %define RESET_MM_PERMUTATION INIT_YMM %1
-- 
1.7.11.3

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH 1/2] Remove usage of INIT_AVX in h264_intrapred_10bit.asm.

2012-10-28 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

---
 libavcodec/x86/h264_intrapred_10bit.asm | 306 
 libavcodec/x86/h264_intrapred_init.c|  40 ++---
 2 files changed, 177 insertions(+), 169 deletions(-)

diff --git a/libavcodec/x86/h264_intrapred_10bit.asm 
b/libavcodec/x86/h264_intrapred_10bit.asm
index 529134e..c3f6dc4 100644
--- a/libavcodec/x86/h264_intrapred_10bit.asm
+++ b/libavcodec/x86/h264_intrapred_10bit.asm
@@ -53,8 +53,8 @@ SECTION .text
 ;-
 ; void pred4x4_down_right(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_DR 1
-cglobal pred4x4_down_right_10_%1, 3,3
+%macro PRED4x4_DR 0
+cglobal pred4x4_down_right_10, 3, 3
 sub   r0, r2
 lea   r1, [r0+r2*2]
 movhpsm1, [r1-8]
@@ -79,21 +79,22 @@ cglobal pred4x4_down_right_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_DR sse2
+PRED4x4_DR
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_DR ssse3
+PRED4x4_DR
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_DR avx
+INIT_XMM avx
+PRED4x4_DR
 %endif
 
 ;-
 ; void pred4x4_vertical_right(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_VR 1
-cglobal pred4x4_vertical_right_10_%1, 3,3,6
+%macro PRED4x4_VR 0
+cglobal pred4x4_vertical_right_10, 3, 3, 6
 sub r0, r2
 lea r1, [r0+r2*2]
 movqm5, [r0]; t3t2t1t0
@@ -119,21 +120,22 @@ cglobal pred4x4_vertical_right_10_%1, 3,3,6
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_VR sse2
+PRED4x4_VR
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_VR ssse3
+PRED4x4_VR
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_VR avx
+INIT_XMM avx
+PRED4x4_VR
 %endif
 
 ;-
 ; void pred4x4_horizontal_down(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_HD 1
-cglobal pred4x4_horizontal_down_10_%1, 3,3
+%macro PRED4x4_HD 0
+cglobal pred4x4_horizontal_down_10, 3, 3
 subr0, r2
 lear1, [r0+r2*2]
 movq   m0, [r0-8]  ; lt ..
@@ -162,14 +164,15 @@ cglobal pred4x4_horizontal_down_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
+INIT_XMM sse2
 %define PALIGNR PALIGNR_MMX
-PRED4x4_HD sse2
+PRED4x4_HD
+INIT_XMM ssse3
 %define PALIGNR PALIGNR_SSSE3
-PRED4x4_HD ssse3
+PRED4x4_HD
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_HD avx
+INIT_XMM avx
+PRED4x4_HD
 %endif
 
 ;-
@@ -192,8 +195,8 @@ PRED4x4_HD avx
 HADDD   %1, %2
 %endmacro
 
-INIT_MMX
-cglobal pred4x4_dc_10_mmxext, 3,3
+INIT_MMX mmx2
+cglobal pred4x4_dc_10, 3, 3
 subr0, r2
 lear1, [r0+r2*2]
 movq   m2, [r0+r2*1-8]
@@ -216,8 +219,8 @@ cglobal pred4x4_dc_10_mmxext, 3,3
 ;-
 ; void pred4x4_down_left(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_DL 1
-cglobal pred4x4_down_left_10_%1, 3,3
+%macro PRED4x4_DL 0
+cglobal pred4x4_down_left_10, 3, 3
 subr0, r2
 movq   m0, [r0]
 movhps m0, [r1]
@@ -236,18 +239,18 @@ cglobal pred4x4_down_left_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
-PRED4x4_DL sse2
+INIT_XMM sse2
+PRED4x4_DL
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_DL avx
+INIT_XMM avx
+PRED4x4_DL
 %endif
 
 ;-
 ; void pred4x4_vertical_left(pixel *src, const pixel *topright, int stride)
 ;-
-%macro PRED4x4_VL 1
-cglobal pred4x4_vertical_left_10_%1, 3,3
+%macro PRED4x4_VL 0
+cglobal pred4x4_vertical_left_10, 3, 3
 subr0, r2
 movu   m1, [r0]
 movhps m1, [r1]
@@ -265,18 +268,18 @@ cglobal pred4x4_vertical_left_10_%1, 3,3
 RET
 %endmacro
 
-INIT_XMM
-PRED4x4_VL sse2
+INIT_XMM sse2
+PRED4x4_VL
 %if HAVE_AVX_EXTERNAL
-INIT_AVX
-PRED4x4_VL avx
+INIT_XMM avx
+PRED4x4_VL
 %endif
 
 ;-
 ; void pred4x4_horizontal_up(pixel *src, const pixel *topright, int stride)
 ;-
-INIT_MMX
-cglobal pred4x4_horizontal_up_10_mmxext, 3,3
+INIT_MMX mmx2
+cglobal pred4x4_horizontal_up_10, 3, 3
 sub   r0, r2
 lea   r1, [r0+r2*2]
 movq  m0, [r0+r2*1-8]
@@ -309,8 +312,8 @@ cglobal pred4x4_horizontal_up_10_m

Re: [libav-devel] [PATCH 11/11] vp8: fix memset() crossing array boundary

2012-10-26 Thread Ronald S. Bultje
Hi,

On Fri, Oct 26, 2012 at 5:50 PM, Mans Rullgard  wrote:
> Indexing across array boundaries is not allowed by C99.
>
> Signed-off-by: Mans Rullgard 
> ---
>  libavcodec/vp8.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/libavcodec/vp8.c b/libavcodec/vp8.c
> index 2766c9e..d9902c4 100644
> --- a/libavcodec/vp8.c
> +++ b/libavcodec/vp8.c
> @@ -1964,7 +1964,8 @@ static int vp8_decode_frame(AVCodecContext *avctx, void 
> *data, int *data_size,
>  // top edge of 127 for intra prediction
>  if (!(avctx->flags & CODEC_FLAG_EMU_EDGE)) {
>  s->top_border[0][15] = s->top_border[0][23] = 127;
> -memset(s->top_border[1]-1, 127, 
> s->mb_width*sizeof(*s->top_border)+1);
> +s->top_border[0][31] = 127;
> +memset(s->top_border[1], 127, s->mb_width*sizeof(*s->top_border));
>  }

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] swscale: support gray to 9bit and 10bit formats

2012-10-26 Thread Ronald S. Bultje
Hi,

On Fri, Oct 26, 2012 at 8:35 AM, Luca Barbato  wrote:
> Signed-off-by: Luca Barbato 
> ---
>
> Partially fixes the issue, there is still something wrong somewhere, possibly
> a buffer overrun or yet some memory not properly set.

How do we reproduce? What kind of msgs do you get in valgrind? Do
32/64bit outputs match?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] swscale: support gray to 9bit and 10bit formats

2012-10-25 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 5:22 PM, Luca Barbato  wrote:
> +if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf) {
> +int length = dstW;
> +int height = dstY - lastDstY;
> +if (is9_OR_10BPS(dstFormat)) {
> +const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat);
> +fill_plane9or10(dst[3], dstStride[3], length, height, lastDstY,
> +255, desc->comp[3].depth_minus1 + 1,
> +isBE(dstFormat));
> +} else if (is16BPS(c->dstFormat))
> +   length *= 2;
> +
> +fillPlane(dst[3], dstStride[3], length, height, lastDstY, 255);

Doesn't this call fillPlane, even if we're 9 or 10 bits?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-10-24 Thread Ronald S. Bultje
Hi,

On Wed, Oct 24, 2012 at 10:07 AM, MĂĄns RullgĂĄrd  wrote:
> "Ronald S. Bultje"  writes:
>
>> From: "Ronald S. Bultje" 
>>
>> Use this in VP8/H264-8bit loopfilter functions so they can be used if
>> there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
>> ---
>>  libavcodec/x86/h264_deblock.asm |  27 ++-
>>  libavcodec/x86/h264dsp_init.c   |   4 +-
>>  libavcodec/x86/vp8dsp.asm   |  68 
>>  libavcodec/x86/vp8dsp_init.c|   8 --
>>  libavutil/x86/x86inc.asm| 167 
>> +---
>>  5 files changed, 181 insertions(+), 93 deletions(-)
>
> What happened to this?  Is there something wrong with the patch?

I am addressing reviews from the x264 people and am somewhat slow at
testing new revisions because of other work...

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] Problem with H.264 and NALU chunks

2012-10-19 Thread Ronald S. Bultje
Hi,

On Fri, Oct 19, 2012 at 6:55 PM, Felipe Contreras
 wrote:
> Hi,
>
> While testing an mpegts stream I found what appears to be a bug in the
> H.264 decoder that only happens on certain processors.
>
> The stream is here[1], but I've split the specific chunks I get from
> the demuxer. They are not complete access-units, but I've set the flag
> CODEC_FLAG2_CHUNKS. On one laptop it works perfectly fine, on another
> I get a bunch of corruption, errors, and valgrind complains that wrong
> memory accesses. The first one is a Core i5, the other a Core i7.
>
> This happens both on v0.8.3 and the master branch.
>
> If I use GStreamer's H.264 parser to aggregate the NAL units into
> access-units, it works fine.
>
> Any ideas?

Aren't you supposed to use the internal h264 parser to do the chunking
before feeding it into the decoder if you choose to not use the
built-in demuxers? (The built-in ones will automatically call the
parser for you if you call av_read_frame() instead of
av_read_packet().)

Or in other words, is this in any way reproducible with avconv/avplay?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 09/15] swscale: avoid pointless use of compound literals

2012-10-14 Thread Ronald S. Bultje
On Sun, Oct 14, 2012 at 8:11 PM, Mans Rullgard  wrote:
> Some compilers (e.g. old gcc) have trouble with these.
>
> Signed-off-by: Mans Rullgard 
> ---
>  libswscale/swscale_unscaled.c | 17 +
>  1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c
> index 70eff72..5efc647 100644
> --- a/libswscale/swscale_unscaled.c
> +++ b/libswscale/swscale_unscaled.c
> @@ -396,6 +396,11 @@ static int planarRgbToRgbWrapper(SwsContext *c, const 
> uint8_t *src[],
>   uint8_t *dst[], int dstStride[])
>  {
>  int alpha_first = 0;
> +const uint8_t *src102[] = { src[1], src[0], src[2] };
> +const uint8_t *src201[] = { src[2], src[0], src[1] };
> +int stride102[] = { srcStride[1], srcStride[0], srcStride[2] };
> +int stride201[] = { srcStride[2], srcStride[0], srcStride[1] };
> +
>  if (c->srcFormat != AV_PIX_FMT_GBRP) {
>  av_log(c, AV_LOG_ERROR, "unsupported planar RGB conversion %s -> 
> %s\n",
> av_get_pix_fmt_name(c->srcFormat),
> @@ -405,15 +410,13 @@ static int planarRgbToRgbWrapper(SwsContext *c, const 
> uint8_t *src[],
>
>  switch (c->dstFormat) {
>  case AV_PIX_FMT_BGR24:
> -gbr24ptopacked24((const uint8_t *[]) { src[1], src[0], src[2] },
> - (int []) { srcStride[1], srcStride[0], srcStride[2] 
> },
> +gbr24ptopacked24(src102, stride102,
>   dst[0] + srcSliceY * dstStride[0], dstStride[0],
>   srcSliceH, c->srcW);
>  break;

I find it slightly hilarious that there is no way in hell this would
have gotten in had the compiler's name been "gcc". Why don't you use
c99-to-c89 for this so-called "compiler"?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] rmdec: Move code shared with Matroska demuxer to RealMedia common code

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 3:02 PM, Diego Biurrun  wrote:
> This removes a dependency of the Matroska demuxer on the RealMedia
> demuxer while only minimally bloating a RealMedia muxer only build.
> ---
>  libavformat/Makefile |4 ++--
>  libavformat/rm.c |   41 +
>  libavformat/rmdec.c  |   41 -
>  3 files changed, 43 insertions(+), 43 deletions(-)

I'm a little confused, rm.c contains code that is to be shared between
rmdec and rmenc. Shouldn't you add a new file (e.g. sipr.c) that
contains sipr-specific code to be shared between different demuxers
(e.g. rmdec and matroskadec)?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 2:12 PM, Luca Barbato  wrote:
> On 10/14/2012 10:43 PM, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Sun, Oct 14, 2012 at 1:39 PM, Luca Barbato  wrote:
>>> On 10/14/2012 09:54 PM, Ronald S. Bultje wrote:
>>>> Hi,
>>>>
>>>> On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje  
>>>> wrote:
>>>>> Hi,
>>>>>
>>>>> On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje  
>>>>> wrote:
>>>>>> Hi,
>>>>>>
>>>>>> On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato  
>>>>>> wrote:
>>>>>>> ---
>>>>>>>
>>>>>>> Here an initial patch to support many yuva, apparently either I botched
>>>>>>> adding them (since I did lots of cut and paste) or they manage the 
>>>>>>> expose some
>>>>>>> flaws in swscale. valgrind manages to spot something and might be nice 
>>>>>>> if
>>>>>>> other people try this and see if the values for the fate-lavfi change 
>>>>>>> after
>>>>>>> each round.
>>>>>>
>>>>>> If this is what I think it is, search for YUVA420P in swscale and
>>>>>> replace it with a generic alpha check.
>>>>>
>>>>> Oh I see now.
>>>>>
>>>>> swscale.c:661
>>>>> if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf)
>>>>> fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 
>>>>> 255);
>>>>> This only works for 8bit. For 9bit, you need a word fill with a value
>>>>> of (1 << numbits) - 1.
>>>>
>>>> Which actually leads to a question: does any of this exist in the real
>>>> world? I've never, ever, ever seen an image with non-8bpc alpha
>>>> planes. I hate to invent stuff that has no utility but does carry
>>>> implementation burden.
>>>
>>> You asked me to add those ^^;
>>
>> Uhm, ok, maybe I meant something else? I mean, there really isn't much
>> use for 9-15bpp alpha channels as far as I know. 8bpp is obvious and I
>> can sort of see 16bpp being useful (although I don't think I've ever
>> seen it used).
>>
>>> BTW while I was investigating I noticed
>>> all the previous 9 and 10 formats look strange, if you have time we
>>> could have a look.
>>
>> Can you elaborate?
>
> try this command
>
> ./avconv -v quiet -nostats -y -cpuflags all -i ./tests/vsynth1/%02d.pgm
> -vcodec rawvideo -pix_fmt yuv420p10 -f nut - | ./avplay -
>
> You will see it blue tinted while it should be gray.
>
> (you might need to apply my yet to be reviewed patch to fix nut with
> rawvideo).

Well yes, can we first sort out this stuff without nut? I mean, I
really don't have much reason to trust nut at this moment. When you
test a typical conversion (convert that thing to nut without
colorspace conversion, and then ./avplay -vf format=yuv420p10le), it
works, so the colorspace conversion itself is correct.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 1:39 PM, Luca Barbato  wrote:
> On 10/14/2012 09:54 PM, Ronald S. Bultje wrote:
>> Hi,
>>
>> On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje  
>> wrote:
>>> Hi,
>>>
>>> On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje  
>>> wrote:
>>>> Hi,
>>>>
>>>> On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato  wrote:
>>>>> ---
>>>>>
>>>>> Here an initial patch to support many yuva, apparently either I botched
>>>>> adding them (since I did lots of cut and paste) or they manage the expose 
>>>>> some
>>>>> flaws in swscale. valgrind manages to spot something and might be nice if
>>>>> other people try this and see if the values for the fate-lavfi change 
>>>>> after
>>>>> each round.
>>>>
>>>> If this is what I think it is, search for YUVA420P in swscale and
>>>> replace it with a generic alpha check.
>>>
>>> Oh I see now.
>>>
>>> swscale.c:661
>>> if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf)
>>> fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 
>>> 255);
>>> This only works for 8bit. For 9bit, you need a word fill with a value
>>> of (1 << numbits) - 1.
>>
>> Which actually leads to a question: does any of this exist in the real
>> world? I've never, ever, ever seen an image with non-8bpc alpha
>> planes. I hate to invent stuff that has no utility but does carry
>> implementation burden.
>
> You asked me to add those ^^;

Uhm, ok, maybe I meant something else? I mean, there really isn't much
use for 9-15bpp alpha channels as far as I know. 8bpp is obvious and I
can sort of see 16bpp being useful (although I don't think I've ever
seen it used).

> BTW while I was investigating I noticed
> all the previous 9 and 10 formats look strange, if you have time we
> could have a look.

Can you elaborate?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje  wrote:
> Hi,
>
> On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje  wrote:
>> Hi,
>>
>> On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato  wrote:
>>> ---
>>>
>>> Here an initial patch to support many yuva, apparently either I botched
>>> adding them (since I did lots of cut and paste) or they manage the expose 
>>> some
>>> flaws in swscale. valgrind manages to spot something and might be nice if
>>> other people try this and see if the values for the fate-lavfi change after
>>> each round.
>>
>> If this is what I think it is, search for YUVA420P in swscale and
>> replace it with a generic alpha check.
>
> Oh I see now.
>
> swscale.c:661
> if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf)
> fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255);
> This only works for 8bit. For 9bit, you need a word fill with a value
> of (1 << numbits) - 1.

Which actually leads to a question: does any of this exist in the real
world? I've never, ever, ever seen an image with non-8bpc alpha
planes. I hate to invent stuff that has no utility but does carry
implementation burden.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-14 Thread Ronald S. Bultje
Hi,

On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje  wrote:
> Hi,
>
> On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato  wrote:
>> ---
>>
>> Here an initial patch to support many yuva, apparently either I botched
>> adding them (since I did lots of cut and paste) or they manage the expose 
>> some
>> flaws in swscale. valgrind manages to spot something and might be nice if
>> other people try this and see if the values for the fate-lavfi change after
>> each round.
>
> If this is what I think it is, search for YUVA420P in swscale and
> replace it with a generic alpha check.

Oh I see now.

swscale.c:661
if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf)
fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255);
This only works for 8bit. For 9bit, you need a word fill with a value
of (1 << numbits) - 1.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [RFC] pixfmt: support more yuva formats

2012-10-13 Thread Ronald S. Bultje
Hi,

On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato  wrote:
> ---
>
> Here an initial patch to support many yuva, apparently either I botched
> adding them (since I did lots of cut and paste) or they manage the expose some
> flaws in swscale. valgrind manages to spot something and might be nice if
> other people try this and see if the values for the fate-lavfi change after
> each round.

If this is what I think it is, search for YUVA420P in swscale and
replace it with a generic alpha check.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/2] H.264: Convert 8-bit qpel inlined assembly to yasm

2012-10-13 Thread Ronald S. Bultje
Hi,

On Sat, Oct 13, 2012 at 9:05 AM, Diego Biurrun  wrote:
> On Sat, Oct 13, 2012 at 10:04:50AM -0500, Daniel Kang wrote:
>> +%macro op_avgh 3
>> +movh   %3, %2
>> +pavgb  %1, %3
>> +movh   %2, %1
>> +%endmacro
>> +
>> +%macro op_avg 3
>> +pavgb  %1, %2
>> +mova   %2, %1
>> +%endmacro
>> +
>> +%macro op_puth 3
>> +movh   %2, %1
>> +%endmacro
>> +
>> +%macro op_put 3
>> +mova   %2, %1
>> +%endmacro
>
> All of these macros only take 2 parameters.

op_avgh takes 3.

>> +.loop:
>> +movh  m1, [r1-1]
>> +movh  m2, [r1+0]
>> +movh  m3, [r1+1]
>> +movh  m0, [r1+2]
>
> Spaces around operators would help readability IMO; more below...

No spaces around operators in assembly dereferences please.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] Fix for FFMPEG vp8dec loopfilter delta values reset at keyframes

2012-10-12 Thread Ronald S. Bultje
Hi,

On Fri, Oct 12, 2012 at 7:18 AM, Luca Barbato  wrote:
> On 10/12/2012 04:15 PM, Ronald S. Bultje wrote:
>>> If you can point me where in the specification it is stated I'd like to
>>> point it in the commit message.
>>
>> It's only natural for all codec state to be reset in a keyframe. It is
>> supposed to be a re-entry point where all values are reset to
>> defaults.
>>
>> The patch itself is LGTM, I was about to push it myself just now.
>
> Please do =)

Pushed.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] Fix for FFMPEG vp8dec loopfilter delta values reset at keyframes

2012-10-12 Thread Ronald S. Bultje
Hi,

On Fri, Oct 12, 2012 at 7:12 AM, Luca Barbato  wrote:
> On 10/12/2012 12:42 PM, Sami Pietilä wrote:
>> Hi all,
>>
>> while testing a VP8 encoder I found a mismatch between FFMPEG and libvpx
>> VP8 decoders. The reason for this mismatch is that FFMPEG doesn't reset
>> loopfilter delta values at keyframes. Patch that fixes the issue is below.
>> I've verified that the output of ffmpeg after this patch matches libvpx. If
>> you need I can send you a stream that shows the issue.
>
> Would be nice to have a small vector to add it to fate, indeed.
>
> The patch itself looks like that this specific detail hadn't been
> considered, while implementing it.
>
> If you can point me where in the specification it is stated I'd like to
> point it in the commit message.

It's only natural for all codec state to be reset in a keyframe. It is
supposed to be a re-entry point where all values are reset to
defaults.

The patch itself is LGTM, I was about to push it myself just now.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/5] vc1dec: Invoke edge emulation regardless of MV precision for 1-MV chroma

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 4:56 PM, Mashiat Sarker Shakkhar
 wrote:
> On 10/9/2012 7:37 PM, Ronald S. Bultje wrote:
>>
>> Hi,
>>
>> On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
>>  wrote:
>>>
>>> This is required due to the way VC-1 handles chroma pull-back which may
>>> end
>>> up causing negative chroma MV for zero luma MV. Edge emulation needs to
>>> be
>>> invoked in such cases.
>>>
>>> This problem only affects chroma Y motion vector.
>>> ---
>>>   libavcodec/vc1dec.c |2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c
>>> index 42eb4a5..2683c86 100644
>>> --- a/libavcodec/vc1dec.c
>>> +++ b/libavcodec/vc1dec.c
>>> @@ -430,7 +430,7 @@ static void vc1_mc_1mv(VC1Context *v, int dir)
>>>   if (v->rangeredfrm || (v->mv_mode == MV_PMODE_INTENSITY_COMP)
>>>   || s->h_edge_pos < 22 || v_edge_pos < 22
>>>   || (unsigned)(src_x - s->mspel) > s->h_edge_pos - (mx&3) - 16 -
>>> s->mspel * 3
>>> -|| (unsigned)(src_y - s->mspel) > v_edge_pos- (my&3) - 16 -
>>> s->mspel * 3) {
>>> +|| (unsigned)(src_y - 1)> v_edge_pos- (my&3) - 16 -
>>> 3) {
>>
>>
>> I'm slightly curious why this is only the case for the vertical
>> dimension, but not the horizontal. Is that intentional, or an
>> oversight?
>
>
> This thread contains my earlier discussion on a rejected patch which tried
> to fix the same bug:
>
> http://patches.libav.org/patch/26240/

OK so it's not because of pullback per se, but because of field
adjustment, then it makes sense.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/5] vc1dec: Set chroma reference field from REFFIELD for 1REF field pictures

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
 wrote:
> Interlaced field pictures can have one or two reference pictures, signaled
> by NUMREF syntax element. For single reference pictures, reference picture
> is determined by REFFIELD syntax element.
> ---
>  libavcodec/vc1dec.c |1 +
>  1 file changed, 1 insertion(+)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/5] vc1dec: Set opposite to the correct value for 1REF field pictures

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
 wrote:
> ---
>  libavcodec/vc1dec.c |   14 ++
>  1 file changed, 10 insertions(+), 4 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 5/5] Double motion vector range for HPEL interlaced picture in proper place

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
 wrote:
> The existing code is not in the right place and it should cover both
> interlaced frame and field pictures.
> ---
>  libavcodec/vc1.c|5 +
>  libavcodec/vc1dec.c |4 
>  2 files changed, 5 insertions(+), 4 deletions(-)

OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/5] vc1dec: Invoke edge emulation regardless of MV precision for 1-MV chroma

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar
 wrote:
> This is required due to the way VC-1 handles chroma pull-back which may end
> up causing negative chroma MV for zero luma MV. Edge emulation needs to be
> invoked in such cases.
>
> This problem only affects chroma Y motion vector.
> ---
>  libavcodec/vc1dec.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c
> index 42eb4a5..2683c86 100644
> --- a/libavcodec/vc1dec.c
> +++ b/libavcodec/vc1dec.c
> @@ -430,7 +430,7 @@ static void vc1_mc_1mv(VC1Context *v, int dir)
>  if (v->rangeredfrm || (v->mv_mode == MV_PMODE_INTENSITY_COMP)
>  || s->h_edge_pos < 22 || v_edge_pos < 22
>  || (unsigned)(src_x - s->mspel) > s->h_edge_pos - (mx&3) - 16 - 
> s->mspel * 3
> -|| (unsigned)(src_y - s->mspel) > v_edge_pos- (my&3) - 16 - 
> s->mspel * 3) {
> +|| (unsigned)(src_y - 1)> v_edge_pos- (my&3) - 16 - 3) {

I'm slightly curious why this is only the case for the vertical
dimension, but not the horizontal. Is that intentional, or an
oversight?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH] Support for MSVC.

2012-10-09 Thread Ronald S. Bultje
Hi,

On Tue, Oct 9, 2012 at 10:33 AM, aviad rozenhek  wrote:
>
> On Thu, Jun 14, 2012 at 1:59 AM, Ronald S. Bultje 
> wrote:
>>
>> Hi,
>>
>> see attached. It's kinda big, some parts can/should be done
>> differently, but it's a starting point so let's start talking about
>> how to get this in the right way.
>>
>> Some general comments:
>> - only tested on 32bit so far
>> - no it won't compile on non-MSVC with my patch (I had to break the
>> Makefile in a couple of places to get linking to work)
>> - the include of stdlib.h everywhere is because of [1]. We may need to
>> add av_restrict instead.
>> - the include of mathematics.h, avconfig.h and avstring.h everywhere
>> is (I think) legit, to account for snprintf(), M_PI or inline.
>> - the added assembly in fmtconvert.asm helps to get rid of a VLA in
>> fmtconvert_mmx.c.
>> - test.c converts c99 to c89 which MSVC can compile. Don't look at it
>> unless you want to see true pain, it's intended to be shipped outside,
>> like gas-preprocessor.pl. (unit.c and unit2.c are unittests for it.)
>> - the renames of libavutil/x86/cpu.c, libavcodec/x86/mlpdsp.c and
>> libswscale/x86/rgb2rgb.c are to prevent duplicate filenames in the
>> same library, which makes the MSVC linker crap out in debug mode
>> (works fine in release mode).
>> - it mostly passes fate, except zlib-dependent tests (because I didn't
>> try including zlib yet) and wtv (which uses gmtime(), which is broken
>> on Windows since it only supports years up to 3000, and our test file
>> uses 10900.) All other tests pass.
>>
>> Ronald
>>
>> [1]
>> http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/bf3c0191-c15d-4e73-a111-3796608900c7
>
>
> can you provide prebuilt binaries for c99-to-c89.exe for windows? it will
> make using it that much easier

https://gerrit.chromium.org/gerrit/#/c/34728/

You'll need c99wrap.exe also, but that one is trivial to build
yourself, since it has no dependencies.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 1/9] pixfmt: add AV_ prefixes to PIX_FMT_*

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov  wrote:
> ---
>  doc/APIchanges   |8 ++
>  libavutil/Makefile   |1 +
>  libavutil/old_pix_fmts.h |  128 
>  libavutil/pixfmt.h   |  298 
> ++
>  libavutil/version.h  |7 +-
>  5 files changed, 311 insertions(+), 131 deletions(-)
>  create mode 100644 libavutil/old_pix_fmts.h

\o/ OK.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:47 PM, Anton Khirnov  wrote:
>
> On Sat, 6 Oct 2012 21:44:03 -0700, "Ronald S. Bultje"  
> wrote:
>> Hi,
>>
>> On Sat, Oct 6, 2012 at 9:19 PM, Anton Khirnov  wrote:
>> >
>> > On Sat, 6 Oct 2012 17:01:57 -0700, "Ronald S. Bultje"  
>> > wrote:
>> >> Hi,
>> >>
>> >> On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov  wrote:
>> >> > On Sat, 6 Oct 2012 12:35:44 -0700, "Ronald S. Bultje" 
>> >> >  wrote:
>> >> >> On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov  
>> >> >> wrote:
>> >> >> > ---
>> >> >> >  avprobe.c |6 +++---
>> >> >> >  cmdutils.c|   16 +++-
>> >> >> >  tools/graph2dot.c |4 ++--
>> >> >> >  3 files changed, 16 insertions(+), 10 deletions(-)
>> >> >> >
>> >> >> > diff --git a/avprobe.c b/avprobe.c
>> >> >> > index 16a5d29..3a3ae0f 100644
>> >> >> > --- a/avprobe.c
>> >> >> > +++ b/avprobe.c
>> >> >> > @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext 
>> >> >> > *fmt_ctx, int stream_idx)
>> >> >> >  const char *profile;
>> >> >> >  char val_str[128];
>> >> >> >  AVRational display_aspect_ratio;
>> >> >> > +const AVPixFmtDescriptor *desc;
>> >> >> >
>> >> >> >  probe_object_header("stream");
>> >> >> >
>> >> >> > @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext 
>> >> >> > *fmt_ctx, int stream_idx)
>> >> >> >rational_string(val_str, sizeof(val_str), 
>> >> >> > ":",
>> >> >> >&display_aspect_ratio));
>> >> >> >  }
>> >> >> > -probe_str("pix_fmt",
>> >> >> > -  dec_ctx->pix_fmt != AV_PIX_FMT_NONE ?
>> >> >> > -  av_pix_fmt_descriptors[dec_ctx->pix_fmt].name 
>> >> >> > : "unknown");
>> >> >> > +desc = av_pix_fmt_desc_get(dec_ctx->pix_fmt);
>> >> >>
>> >> >> I think this is an awful idea, it should be OK to access the array 
>> >> >> directly.
>> >> >>
>> >> >
>> >> > I don't see any advantage to accessing the array directly.
>> >>
>> >> It often leads to smaller code. Instead of AVPixFmtDesc *d =
>> >> getter(pix_fmt); if (!d) return -1; printf("%d\n", d->something);, you
>> >> can just do printf("%d\n", array[pix_fmt].something);. Do we really
>> >> need to babysit our users?
>> >
>> > Checking for d to be non-NULL is equivalent to checking that pix_fmt is
>> > valid. So no, it does not necessarily makes the code longer.
>> >>
>> >> > It is the only case in all Libav where we export an array like this.
>> >>
>> >> We never supported MSVC. So we shouldn't?
>> >>
>> >
>> > YES! =p
>> >
>> > Seriously though, there are only disadvantages to exporting the array
>> > directly.
>> >
>> >> > It fixes the size of AVPixFmtDescriptor for no good reason.
>> >>
>> >> Is there a need to extend it?
>> >>
>> >
>> > There might be. You don't want to find out that you need to extend it
>> > two months after somebody else just bumped lavu.
>> >
>> >> > It prevents the caller from accessing all the descriptors when he's
>> >> > using a newer library than the one he compiled against.
>> >>
>> >> if (compile_ver > runtime_ver) error(); on app init. Likely, you want
>> >> to do that anyway.
>> >>
>> >> > And I hear there's some trouble with the evil microsoft thing and data
>> >> > symbols.
>> >>
>> >> You have that anyway for any symbol shared between lavu/f/fi/c/r/*.
>> >
>> > It should be our long-term goal to remove those.
>>
>> You can't. Stuff like ff_sqrt[], ff_log2[] etc. can't be removed. Never. 
>> Sorry.
>>
>
> The same way we can never have msvc support?

Right.

Shall we move on to more practical patches?

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:19 PM, Anton Khirnov  wrote:
>
> On Sat, 6 Oct 2012 17:01:57 -0700, "Ronald S. Bultje"  
> wrote:
>> Hi,
>>
>> On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov  wrote:
>> > On Sat, 6 Oct 2012 12:35:44 -0700, "Ronald S. Bultje"  
>> > wrote:
>> >> On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov  wrote:
>> >> > ---
>> >> >  avprobe.c |6 +++---
>> >> >  cmdutils.c|   16 +++-
>> >> >  tools/graph2dot.c |4 ++--
>> >> >  3 files changed, 16 insertions(+), 10 deletions(-)
>> >> >
>> >> > diff --git a/avprobe.c b/avprobe.c
>> >> > index 16a5d29..3a3ae0f 100644
>> >> > --- a/avprobe.c
>> >> > +++ b/avprobe.c
>> >> > @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, 
>> >> > int stream_idx)
>> >> >  const char *profile;
>> >> >  char val_str[128];
>> >> >  AVRational display_aspect_ratio;
>> >> > +const AVPixFmtDescriptor *desc;
>> >> >
>> >> >  probe_object_header("stream");
>> >> >
>> >> > @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, 
>> >> > int stream_idx)
>> >> >rational_string(val_str, sizeof(val_str), 
>> >> > ":",
>> >> >&display_aspect_ratio));
>> >> >  }
>> >> > -probe_str("pix_fmt",
>> >> > -  dec_ctx->pix_fmt != AV_PIX_FMT_NONE ?
>> >> > -  av_pix_fmt_descriptors[dec_ctx->pix_fmt].name : 
>> >> > "unknown");
>> >> > +desc = av_pix_fmt_desc_get(dec_ctx->pix_fmt);
>> >>
>> >> I think this is an awful idea, it should be OK to access the array 
>> >> directly.
>> >>
>> >
>> > I don't see any advantage to accessing the array directly.
>>
>> It often leads to smaller code. Instead of AVPixFmtDesc *d =
>> getter(pix_fmt); if (!d) return -1; printf("%d\n", d->something);, you
>> can just do printf("%d\n", array[pix_fmt].something);. Do we really
>> need to babysit our users?
>
> Checking for d to be non-NULL is equivalent to checking that pix_fmt is
> valid. So no, it does not necessarily makes the code longer.
>>
>> > It is the only case in all Libav where we export an array like this.
>>
>> We never supported MSVC. So we shouldn't?
>>
>
> YES! =p
>
> Seriously though, there are only disadvantages to exporting the array
> directly.
>
>> > It fixes the size of AVPixFmtDescriptor for no good reason.
>>
>> Is there a need to extend it?
>>
>
> There might be. You don't want to find out that you need to extend it
> two months after somebody else just bumped lavu.
>
>> > It prevents the caller from accessing all the descriptors when he's
>> > using a newer library than the one he compiled against.
>>
>> if (compile_ver > runtime_ver) error(); on app init. Likely, you want
>> to do that anyway.
>>
>> > And I hear there's some trouble with the evil microsoft thing and data
>> > symbols.
>>
>> You have that anyway for any symbol shared between lavu/f/fi/c/r/*.
>
> It should be our long-term goal to remove those.

You can't. Stuff like ff_sqrt[], ff_log2[] etc. can't be removed. Never. Sorry.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov  wrote:
> On Sat, 6 Oct 2012 12:35:44 -0700, "Ronald S. Bultje"  
> wrote:
>> On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov  wrote:
>> > ---
>> >  avprobe.c |6 +++---
>> >  cmdutils.c|   16 +++-
>> >  tools/graph2dot.c |4 ++--
>> >  3 files changed, 16 insertions(+), 10 deletions(-)
>> >
>> > diff --git a/avprobe.c b/avprobe.c
>> > index 16a5d29..3a3ae0f 100644
>> > --- a/avprobe.c
>> > +++ b/avprobe.c
>> > @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int 
>> > stream_idx)
>> >  const char *profile;
>> >  char val_str[128];
>> >  AVRational display_aspect_ratio;
>> > +const AVPixFmtDescriptor *desc;
>> >
>> >  probe_object_header("stream");
>> >
>> > @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int 
>> > stream_idx)
>> >rational_string(val_str, sizeof(val_str), ":",
>> >&display_aspect_ratio));
>> >  }
>> > -probe_str("pix_fmt",
>> > -  dec_ctx->pix_fmt != AV_PIX_FMT_NONE ?
>> > -  av_pix_fmt_descriptors[dec_ctx->pix_fmt].name : 
>> > "unknown");
>> > +desc = av_pix_fmt_desc_get(dec_ctx->pix_fmt);
>>
>> I think this is an awful idea, it should be OK to access the array directly.
>>
>
> I don't see any advantage to accessing the array directly.

It often leads to smaller code. Instead of AVPixFmtDesc *d =
getter(pix_fmt); if (!d) return -1; printf("%d\n", d->something);, you
can just do printf("%d\n", array[pix_fmt].something);. Do we really
need to babysit our users?

> It is the only case in all Libav where we export an array like this.

We never supported MSVC. So we shouldn't?

> It fixes the size of AVPixFmtDescriptor for no good reason.

Is there a need to extend it?

> It prevents the caller from accessing all the descriptors when he's
> using a newer library than the one he compiled against.

if (compile_ver > runtime_ver) error(); on app init. Likely, you want
to do that anyway.

> And I hear there's some trouble with the evil microsoft thing and data
> symbols.

You have that anyway for any symbol shared between lavu/f/fi/c/r/*.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.

2012-10-06 Thread Ronald S. Bultje
From: "Ronald S. Bultje" 

Use this in VP8/H264-8bit loopfilter functions so they can be used if
there is no aligned stack (e.g. MSVC 32bit or ICC 10.x).
---
 libavcodec/x86/h264_deblock.asm |  27 ++-
 libavcodec/x86/h264dsp_init.c   |   4 +-
 libavcodec/x86/vp8dsp.asm   |  68 
 libavcodec/x86/vp8dsp_init.c|   8 --
 libavutil/x86/x86inc.asm| 167 +---
 5 files changed, 181 insertions(+), 93 deletions(-)

diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm
index 940a8f7..43aaf6d 100644
--- a/libavcodec/x86/h264_deblock.asm
+++ b/libavcodec/x86/h264_deblock.asm
@@ -399,14 +399,12 @@ DEBLOCK_LUMA
 ;-
 ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
-cglobal deblock_%1_luma_8, 5,5
+cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2
 lea r4, [r1*3]
 dec r2 ; alpha-1
 neg r4
 dec r3 ; beta-1
 add r4, r0 ; pix-3*stride
-%assign pad 2*%2+12-(stack_offset&15)
-SUB esp, pad
 
 movam0, [r4+r1]   ; p1
 movam1, [r4+2*r1] ; p0
@@ -444,22 +442,19 @@ cglobal deblock_%1_luma_8, 5,5
 DEBLOCK_P0_Q0
 mova[r4+2*r1], m1
 mova[r0], m2
-ADD esp, pad
 RET
 
 ;-
 ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t 
*tc0 )
 ;-
 INIT_MMX cpuname
-cglobal deblock_h_luma_8, 0,5
+cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70
 movr0, r0mp
 movr3, r1m
 lear4, [r3*3]
 subr0, 4
 lear1, [r0+r4]
-%assign pad 0x78-(stack_offset&15)
-SUBesp, pad
-%define pix_tmp esp+12
+%define pix_tmp esp+16
 
 ; transpose 6x16 -> tmp space
 TRANSPOSE6x8_MEM  PASS8ROWS(r0, r1, r3, r4), pix_tmp
@@ -501,7 +496,6 @@ cglobal deblock_h_luma_8, 0,5
 movq   m3, [pix_tmp+0x48]
 TRANSPOSE8x4B_STORE  PASS8ROWS(r0, r1, r3, r4)
 
-ADDesp, pad
 RET
 %endmacro ; DEBLOCK_LUMA
 
@@ -632,7 +626,7 @@ DEBLOCK_LUMA v, 16
 %define mpb_0 m14
 %define mpb_1 m15
 %else
-%define spill(x) [esp+16*x+((stack_offset+4)&15)]
+%define spill(x) [esp+16*x]
 %define p2 [r4+r1]
 %define q2 [r0+2*r1]
 %define t4 spill(0)
@@ -647,10 +641,7 @@ DEBLOCK_LUMA v, 16
 ;-
 ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta )
 ;-
-cglobal deblock_%1_luma_intra_8, 4,6,16
-%if ARCH_X86_64 == 0
-sub esp, 0x60
-%endif
+cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50
 lea r4, [r1*4]
 lea r5, [r1*3] ; 3*stride
 dec r2d; alpha-1
@@ -699,9 +690,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16
 LUMA_INTRA_SWAP_PQ
 LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5]
 .end:
-%if ARCH_X86_64 == 0
-add esp, 0x60
-%endif
 RET
 
 INIT_MMX cpuname
@@ -738,12 +726,10 @@ cglobal deblock_h_luma_intra_8, 4,9
 addrsp, 0x88
 RET
 %else
-cglobal deblock_h_luma_intra_8, 2,4
+cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80
 lear3,  [r1*3]
 subr0,  4
 lear2,  [r0+r3]
-%assign pad 0x8c-(stack_offset&15)
-SUBrsp, pad
 %define pix_tmp rsp
 
 ; transpose 8x16 -> tmp space
@@ -774,7 +760,6 @@ cglobal deblock_h_luma_intra_8, 2,4
 lear0,  [r0+r1*8]
 lear2,  [r2+r1*8]
 TRANSPOSE8x8_MEM  PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), 
PASS8ROWS(r0, r2, r1, r3)
-ADDrsp, pad
 RET
 %endif ; ARCH_X86_64
 %endmacro ; DEBLOCK_LUMA_INTRA
diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c
index 3f6ded4..8596f26 100644
--- a/libavcodec/x86/h264dsp_init.c
+++ b/libavcodec/x86/h264dsp_init.c
@@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int 
bit_depth,
 c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2;
 c->biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2;
 
-#if HAVE_ALIGNED_STACK
 c->h264_v_loop_filter_luma   = 
ff_deblock_v_luma_8_sse2;
 c->h264_h_loop_filter_luma   = 
ff_deblock_h_luma_8_sse2;
 c->h264_v_loop_filter_luma_intra = 
ff_deblock_v_luma_intra_8_sse2;
 c->h264_h_loop_filter_luma_intra = 
ff_deblock_h_luma_intra_8_sse2;
-#endif /* HAVE_ALIGNED_STACK */
 }
 if (EXTERNAL_SSSE3(mm_flags)) {
 c->bi

Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov  wrote:
> ---
>  avprobe.c |6 +++---
>  cmdutils.c|   16 +++-
>  tools/graph2dot.c |4 ++--
>  3 files changed, 16 insertions(+), 10 deletions(-)
>
> diff --git a/avprobe.c b/avprobe.c
> index 16a5d29..3a3ae0f 100644
> --- a/avprobe.c
> +++ b/avprobe.c
> @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int 
> stream_idx)
>  const char *profile;
>  char val_str[128];
>  AVRational display_aspect_ratio;
> +const AVPixFmtDescriptor *desc;
>
>  probe_object_header("stream");
>
> @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int 
> stream_idx)
>rational_string(val_str, sizeof(val_str), ":",
>&display_aspect_ratio));
>  }
> -probe_str("pix_fmt",
> -  dec_ctx->pix_fmt != AV_PIX_FMT_NONE ?
> -  av_pix_fmt_descriptors[dec_ctx->pix_fmt].name : 
> "unknown");
> +desc = av_pix_fmt_desc_get(dec_ctx->pix_fmt);

I think this is an awful idea, it should be OK to access the array directly.

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


Re: [libav-devel] [PATCH 2/9] Replace PIX_FMT_* -> AV_PIX_FMT_*, PixelFormat -> AVPixelFormat

2012-10-06 Thread Ronald S. Bultje
Hi,

On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov  wrote:
[..]

Yay! OK.

(Didn't review at all, a sed replace can't be that hard, plus this is awesome.)

Ronald
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel


<    1   2   3   4   5   6   7   8   9   10   >