Re: [libav-devel] [PATCH] mkv: add support for the VP9 tag
Hi, On Mon, Dec 17, 2012 at 8:16 PM, Luca Barbato wrote: > On 12/16/12 2:09 AM, Luca Barbato wrote: >> >> From: Tom Finegan > > >> +AV_CODEC_ID_VP9, >> > > Ping I had already said OK - if you add a big phat warning saying this is experimental and a non-stable bitstream (e.g. under the experimental flag for muxing purposes). Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: increase dist_scale_factor for up to 32 references
Hi, On Mon, Dec 17, 2012 at 10:55 PM, Janne Grunau wrote: > On 2012-12-17 18:24:37 -0800, Ronald S. Bultje wrote: >> Hi, >> On Dec 17, 2012 1:02 PM, "Janne Grunau" wrote: >> > >> > Compute dist_scale_factor_field only for MBAFF since that is the only >> > case in which it is used. >> >> The patch also increases an array size - why? > > to handle up to 32 references for field pictures. Alternatively both > arrays can probably be merged to the usual 16 + 32 size. Check the code: +for (i = 0; i < 2 * h->ref_count[0]; i++) +h->dist_scale_factor_field[field][i^field] = +get_scale_factor(h, poc, poc1, i+16); +} + +for (i = 0; i < h->ref_count[0]; i++){ h->dist_scale_factor[i] = get_scale_factor(h, poc, poc1, i); } For arrays of size: -int dist_scale_factor[16]; +int dist_scale_factor[32]; int dist_scale_factor_field[2][32]; If i for setting dist_scale_factor_field is [0, ref_count * 2] and that's [0, 32], them ref_count is 16, thus the second loop i [0, ref_count] is [0, 16], thus dist_scale_factor[] only needs to be of size 16. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 13/15] h264: break after error if enough NALs were seen in decode_nal_units
Hi, On Dec 12, 2012 12:31 PM, "Janne Grunau" wrote: > > --- > libavcodec/h264.c | 15 +++ > 1 file changed, 7 insertions(+), 8 deletions(-) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index 8847f1f..35cc800 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -4070,17 +4070,16 @@ again: > hx->nal_unit_type, bit_length); > } > > -if (context_count == h->max_contexts) { > -execute_decode_slices(h, context_count); > +if (!err && context_count == h->max_contexts) { > +err = execute_decode_slices(h, context_count); > context_count = 0; > } Does this completely break ER? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 3/3] h264: prevent decoding of slice NALs in extradata
Hi, On Dec 17, 2012 1:41 PM, "Janne Grunau" wrote: > > It is not posible to call get_buffer during frame-mt codec > initialization. Libavformat might pass huge amounts of data as > extradata after parsing broken files. The 'extradata' for the fuzzed > sample sample_varPAR_s5374_r001-02.avi is 2.8M large and contains > multiple slices. OK, maybe log an error if you like. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/3] h264: clear picture copies when unreferencing them
Hi, On Dec 17, 2012 1:41 PM, "Janne Grunau" wrote: > > Copy a neighbouring frame/field from the list as error resilience > measure since the decoder assumes frame data pointers of known reference > to be valid. > > Prevents stale references in ref_list in the fuzzed sample > bipbop234.ts_s20118 caused by not refreshing the ref lists when required > due to slice decoding errors. So maybe I'm being an asshole - tell me if I am - but why would we unref an image if we're still referencing it? That seems broken. Shouldn't the code that unrefs clear and/or check this? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: increase dist_scale_factor for up to 32 references
Hi, On Dec 17, 2012 1:02 PM, "Janne Grunau" wrote: > > Compute dist_scale_factor_field only for MBAFF since that is the only > case in which it is used. The patch also increases an array size - why? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Add a VideoDSPContext with core video DSP functions.
Hi, On Sun, Dec 16, 2012 at 9:33 AM, Luca Barbato wrote: > +void ff_videodsp_init_x86(VideoDSPContext *ctx, int bpc) [..] > +{ > +#if HAVE_YASM > +int mm_flags = av_get_cpu_flags(); > + > +#if ARCH_X86_32 > +if (bpc <= 8 && mm_flags & AV_CPU_FLAG_MMX) { > +c->emulated_edge_mc = emulated_edge_mc_mmx; s/c/ctx/. > +} > +if (mm_flags & AV_CPU_FLAG_AMD3DNOW) { AV_CPU_FLAG_3DNOW. Then it compiles on x86-32 also. I haven't tested on ppc or arm. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] x86inc: remove wrong assert in X86_32 PROLOGUE macro
Hi, On Sun, Dec 16, 2012 at 1:49 PM, Janne Grunau wrote: > On 2012-12-16 12:36:56 -0800, Ronald S. Bultje wrote: >> >> On Sun, Dec 16, 2012 at 5:52 AM, Janne Grunau wrote: >> > >> > Nasm also fails on '%1 %+ SUFFIX' with empty SUFFIX in cpuid.asm. It was >> > previously working since it was always followed by ', %2'. Easiest >> > solution I've found was adding parameters to cpu_cpuid_test in >> > cpuid.asm. >> >> Does it work if you add a comma behind it (but no arguments)? Or does >> cglobal_internal then think %0 == 2? > > It already failes earlier with: > libavutil/x86/cpuid.asm:70: error: (cglobal_internal:1) `%ifndef' > expects macro identifiers > > presumeably failing on cglobaled_cpu_cpuid_test, > > Adding a unused parameter to both cglobal_internal calls works ugly > as it is. The old one used this: diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm index 483ae7b..564ba50 100644 --- a/libavutil/x86/x86inc.asm +++ b/libavutil/x86/x86inc.asm @@ -614,12 +614,8 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14 ; Applies any symbol mangling needed for C linkage, and sets up a define such t ; subsequent uses of the function name automatically refer to the mangled versi ; Appends cpuflags to the function name if cpuflags has been specified. -%macro cglobal 1-2+ ; name, [PROLOGUE args] -%if %0 == 1 -cglobal_internal %1 %+ SUFFIX -%else +%macro cglobal 1-2+ "" ; name, [PROLOGUE args] cglobal_internal %1 %+ SUFFIX, %2 -%endif %endmacro %macro cglobal_internal 1-2+ %ifndef cglobaled_%1 @@ -640,7 +636,7 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14 %assign stack_offset 0 %assign stack_size 0 %assign stack_size_padded 0 -%if %0 > 1 +%ifnidn %2, "" PROLOGUE %2 %endif %endmacro Does that work? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] x86inc: remove wrong assert in X86_32 PROLOGUE macro
Hi, On Sun, Dec 16, 2012 at 5:52 AM, Janne Grunau wrote: > On 2012-12-14 17:12:05 -0800, Ronald S. Bultje wrote: >> Hi, >> >> On Fri, Dec 14, 2012 at 8:56 AM, Janne Grunau wrote: >> > On 2012-12-14 08:32:14 -0800, Ronald S. Bultje wrote: >> >> Hi, >> >> >> >> On Dec 14, 2012 3:30 AM, "Janne Grunau" wrote: >> >> > >> >> > Arguments on the stack are handled properly and functions can use more >> >> > than arguments than the 7 registers available on x86_32. >> >> > Fixes nasm build, yasm would fail too if it would error out on %error >> >> > instead of just emitting a warning. >> >> > --- >> >> > libavutil/x86/x86inc.asm | 1 - >> >> > 1 file changed, 1 deletion(-) >> >> > >> >> > diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm >> >> > index 60d05f4..40705e0 100644 >> >> > --- a/libavutil/x86/x86inc.asm >> >> > +++ b/libavutil/x86/x86inc.asm >> >> > @@ -554,7 +554,6 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14 >> >> > %endif >> >> > SETUP_STACK_POINTER %4 >> >> > ASSERT regs_used <= 7 >> >> > -ASSERT regs_used >= num_args >> >> > PUSH_IF_USED 3, 4, 5, 6 >> >> > ALLOC_STACK %4 >> >> > LOAD_IF_USED 0, 1, 2, 3, 4, 5, 6 >> >> >> >> Please submit to x264 first. >> >> >> >> Where does this trigger? This shouldn't happen, an assert is the right >> >> thing to do. >> > >> > It is? make fate passes on x86_32 instances and yasm is affected too >> > except that it emits just a warning on %error. >> > >> > It triggers on h264_loop_filter_strength_mmxext in h264_deblock.asm >> > with 9 arguments. >> >> But we have code above that clips both num_regs and regs_used to 7, so >> both should be 7? > > if you mean num_args with num_regs then you removed the code to clamp it > to 7 in 6f40e9f070f7 'x86inc: support stack mem allocation and > re-alignment in PROLOGUE'. Re-adding that breaks several fate-h264 > tests. > > Nasm also fails on '%1 %+ SUFFIX' with empty SUFFIX in cpuid.asm. It was > previously working since it was always followed by ', %2'. Easiest > solution I've found was adding parameters to cpu_cpuid_test in > cpuid.asm. Does it work if you add a comma behind it (but no arguments)? Or does cglobal_internal then think %0 == 2? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] mkv: support vp9 tag
Hi, On Sat, Dec 15, 2012 at 2:28 PM, Luca Barbato wrote: > From: Tom Finegan > > --- > > Maybe is a bit early but since files are cropping already might be good to > have > it. > > libavcodec/avcodec.h | 1 + > libavformat/matroska.c | 1 + > 2 files changed, 2 insertions(+) > > diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h > index 29e3701..75655df 100644 > --- a/libavcodec/avcodec.h > +++ b/libavcodec/avcodec.h > @@ -238,6 +238,7 @@ enum AVCodecID { > AV_CODEC_ID_KGV1, > AV_CODEC_ID_YOP, > AV_CODEC_ID_VP8, > +AV_CODEC_ID_VP9, > AV_CODEC_ID_PICTOR, Breaks ABI. More generally, a big phat warning is missing that says the bitstream is not stable yet. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Add a VideoDSPContext with core video DSP functions.
Hi, On Sat, Dec 15, 2012 at 2:20 PM, Diego Biurrun wrote: > On Sat, Dec 15, 2012 at 09:46:02AM -0800, Ronald S. Bultje wrote: >> --- /dev/null >> +++ b/libavcodec/arm/videodsp_arm.S >> @@ -0,0 +1,33 @@ >> +@ >> +@ ARMv4 optimized DSP utils >> +@ Copyright (c) 2004 AGAWA Koji >> + >> +#include "config.h" >> +#include "libavutil/arm/asm.S" >> + >> +#if HAVE_ARMV5TE_EXTERNAL >> +function ff_prefetch_arm, export=1 >> +subsr2, r2, #1 >> +pld [r0] >> +add r0, r0, r1 >> +bne ff_prefetch_arm >> +bx lr >> +endfunc >> +#endif > > You continue playing fast and loose with copyright. This was written > by Mans, not by Koji Agawa. > > Many of the instances where you add your name to the top of files in > this patch are also rather dubious. Which ones? I wrote all videodsp initialization code, and I wrote the x86 asm. I only see the ppc one as being dubious, where I did write the init code but not the prefetch, so possibly one name is missing, and perhaps the C code, which was written and touched by a bunch of people (I'm too lazy to look up who, feel free to fill it in for me). So my name is not misplaced anywhere, rather 1-2 names of other people may be missing here or there. The part where Mans' name is missing is because it's missing in the original source file also. I can hardly be blamed for that. Will fix though. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Add a VideoDSPContext with core video DSP functions.
Hi, On Sat, Dec 15, 2012 at 9:15 AM, Janne Grunau wrote: > Everything quoted without remarks looks ok, I haven't looked at > the x86 asm. I guess only tested on x86? Yes. The emulated edge asm is a move so it needs no review. Review of the prefetch x86 asm would be nice, since it's a move from inline asm to yasm. All issues that you pointed out fixed locally. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: handle complementary field pairs in await_references()
Hi, On Fri, Dec 14, 2012 at 7:02 AM, Janne Grunau wrote: > On 2012-12-12 16:16:21 -0800, Ronald S. Bultje wrote: >> >> On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau wrote: >> > On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote: >> >> Hi, >> >> >> >> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau >> >> wrote: >> >> > Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid >> >> > field 2. >> >> > --- >> >> > libavcodec/h264.c | 3 ++- >> >> > 1 file changed, 2 insertions(+), 1 deletion(-) >> >> > >> >> > diff --git a/libavcodec/h264.c b/libavcodec/h264.c >> >> > index 546b046..d5a54e2 100644 >> >> > --- a/libavcodec/h264.c >> >> > +++ b/libavcodec/h264.c >> >> > @@ -433,7 +433,8 @@ static void await_references(H264Context *h) >> >> > ff_thread_await_progress(&ref_pic->f, >> >> > FFMIN((row >> 1), >> >> > pic_height - 1), >> >> > 0); >> >> > -} else if (FIELD_PICTURE && !ref_field_picture) { // >> >> > field referencing one field of a frame >> >> > +} else if (FIELD_PICTURE && >> >> > + (!ref_field_picture || ref_field > 1)) { // >> >> > field referencing one field of a frame or complementary field pair >> >> >> >> I don't understand this one. If we're referencing two fields, >> >> shouldn't ref_field_picture automatically be true? >> > >> > it isn't and the code that marks both fields of a complementory field >> > pair as available doesn't touch field_picture. >> >> Right, but isn't this wrong then? I mean, the reference is >> field-based, so we're going to reference a field. The part where both >> fields are available doesn't negate the fact that we're only >> referencing one. > > Yes the patch is wrong, I wasn't paying enough attention. > A complementary ref field pair can't handled in the same way as a frame. > We reported progress on the individual fields. I don't know how to > determine which field it references so the patch waits on both. eld > >> It seems to me that a change like this could potentially cause use of >> reference data before the second field of a reference is fully >> decoded. > > Yes, see above. > > Janne > > ---8<--- > Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid > field 2. > --- > libavcodec/h264.c | 5 + > 1 file changed, 5 insertions(+) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index f01ff67..be2a98b 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -438,6 +438,11 @@ static void await_references(H264Context *h) > FFMIN(row * 2 + ref_field, > pic_height - 1), > 0); > +} else if (FIELD_PICTURE && ref_field_picture && ref_field > == 2) { // field referencing one field of a complementary field pair > +ff_thread_await_progress(&ref_pic->f, > + FFMIN(row, pic_height - 1), 1); > +ff_thread_await_progress(&ref_pic->f, > + FFMIN(row, pic_height - 1), 0); > } else if (FIELD_PICTURE) { I still don't get it, how does a field reference need to wait for two fields? The idea of a field reference is that we only need the data of 1 field. Can I get the sample? I'll try to see what's going on this weekend perhaps... All this looks weird. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: fix memleak on error during SPS parsing
Hi, On Fri, Dec 14, 2012 at 12:09 PM, Janne Grunau wrote: > Introduced in d7d6efe42b0d. > --- > libavcodec/h264_ps.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264-mt: handle NAL_DPAs before calling ff_thread_finish_setup
Hi, On Fri, Dec 14, 2012 at 2:33 PM, Janne Grunau wrote: > Hi, > On 2012-12-12 16:20:50 -0800, Ronald S. Bultje wrote: >> Hi, >> >> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau >> wrote: >> > --- >> > libavcodec/h264.c | 13 - >> > libavcodec/h264.h | 1 + >> > 2 files changed, 13 insertions(+), 1 deletion(-) >> >> In what situation does this happen? > > One situation is data partitioning. NAL_DPA calls decode_slice_header > and can start a new frame but wasn't counted as needed frame. See > attached patch. With a little luck it make this patch and patch 12/15 > obsolete. > > Janne > > ---8<--- > > Since a NAL_DPA can start a new frame it has to be handled before > ff_thread_finish_setup is called. > --- > libavcodec/h264.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index b863c41..75192ac 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -3929,6 +3929,7 @@ static int decode_nal_units(H264Context *h, const > uint8_t *buf, int buf_size) > case NAL_PPS: > nals_needed = nal_index; > break; > +case NAL_DPA: > case NAL_IDR_SLICE: > case NAL_SLICE: > init_get_bits(&hx->s.gb, ptr, bit_length); > -- > 1.7.12.4 That looks more in line with the intent of the current code, so is OK as a short-gap. Ideally, at some point we would add parsing for these kind of sequences, so they are input as 2 separate pkts to the decoder in 2 separate threads, thus running concurrently. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] x86inc: remove wrong assert in X86_32 PROLOGUE macro
Hi, On Fri, Dec 14, 2012 at 8:56 AM, Janne Grunau wrote: > On 2012-12-14 08:32:14 -0800, Ronald S. Bultje wrote: >> Hi, >> >> On Dec 14, 2012 3:30 AM, "Janne Grunau" wrote: >> > >> > Arguments on the stack are handled properly and functions can use more >> > than arguments than the 7 registers available on x86_32. >> > Fixes nasm build, yasm would fail too if it would error out on %error >> > instead of just emitting a warning. >> > --- >> > libavutil/x86/x86inc.asm | 1 - >> > 1 file changed, 1 deletion(-) >> > >> > diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm >> > index 60d05f4..40705e0 100644 >> > --- a/libavutil/x86/x86inc.asm >> > +++ b/libavutil/x86/x86inc.asm >> > @@ -554,7 +554,6 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14 >> > %endif >> > SETUP_STACK_POINTER %4 >> > ASSERT regs_used <= 7 >> > -ASSERT regs_used >= num_args >> > PUSH_IF_USED 3, 4, 5, 6 >> > ALLOC_STACK %4 >> > LOAD_IF_USED 0, 1, 2, 3, 4, 5, 6 >> >> Please submit to x264 first. >> >> Where does this trigger? This shouldn't happen, an assert is the right >> thing to do. > > It is? make fate passes on x86_32 instances and yasm is affected too > except that it emits just a warning on %error. > > It triggers on h264_loop_filter_strength_mmxext in h264_deblock.asm > with 9 arguments. But we have code above that clips both num_regs and regs_used to 7, so both should be 7? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] vp56: release frames on error
Hi, On Dec 14, 2012 12:59 AM, "Luca Barbato" wrote: > > Fixes CVE-2012-2783 > > CC: libav-sta...@libav.org > --- > libavcodec/vp56.c | 8 +++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/libavcodec/vp56.c b/libavcodec/vp56.c > index 6779ffb..5bd0a1a 100644 > --- a/libavcodec/vp56.c > +++ b/libavcodec/vp56.c > @@ -514,8 +514,14 @@ int ff_vp56_decode_frame(AVCodecContext *avctx, void *data, int *got_frame, > s->modelp = &s->models[is_alpha]; > > res = s->parse_header(s, buf, remaining_buf_size, &golden_frame); > -if (res < 0) > +if (res < 0) { > +int i; > +for (i = 0; i < 4; i++) { > +if (s->frames[i].data[0]) > +avctx->release_buffer(avctx, &s->frames[i]); > +} > return res; > +} > > if (res == VP56_SIZE_CHANGE) { > int i; What about error resilience / concealment? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] x86inc: remove wrong assert in X86_32 PROLOGUE macro
Hi, On Dec 14, 2012 3:30 AM, "Janne Grunau" wrote: > > Arguments on the stack are handled properly and functions can use more > than arguments than the 7 registers available on x86_32. > Fixes nasm build, yasm would fail too if it would error out on %error > instead of just emitting a warning. > --- > libavutil/x86/x86inc.asm | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm > index 60d05f4..40705e0 100644 > --- a/libavutil/x86/x86inc.asm > +++ b/libavutil/x86/x86inc.asm > @@ -554,7 +554,6 @@ DECLARE_ARG 7, 8, 9, 10, 11, 12, 13, 14 > %endif > SETUP_STACK_POINTER %4 > ASSERT regs_used <= 7 > -ASSERT regs_used >= num_args > PUSH_IF_USED 3, 4, 5, 6 > ALLOC_STACK %4 > LOAD_IF_USED 0, 1, 2, 3, 4, 5, 6 Please submit to x264 first. Where does this trigger? This shouldn't happen, an assert is the right thing to do. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] x86inc: fully concatenate tokens to fix macro expansion for nasm
Hi, On Thu, Dec 13, 2012 at 1:48 PM, Janne Grunau wrote: > Fixes build errors with nasm introduced in 6f40e9f070f7 for stack > memory alignment. Noticed by BugMaster. > --- > libavutil/x86/x86inc.asm | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()
Hi, On Wed, Dec 12, 2012 at 4:29 PM, Janne Grunau wrote: > On 2012-12-12 16:16:21 -0800, Ronald S. Bultje wrote: >> Hi, >> >> On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau wrote: >> > On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote: >> >> Hi, >> >> >> >> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau >> >> wrote: >> >> > Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid >> >> > field 2. >> >> > --- >> >> > libavcodec/h264.c | 3 ++- >> >> > 1 file changed, 2 insertions(+), 1 deletion(-) >> >> > >> >> > diff --git a/libavcodec/h264.c b/libavcodec/h264.c >> >> > index 546b046..d5a54e2 100644 >> >> > --- a/libavcodec/h264.c >> >> > +++ b/libavcodec/h264.c >> >> > @@ -433,7 +433,8 @@ static void await_references(H264Context *h) >> >> > ff_thread_await_progress(&ref_pic->f, >> >> > FFMIN((row >> 1), >> >> > pic_height - 1), >> >> > 0); >> >> > -} else if (FIELD_PICTURE && !ref_field_picture) { // >> >> > field referencing one field of a frame >> >> > +} else if (FIELD_PICTURE && >> >> > + (!ref_field_picture || ref_field > 1)) { // >> >> > field referencing one field of a frame or complementary field pair >> >> >> >> I don't understand this one. If we're referencing two fields, >> >> shouldn't ref_field_picture automatically be true? >> > >> > it isn't and the code that marks both fields of a complementory field >> > pair as available doesn't touch field_picture. >> >> Right, but isn't this wrong then? I mean, the reference is >> field-based, so we're going to reference a field. The part where both >> fields are available doesn't negate the fact that we're only >> referencing one. >> >> It seems to me that a change like this could potentially cause use of >> reference data before the second field of a reference is fully >> decoded. > > I need to check > >> What does the hang look like? > > ref_field = ref_pic->f.reference; > > and it waits on ref_field in the next else that breaks when reference is > PICT_FRAME that's BOTTOM_FIELD | TOP_FIELD. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 11/15] h264: prevent starting new frames after ff_thread_finish_setup()
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau wrote: > --- > libavcodec/h264.c | 13 - > libavcodec/h264.h | 1 + > 2 files changed, 13 insertions(+), 1 deletion(-) In what situation does this happen? We should really make this an assert() at some point, and track the cases where it does happen back to ensure we fix it at whatever level. (The current hack to decode two fields in a packet is a hack and should die; we should parse such files better and split at the field boundary.) Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()
Hi, On Wed, Dec 12, 2012 at 3:22 PM, Janne Grunau wrote: > On 2012-12-12 14:39:34 -0800, Ronald S. Bultje wrote: >> Hi, >> >> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau >> wrote: >> > Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid >> > field 2. >> > --- >> > libavcodec/h264.c | 3 ++- >> > 1 file changed, 2 insertions(+), 1 deletion(-) >> > >> > diff --git a/libavcodec/h264.c b/libavcodec/h264.c >> > index 546b046..d5a54e2 100644 >> > --- a/libavcodec/h264.c >> > +++ b/libavcodec/h264.c >> > @@ -433,7 +433,8 @@ static void await_references(H264Context *h) >> > ff_thread_await_progress(&ref_pic->f, >> > FFMIN((row >> 1), pic_height >> > - 1), >> > 0); >> > -} else if (FIELD_PICTURE && !ref_field_picture) { // >> > field referencing one field of a frame >> > +} else if (FIELD_PICTURE && >> > + (!ref_field_picture || ref_field > 1)) { // >> > field referencing one field of a frame or complementary field pair >> >> I don't understand this one. If we're referencing two fields, >> shouldn't ref_field_picture automatically be true? > > it isn't and the code that marks both fields of a complementory field > pair as available doesn't touch field_picture. Right, but isn't this wrong then? I mean, the reference is field-based, so we're going to reference a field. The part where both fields are available doesn't negate the fact that we're only referencing one. It seems to me that a change like this could potentially cause use of reference data before the second field of a reference is fully decoded. What does the hang look like? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 04/15] mpegvideo: align width to 32 for scratch buffer sizes
Hi, On Wed, Dec 12, 2012 at 3:12 PM, Janne Grunau wrote: > On 2012-12-12 14:30:43 -0800, Ronald S. Bultje wrote: >> Hi, >> >> On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau >> wrote: >> > cmdutis.c's alloc_buffer() uses aligned to 32 width plus 2 edges of 32 >> > pixels as linesize. emu_edge_buffer has to work with the same stride. >> > This makes only a difference for > 8 bit per pixel bit depths since we >> > always allocate for 16 bit per pixel. >> > >> > Fixes fuzzed sample nasa-8s2.ts_s244342. >> > --- >> > libavcodec/mpegvideo.c | 13 - >> > 1 file changed, 8 insertions(+), 5 deletions(-) >> >> What if we implement a custom get_buffer() which returns bigger strides? > > it fails horrible for high pixel bit depth like it does with this > sample. I could add an assumed_linesize into the context and realloc if > the linesize returned by get_buffer is larger or maybe preallocate a > buffer here if it' feasible. Can't you allocate these buffers after you receive the first buffer from get_buffer() and actually know the linesize? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 06/15] h264: handle complementary field pairs in await_references()
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau wrote: > Fixes hang in HPCAMAPALQ_BRCM_B.264_s14038 while waiting on invalid > field 2. > --- > libavcodec/h264.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index 546b046..d5a54e2 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -433,7 +433,8 @@ static void await_references(H264Context *h) > ff_thread_await_progress(&ref_pic->f, > FFMIN((row >> 1), pic_height - > 1), > 0); > -} else if (FIELD_PICTURE && !ref_field_picture) { // field > referencing one field of a frame > +} else if (FIELD_PICTURE && > + (!ref_field_picture || ref_field > 1)) { // field > referencing one field of a frame or complementary field pair I don't understand this one. If we're referencing two fields, shouldn't ref_field_picture automatically be true? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 05/15] h264: initialize frame-mt context copies properly
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau wrote: > --- > libavcodec/h264.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index e8a22f8..546b046 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -1125,6 +1125,8 @@ static int decode_init_thread_copy(AVCodecContext > *avctx) > memset(h->sps_buffers, 0, sizeof(h->sps_buffers)); > memset(h->pps_buffers, 0, sizeof(h->pps_buffers)); > > +h->s.context_initialized = 0; > + > return 0; > } OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 04/15] mpegvideo: align width to 32 for scratch buffer sizes
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau wrote: > cmdutis.c's alloc_buffer() uses aligned to 32 width plus 2 edges of 32 > pixels as linesize. emu_edge_buffer has to work with the same stride. > This makes only a difference for > 8 bit per pixel bit depths since we > always allocate for 16 bit per pixel. > > Fixes fuzzed sample nasa-8s2.ts_s244342. > --- > libavcodec/mpegvideo.c | 13 - > 1 file changed, 8 insertions(+), 5 deletions(-) What if we implement a custom get_buffer() which returns bigger strides? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 03/15] mpegvideo: treat delayed pictures as used
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau wrote: > This requires to move the avcodec_default_free_buffers() call to > ff_MPV_common_end() since otherwise delayed pictures would get freed > during a size change. > --- > libavcodec/h264.h | 6 -- > libavcodec/mpegvideo.c | 8 > libavcodec/mpegvideo.h | 6 ++ > 3 files changed, 10 insertions(+), 10 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 02/15] lavc: do not overwrite frame parameters of delayed frames
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau wrote: > Decoders supporting frame parameter changes can return delayed frames > with the old parameters. > --- > libavcodec/utils.c | 15 ++- > 1 file changed, 10 insertions(+), 5 deletions(-) > > diff --git a/libavcodec/utils.c b/libavcodec/utils.c > index 1185a35..97fa570 100644 > --- a/libavcodec/utils.c > +++ b/libavcodec/utils.c > @@ -1282,11 +1282,16 @@ int attribute_align_arg > avcodec_decode_video2(AVCodecContext *avctx, AVFrame *pi > else { > ret = avctx->codec->decode(avctx, picture, got_picture_ptr, > avpkt); > -picture->pkt_dts = avpkt->dts; > -picture->sample_aspect_ratio = avctx->sample_aspect_ratio; > -picture->width = avctx->width; > -picture->height = avctx->height; > -picture->format = avctx->pix_fmt; > +picture->pkt_dts = avpkt->dts; > + > +/* delayed pictures might have different parameters than > + * the context returning them */ > +if (!(avctx->codec->capabilities & CODEC_CAP_DELAY)) { > +picture->sample_aspect_ratio = avctx->sample_aspect_ratio; > +picture->width = avctx->width; > +picture->height = avctx->height; > +picture->format = avctx->pix_fmt; > +} > } I can live with this. We can clean this further later. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 01/15] h264: set parameters from SPS whenever it changes
Hi, On Wed, Dec 12, 2012 at 12:30 PM, Janne Grunau wrote: > Fixes a crash in the fuzzed sample sample_varPAR.avi_s26638 with > alternating bit depths. > --- > libavcodec/h264.c| 107 > +-- > libavcodec/h264.h| 2 + > libavcodec/h264_ps.c | 3 ++ > 3 files changed, 65 insertions(+), 47 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Sat, Dec 8, 2012 at 4:12 PM, Ronald S. Bultje wrote: > From: "Ronald S. Bultje" > > Use this in VP8/H264-8bit loopfilter functions so they can be used if > there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). > --- > libavcodec/x86/h264_deblock.asm | 27 ++ > libavcodec/x86/h264dsp_init.c | 4 +- > libavcodec/x86/vp8dsp.asm | 68 --- > libavcodec/x86/vp8dsp_init.c| 8 -- > libavutil/x86/x86inc.asm| 185 > > 5 files changed, 191 insertions(+), 101 deletions(-) One more fix for invalid stack free'ing if a YMM function on win64 used >6 registers, but no stack. Any more reviews, or can this be applied? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: "Ronald S. Bultje" Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++ libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 --- libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 185 5 files changed, 191 insertions(+), 101 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index c124c4d..1f1dbc6 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -398,14 +398,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5,5,8,2*%2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset&15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset&15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+12*HAVE_ALIGNED_STACK ; transpose 6x16 -> tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)&15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2,4,8,0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset&15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 -> tmp space @@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index ac231cb..73d4990 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c->biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c->h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c->h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c->h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c->h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c->biweight_h264_pixels_tab[0
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Sat, Dec 8, 2012 at 8:41 AM, MĂĄns RullgĂĄrd wrote: > "Ronald S. Bultje" writes: > >> From: "Ronald S. Bultje" >> >> Use this in VP8/H264-8bit loopfilter functions so they can be used if >> there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). >> --- >> libavcodec/x86/h264_deblock.asm | 27 ++ >> libavcodec/x86/h264dsp_init.c | 4 +- >> libavcodec/x86/vp8dsp.asm | 68 --- >> libavcodec/x86/vp8dsp_init.c| 8 -- >> libavutil/x86/x86inc.asm| 185 >> >> 5 files changed, 191 insertions(+), 101 deletions(-) > > How is this different from the patch you sent yesterday? It adds a missing %endrep for win64. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: "Ronald S. Bultje" Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++ libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 --- libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 185 5 files changed, 191 insertions(+), 101 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index c124c4d..1f1dbc6 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -398,14 +398,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5,5,8,2*%2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset&15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset&15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+12*HAVE_ALIGNED_STACK ; transpose 6x16 -> tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)&15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2,4,8,0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset&15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 -> tmp space @@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index ac231cb..73d4990 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c->biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c->h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c->h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c->h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c->h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c->biweight_h264_pixels_tab[0
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Fri, Dec 7, 2012 at 2:08 PM, MĂĄns RullgĂĄrd wrote: > "Ronald S. Bultje" writes: > >> Hi, >> >> On Fri, Dec 7, 2012 at 2:01 PM, MĂĄns RullgĂĄrd wrote: >>> "Ronald S. Bultje" writes: >>>> On Fri, Dec 7, 2012 at 1:01 PM, MĂĄns RullgĂĄrd wrote: >>>>> "Ronald S. Bultje" writes: >>>>> >>>>>> +%if mmsize <= 16 && HAVE_ALIGNED_STACK >>>>> >>>>> How much overhead would it be to drop HAVE_ALIGNED_STACK entirely? >>>> >>>> Well, for now, we still have a ton of functions that don't use the >>>> cglobal-method of allocating stack. I only ported h264/vp8 loopfilter, >>>> nothing else. >>>> >>>> But anyway, more generally, it's 4-5 instructions per function. For >>>> typical functions with an inner loop, that's negligible, but for a >>>> select small set of functions, it may be significant. >>> >>> The remaining functions are ff_h264_idct8_add(4)_10_{sse2,avx}, >>> ff_hadamard8_diff(16)_{sse2,ssse3}, and something in swscale. >> >> lavr also. > > There are no references to HAVE_ALIGNED_STACK there. It crashes on x86-32 icc10.x and msvc. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Fri, Dec 7, 2012 at 2:01 PM, MĂĄns RullgĂĄrd wrote: > "Ronald S. Bultje" writes: >> On Fri, Dec 7, 2012 at 1:01 PM, MĂĄns RullgĂĄrd wrote: >>> "Ronald S. Bultje" writes: >>> >>>> +%if mmsize <= 16 && HAVE_ALIGNED_STACK >>> >>> How much overhead would it be to drop HAVE_ALIGNED_STACK entirely? >> >> Well, for now, we still have a ton of functions that don't use the >> cglobal-method of allocating stack. I only ported h264/vp8 loopfilter, >> nothing else. >> >> But anyway, more generally, it's 4-5 instructions per function. For >> typical functions with an inner loop, that's negligible, but for a >> select small set of functions, it may be significant. > > The remaining functions are ff_h264_idct8_add(4)_10_{sse2,avx}, > ff_hadamard8_diff(16)_{sse2,ssse3}, and something in swscale. lavr also. > Besides, does anyone still use 32-bit where performance is that > critical? This is used for YMM (e.g. avx float) stack alignment (to 32-byte) also, so it will affect 64-bit also. My personal point of view is that the code to take advantage of an actual feature of the compiler/system (alignment) is there. I don't see why we'd remove it. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] golomb: use unsigned arithmetics in svq3_get_ue_golomb()
Hi, On Fri, Nov 30, 2012 at 10:56 AM, Janne Grunau wrote: > This prevents undefined behaviour of signed left shift if the coded > value is larger than 2^31. Large values are most likely invalid and > caused errors or by feeding random. > > Validate every use of svq3_get_ue_golomb() and changed the place there > the return value was compared with negative numbers. dirac.c was clean, > fixed rv30 and svq3. > --- > libavcodec/golomb.h | 5 +++-- > libavcodec/rv30.c | 6 +++--- > libavcodec/svq3.c | 17 - > 3 files changed, 14 insertions(+), 14 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Fri, Dec 7, 2012 at 1:01 PM, MĂĄns RullgĂĄrd wrote: > "Ronald S. Bultje" writes: > >> +%if mmsize <= 16 && HAVE_ALIGNED_STACK > > How much overhead would it be to drop HAVE_ALIGNED_STACK entirely? Well, for now, we still have a ton of functions that don't use the cglobal-method of allocating stack. I only ported h264/vp8 loopfilter, nothing else. But anyway, more generally, it's 4-5 instructions per function. For typical functions with an inner loop, that's negligible, but for a select small set of functions, it may be significant. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] Check for invalid VLC code in zeros_left before writing coefficients.
From: "Ronald S. Bultje" This prevents an invalid write into coeffs[scantable[-1]] if zeros_left itself was an invalid VLC code (and thus -1). --- libavcodec/h264_cavlc.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/libavcodec/h264_cavlc.c b/libavcodec/h264_cavlc.c index 8702122..fa2bd99 100644 --- a/libavcodec/h264_cavlc.c +++ b/libavcodec/h264_cavlc.c @@ -610,17 +610,17 @@ static int decode_residual(H264Context *h, GetBitContext *gb, DCTELEM *block, in } \ } +if(zeros_left<0){ +av_log(h->s.avctx, AV_LOG_ERROR, "negative number of zero coeffs at %d %d\n", s->mb_x, s->mb_y); +return -1; +} + if (h->pixel_shift) { STORE_BLOCK(int32_t) } else { STORE_BLOCK(int16_t) } -if(zeros_left<0){ -av_log(h->s.avctx, AV_LOG_ERROR, "negative number of zero coeffs at %d %d\n", s->mb_x, s->mb_y); -return -1; -} - return 0; } -- 1.8.0 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: "Ronald S. Bultje" Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++ libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 --- libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 186 5 files changed, 191 insertions(+), 102 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index c124c4d..1f1dbc6 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -398,14 +398,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5,5,8,2*%2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset&15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -443,22 +441,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,5,8,0x60+HAVE_ALIGNED_STACK*12 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset&15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+12*HAVE_ALIGNED_STACK ; transpose 6x16 -> tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -500,7 +495,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -631,7 +625,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)&15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -646,10 +640,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4,6,16,ARCH_X86_64*0x50-0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -698,9 +689,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -737,12 +725,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2,4,8,0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset&15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 -> tmp space @@ -773,7 +759,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index ac231cb..73d4990 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c->biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c->h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c->h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c->h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c->h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c->biweight_h264_pixels_tab[0
Re: [libav-devel] [PATCH 1/1] h264: slice-mt: get last_pic_dropable from master context
Hi, On Wed, Dec 5, 2012 at 11:00 AM, Janne Grunau wrote: > Fixes fate-h264-conformance-cvnlfi2_sony_h and smllwebdl.mkv from > https://github.com/OpenELEC/OpenELEC.tv/issues/1557 . > --- > libavcodec/h264.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index cd8aeec..b9c46ea 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -2759,7 +2759,7 @@ static int decode_slice_header(H264Context *h, > H264Context *h0) > h->mb_mbaff= 0; > h->mb_aff_frame= 0; > last_pic_structure = s0->picture_structure; > -last_pic_dropable = s->dropable; > +last_pic_dropable = s0->dropable; OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: slice-mt: check master context for valid current_picture_ptr
Hi, On Wed, Dec 5, 2012 at 11:10 AM, Janne Grunau wrote: > Fixes errors in slice based multithreading introduced in 0b300daad2f5. > > CC: libav-sta...@libav.org > --- > libavcodec/h264.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index b9c46ea..317ee95 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -2782,7 +2782,7 @@ static int decode_slice_header(H264Context *h, > H264Context *h0) > s->picture_structure = last_pic_structure; > s->dropable = last_pic_dropable; > return AVERROR_INVALIDDATA; > -} else if (!s->current_picture_ptr) { > +} else if (!s0->current_picture_ptr) { > av_log(s->avctx, AV_LOG_ERROR, > "unset current_picture_ptr on %d. slice\n", > h0->current_slice + 1); OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] x86: float_dsp: fix compilation of ff_vector_dmul_scalar_avx() on x86-32
Hi, On Wed, Dec 5, 2012 at 9:53 AM, Justin Ruggles wrote: > --- > libavutil/x86/float_dsp.asm |2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/libavutil/x86/float_dsp.asm b/libavutil/x86/float_dsp.asm > index d8fd93a..4a1742f 100644 > --- a/libavutil/x86/float_dsp.asm > +++ b/libavutil/x86/float_dsp.asm > @@ -127,7 +127,7 @@ cglobal vector_dmul_scalar, 3,3,3, dst, src, len > cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len > %endif > %if ARCH_X86_32 > -VBROADCASTSD xmm0, mulm > +VBROADCASTSD m0, mulm OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] x86: float_dsp: fix loading of the len parameter on x86-32
Hi, On Wed, Dec 5, 2012 at 9:53 AM, Justin Ruggles wrote: > --- > libavutil/x86/float_dsp.asm |3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/libavutil/x86/float_dsp.asm b/libavutil/x86/float_dsp.asm > index 4a1742f..dc75532 100644 > --- a/libavutil/x86/float_dsp.asm > +++ b/libavutil/x86/float_dsp.asm > @@ -127,6 +127,9 @@ cglobal vector_dmul_scalar, 3,3,3, dst, src, len > cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len > %endif > %if ARCH_X86_32 > +; PROLOGUE loads len from the wrong stack address because mul is an > 8-byte > +; parameter and PROLOGUE assumes all parameters are 4-byte > +mov lenq, [esp+0x18] > VBROADCASTSD m0, mulm That loads len twice. Why not: %if ARCH_X86_32 cglobal vector_dmul_scalar, 3,4,3, dst, src, mul, len, lenaddr mov lenq, lenaddrm %else cglobal vector_dmul_scalar, 4,4,3, dst, src, mul, len %endif which is more functionally correct? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 03/10] SBR DSP x86: implement SSE qmf_post_shuffle
Hi, On Fri, Nov 30, 2012 at 1:14 PM, Christophe Gisquet wrote: > Hello, > > 2012/11/30 Loren Merritt : >> If you increment an index into W and z rather than the pointers >> themselves, then you can eliminate an add and a cmp. > > I add already tested that, and redid it: > cglobal sbr_qmf_post_shuffle, 2,4,3,W,z > mov r3q, 32*4 > lea r2q, [zq + (64-4)*4] > addzq, r3q > leaWq, [Wq + 2*r3q] > neg r3q > .loop: > mova m0, [r2q] > mova m1, [zq + r3q] > xorps m0, [ps_neg] > shufps m0, m0, 0x1B > mova m2, m0 > unpcklps m0, m1 > unpckhps m2, m1 > mova [Wq + 2*r3q + 0], m0 > mova [Wq + 2*r3q + 16], m2 > sub r2q, 16 > add r3q, 16 > jl .loop > REP_RET > > It's 2 cycles slower on Penrynn/Win64 (154 vs 152). Try adding an "ALIGN 16" just above ".loop:", maybe that fixes it? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] wavenc: write fact chunk sample count at the correct file position
Hi, On Mon, Nov 26, 2012 at 4:11 PM, Justin Ruggles wrote: > From: Michael Niedermayer > > Fixes curruption of metadata in the INFO chunk. > > Signed-off-by: Michael Niedermayer > Signed-off-by: Justin Ruggles > --- > libavformat/wavenc.c |9 + > 1 files changed, 5 insertions(+), 4 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 3/4] h264: check ref_count validity for num_ref_idx_active_override_flag
On Mon, Nov 26, 2012 at 7:54 AM, Janne Grunau wrote: > On 2012-11-26 07:01:20 -0800, Ronald S. Bultje wrote: >> Hi, >> >> On Mon, Nov 26, 2012 at 4:06 AM, Janne Grunau wrote: >> > Fixes segfault in the fuzzed sample bipbop234.ts_s226407. >> > >> > CC: libav-sta...@libav.org >> > --- >> > libavcodec/h264.c | 7 ++- >> > 1 file changed, 6 insertions(+), 1 deletion(-) >> > >> > diff --git a/libavcodec/h264.c b/libavcodec/h264.c >> > index eca4636..730e34a 100644 >> > --- a/libavcodec/h264.c >> > +++ b/libavcodec/h264.c >> > @@ -2896,8 +2896,13 @@ static int decode_slice_header(H264Context *h, >> > H264Context *h0) >> > >> > if (num_ref_idx_active_override_flag) { >> > h->ref_count[0] = get_ue_golomb(&s->gb) + 1; >> > -if (h->slice_type_nos == AV_PICTURE_TYPE_B) >> > +if (h->ref_count[0] < 1) >> > +return AVERROR_INVALIDDATA; >> > +if (h->slice_type_nos == AV_PICTURE_TYPE_B) { >> > h->ref_count[1] = get_ue_golomb(&s->gb) + 1; >> > +if (h->ref_count[1] < 1) >> > +return AVERROR_INVALIDDATA; >> > +} >> > } >> > >> > if (h->slice_type_nos == AV_PICTURE_TYPE_B) >> > -- >> > 1.7.12.4 >> >> get_ue_golomb() guarantees to return something non-negative, so is >> this an integer overflow? We should then prevent the integer overflow. > > It returns -1 if the next 32 bits are all 0. That is arguebly not a > valid exp-golomb code but occurs with the sample mentioned in the commit > message (available in incoming/2012-11-15_fuzz_h264_janne as my other > fuzzed samples). Oh I see, patch OK then. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/4] h264: check sps.log2_max_frame_num for validity
Hi, On Mon, Nov 26, 2012 at 4:27 AM, Janne Grunau wrote: > On 2012-11-26 13:22:51 +0100, Kostya Shishkov wrote: >> On Mon, Nov 26, 2012 at 01:06:18PM +0100, Janne Grunau wrote: >> > Fixes infinitive or long taking loop in frame num gap code in >> > the fuzzed sample bipbop234.ts_s223302. >> > >> > CC: libav-sta...@libav.org >> > --- >> > libavcodec/h264_ps.c | 9 + >> > 1 file changed, 9 insertions(+) >> > >> > diff --git a/libavcodec/h264_ps.c b/libavcodec/h264_ps.c >> > index 810f69f..dc6b676 100644 >> > --- a/libavcodec/h264_ps.c >> > +++ b/libavcodec/h264_ps.c >> > @@ -37,6 +37,9 @@ >> > //#undef NDEBUG >> > #include >> > >> > +#define MAX_LOG2_MAX_FRAME_NUM(12 + 4) >> > +#define MIN_LOG2_MAX_FRAME_NUM4 >> > + >> > static const AVRational pixel_aspect[17]={ >> > {0, 1}, >> > {1, 1}, >> > @@ -349,6 +352,12 @@ int ff_h264_decode_seq_parameter_set(H264Context *h){ >> > } >> > >> > sps->log2_max_frame_num= get_ue_golomb(&s->gb) + 4; >> > +if (sps->log2_max_frame_num > MAX_LOG2_MAX_FRAME_NUM || >> > +sps->log2_max_frame_num < MIN_LOG2_MAX_FRAME_NUM) { >> > +av_log(h->s.avctx, AV_LOG_ERROR, "log2_max_frame_num out of range >> > " >> > + "(4-16): %d\n", sps->log2_max_frame_num); >> > +return AVERROR_INVALIDDATA; >> > +} >> > sps->poc_type= get_ue_golomb_31(&s->gb); >> > >> > if(sps->poc_type == 0){ //FIXME #define >> > -- >> >> LGTM though I suspect it's useless to check for the minimum size (unless it >> overflows). > > The min check is added to protect against overflows. The sample in the > commit msg doesn't overflow but is only slightly lower than INT_MAX. Here, too, we should then protect against the actual overflow itself from happening, not so much check that it just happened. number = read_golomb(); if number >= MIN_VALUE && number - MIN_VALUE >= MAX_VALUE error; number += MIN_VALUE; Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 3/4] h264: check ref_count validity for num_ref_idx_active_override_flag
Hi, On Mon, Nov 26, 2012 at 4:06 AM, Janne Grunau wrote: > Fixes segfault in the fuzzed sample bipbop234.ts_s226407. > > CC: libav-sta...@libav.org > --- > libavcodec/h264.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index eca4636..730e34a 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -2896,8 +2896,13 @@ static int decode_slice_header(H264Context *h, > H264Context *h0) > > if (num_ref_idx_active_override_flag) { > h->ref_count[0] = get_ue_golomb(&s->gb) + 1; > -if (h->slice_type_nos == AV_PICTURE_TYPE_B) > +if (h->ref_count[0] < 1) > +return AVERROR_INVALIDDATA; > +if (h->slice_type_nos == AV_PICTURE_TYPE_B) { > h->ref_count[1] = get_ue_golomb(&s->gb) + 1; > +if (h->ref_count[1] < 1) > +return AVERROR_INVALIDDATA; > +} > } > > if (h->slice_type_nos == AV_PICTURE_TYPE_B) > -- > 1.7.12.4 get_ue_golomb() guarantees to return something non-negative, so is this an integer overflow? We should then prevent the integer overflow. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: enable low delay only if no delayed frames were seen
Hi, On Sun, Nov 25, 2012 at 2:40 PM, Janne Grunau wrote: > Dropping frames is undesirable but that is the only way by which the > decoder could return to low delay mode. Instead emit a warning and > continue with delayed frames. > Fixes a crash in fuzzed sample nasa-8s2.ts_s20033 caused by a larger > than expected has_b_frames value. Low delay keeps getting re-enabled > from a presumely broken SPS. > --- > libavcodec/h264.c | 10 -- > 1 file changed, 8 insertions(+), 2 deletions(-) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index f45c572..658d865 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -3867,8 +3867,14 @@ again: > > if (s->flags & CODEC_FLAG_LOW_DELAY || > (h->sps.bitstream_restriction_flag && > - !h->sps.num_reorder_frames)) > -s->low_delay = 1; > + !h->sps.num_reorder_frames)) { > +if (s->avctx->has_b_frames > 1 || h->delayed_pic[0]) > +av_log(avctx, AV_LOG_WARNING, "Delayed frames seen " > + "reenabling low delay requires a codec " > + "flush.\n"); > +else > +s->low_delay = 1; > +} That looks good, thanks. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] dsputil: modify scalarproduct_int16 to handle mod8 numbers of loops.
Hi, On Sun, Nov 25, 2012 at 8:43 AM, Ronald S. Bultje wrote: > (This is kind of a pain on Fedora Whoops, I meant Mac here. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] dsputil: modify scalarproduct_int16 to handle mod8 numbers of loops.
Hi, On Sun, Nov 25, 2012 at 1:18 AM, Diego Biurrun wrote: > On Sat, Nov 24, 2012 at 03:43:04PM +0100, Christophe Gisquet wrote: >> >> PS: no idea why firefox/gmail treats those patches as octet-stream, >> and how to fix that. > > Ronald wrote about this a while back, but there seems to be no easy > solution: > > http://blogs.gnome.org/rbultje/2009/03/04/firefox-gmail-and-content-type-of-attachments/ The solution is basically to use git send-email, not gmail. git send-email can use your gmail account as smtp, if wanted. I can help you set that up if this doesn't help enough: http://morefedora.blogspot.com/2009/02/configuring-git-send-email-to-use-gmail.html (This is kind of a pain on Fedora, since you'll first have to install a bunch of perl extensions. I'm gonna hope you're not using a Mac at this point. See http://kbase.wincent.com/old/knowledge-base/Installing_Net::SMTP::SSL_for_sending_patches_with_Git_over_secure_SMTP.html for gory details.) Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: reset has_b_frames after enabling low_delay from SPS
Hi, On Wed, Nov 21, 2012 at 12:29 PM, Janne Grunau wrote: > On 2012-11-21 09:13:27 -0800, Ronald S. Bultje wrote: > > On Wed, Nov 21, 2012 at 5:37 AM, Janne Grunau >wrote: > > > On 2012-11-21 13:14:34 +, Loren Merritt wrote: > > > > On Wed, 21 Nov 2012, Janne Grunau wrote: > > > > > On 2012-11-16 18:14:29 -0800, Ronald S. Bultje wrote: > > > > > > > > > > > > So I'm going to have to wonder what happens to the delayed frames > > > > > > already > > > > > > cached in h->delayed_pics[]? Are they still output? > > > > > > > > > > yes, delayed pictures are returned as long as h->delayed_pics[0] > is not > > > > > NULL. > > > > > > > > And when you return something from delayed_pics[] instead of the new > > > > frame that just got decoded, the new frame gets delayed, so that you > > > > don't ever actually switch to low delay mode? > > > > > > Since we can't return multiple frames at once we can't switch to low > > > delay mode once we have delayed pictures. If low delay after switching > > > from a stream with delayed pictures to a low_delay stream is important > > > and dropping frames is an option requiring a decoder flush from sounds > > > reasonable to me. > > > > > > > That's one option. We could also simply say that low_delay can't be > > re-enabled once disabled, so even though the SPS says we're low-delay, > just > > ignore it. I don't really have a preference either way, whichever is > easier > > implementation-wise or whichever you feel is a better representation of > > what the end user of a non-fuzzed bitstream likely wanted to happen. (In > > practice, this probably never happens anyway?) > > The current patch seems to work fine and fate-h264 still passes so this > approach is easier for me. I also think that not dropping frames is > less surprising option. Expecting tests of sender and receiver in > situations where low delay is important is not unreasonable. I can add a > warning when low_delay gets enabled with delayed frames. Well, what I meant is that we probably shouldn't reset s->avctx->has_b_frames and s->low_delay, since that may give the (incorrect) assumption to the application that there is no delay... A warning would be good, yes. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264: reset has_b_frames after enabling low_delay from SPS
Hi, On Wed, Nov 21, 2012 at 5:37 AM, Janne Grunau wrote: > On 2012-11-21 13:14:34 +, Loren Merritt wrote: > > On Wed, 21 Nov 2012, Janne Grunau wrote: > > > On 2012-11-16 18:14:29 -0800, Ronald S. Bultje wrote: > > > > > > > > So I'm going to have to wonder what happens to the delayed frames > already > > > > cached in h->delayed_pics[]? Are they still output? > > > > > > yes, delayed pictures are returned as long as h->delayed_pics[0] is not > > > NULL. > > > > And when you return something from delayed_pics[] instead of the new > > frame that just got decoded, the new frame gets delayed, so that you > > don't ever actually switch to low delay mode? > > Since we can't return multiple frames at once we can't switch to low > delay mode once we have delayed pictures. If low delay after switching > from a stream with delayed pictures to a low_delay stream is important > and dropping frames is an option requiring a decoder flush from sounds > reasonable to me. > That's one option. We could also simply say that low_delay can't be re-enabled once disabled, so even though the SPS says we're low-delay, just ignore it. I don't really have a preference either way, whichever is easier implementation-wise or whichever you feel is a better representation of what the end user of a non-fuzzed bitstream likely wanted to happen. (In practice, this probably never happens anyway?) Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC PATCH 1/2] h264: set parameters from SPS whenever it changes
Hi, On Fri, Nov 16, 2012 at 8:43 AM, Janne Grunau wrote: > I'm not entirely convinced that is the the correct/best way to fix the > problem with this sample. It crashes in draw_edges_10_c because the > image format and parameters has just 8 bit per pixels. > > The main problem I see is that it does far too many context re-inits. > > Janne > ---8<--- > > Fixes a crash in the fuzzed sample sample_varPAR.avi_s26638 with > alternating bit depths. > --- > libavcodec/h264.c | 88 > ++- > 1 file changed, 48 insertions(+), 40 deletions(-) > So ... I originally added code to not support changing bitdepth at all, basically to prevent this and alike issues. E.g., what if the reference is 8bit but the next frame is 10bit? What if we're using frame threading at the same time? So I wouldn't mind just erroring out altogether, unless you think this can really be made to work? I mean, if we want to support reinits, with frame threading, then we should just rewrite the init code and allow complete reinits and be convinced it always works, like we are for ffvp8. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] h264: reset has_b_frames after enabling low_delay from SPS
Hi, On Fri, Nov 16, 2012 at 8:43 AM, Janne Grunau wrote: > Fixes a crash in fuzzed file nasa-8s2.ts_s20033 caused by a too large > has_b_frames value. low_delay keeps getting re-enabled from the the > presumely broken SPS. > --- > libavcodec/h264.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/libavcodec/h264.c b/libavcodec/h264.c > index c30c478..7935fe6 100644 > --- a/libavcodec/h264.c > +++ b/libavcodec/h264.c > @@ -2346,8 +2346,10 @@ static int h264_set_parameter_from_sps(H264Context > *h) > > if (s->flags & CODEC_FLAG_LOW_DELAY || > (h->sps.bitstream_restriction_flag && > - !h->sps.num_reorder_frames)) > + !h->sps.num_reorder_frames)) { > s->low_delay = 1; > +s->avctx->has_b_frames = 0; > +} > So I'm going to have to wonder what happens to the delayed frames already cached in h->delayed_pics[]? Are they still output? What happens if we concatenate a delayed h264 file by a file that has this kind of sps header? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86: lavr: fix stack allocation for 7 and 8 channel downmixing on x86-32
Hi, On Mon, Nov 12, 2012 at 12:41 PM, Luca Barbato wrote: > On 11/12/2012 07:35 PM, Justin Ruggles wrote: > > Fixes crashes on Win32 and stack overruns on x86-32 in general. > > Who can test it? > http://bugzilla.libav.org/show_bug.cgi?id=338 Reproducible on x86-32 Linux/Mac with asan. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86: lavr: fix stack allocation for 7 and 8 channel downmixing on x86-32
Hi, On Thu, Nov 8, 2012 at 2:17 PM, Justin Ruggles wrote: > On 10/29/2012 04:39 PM, Justin Ruggles wrote: > > From: Ronald S. Bultje > > > > Fixes crashes on Win32 and stack overruns on x86-32 in general. > > --- > > libavresample/x86/audio_mix.asm |7 +++ > > 1 files changed, 7 insertions(+), 0 deletions(-) > > > > diff --git a/libavresample/x86/audio_mix.asm > b/libavresample/x86/audio_mix.asm > > index 0c4a9bd..cdc787d 100644 > > --- a/libavresample/x86/audio_mix.asm > > +++ b/libavresample/x86/audio_mix.asm > > @@ -280,6 +280,13 @@ cglobal mix_%1_to_%2_%3_flt, > 3,in_channels+2,needed_mmregs+matrix_elements_mm, s > > sub rsp, matrix_elements_stack * mmsize > > %else > > %assign pad matrix_elements_stack * mmsize + (mmsize - gprsize) - > (stack_offset & (mmsize - gprsize)) > > +; on x86-32 for 7 and 8 channels we need more stack space for src > pointers > > +%if ARCH_X86_32 && in_channels >= 7 > > +%assign pad pad + 0x10 > > +%define src5m [rsp+0xc0] > > +%define src6m [rsp+0xc4] > > +%define src7m [rsp+0xc8] > > +%endif > > SUB rsp, pad > > %endif > > %endif > > ping Well it's my patch and you reviewed it, isn't that enough? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86: h264: Convert 8-bit QPEL inline assembly to YASM
Hi, On Mon, Nov 5, 2012 at 10:48 AM, Diego Biurrun wrote: > +%macro QPEL8_H_LOWPASS_OP 1 > +cglobal %1_h264_qpel8_h_lowpass, 4,5 ; dst, src, dstStride, srcStride > +mov r4d, 8 Each and every single one of these needs a stride sign extension like you did in dsputil.asm. Do you have numbers to compare performance of qpel (both with START/STOP_TIMER as well as with total wallclock time, e.g. using the "time" command) before/after this patch? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86: h264_qpel: sign-extend stride argument in put_pixels16_sse2()
Hi, On Mon, Nov 5, 2012 at 1:22 PM, Diego Biurrun wrote: > --- > > This fixes the crash in fate-vp5 with the QPEL code converted to YASM. > I'm sending this separately to not spam everybody with 100+kB patches. > The patch is meant to be squashed into the QPEL yasmification. > ; void put_pixels16_sse2(uint8_t *block, const uint8_t *pixels, int > line_size, int h) > cglobal put_pixels16, 4,5,4 > +movsxdifnidnr2, r2d > lea r4, [r2*3] > .loop: No, I said each occurence of cglobal needed this, not just one of them. Also, please fix indenting of this instruction to be aligned with the line below. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/4] x86util: Add cpuflags_mmxext alias for cpuflags_mmx2
Hi, On Mon, Oct 29, 2012 at 7:28 PM, Diego Biurrun wrote: > "mmxext" is a more sensible name and more common in outside projects. > --- > libavutil/x86/x86util.asm |1 + > 1 files changed, 1 insertions(+), 0 deletions(-) No. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/4] x86: include x86inc.asm in x86util.asm
Hi, On Mon, Oct 29, 2012 at 7:28 PM, Diego Biurrun wrote: > -%include "x86inc.asm" > %include "x86util.asm" Didn't you come up with the rule that each file should contain all headers it directly takes symbols from? How do these changes fit with that scheme? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] pixfmt: support more yuva formats
Hi, On Mon, Oct 29, 2012 at 7:14 PM, Luca Barbato wrote: > Signed-off-by: Luca Barbato > --- > libavcodec/raw.c | 19 > libavformat/nut.c | 28 + > libavutil/pixdesc.c | 234 > ++ > libavutil/pixfmt.h| 28 + > libswscale/utils.c| 18 > tests/ref/lavfi/pixdesc | 18 > tests/ref/lavfi/pixfmts_copy | 18 > tests/ref/lavfi/pixfmts_null | 18 > tests/ref/lavfi/pixfmts_scale | 18 > tests/ref/lavfi/pixfmts_vflip | 18 > 10 files changed, 417 insertions(+) OK, assuming it works without new warnings under valgrind. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] swscale: support gray to 9bit and 10bit formats
Hi, On Mon, Oct 29, 2012 at 7:14 PM, Luca Barbato wrote: > With the input of Kostya and Ronald. > --- > libswscale/swscale.c | 38 -- > libswscale/swscale_unscaled.c | 32 ++-- > 2 files changed, 66 insertions(+), 4 deletions(-) So it works now, under valgrind? LGTM then. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: Add cpuflags_mmxext alias for cpuflags_mmx2
Hi, On Mon, Oct 29, 2012 at 5:56 PM, Diego Biurrun wrote: > On Mon, Oct 29, 2012 at 05:25:08PM -0700, Ronald S. Bultje wrote: >> On Mon, Oct 29, 2012 at 5:15 PM, Diego Biurrun wrote: >> > This allows using "mmxext" as name in Libav while staying compatible >> > with changes to the YASM macro infrastructure imported from x264. >> > --- >> > libavutil/x86/x86inc.asm |1 + >> > 1 files changed, 1 insertions(+), 0 deletions(-) >> > >> > --- a/libavutil/x86/x86inc.asm >> > +++ b/libavutil/x86/x86inc.asm >> > @@ -35,6 +35,7 @@ >> > ; to x264-de...@videolan.org . >> > >> > %define program_name ff >> > +%define cpuflags_mmxext cpuflags_mmx2 >> >> Any such a change to x86inc.asm is OK if and only if x264 accepts this >> change upstream. Loren? > > The x86inc.asm files from x264 and Libav are already different, x264 uses > "x264" instead of "ff" as program_name. That's the reason why I chose to > place the alias in this very spot: during upstream merges there will be > no conflicts that are not > > a) trivial to fix and > b) would have occurred anyway due to the existing difference. > > This patch is also what a room full of Libav developers accepted as a > compromise at VDD. Please don't oppose the consensus of your peers. You keep trying to push this through. No. I do the syncing, and nobody wants to take that job from me, I assure you. Every time the semantics of cglobal change by however little, someone who knows that stuff needs to go in and fix/port it. I prefer to not have to take any additional tasks on me, such as maintaining forky differences. Thus, please, no. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: Add cpuflags_mmxext alias for cpuflags_mmx2
Hi, On Mon, Oct 29, 2012 at 5:15 PM, Diego Biurrun wrote: > This allows using "mmxext" as name in Libav while staying compatible > with changes to the YASM macro infrastructure imported from x264. > --- > libavutil/x86/x86inc.asm |1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm > index 1fe9f55..2655154 100644 > --- a/libavutil/x86/x86inc.asm > +++ b/libavutil/x86/x86inc.asm > @@ -35,6 +35,7 @@ > ; to x264-de...@videolan.org . > > %define program_name ff > +%define cpuflags_mmxext cpuflags_mmx2 Any such a change to x86inc.asm is OK if and only if x264 accepts this change upstream. Loren? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] Use PRED4x4/8x8/8x8L/16x16 macros to declare x86 intrapred prototypes.
Hi, On Mon, Oct 29, 2012 at 10:25 AM, Luca Barbato wrote: > On 10/29/2012 04:44 AM, Ronald S. Bultje wrote: >> From: "Ronald S. Bultje" >> > > If you don't mind > > x86: use PRED4x4/8x8/8x8L/16x16 macros to declare intrapred prototypes > > might fit better. Will change locally. Any other comments on the code or can I commit with that modified? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] Remove usage of INIT_AVX in h264_intrapred_10bit.asm.
From: "Ronald S. Bultje" Replace INIT_AVX by INIT_XMM avx. Port the whole file to use cpuflag based function declarations. Remove (now unused) cputype argument in function declaration macros. Change function prototypes to have mmx2 instead of mmxext as suffix, since that's required by cpuflags. --- libavcodec/x86/h264_intrapred_10bit.asm | 306 libavcodec/x86/h264_intrapred_init.c| 40 ++--- 2 files changed, 177 insertions(+), 169 deletions(-) diff --git a/libavcodec/x86/h264_intrapred_10bit.asm b/libavcodec/x86/h264_intrapred_10bit.asm index 529134e..c3f6dc4 100644 --- a/libavcodec/x86/h264_intrapred_10bit.asm +++ b/libavcodec/x86/h264_intrapred_10bit.asm @@ -53,8 +53,8 @@ SECTION .text ;- ; void pred4x4_down_right(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_DR 1 -cglobal pred4x4_down_right_10_%1, 3,3 +%macro PRED4x4_DR 0 +cglobal pred4x4_down_right_10, 3, 3 sub r0, r2 lea r1, [r0+r2*2] movhpsm1, [r1-8] @@ -79,21 +79,22 @@ cglobal pred4x4_down_right_10_%1, 3,3 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_DR sse2 +PRED4x4_DR +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_DR ssse3 +PRED4x4_DR %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_DR avx +INIT_XMM avx +PRED4x4_DR %endif ;- ; void pred4x4_vertical_right(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_VR 1 -cglobal pred4x4_vertical_right_10_%1, 3,3,6 +%macro PRED4x4_VR 0 +cglobal pred4x4_vertical_right_10, 3, 3, 6 sub r0, r2 lea r1, [r0+r2*2] movqm5, [r0]; t3t2t1t0 @@ -119,21 +120,22 @@ cglobal pred4x4_vertical_right_10_%1, 3,3,6 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_VR sse2 +PRED4x4_VR +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_VR ssse3 +PRED4x4_VR %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_VR avx +INIT_XMM avx +PRED4x4_VR %endif ;- ; void pred4x4_horizontal_down(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_HD 1 -cglobal pred4x4_horizontal_down_10_%1, 3,3 +%macro PRED4x4_HD 0 +cglobal pred4x4_horizontal_down_10, 3, 3 subr0, r2 lear1, [r0+r2*2] movq m0, [r0-8] ; lt .. @@ -162,14 +164,15 @@ cglobal pred4x4_horizontal_down_10_%1, 3,3 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_HD sse2 +PRED4x4_HD +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_HD ssse3 +PRED4x4_HD %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_HD avx +INIT_XMM avx +PRED4x4_HD %endif ;- @@ -192,8 +195,8 @@ PRED4x4_HD avx HADDD %1, %2 %endmacro -INIT_MMX -cglobal pred4x4_dc_10_mmxext, 3,3 +INIT_MMX mmx2 +cglobal pred4x4_dc_10, 3, 3 subr0, r2 lear1, [r0+r2*2] movq m2, [r0+r2*1-8] @@ -216,8 +219,8 @@ cglobal pred4x4_dc_10_mmxext, 3,3 ;- ; void pred4x4_down_left(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_DL 1 -cglobal pred4x4_down_left_10_%1, 3,3 +%macro PRED4x4_DL 0 +cglobal pred4x4_down_left_10, 3, 3 subr0, r2 movq m0, [r0] movhps m0, [r1] @@ -236,18 +239,18 @@ cglobal pred4x4_down_left_10_%1, 3,3 RET %endmacro -INIT_XMM -PRED4x4_DL sse2 +INIT_XMM sse2 +PRED4x4_DL %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_DL avx +INIT_XMM avx +PRED4x4_DL %endif ;- ; void pred4x4_vertical_left(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_VL 1 -cglobal pred4x4_vertical_left_10_%1, 3,3 +%macro PRED4x4_VL 0 +cglobal pred4x4_vertical_left_10, 3, 3 subr0, r2 movu m1, [r0] movhps m1, [r1] @@ -265,18 +268,18 @@ cglobal pred4x4_vertical_left_10_%1, 3,3 RET %endmacro -INIT_XMM -PRED4x4_VL sse2 +INIT_XMM sse2 +PRED4x4_VL %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_VL avx +INIT_XMM avx +PRED4x4_VL %endif ;- ; void pred4x4_horizontal_up(pixel *src, const pixel *topright, int stride) ;
Re: [libav-devel] [PATCH 2/2] Use ptrdiff_t instead of int for intra pred "stride" function parameter.
Hi, On Mon, Oct 29, 2012 at 7:53 AM, Diego Biurrun wrote: > On Sun, Oct 28, 2012 at 08:44:54PM -0700, Ronald S. Bultje wrote: >> From: "Ronald S. Bultje" >> >> --- >> libavcodec/arm/h264pred_init_arm.c | 36 +++--- >> libavcodec/h264pred.c| 79 >> libavcodec/h264pred.h| 18 +-- >> libavcodec/h264pred_template.c | 233 >> +-- >> libavcodec/x86/h264_intrapred_init.c | 15 ++- >> 5 files changed, 264 insertions(+), 117 deletions(-) > > Why? So you don't have to sign-extend integers in simd functions (or suffer undefined behaviour if you forget). I've done this before for several other simd functions, use "git log". Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 2/2] Use ptrdiff_t instead of int for intra pred "stride" function parameter.
From: "Ronald S. Bultje" --- libavcodec/arm/h264pred_init_arm.c | 36 +++--- libavcodec/h264pred.c| 79 libavcodec/h264pred.h| 18 +-- libavcodec/h264pred_template.c | 233 +-- libavcodec/x86/h264_intrapred_init.c | 15 ++- 5 files changed, 264 insertions(+), 117 deletions(-) diff --git a/libavcodec/arm/h264pred_init_arm.c b/libavcodec/arm/h264pred_init_arm.c index 371b1a6..39c0121 100644 --- a/libavcodec/arm/h264pred_init_arm.c +++ b/libavcodec/arm/h264pred_init_arm.c @@ -23,25 +23,25 @@ #include "libavutil/arm/cpu.h" #include "libavcodec/h264pred.h" -void ff_pred16x16_vert_neon(uint8_t *src, int stride); -void ff_pred16x16_hor_neon(uint8_t *src, int stride); -void ff_pred16x16_plane_neon(uint8_t *src, int stride); -void ff_pred16x16_dc_neon(uint8_t *src, int stride); -void ff_pred16x16_128_dc_neon(uint8_t *src, int stride); -void ff_pred16x16_left_dc_neon(uint8_t *src, int stride); -void ff_pred16x16_top_dc_neon(uint8_t *src, int stride); +void ff_pred16x16_vert_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_hor_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_plane_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_128_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_left_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred16x16_top_dc_neon(uint8_t *src, ptrdiff_t stride); -void ff_pred8x8_vert_neon(uint8_t *src, int stride); -void ff_pred8x8_hor_neon(uint8_t *src, int stride); -void ff_pred8x8_plane_neon(uint8_t *src, int stride); -void ff_pred8x8_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_128_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_left_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_top_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_l0t_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_0lt_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_l00_dc_neon(uint8_t *src, int stride); -void ff_pred8x8_0l0_dc_neon(uint8_t *src, int stride); +void ff_pred8x8_vert_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_hor_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_plane_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_128_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_left_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_top_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_l0t_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_0lt_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_l00_dc_neon(uint8_t *src, ptrdiff_t stride); +void ff_pred8x8_0l0_dc_neon(uint8_t *src, ptrdiff_t stride); static void ff_h264_pred_init_neon(H264PredContext *h, int codec_id, const int bit_depth, const int chroma_format_idc) { diff --git a/libavcodec/h264pred.c b/libavcodec/h264pred.c index fb44046..94cf9d0 100644 --- a/libavcodec/h264pred.c +++ b/libavcodec/h264pred.c @@ -39,7 +39,9 @@ #include "h264pred_template.c" #undef BIT_DEPTH -static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright, int stride){ +static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright, + ptrdiff_t stride) +{ const unsigned lt = src[-1-1*stride]; LOAD_TOP_EDGE LOAD_TOP_RIGHT_EDGE @@ -54,7 +56,9 @@ static void pred4x4_vertical_vp8_c(uint8_t *src, const uint8_t *topright, int st AV_WN32A(src+3*stride, v); } -static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright, int stride){ +static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright, + ptrdiff_t stride) +{ const unsigned lt = src[-1-1*stride]; LOAD_LEFT_EDGE @@ -64,7 +68,9 @@ static void pred4x4_horizontal_vp8_c(uint8_t *src, const uint8_t *topright, int AV_WN32A(src+3*stride, ((l2 + 2*l3 + l3 + 2) >> 2)*0x01010101); } -static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright, int stride){ +static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright, + ptrdiff_t stride) +{ LOAD_TOP_EDGE LOAD_LEFT_EDGE @@ -86,7 +92,9 @@ static void pred4x4_down_left_svq3_c(uint8_t *src, const uint8_t *topright, int src[3+3*stride]=(l3 + t3)>>1; } -static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright, int stride){ +static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright, + ptrdiff_t stride) +{ LOAD_TOP_EDGE LOAD_TOP_RIGHT_EDGE LOAD_LEFT_EDGE @@ -110,7 +118,10 @@ static void pred4x4_down_left_rv40_c(uint8_t *src, const uint8_t *topright, int src[3+3*stride]=(t6 + t7 + 1 + l6 + l7 + 1)>>2; } -static void pred4x4_down_left_rv40_nodown_c(uint8_t
[libav-devel] [PATCH 1/2] Use PRED4x4/8x8/8x8L/16x16 macros to declare x86 intrapred prototypes.
From: "Ronald S. Bultje" --- libavcodec/x86/h264_intrapred.asm| 80 +- libavcodec/x86/h264_intrapred_init.c | 296 ++- 2 files changed, 190 insertions(+), 186 deletions(-) diff --git a/libavcodec/x86/h264_intrapred.asm b/libavcodec/x86/h264_intrapred.asm index dc418f7..7c6aa12 100644 --- a/libavcodec/x86/h264_intrapred.asm +++ b/libavcodec/x86/h264_intrapred.asm @@ -53,7 +53,7 @@ cextern pw_32 ; void pred16x16_vertical(uint8_t *src, int stride) ;- -cglobal pred16x16_vertical_mmx, 2,3 +cglobal pred16x16_vertical_8_mmx, 2,3 sub r0, r1 mov r2, 8 movq mm0, [r0+0] @@ -68,7 +68,7 @@ cglobal pred16x16_vertical_mmx, 2,3 jg .loop REP_RET -cglobal pred16x16_vertical_sse, 2,3 +cglobal pred16x16_vertical_8_sse, 2,3 sub r0, r1 mov r2, 4 movaps xmm0, [r0] @@ -88,7 +88,7 @@ cglobal pred16x16_vertical_sse, 2,3 ;- %macro PRED16x16_H 0 -cglobal pred16x16_horizontal, 2,3 +cglobal pred16x16_horizontal_8, 2,3 mov r2, 8 %if cpuflag(ssse3) mova m2, [pb_3] @@ -130,7 +130,7 @@ INIT_XMM ;- %macro PRED16x16_DC 0 -cglobal pred16x16_dc, 2,7 +cglobal pred16x16_dc_8, 2,7 mov r4, r0 sub r0, r1 pxor mm0, mm0 @@ -193,7 +193,7 @@ INIT_XMM ;- %macro PRED16x16_TM_MMX 0 -cglobal pred16x16_tm_vp8, 2,5 +cglobal pred16x16_tm_vp8_8, 2,5 subr0, r1 pxor mm7, mm7 movq mm0, [r0+0] @@ -234,7 +234,7 @@ INIT_MMX mmx2 PRED16x16_TM_MMX INIT_MMX -cglobal pred16x16_tm_vp8_sse2, 2,6,6 +cglobal pred16x16_tm_vp8_8_sse2, 2,6,6 sub r0, r1 pxor xmm2, xmm2 movdqa xmm0, [r0] @@ -274,7 +274,7 @@ cglobal pred16x16_tm_vp8_sse2, 2,6,6 ;- %macro H264_PRED16x16_PLANE 1 -cglobal pred16x16_plane_%1, 2,9,7 +cglobal pred16x16_plane_%1_8, 2,9,7 mov r2, r1 ; +stride neg r1 ; -stride @@ -556,7 +556,7 @@ INIT_XMM ;- %macro H264_PRED8x8_PLANE 0 -cglobal pred8x8_plane, 2,9,7 +cglobal pred8x8_plane_8, 2,9,7 mov r2, r1 ; +stride neg r1 ; -stride @@ -730,7 +730,7 @@ INIT_XMM ; void pred8x8_vertical(uint8_t *src, int stride) ;- -cglobal pred8x8_vertical_mmx, 2,2 +cglobal pred8x8_vertical_8_mmx, 2,2 subr0, r1 movq mm0, [r0] %rep 3 @@ -747,7 +747,7 @@ cglobal pred8x8_vertical_mmx, 2,2 ;- %macro PRED8x8_H 0 -cglobal pred8x8_horizontal, 2,3 +cglobal pred8x8_horizontal_8, 2,3 mov r2, 4 %if cpuflag(ssse3) mova m2, [pb_3] @@ -774,7 +774,7 @@ INIT_MMX ;- ; void pred8x8_top_dc_mmxext(uint8_t *src, int stride) ;- -cglobal pred8x8_top_dc_mmxext, 2,5 +cglobal pred8x8_top_dc_8_mmxext, 2,5 sub r0, r1 movq mm0, [r0] pxor mm1, mm1 @@ -809,7 +809,7 @@ cglobal pred8x8_top_dc_mmxext, 2,5 ;- INIT_MMX -cglobal pred8x8_dc_mmxext, 2,5 +cglobal pred8x8_dc_8_mmxext, 2,5 sub r0, r1 pxor m7, m7 movd m0, [r0+0] @@ -869,7 +869,7 @@ cglobal pred8x8_dc_mmxext, 2,5 ; void pred8x8_dc_rv40(uint8_t *src, int stride) ;- -cglobal pred8x8_dc_rv40_mmxext, 2,7 +cglobal pred8x8_dc_rv40_8_mmxext, 2,7 mov r4, r0 sub r0, r1 pxor mm0, mm0 @@ -906,7 +906,7 @@ cglobal pred8x8_dc_rv40_mmxext, 2,7 ;- %macro PRED8x8_TM_MMX 0 -cglobal pred8x8_tm_vp8, 2,6 +cglobal pred8x8_tm_vp8_8, 2,6 subr0, r1 pxor mm7, mm7 movq mm0, [r0] @@ -946,7 +946,7 @@ INIT_MMX mmx2 PRED8x8_TM_MMX INIT_MMX -cglobal pred8x8_tm_vp8_sse2, 2,6,4 +cglobal pred8x8_tm_vp8_8_sse2, 2,6,4 sub r0, r1 pxor xmm1, xmm1 movq xmm0, [r0] @@ -974,7 +974,7 @@ cglobal pred8x8_tm_vp8_sse2, 2,6,4 jg .loop REP_RET -cglobal pred8x8_tm_vp8_ssse3, 2,3,6 +cglobal pred8x8_tm_vp8_8_ssse3, 2,3,6 sub r0, r1 movdqa xmm4, [tm_shuf] pxor xmm1, xmm1 @@ -1016,7 +1016,7 @@ cglobal pred8x8_tm
[libav-devel] [PATCH 2/2] Remove INIT_AVX from x86inc.asm.
From: "Ronald S. Bultje" --- libavutil/x86/x86inc.asm | 8 1 file changed, 8 deletions(-) diff --git a/libavutil/x86/x86inc.asm b/libavutil/x86/x86inc.asm index d734c6e..1fe9f55 100644 --- a/libavutil/x86/x86inc.asm +++ b/libavutil/x86/x86inc.asm @@ -678,14 +678,6 @@ SECTION .note.GNU-stack noalloc noexec nowrite progbits INIT_CPUFLAGS %1 %endmacro -; FIXME: INIT_AVX can be replaced by INIT_XMM avx -%macro INIT_AVX 0 -INIT_XMM -%assign avx_enabled 1 -%define PALIGNR PALIGNR_SSSE3 -%define RESET_MM_PERMUTATION INIT_AVX -%endmacro - %macro INIT_YMM 0-1+ %assign avx_enabled 1 %define RESET_MM_PERMUTATION INIT_YMM %1 -- 1.7.11.3 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/2] Remove usage of INIT_AVX in h264_intrapred_10bit.asm.
From: "Ronald S. Bultje" --- libavcodec/x86/h264_intrapred_10bit.asm | 306 libavcodec/x86/h264_intrapred_init.c| 40 ++--- 2 files changed, 177 insertions(+), 169 deletions(-) diff --git a/libavcodec/x86/h264_intrapred_10bit.asm b/libavcodec/x86/h264_intrapred_10bit.asm index 529134e..c3f6dc4 100644 --- a/libavcodec/x86/h264_intrapred_10bit.asm +++ b/libavcodec/x86/h264_intrapred_10bit.asm @@ -53,8 +53,8 @@ SECTION .text ;- ; void pred4x4_down_right(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_DR 1 -cglobal pred4x4_down_right_10_%1, 3,3 +%macro PRED4x4_DR 0 +cglobal pred4x4_down_right_10, 3, 3 sub r0, r2 lea r1, [r0+r2*2] movhpsm1, [r1-8] @@ -79,21 +79,22 @@ cglobal pred4x4_down_right_10_%1, 3,3 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_DR sse2 +PRED4x4_DR +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_DR ssse3 +PRED4x4_DR %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_DR avx +INIT_XMM avx +PRED4x4_DR %endif ;- ; void pred4x4_vertical_right(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_VR 1 -cglobal pred4x4_vertical_right_10_%1, 3,3,6 +%macro PRED4x4_VR 0 +cglobal pred4x4_vertical_right_10, 3, 3, 6 sub r0, r2 lea r1, [r0+r2*2] movqm5, [r0]; t3t2t1t0 @@ -119,21 +120,22 @@ cglobal pred4x4_vertical_right_10_%1, 3,3,6 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_VR sse2 +PRED4x4_VR +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_VR ssse3 +PRED4x4_VR %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_VR avx +INIT_XMM avx +PRED4x4_VR %endif ;- ; void pred4x4_horizontal_down(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_HD 1 -cglobal pred4x4_horizontal_down_10_%1, 3,3 +%macro PRED4x4_HD 0 +cglobal pred4x4_horizontal_down_10, 3, 3 subr0, r2 lear1, [r0+r2*2] movq m0, [r0-8] ; lt .. @@ -162,14 +164,15 @@ cglobal pred4x4_horizontal_down_10_%1, 3,3 RET %endmacro -INIT_XMM +INIT_XMM sse2 %define PALIGNR PALIGNR_MMX -PRED4x4_HD sse2 +PRED4x4_HD +INIT_XMM ssse3 %define PALIGNR PALIGNR_SSSE3 -PRED4x4_HD ssse3 +PRED4x4_HD %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_HD avx +INIT_XMM avx +PRED4x4_HD %endif ;- @@ -192,8 +195,8 @@ PRED4x4_HD avx HADDD %1, %2 %endmacro -INIT_MMX -cglobal pred4x4_dc_10_mmxext, 3,3 +INIT_MMX mmx2 +cglobal pred4x4_dc_10, 3, 3 subr0, r2 lear1, [r0+r2*2] movq m2, [r0+r2*1-8] @@ -216,8 +219,8 @@ cglobal pred4x4_dc_10_mmxext, 3,3 ;- ; void pred4x4_down_left(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_DL 1 -cglobal pred4x4_down_left_10_%1, 3,3 +%macro PRED4x4_DL 0 +cglobal pred4x4_down_left_10, 3, 3 subr0, r2 movq m0, [r0] movhps m0, [r1] @@ -236,18 +239,18 @@ cglobal pred4x4_down_left_10_%1, 3,3 RET %endmacro -INIT_XMM -PRED4x4_DL sse2 +INIT_XMM sse2 +PRED4x4_DL %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_DL avx +INIT_XMM avx +PRED4x4_DL %endif ;- ; void pred4x4_vertical_left(pixel *src, const pixel *topright, int stride) ;- -%macro PRED4x4_VL 1 -cglobal pred4x4_vertical_left_10_%1, 3,3 +%macro PRED4x4_VL 0 +cglobal pred4x4_vertical_left_10, 3, 3 subr0, r2 movu m1, [r0] movhps m1, [r1] @@ -265,18 +268,18 @@ cglobal pred4x4_vertical_left_10_%1, 3,3 RET %endmacro -INIT_XMM -PRED4x4_VL sse2 +INIT_XMM sse2 +PRED4x4_VL %if HAVE_AVX_EXTERNAL -INIT_AVX -PRED4x4_VL avx +INIT_XMM avx +PRED4x4_VL %endif ;- ; void pred4x4_horizontal_up(pixel *src, const pixel *topright, int stride) ;- -INIT_MMX -cglobal pred4x4_horizontal_up_10_mmxext, 3,3 +INIT_MMX mmx2 +cglobal pred4x4_horizontal_up_10, 3, 3 sub r0, r2 lea r1, [r0+r2*2] movq m0, [r0+r2*1-8] @@ -309,8 +312,8 @@ cglobal pred4x4_horizontal_up_10_m
Re: [libav-devel] [PATCH 11/11] vp8: fix memset() crossing array boundary
Hi, On Fri, Oct 26, 2012 at 5:50 PM, Mans Rullgard wrote: > Indexing across array boundaries is not allowed by C99. > > Signed-off-by: Mans Rullgard > --- > libavcodec/vp8.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/libavcodec/vp8.c b/libavcodec/vp8.c > index 2766c9e..d9902c4 100644 > --- a/libavcodec/vp8.c > +++ b/libavcodec/vp8.c > @@ -1964,7 +1964,8 @@ static int vp8_decode_frame(AVCodecContext *avctx, void > *data, int *data_size, > // top edge of 127 for intra prediction > if (!(avctx->flags & CODEC_FLAG_EMU_EDGE)) { > s->top_border[0][15] = s->top_border[0][23] = 127; > -memset(s->top_border[1]-1, 127, > s->mb_width*sizeof(*s->top_border)+1); > +s->top_border[0][31] = 127; > +memset(s->top_border[1], 127, s->mb_width*sizeof(*s->top_border)); > } OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] swscale: support gray to 9bit and 10bit formats
Hi, On Fri, Oct 26, 2012 at 8:35 AM, Luca Barbato wrote: > Signed-off-by: Luca Barbato > --- > > Partially fixes the issue, there is still something wrong somewhere, possibly > a buffer overrun or yet some memory not properly set. How do we reproduce? What kind of msgs do you get in valgrind? Do 32/64bit outputs match? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] swscale: support gray to 9bit and 10bit formats
Hi, On Sun, Oct 14, 2012 at 5:22 PM, Luca Barbato wrote: > +if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf) { > +int length = dstW; > +int height = dstY - lastDstY; > +if (is9_OR_10BPS(dstFormat)) { > +const AVPixFmtDescriptor *desc = av_pix_fmt_desc_get(dstFormat); > +fill_plane9or10(dst[3], dstStride[3], length, height, lastDstY, > +255, desc->comp[3].depth_minus1 + 1, > +isBE(dstFormat)); > +} else if (is16BPS(c->dstFormat)) > + length *= 2; > + > +fillPlane(dst[3], dstStride[3], length, height, lastDstY, 255); Doesn't this call fillPlane, even if we're 9 or 10 bits? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
Hi, On Wed, Oct 24, 2012 at 10:07 AM, MĂĄns RullgĂĄrd wrote: > "Ronald S. Bultje" writes: > >> From: "Ronald S. Bultje" >> >> Use this in VP8/H264-8bit loopfilter functions so they can be used if >> there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). >> --- >> libavcodec/x86/h264_deblock.asm | 27 ++- >> libavcodec/x86/h264dsp_init.c | 4 +- >> libavcodec/x86/vp8dsp.asm | 68 >> libavcodec/x86/vp8dsp_init.c| 8 -- >> libavutil/x86/x86inc.asm| 167 >> +--- >> 5 files changed, 181 insertions(+), 93 deletions(-) > > What happened to this? Is there something wrong with the patch? I am addressing reviews from the x264 people and am somewhat slow at testing new revisions because of other work... Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] Problem with H.264 and NALU chunks
Hi, On Fri, Oct 19, 2012 at 6:55 PM, Felipe Contreras wrote: > Hi, > > While testing an mpegts stream I found what appears to be a bug in the > H.264 decoder that only happens on certain processors. > > The stream is here[1], but I've split the specific chunks I get from > the demuxer. They are not complete access-units, but I've set the flag > CODEC_FLAG2_CHUNKS. On one laptop it works perfectly fine, on another > I get a bunch of corruption, errors, and valgrind complains that wrong > memory accesses. The first one is a Core i5, the other a Core i7. > > This happens both on v0.8.3 and the master branch. > > If I use GStreamer's H.264 parser to aggregate the NAL units into > access-units, it works fine. > > Any ideas? Aren't you supposed to use the internal h264 parser to do the chunking before feeding it into the decoder if you choose to not use the built-in demuxers? (The built-in ones will automatically call the parser for you if you call av_read_frame() instead of av_read_packet().) Or in other words, is this in any way reproducible with avconv/avplay? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 09/15] swscale: avoid pointless use of compound literals
On Sun, Oct 14, 2012 at 8:11 PM, Mans Rullgard wrote: > Some compilers (e.g. old gcc) have trouble with these. > > Signed-off-by: Mans Rullgard > --- > libswscale/swscale_unscaled.c | 17 + > 1 file changed, 9 insertions(+), 8 deletions(-) > > diff --git a/libswscale/swscale_unscaled.c b/libswscale/swscale_unscaled.c > index 70eff72..5efc647 100644 > --- a/libswscale/swscale_unscaled.c > +++ b/libswscale/swscale_unscaled.c > @@ -396,6 +396,11 @@ static int planarRgbToRgbWrapper(SwsContext *c, const > uint8_t *src[], > uint8_t *dst[], int dstStride[]) > { > int alpha_first = 0; > +const uint8_t *src102[] = { src[1], src[0], src[2] }; > +const uint8_t *src201[] = { src[2], src[0], src[1] }; > +int stride102[] = { srcStride[1], srcStride[0], srcStride[2] }; > +int stride201[] = { srcStride[2], srcStride[0], srcStride[1] }; > + > if (c->srcFormat != AV_PIX_FMT_GBRP) { > av_log(c, AV_LOG_ERROR, "unsupported planar RGB conversion %s -> > %s\n", > av_get_pix_fmt_name(c->srcFormat), > @@ -405,15 +410,13 @@ static int planarRgbToRgbWrapper(SwsContext *c, const > uint8_t *src[], > > switch (c->dstFormat) { > case AV_PIX_FMT_BGR24: > -gbr24ptopacked24((const uint8_t *[]) { src[1], src[0], src[2] }, > - (int []) { srcStride[1], srcStride[0], srcStride[2] > }, > +gbr24ptopacked24(src102, stride102, > dst[0] + srcSliceY * dstStride[0], dstStride[0], > srcSliceH, c->srcW); > break; I find it slightly hilarious that there is no way in hell this would have gotten in had the compiler's name been "gcc". Why don't you use c99-to-c89 for this so-called "compiler"? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] rmdec: Move code shared with Matroska demuxer to RealMedia common code
Hi, On Sun, Oct 14, 2012 at 3:02 PM, Diego Biurrun wrote: > This removes a dependency of the Matroska demuxer on the RealMedia > demuxer while only minimally bloating a RealMedia muxer only build. > --- > libavformat/Makefile |4 ++-- > libavformat/rm.c | 41 + > libavformat/rmdec.c | 41 - > 3 files changed, 43 insertions(+), 43 deletions(-) I'm a little confused, rm.c contains code that is to be shared between rmdec and rmenc. Shouldn't you add a new file (e.g. sipr.c) that contains sipr-specific code to be shared between different demuxers (e.g. rmdec and matroskadec)? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sun, Oct 14, 2012 at 2:12 PM, Luca Barbato wrote: > On 10/14/2012 10:43 PM, Ronald S. Bultje wrote: >> Hi, >> >> On Sun, Oct 14, 2012 at 1:39 PM, Luca Barbato wrote: >>> On 10/14/2012 09:54 PM, Ronald S. Bultje wrote: >>>> Hi, >>>> >>>> On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje >>>> wrote: >>>>> Hi, >>>>> >>>>> On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje >>>>> wrote: >>>>>> Hi, >>>>>> >>>>>> On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato >>>>>> wrote: >>>>>>> --- >>>>>>> >>>>>>> Here an initial patch to support many yuva, apparently either I botched >>>>>>> adding them (since I did lots of cut and paste) or they manage the >>>>>>> expose some >>>>>>> flaws in swscale. valgrind manages to spot something and might be nice >>>>>>> if >>>>>>> other people try this and see if the values for the fate-lavfi change >>>>>>> after >>>>>>> each round. >>>>>> >>>>>> If this is what I think it is, search for YUVA420P in swscale and >>>>>> replace it with a generic alpha check. >>>>> >>>>> Oh I see now. >>>>> >>>>> swscale.c:661 >>>>> if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf) >>>>> fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, >>>>> 255); >>>>> This only works for 8bit. For 9bit, you need a word fill with a value >>>>> of (1 << numbits) - 1. >>>> >>>> Which actually leads to a question: does any of this exist in the real >>>> world? I've never, ever, ever seen an image with non-8bpc alpha >>>> planes. I hate to invent stuff that has no utility but does carry >>>> implementation burden. >>> >>> You asked me to add those ^^; >> >> Uhm, ok, maybe I meant something else? I mean, there really isn't much >> use for 9-15bpp alpha channels as far as I know. 8bpp is obvious and I >> can sort of see 16bpp being useful (although I don't think I've ever >> seen it used). >> >>> BTW while I was investigating I noticed >>> all the previous 9 and 10 formats look strange, if you have time we >>> could have a look. >> >> Can you elaborate? > > try this command > > ./avconv -v quiet -nostats -y -cpuflags all -i ./tests/vsynth1/%02d.pgm > -vcodec rawvideo -pix_fmt yuv420p10 -f nut - | ./avplay - > > You will see it blue tinted while it should be gray. > > (you might need to apply my yet to be reviewed patch to fix nut with > rawvideo). Well yes, can we first sort out this stuff without nut? I mean, I really don't have much reason to trust nut at this moment. When you test a typical conversion (convert that thing to nut without colorspace conversion, and then ./avplay -vf format=yuv420p10le), it works, so the colorspace conversion itself is correct. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sun, Oct 14, 2012 at 1:39 PM, Luca Barbato wrote: > On 10/14/2012 09:54 PM, Ronald S. Bultje wrote: >> Hi, >> >> On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje >> wrote: >>> Hi, >>> >>> On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje >>> wrote: >>>> Hi, >>>> >>>> On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato wrote: >>>>> --- >>>>> >>>>> Here an initial patch to support many yuva, apparently either I botched >>>>> adding them (since I did lots of cut and paste) or they manage the expose >>>>> some >>>>> flaws in swscale. valgrind manages to spot something and might be nice if >>>>> other people try this and see if the values for the fate-lavfi change >>>>> after >>>>> each round. >>>> >>>> If this is what I think it is, search for YUVA420P in swscale and >>>> replace it with a generic alpha check. >>> >>> Oh I see now. >>> >>> swscale.c:661 >>> if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf) >>> fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, >>> 255); >>> This only works for 8bit. For 9bit, you need a word fill with a value >>> of (1 << numbits) - 1. >> >> Which actually leads to a question: does any of this exist in the real >> world? I've never, ever, ever seen an image with non-8bpc alpha >> planes. I hate to invent stuff that has no utility but does carry >> implementation burden. > > You asked me to add those ^^; Uhm, ok, maybe I meant something else? I mean, there really isn't much use for 9-15bpp alpha channels as far as I know. 8bpp is obvious and I can sort of see 16bpp being useful (although I don't think I've ever seen it used). > BTW while I was investigating I noticed > all the previous 9 and 10 formats look strange, if you have time we > could have a look. Can you elaborate? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sun, Oct 14, 2012 at 12:53 PM, Ronald S. Bultje wrote: > Hi, > > On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje wrote: >> Hi, >> >> On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato wrote: >>> --- >>> >>> Here an initial patch to support many yuva, apparently either I botched >>> adding them (since I did lots of cut and paste) or they manage the expose >>> some >>> flaws in swscale. valgrind manages to spot something and might be nice if >>> other people try this and see if the values for the fate-lavfi change after >>> each round. >> >> If this is what I think it is, search for YUVA420P in swscale and >> replace it with a generic alpha check. > > Oh I see now. > > swscale.c:661 > if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf) > fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255); > This only works for 8bit. For 9bit, you need a word fill with a value > of (1 << numbits) - 1. Which actually leads to a question: does any of this exist in the real world? I've never, ever, ever seen an image with non-8bpc alpha planes. I hate to invent stuff that has no utility but does carry implementation burden. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sat, Oct 13, 2012 at 11:13 PM, Ronald S. Bultje wrote: > Hi, > > On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato wrote: >> --- >> >> Here an initial patch to support many yuva, apparently either I botched >> adding them (since I did lots of cut and paste) or they manage the expose >> some >> flaws in swscale. valgrind manages to spot something and might be nice if >> other people try this and see if the values for the fate-lavfi change after >> each round. > > If this is what I think it is, search for YUVA420P in swscale and > replace it with a generic alpha check. Oh I see now. swscale.c:661 if (isPlanar(dstFormat) && isALPHA(dstFormat) && !alpPixBuf) fillPlane(dst[3], dstStride[3], dstW, dstY - lastDstY, lastDstY, 255); This only works for 8bit. For 9bit, you need a word fill with a value of (1 << numbits) - 1. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [RFC] pixfmt: support more yuva formats
Hi, On Sat, Oct 13, 2012 at 11:06 PM, Luca Barbato wrote: > --- > > Here an initial patch to support many yuva, apparently either I botched > adding them (since I did lots of cut and paste) or they manage the expose some > flaws in swscale. valgrind manages to spot something and might be nice if > other people try this and see if the values for the fate-lavfi change after > each round. If this is what I think it is, search for YUVA420P in swscale and replace it with a generic alpha check. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] H.264: Convert 8-bit qpel inlined assembly to yasm
Hi, On Sat, Oct 13, 2012 at 9:05 AM, Diego Biurrun wrote: > On Sat, Oct 13, 2012 at 10:04:50AM -0500, Daniel Kang wrote: >> +%macro op_avgh 3 >> +movh %3, %2 >> +pavgb %1, %3 >> +movh %2, %1 >> +%endmacro >> + >> +%macro op_avg 3 >> +pavgb %1, %2 >> +mova %2, %1 >> +%endmacro >> + >> +%macro op_puth 3 >> +movh %2, %1 >> +%endmacro >> + >> +%macro op_put 3 >> +mova %2, %1 >> +%endmacro > > All of these macros only take 2 parameters. op_avgh takes 3. >> +.loop: >> +movh m1, [r1-1] >> +movh m2, [r1+0] >> +movh m3, [r1+1] >> +movh m0, [r1+2] > > Spaces around operators would help readability IMO; more below... No spaces around operators in assembly dereferences please. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] Fix for FFMPEG vp8dec loopfilter delta values reset at keyframes
Hi, On Fri, Oct 12, 2012 at 7:18 AM, Luca Barbato wrote: > On 10/12/2012 04:15 PM, Ronald S. Bultje wrote: >>> If you can point me where in the specification it is stated I'd like to >>> point it in the commit message. >> >> It's only natural for all codec state to be reset in a keyframe. It is >> supposed to be a re-entry point where all values are reset to >> defaults. >> >> The patch itself is LGTM, I was about to push it myself just now. > > Please do =) Pushed. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] Fix for FFMPEG vp8dec loopfilter delta values reset at keyframes
Hi, On Fri, Oct 12, 2012 at 7:12 AM, Luca Barbato wrote: > On 10/12/2012 12:42 PM, Sami Pietilä wrote: >> Hi all, >> >> while testing a VP8 encoder I found a mismatch between FFMPEG and libvpx >> VP8 decoders. The reason for this mismatch is that FFMPEG doesn't reset >> loopfilter delta values at keyframes. Patch that fixes the issue is below. >> I've verified that the output of ffmpeg after this patch matches libvpx. If >> you need I can send you a stream that shows the issue. > > Would be nice to have a small vector to add it to fate, indeed. > > The patch itself looks like that this specific detail hadn't been > considered, while implementing it. > > If you can point me where in the specification it is stated I'd like to > point it in the commit message. It's only natural for all codec state to be reset in a keyframe. It is supposed to be a re-entry point where all values are reset to defaults. The patch itself is LGTM, I was about to push it myself just now. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/5] vc1dec: Invoke edge emulation regardless of MV precision for 1-MV chroma
Hi, On Tue, Oct 9, 2012 at 4:56 PM, Mashiat Sarker Shakkhar wrote: > On 10/9/2012 7:37 PM, Ronald S. Bultje wrote: >> >> Hi, >> >> On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar >> wrote: >>> >>> This is required due to the way VC-1 handles chroma pull-back which may >>> end >>> up causing negative chroma MV for zero luma MV. Edge emulation needs to >>> be >>> invoked in such cases. >>> >>> This problem only affects chroma Y motion vector. >>> --- >>> libavcodec/vc1dec.c |2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c >>> index 42eb4a5..2683c86 100644 >>> --- a/libavcodec/vc1dec.c >>> +++ b/libavcodec/vc1dec.c >>> @@ -430,7 +430,7 @@ static void vc1_mc_1mv(VC1Context *v, int dir) >>> if (v->rangeredfrm || (v->mv_mode == MV_PMODE_INTENSITY_COMP) >>> || s->h_edge_pos < 22 || v_edge_pos < 22 >>> || (unsigned)(src_x - s->mspel) > s->h_edge_pos - (mx&3) - 16 - >>> s->mspel * 3 >>> -|| (unsigned)(src_y - s->mspel) > v_edge_pos- (my&3) - 16 - >>> s->mspel * 3) { >>> +|| (unsigned)(src_y - 1)> v_edge_pos- (my&3) - 16 - >>> 3) { >> >> >> I'm slightly curious why this is only the case for the vertical >> dimension, but not the horizontal. Is that intentional, or an >> oversight? > > > This thread contains my earlier discussion on a rejected patch which tried > to fix the same bug: > > http://patches.libav.org/patch/26240/ OK so it's not because of pullback per se, but because of field adjustment, then it makes sense. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/5] vc1dec: Set chroma reference field from REFFIELD for 1REF field pictures
Hi, On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar wrote: > Interlaced field pictures can have one or two reference pictures, signaled > by NUMREF syntax element. For single reference pictures, reference picture > is determined by REFFIELD syntax element. > --- > libavcodec/vc1dec.c |1 + > 1 file changed, 1 insertion(+) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/5] vc1dec: Set opposite to the correct value for 1REF field pictures
Hi, On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar wrote: > --- > libavcodec/vc1dec.c | 14 ++ > 1 file changed, 10 insertions(+), 4 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 5/5] Double motion vector range for HPEL interlaced picture in proper place
Hi, On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar wrote: > The existing code is not in the right place and it should cover both > interlaced frame and field pictures. > --- > libavcodec/vc1.c|5 + > libavcodec/vc1dec.c |4 > 2 files changed, 5 insertions(+), 4 deletions(-) OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/5] vc1dec: Invoke edge emulation regardless of MV precision for 1-MV chroma
Hi, On Tue, Oct 9, 2012 at 1:38 PM, Mashiat Sarker Shakkhar wrote: > This is required due to the way VC-1 handles chroma pull-back which may end > up causing negative chroma MV for zero luma MV. Edge emulation needs to be > invoked in such cases. > > This problem only affects chroma Y motion vector. > --- > libavcodec/vc1dec.c |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavcodec/vc1dec.c b/libavcodec/vc1dec.c > index 42eb4a5..2683c86 100644 > --- a/libavcodec/vc1dec.c > +++ b/libavcodec/vc1dec.c > @@ -430,7 +430,7 @@ static void vc1_mc_1mv(VC1Context *v, int dir) > if (v->rangeredfrm || (v->mv_mode == MV_PMODE_INTENSITY_COMP) > || s->h_edge_pos < 22 || v_edge_pos < 22 > || (unsigned)(src_x - s->mspel) > s->h_edge_pos - (mx&3) - 16 - > s->mspel * 3 > -|| (unsigned)(src_y - s->mspel) > v_edge_pos- (my&3) - 16 - > s->mspel * 3) { > +|| (unsigned)(src_y - 1)> v_edge_pos- (my&3) - 16 - 3) { I'm slightly curious why this is only the case for the vertical dimension, but not the horizontal. Is that intentional, or an oversight? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Support for MSVC.
Hi, On Tue, Oct 9, 2012 at 10:33 AM, aviad rozenhek wrote: > > On Thu, Jun 14, 2012 at 1:59 AM, Ronald S. Bultje > wrote: >> >> Hi, >> >> see attached. It's kinda big, some parts can/should be done >> differently, but it's a starting point so let's start talking about >> how to get this in the right way. >> >> Some general comments: >> - only tested on 32bit so far >> - no it won't compile on non-MSVC with my patch (I had to break the >> Makefile in a couple of places to get linking to work) >> - the include of stdlib.h everywhere is because of [1]. We may need to >> add av_restrict instead. >> - the include of mathematics.h, avconfig.h and avstring.h everywhere >> is (I think) legit, to account for snprintf(), M_PI or inline. >> - the added assembly in fmtconvert.asm helps to get rid of a VLA in >> fmtconvert_mmx.c. >> - test.c converts c99 to c89 which MSVC can compile. Don't look at it >> unless you want to see true pain, it's intended to be shipped outside, >> like gas-preprocessor.pl. (unit.c and unit2.c are unittests for it.) >> - the renames of libavutil/x86/cpu.c, libavcodec/x86/mlpdsp.c and >> libswscale/x86/rgb2rgb.c are to prevent duplicate filenames in the >> same library, which makes the MSVC linker crap out in debug mode >> (works fine in release mode). >> - it mostly passes fate, except zlib-dependent tests (because I didn't >> try including zlib yet) and wtv (which uses gmtime(), which is broken >> on Windows since it only supports years up to 3000, and our test file >> uses 10900.) All other tests pass. >> >> Ronald >> >> [1] >> http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/bf3c0191-c15d-4e73-a111-3796608900c7 > > > can you provide prebuilt binaries for c99-to-c89.exe for windows? it will > make using it that much easier https://gerrit.chromium.org/gerrit/#/c/34728/ You'll need c99wrap.exe also, but that one is trivial to build yourself, since it has no dependencies. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/9] pixfmt: add AV_ prefixes to PIX_FMT_*
Hi, On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov wrote: > --- > doc/APIchanges |8 ++ > libavutil/Makefile |1 + > libavutil/old_pix_fmts.h | 128 > libavutil/pixfmt.h | 298 > ++ > libavutil/version.h |7 +- > 5 files changed, 311 insertions(+), 131 deletions(-) > create mode 100644 libavutil/old_pix_fmts.h \o/ OK. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.
Hi, On Sat, Oct 6, 2012 at 9:47 PM, Anton Khirnov wrote: > > On Sat, 6 Oct 2012 21:44:03 -0700, "Ronald S. Bultje" > wrote: >> Hi, >> >> On Sat, Oct 6, 2012 at 9:19 PM, Anton Khirnov wrote: >> > >> > On Sat, 6 Oct 2012 17:01:57 -0700, "Ronald S. Bultje" >> > wrote: >> >> Hi, >> >> >> >> On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov wrote: >> >> > On Sat, 6 Oct 2012 12:35:44 -0700, "Ronald S. Bultje" >> >> > wrote: >> >> >> On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov >> >> >> wrote: >> >> >> > --- >> >> >> > avprobe.c |6 +++--- >> >> >> > cmdutils.c| 16 +++- >> >> >> > tools/graph2dot.c |4 ++-- >> >> >> > 3 files changed, 16 insertions(+), 10 deletions(-) >> >> >> > >> >> >> > diff --git a/avprobe.c b/avprobe.c >> >> >> > index 16a5d29..3a3ae0f 100644 >> >> >> > --- a/avprobe.c >> >> >> > +++ b/avprobe.c >> >> >> > @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext >> >> >> > *fmt_ctx, int stream_idx) >> >> >> > const char *profile; >> >> >> > char val_str[128]; >> >> >> > AVRational display_aspect_ratio; >> >> >> > +const AVPixFmtDescriptor *desc; >> >> >> > >> >> >> > probe_object_header("stream"); >> >> >> > >> >> >> > @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext >> >> >> > *fmt_ctx, int stream_idx) >> >> >> >rational_string(val_str, sizeof(val_str), >> >> >> > ":", >> >> >> >&display_aspect_ratio)); >> >> >> > } >> >> >> > -probe_str("pix_fmt", >> >> >> > - dec_ctx->pix_fmt != AV_PIX_FMT_NONE ? >> >> >> > - av_pix_fmt_descriptors[dec_ctx->pix_fmt].name >> >> >> > : "unknown"); >> >> >> > +desc = av_pix_fmt_desc_get(dec_ctx->pix_fmt); >> >> >> >> >> >> I think this is an awful idea, it should be OK to access the array >> >> >> directly. >> >> >> >> >> > >> >> > I don't see any advantage to accessing the array directly. >> >> >> >> It often leads to smaller code. Instead of AVPixFmtDesc *d = >> >> getter(pix_fmt); if (!d) return -1; printf("%d\n", d->something);, you >> >> can just do printf("%d\n", array[pix_fmt].something);. Do we really >> >> need to babysit our users? >> > >> > Checking for d to be non-NULL is equivalent to checking that pix_fmt is >> > valid. So no, it does not necessarily makes the code longer. >> >> >> >> > It is the only case in all Libav where we export an array like this. >> >> >> >> We never supported MSVC. So we shouldn't? >> >> >> > >> > YES! =p >> > >> > Seriously though, there are only disadvantages to exporting the array >> > directly. >> > >> >> > It fixes the size of AVPixFmtDescriptor for no good reason. >> >> >> >> Is there a need to extend it? >> >> >> > >> > There might be. You don't want to find out that you need to extend it >> > two months after somebody else just bumped lavu. >> > >> >> > It prevents the caller from accessing all the descriptors when he's >> >> > using a newer library than the one he compiled against. >> >> >> >> if (compile_ver > runtime_ver) error(); on app init. Likely, you want >> >> to do that anyway. >> >> >> >> > And I hear there's some trouble with the evil microsoft thing and data >> >> > symbols. >> >> >> >> You have that anyway for any symbol shared between lavu/f/fi/c/r/*. >> > >> > It should be our long-term goal to remove those. >> >> You can't. Stuff like ff_sqrt[], ff_log2[] etc. can't be removed. Never. >> Sorry. >> > > The same way we can never have msvc support? Right. Shall we move on to more practical patches? Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.
Hi, On Sat, Oct 6, 2012 at 9:19 PM, Anton Khirnov wrote: > > On Sat, 6 Oct 2012 17:01:57 -0700, "Ronald S. Bultje" > wrote: >> Hi, >> >> On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov wrote: >> > On Sat, 6 Oct 2012 12:35:44 -0700, "Ronald S. Bultje" >> > wrote: >> >> On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov wrote: >> >> > --- >> >> > avprobe.c |6 +++--- >> >> > cmdutils.c| 16 +++- >> >> > tools/graph2dot.c |4 ++-- >> >> > 3 files changed, 16 insertions(+), 10 deletions(-) >> >> > >> >> > diff --git a/avprobe.c b/avprobe.c >> >> > index 16a5d29..3a3ae0f 100644 >> >> > --- a/avprobe.c >> >> > +++ b/avprobe.c >> >> > @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, >> >> > int stream_idx) >> >> > const char *profile; >> >> > char val_str[128]; >> >> > AVRational display_aspect_ratio; >> >> > +const AVPixFmtDescriptor *desc; >> >> > >> >> > probe_object_header("stream"); >> >> > >> >> > @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, >> >> > int stream_idx) >> >> >rational_string(val_str, sizeof(val_str), >> >> > ":", >> >> >&display_aspect_ratio)); >> >> > } >> >> > -probe_str("pix_fmt", >> >> > - dec_ctx->pix_fmt != AV_PIX_FMT_NONE ? >> >> > - av_pix_fmt_descriptors[dec_ctx->pix_fmt].name : >> >> > "unknown"); >> >> > +desc = av_pix_fmt_desc_get(dec_ctx->pix_fmt); >> >> >> >> I think this is an awful idea, it should be OK to access the array >> >> directly. >> >> >> > >> > I don't see any advantage to accessing the array directly. >> >> It often leads to smaller code. Instead of AVPixFmtDesc *d = >> getter(pix_fmt); if (!d) return -1; printf("%d\n", d->something);, you >> can just do printf("%d\n", array[pix_fmt].something);. Do we really >> need to babysit our users? > > Checking for d to be non-NULL is equivalent to checking that pix_fmt is > valid. So no, it does not necessarily makes the code longer. >> >> > It is the only case in all Libav where we export an array like this. >> >> We never supported MSVC. So we shouldn't? >> > > YES! =p > > Seriously though, there are only disadvantages to exporting the array > directly. > >> > It fixes the size of AVPixFmtDescriptor for no good reason. >> >> Is there a need to extend it? >> > > There might be. You don't want to find out that you need to extend it > two months after somebody else just bumped lavu. > >> > It prevents the caller from accessing all the descriptors when he's >> > using a newer library than the one he compiled against. >> >> if (compile_ver > runtime_ver) error(); on app init. Likely, you want >> to do that anyway. >> >> > And I hear there's some trouble with the evil microsoft thing and data >> > symbols. >> >> You have that anyway for any symbol shared between lavu/f/fi/c/r/*. > > It should be our long-term goal to remove those. You can't. Stuff like ff_sqrt[], ff_log2[] etc. can't be removed. Never. Sorry. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.
Hi, On Sat, Oct 6, 2012 at 1:28 PM, Anton Khirnov wrote: > On Sat, 6 Oct 2012 12:35:44 -0700, "Ronald S. Bultje" > wrote: >> On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov wrote: >> > --- >> > avprobe.c |6 +++--- >> > cmdutils.c| 16 +++- >> > tools/graph2dot.c |4 ++-- >> > 3 files changed, 16 insertions(+), 10 deletions(-) >> > >> > diff --git a/avprobe.c b/avprobe.c >> > index 16a5d29..3a3ae0f 100644 >> > --- a/avprobe.c >> > +++ b/avprobe.c >> > @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int >> > stream_idx) >> > const char *profile; >> > char val_str[128]; >> > AVRational display_aspect_ratio; >> > +const AVPixFmtDescriptor *desc; >> > >> > probe_object_header("stream"); >> > >> > @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int >> > stream_idx) >> >rational_string(val_str, sizeof(val_str), ":", >> >&display_aspect_ratio)); >> > } >> > -probe_str("pix_fmt", >> > - dec_ctx->pix_fmt != AV_PIX_FMT_NONE ? >> > - av_pix_fmt_descriptors[dec_ctx->pix_fmt].name : >> > "unknown"); >> > +desc = av_pix_fmt_desc_get(dec_ctx->pix_fmt); >> >> I think this is an awful idea, it should be OK to access the array directly. >> > > I don't see any advantage to accessing the array directly. It often leads to smaller code. Instead of AVPixFmtDesc *d = getter(pix_fmt); if (!d) return -1; printf("%d\n", d->something);, you can just do printf("%d\n", array[pix_fmt].something);. Do we really need to babysit our users? > It is the only case in all Libav where we export an array like this. We never supported MSVC. So we shouldn't? > It fixes the size of AVPixFmtDescriptor for no good reason. Is there a need to extend it? > It prevents the caller from accessing all the descriptors when he's > using a newer library than the one he compiled against. if (compile_ver > runtime_ver) error(); on app init. Likely, you want to do that anyway. > And I hear there's some trouble with the evil microsoft thing and data > symbols. You have that anyway for any symbol shared between lavu/f/fi/c/r/*. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86inc: support stack mem allocation and re-alignment in PROLOGUE.
From: "Ronald S. Bultje" Use this in VP8/H264-8bit loopfilter functions so they can be used if there is no aligned stack (e.g. MSVC 32bit or ICC 10.x). --- libavcodec/x86/h264_deblock.asm | 27 ++- libavcodec/x86/h264dsp_init.c | 4 +- libavcodec/x86/vp8dsp.asm | 68 libavcodec/x86/vp8dsp_init.c| 8 -- libavutil/x86/x86inc.asm| 167 +--- 5 files changed, 181 insertions(+), 93 deletions(-) diff --git a/libavcodec/x86/h264_deblock.asm b/libavcodec/x86/h264_deblock.asm index 940a8f7..43aaf6d 100644 --- a/libavcodec/x86/h264_deblock.asm +++ b/libavcodec/x86/h264_deblock.asm @@ -399,14 +399,12 @@ DEBLOCK_LUMA ;- ; void deblock_v8_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- -cglobal deblock_%1_luma_8, 5,5 +cglobal deblock_%1_luma_8, 5, 6 - HAVE_ALIGNED_STACK, 0, 2 * %2 lea r4, [r1*3] dec r2 ; alpha-1 neg r4 dec r3 ; beta-1 add r4, r0 ; pix-3*stride -%assign pad 2*%2+12-(stack_offset&15) -SUB esp, pad movam0, [r4+r1] ; p1 movam1, [r4+2*r1] ; p0 @@ -444,22 +442,19 @@ cglobal deblock_%1_luma_8, 5,5 DEBLOCK_P0_Q0 mova[r4+2*r1], m1 mova[r0], m2 -ADD esp, pad RET ;- ; void deblock_h_luma( uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0 ) ;- INIT_MMX cpuname -cglobal deblock_h_luma_8, 0,5 +cglobal deblock_h_luma_8, 0,6 - HAVE_ALIGNED_STACK, 0, 0x70 movr0, r0mp movr3, r1m lear4, [r3*3] subr0, 4 lear1, [r0+r4] -%assign pad 0x78-(stack_offset&15) -SUBesp, pad -%define pix_tmp esp+12 +%define pix_tmp esp+16 ; transpose 6x16 -> tmp space TRANSPOSE6x8_MEM PASS8ROWS(r0, r1, r3, r4), pix_tmp @@ -501,7 +496,6 @@ cglobal deblock_h_luma_8, 0,5 movq m3, [pix_tmp+0x48] TRANSPOSE8x4B_STORE PASS8ROWS(r0, r1, r3, r4) -ADDesp, pad RET %endmacro ; DEBLOCK_LUMA @@ -632,7 +626,7 @@ DEBLOCK_LUMA v, 16 %define mpb_0 m14 %define mpb_1 m15 %else -%define spill(x) [esp+16*x+((stack_offset+4)&15)] +%define spill(x) [esp+16*x] %define p2 [r4+r1] %define q2 [r0+2*r1] %define t4 spill(0) @@ -647,10 +641,7 @@ DEBLOCK_LUMA v, 16 ;- ; void deblock_v_luma_intra( uint8_t *pix, int stride, int alpha, int beta ) ;- -cglobal deblock_%1_luma_intra_8, 4,6,16 -%if ARCH_X86_64 == 0 -sub esp, 0x60 -%endif +cglobal deblock_%1_luma_intra_8, 4, 6, 16, ARCH_X86_32 * 0x50 lea r4, [r1*4] lea r5, [r1*3] ; 3*stride dec r2d; alpha-1 @@ -699,9 +690,6 @@ cglobal deblock_%1_luma_intra_8, 4,6,16 LUMA_INTRA_SWAP_PQ LUMA_INTRA_P012 [r0], [r0+r1], [r0+2*r1], [r0+r5] .end: -%if ARCH_X86_64 == 0 -add esp, 0x60 -%endif RET INIT_MMX cpuname @@ -738,12 +726,10 @@ cglobal deblock_h_luma_intra_8, 4,9 addrsp, 0x88 RET %else -cglobal deblock_h_luma_intra_8, 2,4 +cglobal deblock_h_luma_intra_8, 2, 5 - HAVE_ALIGNED_STACK, 0, 0x80 lear3, [r1*3] subr0, 4 lear2, [r0+r3] -%assign pad 0x8c-(stack_offset&15) -SUBrsp, pad %define pix_tmp rsp ; transpose 8x16 -> tmp space @@ -774,7 +760,6 @@ cglobal deblock_h_luma_intra_8, 2,4 lear0, [r0+r1*8] lear2, [r2+r1*8] TRANSPOSE8x8_MEM PASS8ROWS(pix_tmp+8, pix_tmp+0x38, 0x10, 0x30), PASS8ROWS(r0, r2, r1, r3) -ADDrsp, pad RET %endif ; ARCH_X86_64 %endmacro ; DEBLOCK_LUMA_INTRA diff --git a/libavcodec/x86/h264dsp_init.c b/libavcodec/x86/h264dsp_init.c index 3f6ded4..8596f26 100644 --- a/libavcodec/x86/h264dsp_init.c +++ b/libavcodec/x86/h264dsp_init.c @@ -275,18 +275,16 @@ void ff_h264dsp_init_x86(H264DSPContext *c, const int bit_depth, c->biweight_h264_pixels_tab[0] = ff_h264_biweight_16_sse2; c->biweight_h264_pixels_tab[1] = ff_h264_biweight_8_sse2; -#if HAVE_ALIGNED_STACK c->h264_v_loop_filter_luma = ff_deblock_v_luma_8_sse2; c->h264_h_loop_filter_luma = ff_deblock_h_luma_8_sse2; c->h264_v_loop_filter_luma_intra = ff_deblock_v_luma_intra_8_sse2; c->h264_h_loop_filter_luma_intra = ff_deblock_h_luma_intra_8_sse2; -#endif /* HAVE_ALIGNED_STACK */ } if (EXTERNAL_SSSE3(mm_flags)) { c->bi
Re: [libav-devel] [PATCH 4/9] tools: do not use av_pix_fmt_descriptors directly.
Hi, On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov wrote: > --- > avprobe.c |6 +++--- > cmdutils.c| 16 +++- > tools/graph2dot.c |4 ++-- > 3 files changed, 16 insertions(+), 10 deletions(-) > > diff --git a/avprobe.c b/avprobe.c > index 16a5d29..3a3ae0f 100644 > --- a/avprobe.c > +++ b/avprobe.c > @@ -584,6 +584,7 @@ static void show_stream(AVFormatContext *fmt_ctx, int > stream_idx) > const char *profile; > char val_str[128]; > AVRational display_aspect_ratio; > +const AVPixFmtDescriptor *desc; > > probe_object_header("stream"); > > @@ -629,9 +630,8 @@ static void show_stream(AVFormatContext *fmt_ctx, int > stream_idx) >rational_string(val_str, sizeof(val_str), ":", >&display_aspect_ratio)); > } > -probe_str("pix_fmt", > - dec_ctx->pix_fmt != AV_PIX_FMT_NONE ? > - av_pix_fmt_descriptors[dec_ctx->pix_fmt].name : > "unknown"); > +desc = av_pix_fmt_desc_get(dec_ctx->pix_fmt); I think this is an awful idea, it should be OK to access the array directly. Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/9] Replace PIX_FMT_* -> AV_PIX_FMT_*, PixelFormat -> AVPixelFormat
Hi, On Sat, Oct 6, 2012 at 9:58 AM, Anton Khirnov wrote: [..] Yay! OK. (Didn't review at all, a sed replace can't be that hard, plus this is awesome.) Ronald ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel