On 21/04/15 7:49 PM, James Almer wrote: > On 21/04/15 7:12 PM, Michael Niedermayer wrote: >> On Tue, Apr 21, 2015 at 06:51:52PM -0300, James Almer wrote: >>> On 21/04/15 6:38 PM, Michael Niedermayer wrote: >>>> On Tue, Apr 21, 2015 at 05:45:25PM -0300, James Almer wrote: >>>>> Signed-off-by: James Almer <jamr...@gmail.com> >>>>> diff --git a/libavcodec/ffv1.h b/libavcodec/ffv1.h >>>>> index 5081397..bfc4d71 100644 >>>>> --- a/libavcodec/ffv1.h >>>>> +++ b/libavcodec/ffv1.h >>>>> @@ -143,7 +143,7 @@ static av_always_inline int fold(int diff, int bits) >>>>> diff = (int8_t)diff; >>>>> else { >>>>> diff += 1 << (bits - 1); >>>>> - diff &= (1 << bits) - 1; >>>>> + diff = av_mod_uintp2(diff, bits); >>>>> diff -= 1 << (bits - 1); >>>>> } >>>>> >>>> >>>> iam not sure some of these changes are are a good idea >>>> for example above the mask has to be calculated anyway what can be >>>> gained from av_mod_uintp2() use here ? >>>> i think this should be bechmarked when its in performance critical code >>> >>> "1 << (bits - 1)" is not the same as "(1 << bits) - 1". The latter is not >>> going to be >>> calculated if the arch optimized version of av_mod_uintp2 is used. >> >> yep, i should not review patches when i am in a (IRC) meeting >> >> before the patch fold looks like this: (for the non 8bit case) >> addl 100(%rsp), %ecx >> andl 104(%rsp), %ecx >> subl 100(%rsp), %ecx >> .L869: >> afterwards it looks like this: >> >> addl 104(%rsp), %ecx >> andl 64(%rsp), %ecx >> subl 104(%rsp), %ecx >> .L869: >> >> so it seems ok >> >> >>> >>> The one file where i don't know if there's going to be any gain is in one >>> of the opus_celt.c >>> changes, where the mask needs to be calculated even if arch optimized >>> av_mod_uintp2 is used. >>> That one can be removed, but every other change is pretty straightforward. >> >> ok, then iam fine with the patch >> >> Thanks > > I'm going to apply the changes that are straightforward (single line changes > that didn't use > mask variables prior to this patch, or that created one but then used it > once). av_mod_uintp2 > should generate the same assembly with the non bmi2 optimized version.
Done and pushed, thanks. _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel