Hi,

On Wed, Jul 6, 2011 at 4:18 PM, Jason Garrett-Glaser <[email protected]> wrote:
> On Wed, Jul 6, 2011 at 4:10 PM, Ronald S. Bultje <[email protected]> wrote:
>> Hi,
>>
>> On Wed, Jul 6, 2011 at 4:08 PM, Ronald S. Bultje <[email protected]> wrote:
>>> I believe the reason this exists is so that we duplicate the code, and
>>> shifting becomes a constant in the code, rather then a variable to be
>>> loaded into cl. This leaves cl free for others and lead to somewhat of
>>> a speedup.
>>
>> Actually, it may not be the shift alone, but also the fact that the
>> shift is completely not there for the 8bit version. So it's actually
>> the 8bit version that gets significantly faster with this, the 10bit
>> maybe not so much (or just a little bit, at best, because cl is not
>> clobbered).
>
> Except that:
>
> 1) hl_motion is ALREADY always_inline'd.  So it doesn't matter!
>
> 2) The mc_dir_part chain of functions is not fully inlined into
> hl_motion, despite the use of "inline".  If we force this, the code
> size increases by 500 kilobytes for no real speed gain.  There might
> be some gain to be had in tweaking this.

Hm, I see, and decode_mb_simple() is already av_always_inline also so
I guess this is indeed just duplication for no real gain. Weird. I
assume you tested this to not give any slowdown in regular (simple)
h264 content so then it's fine.

Ronald
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to