On Wed, Jul 6, 2011 at 4:10 PM, Ronald S. Bultje <[email protected]> wrote:
> Hi,
>
> On Wed, Jul 6, 2011 at 4:08 PM, Ronald S. Bultje <[email protected]> wrote:
>> I believe the reason this exists is so that we duplicate the code, and
>> shifting becomes a constant in the code, rather then a variable to be
>> loaded into cl. This leaves cl free for others and lead to somewhat of
>> a speedup.
>
> Actually, it may not be the shift alone, but also the fact that the
> shift is completely not there for the 8bit version. So it's actually
> the 8bit version that gets significantly faster with this, the 10bit
> maybe not so much (or just a little bit, at best, because cl is not
> clobbered).

Except that:

1) hl_motion is ALREADY always_inline'd.  So it doesn't matter!

2) The mc_dir_part chain of functions is not fully inlined into
hl_motion, despite the use of "inline".  If we force this, the code
size increases by 500 kilobytes for no real speed gain.  There might
be some gain to be had in tweaking this.

Jason
_______________________________________________
libav-devel mailing list
[email protected]
https://lists.libav.org/mailman/listinfo/libav-devel

Reply via email to