Bin Chen wrote:
> 在 2007-10-16二的 10:04 +0200,Denis Oliver Kropp写道:
>> Bin Chen wrote:
>>> 在 2007-10-16二的 09:23 +0200,Denis Oliver Kropp写道:
>>>> Phil Endecott wrote:
>>>>> I've also noticed that djpeg runs about 15% faster if compiled with -Os 
>>>>> rather than -O4.
>>>> On x86?
>>>>
>>>> On embedded architectures, -O2 is often better than -O3, but if you have
>>>> a very small instruction cache, -Os could be best.
>>> Its interesting, why -O2 is better?
>> O3 can produce code (loop unrolling etc.) where the cache penalty is
>> bigger than the speed improvement.
>>
> Thanks Phil, so what is loop unrolling, is it to expand a loop to repeat
> machine code to reduce the penalty from jumping back?

Yes.

> Such as 
> 
> for (i = 0;i < 5;i++) {
> do sth for i;
> }
> 
> expand to
> 
> do sth for 1
> do sth for 2
> ...
> do sth for 5
> 
> The expanded code doesn't need to do jump, so can increase the prefetch
> efficiency.

There are less instructions in total, but more instructions need to be read.

-- 
Best regards,
  Denis Oliver Kropp

.------------------------------------------.
| DirectFB - Hardware accelerated graphics |
| http://www.directfb.org/                 |
"------------------------------------------"

_______________________________________________
directfb-dev mailing list
[email protected]
http://mail.directfb.org/cgi-bin/mailman/listinfo/directfb-dev

Reply via email to