David Dyck wrote:
in the mspgcc documentation "Appendix E - Tips and tricks"
http://mspgcc.sourceforge.net/doc_appendixE.html
trick #18 ends with
Alternatively, you can mark these loops as volatile:
for(i=0;i<1234;i++) __asm__ __volatile__("; loop");
This example does not introduce any extra code and will work as you think.
If I use -O2 or -Os the loop gets unrolled somewhat still,
even if I throw in -fno-unroll-loops.
(If I add -fno-strength-reduce then the loop is not unrolled)
It is very surprising to change the limit from 1234 to 1000 and
then the loop is unrolled 25x (so the index is incremented by 25)
limit #times unrolled
1234 2
1024 32
1024 27
1000 25
I would imagine that these similar sized loops would
execute at unexpected speeds!
(Better heed the previous warning, and use timers,
or make the index volatile )
Depending exactly what you do
for(i=0;i<1234;i++) __asm__ __volatile__("; loop");
Can be optimised away completely by the compiler, if "i" is a local
variable.
The new documentation I am preparing now suggests the following:
Regardless of these optimisation issues, this type of delay loop is
poor programming style - you have no idea how long or short a delay it
might produce (although there is an obvious minimum bound!). It would
be better, and more reliable to define something like:
static void __inline__ brief_pause(register unsigned int n)
{
__asm__ __volatile__ (
"1: \n"
" dec %[n] \n"
" jne 1b \n"
: [n] "+r"(n));
}
and call this routine where needed. This is simple, compact, and
predictable.
Dmitry,
Is unrolling a loop by *25X* a good idea? That sounds like a real bulky
code generator, and unrolling by more than a few doesn't gain a lot. I
don't seem to see such extreme unrolling, but maybe I just don't hit the
right conditions in my code.
Regards,
Steve