When you delete the *(DU+j)=Element line the loop is most likley removed 
entirely.  The only relevent calculations at that point are for k and 
the final value of Element.  I expect that GCC will notice that the loop 
is essentially a NOP and remove it entirely.  This is more likely the 
reason for the dramatic "performance improvement".

Ty

> On Thu, 24 Jan 2002, Arne Linder wrote:
> 
> 
>><color><param>0100,0100,0100</param>Hello everybody,
>>
>>
>>my question is a bit off topic, but since I know from the conference in
>>Milan, that there are a lot of programming specialists among us, I hope,
>>you can help me.
>>
>>
>>In my real-time application an ISR has to do a matrix multiplication in
>>real time. This is done by the following code:
>>
>>===
>>
>></color> for (j=0; j<<Np; j++)
>>
>>    {
>>
>>      Element=0.0;
>>
>>      k=(j+1)*Np;
>>
>>      for (i=0; i<<Np; i++)
>>
>>        Element+= *(P+k+i) * (*(W+i) - *(f+i));
>>
>>      *(DU+j)=Element;
>>
>>    }
>>
>>===
>>
>>
>>*DU, *W and *f are declared as double.
>>
>>Compiler options: -O2 -D__KERNEL__ -DMODULE -c -I/usr/src/rtai-
>>1.3/include -funroll-loops
>>
>>
>>With Np=100 the calculation time is 51µs. This seems to be quite high
>>for an AMD Duron processor with 900MHz. So I tried to find out, which
>>part of the routine is consuming most of the time. I realized, that
>>deleting the command "*(DU+j)=Element" reduces the processing time
>>to 9µs.
>>
>>My first thought was, that due to cache misses the time to access *DU
>>gives that huge difference, but changing "*(DU+j)=Element" into
>>"Element=*(DU+j)" also resulted in 9µs processing time. (Of course the
>>program didn't give the desired results in that case.)
>>
>>
>>In my opinion the gcc compiler (version 2.95.2 19991024) refuses to
>>unroll the loops correctly when inserting the line "*(DU+j)=Element;"
>>into the outer for-loop. Does someone have an idea to solve this
>>problem?
>>
>>
>>Greetings from Wuppertal
>>
>>
>>Arne Linder
>>
>>
>>
>><nofill>
>>Dipl.-Ing. Arne Linder
>>Labor fuer elektrische Maschinen und Antriebe
>>Fachbereich 13
>>Bergische Universitaet - Gesamthochschule Wuppertal
>>D-42097 Wuppertal
>>e-mail: [EMAIL PROTECTED]
>>-- [rtl] ---
>>To unsubscribe:
>>echo "unsubscribe rtl" | mail [EMAIL PROTECTED] OR
>>echo "unsubscribe rtl <Your_email>" | mail [EMAIL PROTECTED]
>>--
>>For more information on Real-Time Linux see:
>>http://www.rtlinux.org/
>>
>>
> 
> -- [rtl] ---
> To unsubscribe:
> echo "unsubscribe rtl" | mail [EMAIL PROTECTED] OR
> echo "unsubscribe rtl <Your_email>" | mail [EMAIL PROTECTED]
> --
> For more information on Real-Time Linux see:
> http://www.rtlinux.org/
> 



-- 
Tyson D Sawyer                             iRobot Corporation
Senior Systems Engineer                    Military Systems Division
[EMAIL PROTECTED]                         Robots for the Real World
603-532-6900 ext 206                       http://www.irobot.com

-- [rtl] ---
To unsubscribe:
echo "unsubscribe rtl" | mail [EMAIL PROTECTED] OR
echo "unsubscribe rtl <Your_email>" | mail [EMAIL PROTECTED]
--
For more information on Real-Time Linux see:
http://www.rtlinux.org/

Reply via email to