When you delete the *(DU+j)=Element line the loop is most likley removed entirely. The only relevent calculations at that point are for k and the final value of Element. I expect that GCC will notice that the loop is essentially a NOP and remove it entirely. This is more likely the reason for the dramatic "performance improvement".
Ty > On Thu, 24 Jan 2002, Arne Linder wrote: > > >><color><param>0100,0100,0100</param>Hello everybody, >> >> >>my question is a bit off topic, but since I know from the conference in >>Milan, that there are a lot of programming specialists among us, I hope, >>you can help me. >> >> >>In my real-time application an ISR has to do a matrix multiplication in >>real time. This is done by the following code: >> >>=== >> >></color> for (j=0; j<<Np; j++) >> >> { >> >> Element=0.0; >> >> k=(j+1)*Np; >> >> for (i=0; i<<Np; i++) >> >> Element+= *(P+k+i) * (*(W+i) - *(f+i)); >> >> *(DU+j)=Element; >> >> } >> >>=== >> >> >>*DU, *W and *f are declared as double. >> >>Compiler options: -O2 -D__KERNEL__ -DMODULE -c -I/usr/src/rtai- >>1.3/include -funroll-loops >> >> >>With Np=100 the calculation time is 51µs. This seems to be quite high >>for an AMD Duron processor with 900MHz. So I tried to find out, which >>part of the routine is consuming most of the time. I realized, that >>deleting the command "*(DU+j)=Element" reduces the processing time >>to 9µs. >> >>My first thought was, that due to cache misses the time to access *DU >>gives that huge difference, but changing "*(DU+j)=Element" into >>"Element=*(DU+j)" also resulted in 9µs processing time. (Of course the >>program didn't give the desired results in that case.) >> >> >>In my opinion the gcc compiler (version 2.95.2 19991024) refuses to >>unroll the loops correctly when inserting the line "*(DU+j)=Element;" >>into the outer for-loop. Does someone have an idea to solve this >>problem? >> >> >>Greetings from Wuppertal >> >> >>Arne Linder >> >> >> >><nofill> >>Dipl.-Ing. Arne Linder >>Labor fuer elektrische Maschinen und Antriebe >>Fachbereich 13 >>Bergische Universitaet - Gesamthochschule Wuppertal >>D-42097 Wuppertal >>e-mail: [EMAIL PROTECTED] >>-- [rtl] --- >>To unsubscribe: >>echo "unsubscribe rtl" | mail [EMAIL PROTECTED] OR >>echo "unsubscribe rtl <Your_email>" | mail [EMAIL PROTECTED] >>-- >>For more information on Real-Time Linux see: >>http://www.rtlinux.org/ >> >> > > -- [rtl] --- > To unsubscribe: > echo "unsubscribe rtl" | mail [EMAIL PROTECTED] OR > echo "unsubscribe rtl <Your_email>" | mail [EMAIL PROTECTED] > -- > For more information on Real-Time Linux see: > http://www.rtlinux.org/ > -- Tyson D Sawyer iRobot Corporation Senior Systems Engineer Military Systems Division [EMAIL PROTECTED] Robots for the Real World 603-532-6900 ext 206 http://www.irobot.com -- [rtl] --- To unsubscribe: echo "unsubscribe rtl" | mail [EMAIL PROTECTED] OR echo "unsubscribe rtl <Your_email>" | mail [EMAIL PROTECTED] -- For more information on Real-Time Linux see: http://www.rtlinux.org/