On Fri, Oct 23, 2009 at 14:46, spop at gcc dot gnu dot org <gcc-bugzi...@gcc.gnu.org> wrote: > and the code generated by CLooG for the interchange looks like this: > > for (scat_1=0;scat_1<=2;scat_1++) { > for (scat_3=0;scat_3<=2;scat_3++) { > S4(scat_1,scat_3) ; > for (scat_5=0;scat_5<=2;scat_5++) { > S5(scat_1,scat_5,scat_3) ; > } > S7(scat_1,scat_3) ; > S18(scat_1,scat_3) ; > }
S7 and S18 should not be generated before S5 finishes to execute over all the iterations of the original innermost loop (do k=1,20). S7 and S18 contain the end of the reduction and the write in the array xs(i,j) that is independent of the k loop. > for (scat_3=3;scat_3<=19;scat_3++) { > for (scat_5=0;scat_5<=2;scat_5++) { > S5(scat_1,scat_5,scat_3) ; > } > } > }