I am looking into reversing loop to increased efficiency. There is already a PR22041 for this and an old patch https://gcc.gnu.org/ml/gcc-patches/2006-01/msg01851.html by Zdenek which never made it to mainline.
For constant loop count, ivcanon pass is adding reverse iv but this not selected by ivopt. For example: void copy (unsigned int N, double *a, double *c) { for (int i = 0; i < 800; ++i) c[i] = a[i]; } ivcanon pass Added canonical iv to loop 1, 799 iterations. ivtmp_14 = ivtmp_15 – 1; in ivopt, it selects candidates 10 Candidate 10: Var befor: ivtmp.11 Var after: ivtmp.11 Incr POS: before exit test IV struct: Type: sizetype Base: 0 Step: 8 Biv: N If we look at the group : Group 0: Type: ADDRESS Use 0.0: At stmt: _5 = *_3; At pos: *_3 IV struct: Type: double * Base: a_9(D) Step: 8 Object: (void *) a_9(D) Biv: N Overflowness wrto loop niter: Overflow Group 1: Type: ADDRESS Use 1.0: At stmt: *_4 = _5; At pos: *_4 IV struct: Type: double * Base: c_10(D) Step: 8 Object: (void *) c_10(D) Biv: N Overflowness wrto loop niter: Overflow Group 2: Type: COMPARE Use 2.0: At stmt: if (ivtmp_14 != 0) At pos: ivtmp_14 IV struct: Type: unsigned int Base: 799 Step: 4294967295 Biv: Y Overflowness wrto loop niter: Overflow ivopt cost model assumes that group0 and 1 will have infinite cost for the iv added by ivcanon pass because of the lower precision with the IV added by ivcanon pass. If I change the example to: void copy (unsigned int N, double *a, double *c) { for (long i = 0; i < 800; ++i) c[i] = a[i]; } It still has higher cost for group0 and 1 due to the negative step. I think this can be improved. My question is: 1. For the case where the loop count is not constant, can we make ivcanon to add reverse IV with the current implementation. Can ivopt be taught to select the reverse iv ? 2. Or is the patch by Zdenek a better option. I am re-basing it for the trunk. Thanks, Kugan