I am looking into reversing loop to increased efficiency. There is
already a PR22041 for this and an old patch
https://gcc.gnu.org/ml/gcc-patches/2006-01/msg01851.html by Zdenek
which never made it to mainline.

For constant loop count, ivcanon pass is adding reverse iv but this
not selected by ivopt.

For example:

void copy (unsigned int N, double *a, double *c)
{
  for (int i = 0; i < 800; ++i)
  c[i] = a[i];
}

ivcanon pass Added canonical iv to loop 1, 799 iterations.
ivtmp_14 = ivtmp_15 – 1;

in ivopt, it selects candidates 10

Candidate 10:
Var befor: ivtmp.11
Var after: ivtmp.11
Incr POS: before exit test
IV struct:
Type: sizetype
Base: 0
Step: 8
Biv: N

If we look at the group :

Group 0:
Type: ADDRESS
Use 0.0:
At stmt: _5 = *_3;
At pos: *_3
IV struct:
Type: double *
Base: a_9(D)
Step: 8
Object: (void *) a_9(D)
Biv: N
Overflowness wrto loop niter: Overflow

Group 1:
Type: ADDRESS
Use 1.0:
At stmt: *_4 = _5;
At pos: *_4
IV struct:
Type: double *
Base: c_10(D)
Step: 8
Object: (void *) c_10(D)
Biv: N
Overflowness wrto loop niter: Overflow

Group 2:
Type: COMPARE
Use 2.0:
At stmt: if (ivtmp_14 != 0)
At pos: ivtmp_14
IV struct:
Type: unsigned int
Base: 799
Step: 4294967295
Biv: Y
Overflowness wrto loop niter: Overflow

ivopt cost model assumes that group0 and 1 will have infinite cost for
the iv added by ivcanon pass because of the lower precision with the
IV added by ivcanon pass.

If I change the example to:

void copy (unsigned int N, double *a, double *c)
{
 for (long i = 0; i < 800; ++i)
 c[i] = a[i];
}

It still has higher cost for group0 and 1 due to the negative step. I
think this can be improved. My question is:

1. For the case where the loop count is not constant, can we make
ivcanon to add reverse IV with the current implementation. Can ivopt
be taught to select the reverse iv ?

2. Or is the patch by Zdenek a better option. I am re-basing it for the trunk.

Thanks,
Kugan

Reply via email to