Re: Infinite number of iterations in loop [v850, mep]
Hi, My name is Chris Gallimore and I am a headhunter. I am currently recruiting for a compiler design engineer with knowledge of GCC to join Ericsson in the UK. I was hoping that somebody here might be interested or might know somebody. If so, please drop me an email at chris.gallim...@wenhamcarter.com Best regards, Chris -- View this message in context: http://gcc.1065356.n5.nabble.com/Infinite-number-of-iterations-in-loop-v850-mep-tp984790p1003192.html Sent from the gcc - Dev mailing list archive at Nabble.com.
RE: Infinite number of iterations in loop [v850, mep]
> -Original Message- > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Paulo > Matos > Sent: 09 January 2014 16:48 > To: Richard Biener > Cc: Andrew Haley; gcc@gcc.gnu.org; Jan Hubicka > Subject: RE: Infinite number of iterations in loop [v850, mep] > > > I would like some comments on the following patch that seems to work but I > think > it could be generalized. > The idea is for the specific infinite condition of type (and reg int), we can > search for the definition of reg, > check nonzero_bits and check that they don't match any of the bits in int. > Forget the patch, I am implementing a much more robust patch with the same objective which I will post with a request for comments in a separate thread. Paulo Matos
RE: Infinite number of iterations in loop [v850, mep]
> -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 08 January 2014 14:42 > To: Paulo Matos > Cc: Andrew Haley; gcc@gcc.gnu.org; Jan Hubicka > Subject: Re: Infinite number of iterations in loop [v850, mep] > > On Wed, Jan 8, 2014 at 3:09 PM, Paulo Matos wrote: > >> -Original Message- > >> From: Richard Biener [mailto:richard.guent...@gmail.com] > >> Sent: 08 January 2014 11:03 > >> To: Paulo Matos > >> Cc: Andrew Haley; gcc@gcc.gnu.org > >> Subject: Re: Infinite number of iterations in loop [v850, mep] > >> > >> That was refering to the case with extern b. For the above case the > >> issue must be sth else. Trying a cross to v850-elf to see if it > >> reproduces for me (if 'b' is a stack or argument slot then we might > >> bogously think that *c++ = 0 may clobber it, otherwise RTL > >> number of iteration analysis might just be confused). > >> > >> So for example (should be arch independent) > >> > >> struct X { int i; int j; int k; int l[24]; }; > >> > >> int foo (struct X x, int *p) > >> { > >> int z = x.j; > >> *p = 1; > >> return z; > >> } > >> > >> see if there is a anti-dependence between x.j and *p on the RTL side > >> (at least the code dispatching to the tree oracle using the MEM_EXPRs > >> should save you from that). > >> > >> So - v850 at least doesn't pass b in memory and the doloop recognition > >> works for me (on trunk). > >> > > > > You are right, everything is fine with the above example regarding the anti- > dependence and with the loop as well. I got confused with mine not generating > a > loop for > > void fn1 (unsigned int b) > > { > > unsigned int a; > > for (a = 0; a < b; a++) > > *c++ = 0; > > } > > > > but that simply because in our case it is not profitable. > > > > However, for the case: > > void matrix_add_const(unsigned int N, short *A, short val) { > > unsigned int i,j; > > for (i=0; i > for (j=0; j >A[i*N+j] += val; > > } > > } > > } > > > > GCC thinks for v850 and my port that the inner loop might be infinite. > > It looks like GCC is mangling the loop so much that the obviousness that the > inner loop is finite is lost. > > > > This however turns out to be very performance degrading. Using -fno-ivopts > makes generation of loops work again both in my port and v850. > > Is there a way to fine-tune ivopts besides trying to tune the costs or do > > you > reckon this is something iv-analysis should be smarter about? > > Well. We have > > Loop 2 is simple: > simple exit 5 -> 7 > infinite if: (expr_list:REG_DEP_TRUE (and:SI (reg:SI 76) > (const_int 1 [0x1])) > (nil)) > number of iterations: (lshiftrt:SI (plus:SI (minus:SI (reg:SI 68 [ D.1398 ]) > (reg:SI 64 [ ivtmp___6 ])) > (const_int -2 [0xfffe])) > (const_int 1 [0x1])) > upper bound: 2147483646 > realistic bound: -1 > Doloop: Possible infinite iteration case. > Doloop: The loop is not suitable. > > as we replaced the induction variable by a pointer induction with > step 2. So this might be a very common issue for RTL loop opts, > the upper bound of the IV is 2 * N in this case, so 2 * N & 1 > should be always false and thus "infinite" be optimized. > > (insn 34 33 36 3 (parallel [ > (set (reg:SI 76) > (plus:SI (reg/v:SI 71 [ N ]) > (reg/v:SI 71 [ N ]))) > (clobber (reg:CC 32 psw)) > ]) 21 {addsi3} > (expr_list:REG_UNUSED (reg:CC 32 psw) > (nil))) > > that doesn't look too difficult to do with the above definition. > nonzero_bits might be of use here, not sure (not my area of > expertise). > I would like some comments on the following patch that seems to work but I think it could be generalized. The idea is for the specific infinite condition of type (and reg int), we can search for the definition of reg, check nonzero_bits and check that they don't match any of the bits in int. diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c index 4c34007..215fd22 100644 --- a/gcc/loop-iv.c +++ b/gcc/loop-iv.c @@ -2064,6 +2064,50 @@ simplify_using_initial_values (struct loop *loop, enum rtx_code op, rtx *expr) e = single_pred_edge (e->src); } + /* For certain patterns we can do even better, like (and (reg) 1). */ + if (GET_CODE (*expr) == AND + && REG_P (XEXP (*expr, 0)
RE: Infinite number of iterations in loop [v850, mep]
> -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 08 January 2014 14:42 > To: Paulo Matos > Cc: Andrew Haley; gcc@gcc.gnu.org; Jan Hubicka > Subject: Re: Infinite number of iterations in loop [v850, mep] > > Well. We have > > Loop 2 is simple: > simple exit 5 -> 7 > infinite if: (expr_list:REG_DEP_TRUE (and:SI (reg:SI 76) > (const_int 1 [0x1])) > (nil)) > number of iterations: (lshiftrt:SI (plus:SI (minus:SI (reg:SI 68 [ D.1398 ]) > (reg:SI 64 [ ivtmp___6 ])) > (const_int -2 [0xfffe])) > (const_int 1 [0x1])) > upper bound: 2147483646 > realistic bound: -1 > Doloop: Possible infinite iteration case. > Doloop: The loop is not suitable. > > as we replaced the induction variable by a pointer induction with > step 2. So this might be a very common issue for RTL loop opts, > the upper bound of the IV is 2 * N in this case, so 2 * N & 1 > should be always false and thus "infinite" be optimized. > > (insn 34 33 36 3 (parallel [ > (set (reg:SI 76) > (plus:SI (reg/v:SI 71 [ N ]) > (reg/v:SI 71 [ N ]))) > (clobber (reg:CC 32 psw)) > ]) 21 {addsi3} > (expr_list:REG_UNUSED (reg:CC 32 psw) > (nil))) > > that doesn't look too difficult to do with the above definition. > nonzero_bits might be of use here, not sure (not my area of > expertise). > I am trying to do something that shouldn't be too hard with the current df infrastructure but I don't think I am doing it the right way. Once I have the assumption (and:SI (reg:SI 76) (const_int 1 [0x1])) I need to reach the definition of reg 76 which is insn 34. Generally we can only do this if there if no other definition of reg 76 except one in a dominator basic block. The CFG looks like: BB2 / \ BB3 BB7 | \ BB6exit \ BB4 - /\ BB5 BB9 | \ | BB8 BB7 (exit)BB4 (loop) | BB6 (loop) BB3 contains insn 34 and there's no other definition of reg 76 in loop BB6->BB4->BB5->BB8 or the inner BB4->BB9. Is there a way to do this search for definition of reg 76 automatically? I can see a df_find_def, however this requires me to have insn 34 already. I need to search for it. Is there anything in GCC to do this already? I am sure GCC must be doing this already somewhere. Cheers, Paulo Matos > Richard. >
RE: Infinite number of iterations in loop [v850, mep]
> -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 08 January 2014 14:42 > To: Paulo Matos > Cc: Andrew Haley; gcc@gcc.gnu.org; Jan Hubicka > Subject: Re: Infinite number of iterations in loop [v850, mep] > > Well. We have > > Loop 2 is simple: > simple exit 5 -> 7 > infinite if: (expr_list:REG_DEP_TRUE (and:SI (reg:SI 76) > (const_int 1 [0x1])) > (nil)) > number of iterations: (lshiftrt:SI (plus:SI (minus:SI (reg:SI 68 [ D.1398 ]) > (reg:SI 64 [ ivtmp___6 ])) > (const_int -2 [0xfffe])) > (const_int 1 [0x1])) > upper bound: 2147483646 > realistic bound: -1 > Doloop: Possible infinite iteration case. > Doloop: The loop is not suitable. > > as we replaced the induction variable by a pointer induction with > step 2. So this might be a very common issue for RTL loop opts, > the upper bound of the IV is 2 * N in this case, so 2 * N & 1 > should be always false and thus "infinite" be optimized. > > (insn 34 33 36 3 (parallel [ > (set (reg:SI 76) > (plus:SI (reg/v:SI 71 [ N ]) > (reg/v:SI 71 [ N ]))) > (clobber (reg:CC 32 psw)) > ]) 21 {addsi3} > (expr_list:REG_UNUSED (reg:CC 32 psw) > (nil))) > > that doesn't look too difficult to do with the above definition. > nonzero_bits might be of use here, not sure (not my area of > expertise). > You're right. After having had a look at the code this looks like something to be added to simplify_using_initial_values. I will try to patch it up and will post it upstream once complete for comments. Thanks, -- Paulo Matos
Re: Infinite number of iterations in loop [v850, mep]
On Wed, Jan 8, 2014 at 3:09 PM, Paulo Matos wrote: >> -Original Message- >> From: Richard Biener [mailto:richard.guent...@gmail.com] >> Sent: 08 January 2014 11:03 >> To: Paulo Matos >> Cc: Andrew Haley; gcc@gcc.gnu.org >> Subject: Re: Infinite number of iterations in loop [v850, mep] >> >> That was refering to the case with extern b. For the above case the >> issue must be sth else. Trying a cross to v850-elf to see if it >> reproduces for me (if 'b' is a stack or argument slot then we might >> bogously think that *c++ = 0 may clobber it, otherwise RTL >> number of iteration analysis might just be confused). >> >> So for example (should be arch independent) >> >> struct X { int i; int j; int k; int l[24]; }; >> >> int foo (struct X x, int *p) >> { >> int z = x.j; >> *p = 1; >> return z; >> } >> >> see if there is a anti-dependence between x.j and *p on the RTL side >> (at least the code dispatching to the tree oracle using the MEM_EXPRs >> should save you from that). >> >> So - v850 at least doesn't pass b in memory and the doloop recognition >> works for me (on trunk). >> > > You are right, everything is fine with the above example regarding the > anti-dependence and with the loop as well. I got confused with mine not > generating a loop for > void fn1 (unsigned int b) > { > unsigned int a; > for (a = 0; a < b; a++) > *c++ = 0; > } > > but that simply because in our case it is not profitable. > > However, for the case: > void matrix_add_const(unsigned int N, short *A, short val) { > unsigned int i,j; > for (i=0; i for (j=0; jA[i*N+j] += val; > } > } > } > > GCC thinks for v850 and my port that the inner loop might be infinite. > It looks like GCC is mangling the loop so much that the obviousness that the > inner loop is finite is lost. > > This however turns out to be very performance degrading. Using -fno-ivopts > makes generation of loops work again both in my port and v850. > Is there a way to fine-tune ivopts besides trying to tune the costs or do you > reckon this is something iv-analysis should be smarter about? Well. We have Loop 2 is simple: simple exit 5 -> 7 infinite if: (expr_list:REG_DEP_TRUE (and:SI (reg:SI 76) (const_int 1 [0x1])) (nil)) number of iterations: (lshiftrt:SI (plus:SI (minus:SI (reg:SI 68 [ D.1398 ]) (reg:SI 64 [ ivtmp___6 ])) (const_int -2 [0xfffe])) (const_int 1 [0x1])) upper bound: 2147483646 realistic bound: -1 Doloop: Possible infinite iteration case. Doloop: The loop is not suitable. as we replaced the induction variable by a pointer induction with step 2. So this might be a very common issue for RTL loop opts, the upper bound of the IV is 2 * N in this case, so 2 * N & 1 should be always false and thus "infinite" be optimized. (insn 34 33 36 3 (parallel [ (set (reg:SI 76) (plus:SI (reg/v:SI 71 [ N ]) (reg/v:SI 71 [ N ]))) (clobber (reg:CC 32 psw)) ]) 21 {addsi3} (expr_list:REG_UNUSED (reg:CC 32 psw) (nil))) that doesn't look too difficult to do with the above definition. nonzero_bits might be of use here, not sure (not my area of expertise). Richard. > Paulo Matos > >> Richard. >> >> > The same situation occurs with a coremark function: >> > void matrix_add_const(ee_u32 N, MATDAT *A, MATDAT val) { >> > ee_u32 i,j; >> > for (i=0; i> > for (j=0; j> >A[i*N+j] += val; >> > } >> > } >> > } >> > >> > It also says the inner loop might be infinite but it can't N is given as >> argument. j starts as 0, N is unsigned like N. This will loop N times. GCC >> cannot >> possibly assume array A will overwrite the value of N in the loop. This seems >> like something is wrong in alias analysis. >> > >> >> -- >> >> PMatos
RE: Infinite number of iterations in loop [v850, mep]
> -Original Message- > From: Richard Biener [mailto:richard.guent...@gmail.com] > Sent: 08 January 2014 11:03 > To: Paulo Matos > Cc: Andrew Haley; gcc@gcc.gnu.org > Subject: Re: Infinite number of iterations in loop [v850, mep] > > That was refering to the case with extern b. For the above case the > issue must be sth else. Trying a cross to v850-elf to see if it > reproduces for me (if 'b' is a stack or argument slot then we might > bogously think that *c++ = 0 may clobber it, otherwise RTL > number of iteration analysis might just be confused). > > So for example (should be arch independent) > > struct X { int i; int j; int k; int l[24]; }; > > int foo (struct X x, int *p) > { > int z = x.j; > *p = 1; > return z; > } > > see if there is a anti-dependence between x.j and *p on the RTL side > (at least the code dispatching to the tree oracle using the MEM_EXPRs > should save you from that). > > So - v850 at least doesn't pass b in memory and the doloop recognition > works for me (on trunk). > You are right, everything is fine with the above example regarding the anti-dependence and with the loop as well. I got confused with mine not generating a loop for void fn1 (unsigned int b) { unsigned int a; for (a = 0; a < b; a++) *c++ = 0; } but that simply because in our case it is not profitable. However, for the case: void matrix_add_const(unsigned int N, short *A, short val) { unsigned int i,j; for (i=0; i Richard. > > > The same situation occurs with a coremark function: > > void matrix_add_const(ee_u32 N, MATDAT *A, MATDAT val) { > > ee_u32 i,j; > > for (i=0; i > for (j=0; j >A[i*N+j] += val; > > } > > } > > } > > > > It also says the inner loop might be infinite but it can't N is given as > argument. j starts as 0, N is unsigned like N. This will loop N times. GCC > cannot > possibly assume array A will overwrite the value of N in the loop. This seems > like something is wrong in alias analysis. > > > >> -- > >> PMatos
Re: Infinite number of iterations in loop [v850, mep]
On Tue, Jan 7, 2014 at 4:47 PM, Paulo Matos wrote: >> -Original Message- >> From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Paulo >> Matos >> Sent: 13 November 2013 16:14 >> To: Andrew Haley >> Cc: gcc@gcc.gnu.org >> Subject: RE: Infinite number of iterations in loop [v850, mep] >> >> > -Original Message- >> > From: Andrew Haley [mailto:a...@redhat.com] >> > Sent: 13 November 2013 15:56 >> > To: Paulo Matos >> > Cc: gcc@gcc.gnu.org >> > Subject: Re: Infinite number of iterations in loop [v850, mep] >> > >> > On 11/13/2013 03:48 PM, Paulo Matos wrote: >> > >> > Because GCC does not know that *c++ = 0; will not overwrite b . I >> > suppose you could argue that it's not really infinite, because a will >> > eventually equal 0x, but I think that's what is going on. >> > >> > Andrew. >> > >> >> >> I will try to investigate further. >> > > After re-encountering this issue something is amiss. I think this is > incorrect. > In the example: > extern int *c; > > void fn1 (unsigned int b) > { > unsigned int a; > for (a = 0; a < b; a++) > *c++ = 0; > } > > It doesn't make sense to assume *c++ will overwrite b. b is an argument to > the function. That was refering to the case with extern b. For the above case the issue must be sth else. Trying a cross to v850-elf to see if it reproduces for me (if 'b' is a stack or argument slot then we might bogously think that *c++ = 0 may clobber it, otherwise RTL number of iteration analysis might just be confused). So for example (should be arch independent) struct X { int i; int j; int k; int l[24]; }; int foo (struct X x, int *p) { int z = x.j; *p = 1; return z; } see if there is a anti-dependence between x.j and *p on the RTL side (at least the code dispatching to the tree oracle using the MEM_EXPRs should save you from that). So - v850 at least doesn't pass b in memory and the doloop recognition works for me (on trunk). Richard. > The same situation occurs with a coremark function: > void matrix_add_const(ee_u32 N, MATDAT *A, MATDAT val) { > ee_u32 i,j; > for (i=0; i for (j=0; jA[i*N+j] += val; > } > } > } > > It also says the inner loop might be infinite but it can't N is given as > argument. j starts as 0, N is unsigned like N. This will loop N times. GCC > cannot possibly assume array A will overwrite the value of N in the loop. > This seems like something is wrong in alias analysis. > >> -- >> PMatos
RE: Infinite number of iterations in loop [v850, mep]
> -Original Message- > From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of Paulo > Matos > Sent: 13 November 2013 16:14 > To: Andrew Haley > Cc: gcc@gcc.gnu.org > Subject: RE: Infinite number of iterations in loop [v850, mep] > > > -Original Message- > > From: Andrew Haley [mailto:a...@redhat.com] > > Sent: 13 November 2013 15:56 > > To: Paulo Matos > > Cc: gcc@gcc.gnu.org > > Subject: Re: Infinite number of iterations in loop [v850, mep] > > > > On 11/13/2013 03:48 PM, Paulo Matos wrote: > > > > Because GCC does not know that *c++ = 0; will not overwrite b . I > > suppose you could argue that it's not really infinite, because a will > > eventually equal 0x, but I think that's what is going on. > > > > Andrew. > > > > > I will try to investigate further. > After re-encountering this issue something is amiss. I think this is incorrect. In the example: extern int *c; void fn1 (unsigned int b) { unsigned int a; for (a = 0; a < b; a++) *c++ = 0; } It doesn't make sense to assume *c++ will overwrite b. b is an argument to the function. The same situation occurs with a coremark function: void matrix_add_const(ee_u32 N, MATDAT *A, MATDAT val) { ee_u32 i,j; for (i=0; i -- > PMatos
RE: Infinite number of iterations in loop [v850, mep]
> -Original Message- > From: Andrew Haley [mailto:a...@redhat.com] > Sent: 13 November 2013 15:56 > To: Paulo Matos > Cc: gcc@gcc.gnu.org > Subject: Re: Infinite number of iterations in loop [v850, mep] > > On 11/13/2013 03:48 PM, Paulo Matos wrote: > > Because GCC does not know that *c++ = 0; will not overwrite b . I > suppose you could argue that it's not really infinite, because a will > eventually equal 0x, but I think that's what is going on. > > Andrew. > I think you might be right, since this works: extern int * __restrict c; extern unsigned int * __restrict b; void fn1 (void) { unsigned int a; for (a = 0; a < *b; a++) *c++ = 0; } I will try to investigate further. -- PMatos
Re: Infinite number of iterations in loop [v850, mep]
On 11/13/2013 03:48 PM, Paulo Matos wrote: > I cannot understand GCC's reasoning that the second loop is not > simple. The only source code difference is that unsigned int b is > extern. However, it will always be higher than 'a' (unsigned int) > and the loop can't possibly be infinite. > > Does anybody know why GCC is behaving this way? Because GCC does not know that *c++ = 0; will not overwrite b . I suppose you could argue that it's not really infinite, because a will eventually equal 0x, but I think that's what is going on. Andrew.