Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-08-03 Thread Richard Sandiford
"Yangfei (Felix)"  writes:
> Hi Richard,
>
> Thanks for reviewing this fix and the detailed suggestions : -)
> Looks like my colleague Yunde was having some issue setting up his local 
> repo.
> I have prepared one for him.  Attached please find the patch.
> Bootstrapped and tested on aarch64-linux-gnu.  Please help install if 
> it's good to go.

Thanks, pushed to trunk.

Richard


RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-08-03 Thread Yangfei (Felix)
Hi Richard,

Thanks for reviewing this fix and the detailed suggestions : -)
Looks like my colleague Yunde was having some issue setting up his local 
repo.
I have prepared one for him.  Attached please find the patch.
Bootstrapped and tested on aarch64-linux-gnu.  Please help install if it's 
good to go.

Felix

> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Friday, July 31, 2020 5:33 PM
> To: Zhongyunde 
> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) 
> Subject: Re: [PATCH PR95696] regrename creates overlapping register
> allocations for vliw
> 
> Thanks for the update, looks good.  Could you post a changelog too so that I
> can use it when committing?
> 
> The changelog was the only reason I didn't just push the patch, but FWIW, a
> couple of very minor things…
> 
> Zhongyunde  writes:
> > diff --git a/gcc/regrename.c b/gcc/regrename.c old mode 100644 new
> > mode 100755 index 637b3cbe6d7..815ed22805d
> > --- a/gcc/regrename.c
> > +++ b/gcc/regrename.c
> > @@ -684,10 +684,12 @@ merge_chains (du_head_p c1, du_head_p c2)
> >c1->cannot_rename |= c2->cannot_rename;  }
> >
> > -/* Analyze the current function and build chains for renaming.  */
> > +/* Analyze the current function and build chains for renaming.
> > +   If INCLUDE_ALL_BLOCKS_P is set to true, should process all blocks,
> > +   ignoring BB_DISABLE_SCHEDULE.  The default value is true.  */
> 
> I think s/should// here, since GCC comments usually use an imperative style.
> 
> > @@ -737,6 +739,14 @@ regrename_analyze (bitmap bb_mask)
> >if (dump_file)
> > fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
> >
> > +  if (!include_all_block_p && (bb1->flags & BB_DISABLE_SCHEDULE) != 0)
> > +   {
> > + if (dump_file)
> > +   fprintf (dump_file, "avoid disrupting the sms schedule of bb %d\n",
> > +bb1->index);
> 
> bb1->index should be indented below “dump_file”.
> 
> Richard


pr95696-v0.diff
Description: pr95696-v0.diff


Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-31 Thread Richard Sandiford
Thanks for the update, looks good.  Could you post a changelog too
so that I can use it when committing?

The changelog was the only reason I didn't just push the patch,
but FWIW, a couple of very minor things…

Zhongyunde  writes:
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> old mode 100644
> new mode 100755
> index 637b3cbe6d7..815ed22805d
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -684,10 +684,12 @@ merge_chains (du_head_p c1, du_head_p c2)
>c1->cannot_rename |= c2->cannot_rename;
>  }
>  
> -/* Analyze the current function and build chains for renaming.  */
> +/* Analyze the current function and build chains for renaming.
> +   If INCLUDE_ALL_BLOCKS_P is set to true, should process all blocks,
> +   ignoring BB_DISABLE_SCHEDULE.  The default value is true.  */

I think s/should// here, since GCC comments usually use an imperative
style.

> @@ -737,6 +739,14 @@ regrename_analyze (bitmap bb_mask)
>if (dump_file)
>   fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
>  
> +  if (!include_all_block_p && (bb1->flags & BB_DISABLE_SCHEDULE) != 0)
> + {
> +   if (dump_file)
> + fprintf (dump_file, "avoid disrupting the sms schedule of bb %d\n",
> +  bb1->index);

bb1->index should be indented below “dump_file”.

Richard


RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-30 Thread Zhongyunde

> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Tuesday, July 28, 2020 1:33 AM
> To: Zhongyunde 
> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) 
> Subject: Re: [PATCH PR95696] regrename creates overlapping register
> allocations for vliw
> 
> Zhongyunde  writes:
> > I reconsider the issue and update patch attached.
> > Yes, If the kernel loop bb's information doesn't use in regrename, it
> > also need not be collected to save compile time.
> 
> Thanks.  The point about other callers was a good one though.
> I think it would be OK to add a new parameter to regrename_analyze that
> says whether it should process all blocks, ignoring BB_DISABLE_SCHEDULE.
> The default value could be true, with only regrename itself passing false.

Thanks for your detail advice, and I add a new parameter to regrename_analyze.

> Why do you need the BB_MODIFIED test?  The point in time that that flag
> is measured from is somewhat arbitrary.  Also, the modification that
> caused the flag to be set might not have invalidated the schedule.

Yes, in fact the BB_MODIFIED test is not need.

> Richard
> 
> >
> >> -Original Message-
> >> From: Zhongyunde
> >> Sent: Sunday, July 26, 2020 3:29 PM
> >> To: 'Richard Sandiford' 
> >> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) 
> >> Subject: RE: [PATCH PR95696] regrename creates overlapping register
> >> allocations for vliw
> >>
> >>
> >> > >> It's interesting that this is for a testcase using SMS.  One of
> >> > >> the traditional problems with the GCC implementation of SMS has
> >> > >> been ensuring that later passes don't mess up the scheduled
> >> > >> loop.  So in your testcase, does register allocation succeed for
> >> > >> the SMS loop without invalidating the bundling decisions?
> >> > >
> >> > > Yes.
> >> > >
> >> > >> If so, then it's probably better to avoid running regrename on it at
> all.
> >> > >> It mostly exists to help the second scheduling pass, but the
> >> > >> second scheduling pass shouldn't be messing with an SMS loop
> anyway.
> >> > >> Also, although the patch deals with one case in which regrename
> >> > >> could disrupt the bundling, there are others too.
> >> > >>
> >> > >> So maybe one option would be to make regrename ignore blocks
> >> > >> that have BB_DISABLE_SCHEDULE set.  (Sorry if that's been
> >> > >> discussed and discounted
> >> > >> already.)
> >> > >
> >> > > ok, according your advice, I make a new patch attached.
> >> >
> >> > Thanks.  I think we should treat the SMS and the REG_UNUSED stuff
> >> > as separate patches though.
> >> >
> >> > For the SMS part, I think a better place to enforce the rule is in
> >> > build_def_use.  If that function returns false early for
> >> > BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
> >> > block without wasting too much compile time on it, and we'll still
> >> > keep the pass structures internally correct.  (It would also be
> >> > good to have a dump_file message to say that that's what we're
> >> > doing.)
> >>
> >> > Do you still need the REG_UNUSED stuff with the SMS patch?  If so,
> >> > could you describe the (presumably non-SMS) cases that are
> affected?
> >>
> >> Yes, the non-SMS basic block should not be affected.
> >> An alternate method attached can avoid use REG_UNUSED stuff for BB
> >> with BB_DISABLE_SCHEDUL.
> >>
> >> I don't change build_def_use to return false early as I find some
> >> other optimization reuse the function regrename_analyze to creat
> >> def/use chain info of the kernel loop body in our target.
> >>
> >> > TBH, given that the bundling information is so uncertain at this
> >> > stage, I think it would be better to have a mode in which regrename
> >> > ignores REG_UNUSED notes altogether.  Perhaps we could put it
> under
> >> a
> >> > --param, which targets could then set to whichever default they
> prefer.
> >> > The default should be the current behaviour though.
> >>
> >>
> >> > Thanks,
> >> > Richard
> >
> > diff --git a/gcc/regrename.c b/gcc/regrename.c index
> > c38173a77..1df58e52d 100644
> > --- a/gcc/regrename.c
> > +++ b/gcc/regrename.c
> > @@ -737,6 +737,14 @@ regrename_analyze (bitmap bb_mask)
> >if (dump_file)
> > fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
> >
> > +  if ((bb1->flags & BB_DISABLE_SCHEDULE) != 0
> > + && (bb1->flags & BB_MODIFIED) == 0)
> > +   {
> > + if (dump_file)
> > +   fprintf (dump_file, "skip to avoid disrupting the sms schedule\n");
> > + continue;
> > +   }
> > +
> >init_rename_info (this_info, bb1);
> >
> >success = build_def_use (bb1);


PR95696_5.patch
Description: PR95696_5.patch


Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-27 Thread Richard Sandiford
Zhongyunde  writes:
> I reconsider the issue and update patch attached.
> Yes, If the kernel loop bb's information doesn't use in regrename, it
> also need not be collected to save compile time.

Thanks.  The point about other callers was a good one though.
I think it would be OK to add a new parameter to regrename_analyze
that says whether it should process all blocks, ignoring
BB_DISABLE_SCHEDULE.  The default value could be true, with only
regrename itself passing false.

Why do you need the BB_MODIFIED test?  The point in time that that
flag is measured from is somewhat arbitrary.  Also, the modification
that caused the flag to be set might not have invalidated the schedule.

Richard

>
>> -Original Message-
>> From: Zhongyunde
>> Sent: Sunday, July 26, 2020 3:29 PM
>> To: 'Richard Sandiford' 
>> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) 
>> Subject: RE: [PATCH PR95696] regrename creates overlapping register
>> allocations for vliw
>> 
>> 
>> > >> It's interesting that this is for a testcase using SMS.  One of the
>> > >> traditional problems with the GCC implementation of SMS has been
>> > >> ensuring that later passes don't mess up the scheduled loop.  So in
>> > >> your testcase, does register allocation succeed for the SMS loop
>> > >> without invalidating the bundling decisions?
>> > >
>> > > Yes.
>> > >
>> > >> If so, then it's probably better to avoid running regrename on it at 
>> > >> all.
>> > >> It mostly exists to help the second scheduling pass, but the second
>> > >> scheduling pass shouldn't be messing with an SMS loop anyway.
>> > >> Also, although the patch deals with one case in which regrename
>> > >> could disrupt the bundling, there are others too.
>> > >>
>> > >> So maybe one option would be to make regrename ignore blocks that
>> > >> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
>> > >> discounted
>> > >> already.)
>> > >
>> > > ok, according your advice, I make a new patch attached.
>> >
>> > Thanks.  I think we should treat the SMS and the REG_UNUSED stuff as
>> > separate patches though.
>> >
>> > For the SMS part, I think a better place to enforce the rule is in
>> > build_def_use.  If that function returns false early for
>> > BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
>> > block without wasting too much compile time on it, and we'll still
>> > keep the pass structures internally correct.  (It would also be good
>> > to have a dump_file message to say that that's what we're doing.)
>> 
>> > Do you still need the REG_UNUSED stuff with the SMS patch?  If so,
>> > could you describe the (presumably non-SMS) cases that are affected?
>> 
>> Yes, the non-SMS basic block should not be affected.
>> An alternate method attached can avoid use REG_UNUSED stuff for BB with
>> BB_DISABLE_SCHEDUL.
>> 
>> I don't change build_def_use to return false early as I find some other
>> optimization reuse the function regrename_analyze to creat def/use chain
>> info of the kernel loop body in our target.
>> 
>> > TBH, given that the bundling information is so uncertain at this
>> > stage, I think it would be better to have a mode in which regrename
>> > ignores REG_UNUSED notes altogether.  Perhaps we could put it under
>> a
>> > --param, which targets could then set to whichever default they prefer.
>> > The default should be the current behaviour though.
>> 
>> 
>> > Thanks,
>> > Richard
>
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> index c38173a77..1df58e52d 100644
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -737,6 +737,14 @@ regrename_analyze (bitmap bb_mask)
>if (dump_file)
>   fprintf (dump_file, "\nprocessing block %d:\n", bb1->index);
>  
> +  if ((bb1->flags & BB_DISABLE_SCHEDULE) != 0
> +   && (bb1->flags & BB_MODIFIED) == 0)
> + {
> +   if (dump_file)
> + fprintf (dump_file, "skip to avoid disrupting the sms schedule\n");
> +   continue;
> + }
> +
>init_rename_info (this_info, bb1);
>  
>success = build_def_use (bb1);


RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-26 Thread Zhongyunde
I reconsider the issue and update patch attached.
Yes, If the kernel loop bb's information doesn't use in regrename, it also need 
not be collected to save compile time.

> -Original Message-
> From: Zhongyunde
> Sent: Sunday, July 26, 2020 3:29 PM
> To: 'Richard Sandiford' 
> Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) 
> Subject: RE: [PATCH PR95696] regrename creates overlapping register
> allocations for vliw
> 
> 
> > >> It's interesting that this is for a testcase using SMS.  One of the
> > >> traditional problems with the GCC implementation of SMS has been
> > >> ensuring that later passes don't mess up the scheduled loop.  So in
> > >> your testcase, does register allocation succeed for the SMS loop
> > >> without invalidating the bundling decisions?
> > >
> > > Yes.
> > >
> > >> If so, then it's probably better to avoid running regrename on it at all.
> > >> It mostly exists to help the second scheduling pass, but the second
> > >> scheduling pass shouldn't be messing with an SMS loop anyway.
> > >> Also, although the patch deals with one case in which regrename
> > >> could disrupt the bundling, there are others too.
> > >>
> > >> So maybe one option would be to make regrename ignore blocks that
> > >> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
> > >> discounted
> > >> already.)
> > >
> > > ok, according your advice, I make a new patch attached.
> >
> > Thanks.  I think we should treat the SMS and the REG_UNUSED stuff as
> > separate patches though.
> >
> > For the SMS part, I think a better place to enforce the rule is in
> > build_def_use.  If that function returns false early for
> > BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
> > block without wasting too much compile time on it, and we'll still
> > keep the pass structures internally correct.  (It would also be good
> > to have a dump_file message to say that that's what we're doing.)
> 
> > Do you still need the REG_UNUSED stuff with the SMS patch?  If so,
> > could you describe the (presumably non-SMS) cases that are affected?
> 
> Yes, the non-SMS basic block should not be affected.
> An alternate method attached can avoid use REG_UNUSED stuff for BB with
> BB_DISABLE_SCHEDUL.
> 
> I don't change build_def_use to return false early as I find some other
> optimization reuse the function regrename_analyze to creat def/use chain
> info of the kernel loop body in our target.
> 
> > TBH, given that the bundling information is so uncertain at this
> > stage, I think it would be better to have a mode in which regrename
> > ignores REG_UNUSED notes altogether.  Perhaps we could put it under
> a
> > --param, which targets could then set to whichever default they prefer.
> > The default should be the current behaviour though.
> 
> 
> > Thanks,
> > Richard


PR95696_3.patch
Description: PR95696_3.patch


RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-26 Thread Zhongyunde

> >> It's interesting that this is for a testcase using SMS.  One of the
> >> traditional problems with the GCC implementation of SMS has been
> >> ensuring that later passes don't mess up the scheduled loop.  So in
> >> your testcase, does register allocation succeed for the SMS loop
> >> without invalidating the bundling decisions?
> >
> > Yes.
> >
> >> If so, then it's probably better to avoid running regrename on it at all.
> >> It mostly exists to help the second scheduling pass, but the second
> >> scheduling pass shouldn't be messing with an SMS loop anyway.  Also,
> >> although the patch deals with one case in which regrename could
> >> disrupt the bundling, there are others too.
> >>
> >> So maybe one option would be to make regrename ignore blocks that
> >> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
> >> discounted
> >> already.)
> >
> > ok, according your advice, I make a new patch attached.
> 
> Thanks.  I think we should treat the SMS and the REG_UNUSED stuff as
> separate patches though.
> 
> For the SMS part, I think a better place to enforce the rule is in
> build_def_use.  If that function returns false early for
> BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
> block without wasting too much compile time on it, and we'll still keep the
> pass structures internally correct.  (It would also be good to have a
> dump_file message to say that that's what we're doing.)

> Do you still need the REG_UNUSED stuff with the SMS patch?  If so, could
> you describe the (presumably non-SMS) cases that are affected?

Yes, the non-SMS basic block should not be affected. 
An alternate method attached can avoid use REG_UNUSED stuff for BB with 
BB_DISABLE_SCHEDUL.

I don't change build_def_use to return false early as I find some other 
optimization reuse the function
regrename_analyze to creat def/use chain info of the kernel loop body in our 
target.

> TBH, given that the bundling information is so uncertain at this stage, I
> think it would be better to have a mode in which regrename ignores
> REG_UNUSED notes altogether.  Perhaps we could put it under a --param,
> which targets could then set to whichever default they prefer.
> The default should be the current behaviour though.


> Thanks,
> Richard


PR95696_2.patch
Description: PR95696_2.patch


Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-22 Thread Richard Sandiford
Zhongyunde  writes:
>> -Original Message-
>> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
>> Sent: Wednesday, July 22, 2020 12:12 AM
>> To: Zhongyunde 
>> Cc: gcc-patches@gcc.gnu.org; Yangfei (A) 
>> Subject: Re: 答复: [PATCH PR95696] regrename creates overlapping
>> register allocations for vliw
>> 
>> Zhongyunde  writes:
>> > Thanks for your review.
>> >
>> > First of all, this is an optimization.
>> 
>> OK, good.
>> 
>> >gcc do sms before reload, and here each insn use pseudo-register.
>> After reload, they are allocated hard-register, then the regrename pass try
>> to adjust the register number with def/use chain created by
>> build_def_use.
>> >  As now gcc doesn't consider the VLIW bundles, so regrename pass may
>> updated a reg which may not really unused, which will bring in invalid
>> VLIW bundles.
>> >Before the final schedule, we usually recheck the validation of VLIW
>> bundles, and reschedule the conflicted insns into two VLIW to make them
>> validation to avoid above issue, so this is not a correctness issue.
>> >  Certainly, reschedule the conflicted insns into two VLIW will destroy
>> the kernel loop's sms schedule result, and usually it will be harmful to the
>> performance.
>> 
>> Yeah.  The reason I was worried about the TI markers being stale is that, in
>> general, register allocation can introduce new spills and reloads, can add
>> and remove instructions, and can convert instructions into different forms
>> (e.g. as a result of register elimination).
>> There are then post-reload optimisers that can change the code further.
>> All these things could invalidate the VLIW bundling done by the first
>> scheduler.
>> 
>> It sounds like that's not happening in your motivating testcase, and the
>> VLIW bundling is still correct (for this loop) by the time that regrename
>> runs.  Is that right?
>
> Yes, it is right.
>
>> It's interesting that this is for a testcase using SMS.  One of the 
>> traditional
>> problems with the GCC implementation of SMS has been ensuring that
>> later passes don't mess up the scheduled loop.  So in your testcase, does
>> register allocation succeed for the SMS loop without invalidating the
>> bundling decisions?
>
> Yes.
>
>> If so, then it's probably better to avoid running regrename on it at all.
>> It mostly exists to help the second scheduling pass, but the second
>> scheduling pass shouldn't be messing with an SMS loop anyway.  Also,
>> although the patch deals with one case in which regrename could disrupt
>> the bundling, there are others too.
>> 
>> So maybe one option would be to make regrename ignore blocks that
>> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
>> discounted
>> already.)
>
> ok, according your advice, I make a new patch attached.

Thanks.  I think we should treat the SMS and the REG_UNUSED stuff as
separate patches though.

For the SMS part, I think a better place to enforce the rule
is in build_def_use.  If that function returns false early for
BB_DISABLE_SCHEDULE, we will avoid disrupting the schedule for the
block without wasting too much compile time on it, and we'll still
keep the pass structures internally correct.  (It would also be good
to have a dump_file message to say that that's what we're doing.)

Do you still need the REG_UNUSED stuff with the SMS patch?  If so,
could you describe the (presumably non-SMS) cases that are affected?

TBH, given that the bundling information is so uncertain at this stage,
I think it would be better to have a mode in which regrename ignores
REG_UNUSED notes altogether.  Perhaps we could put it under a --param,
which targets could then set to whichever default they prefer.
The default should be the current behaviour though.

Thanks,
Richard


RE: RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-22 Thread Zhongyunde

> -Original Message-
> From: Richard Sandiford [mailto:richard.sandif...@arm.com]
> Sent: Wednesday, July 22, 2020 12:12 AM
> To: Zhongyunde 
> Cc: gcc-patches@gcc.gnu.org; Yangfei (A) 
> Subject: Re: 答复: [PATCH PR95696] regrename creates overlapping
> register allocations for vliw
> 
> Zhongyunde  writes:
> > Thanks for your review.
> >
> > First of all, this is an optimization.
> 
> OK, good.
> 
> >gcc do sms before reload, and here each insn use pseudo-register.
> After reload, they are allocated hard-register, then the regrename pass try
> to adjust the register number with def/use chain created by
> build_def_use.
> >  As now gcc doesn't consider the VLIW bundles, so regrename pass may
> updated a reg which may not really unused, which will bring in invalid
> VLIW bundles.
> >Before the final schedule, we usually recheck the validation of VLIW
> bundles, and reschedule the conflicted insns into two VLIW to make them
> validation to avoid above issue, so this is not a correctness issue.
> >  Certainly, reschedule the conflicted insns into two VLIW will destroy
> the kernel loop's sms schedule result, and usually it will be harmful to the
> performance.
> 
> Yeah.  The reason I was worried about the TI markers being stale is that, in
> general, register allocation can introduce new spills and reloads, can add
> and remove instructions, and can convert instructions into different forms
> (e.g. as a result of register elimination).
> There are then post-reload optimisers that can change the code further.
> All these things could invalidate the VLIW bundling done by the first
> scheduler.
> 
> It sounds like that's not happening in your motivating testcase, and the
> VLIW bundling is still correct (for this loop) by the time that regrename
> runs.  Is that right?

Yes, it is right.

> It's interesting that this is for a testcase using SMS.  One of the 
> traditional
> problems with the GCC implementation of SMS has been ensuring that
> later passes don't mess up the scheduled loop.  So in your testcase, does
> register allocation succeed for the SMS loop without invalidating the
> bundling decisions?

Yes.

> If so, then it's probably better to avoid running regrename on it at all.
> It mostly exists to help the second scheduling pass, but the second
> scheduling pass shouldn't be messing with an SMS loop anyway.  Also,
> although the patch deals with one case in which regrename could disrupt
> the bundling, there are others too.
> 
> So maybe one option would be to make regrename ignore blocks that
> have BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and
> discounted
> already.)

ok, according your advice, I make a new patch attached.

> Thanks,
> Richard


PR95696_1.patch
Description: PR95696_1.patch


Re: 答复: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-21 Thread Richard Sandiford
Zhongyunde  writes:
> Thanks for your review.
>
> First of all, this is an optimization.

OK, good.

>gcc do sms before reload, and here each insn use pseudo-register. After 
> reload, they are allocated hard-register, then the regrename pass try to 
> adjust the register number with def/use chain created by build_def_use.
>  As now gcc doesn't consider the VLIW bundles, so regrename pass may updated 
> a reg which may not really unused, which will bring in invalid VLIW bundles. 
>Before the final schedule, we usually recheck the validation of VLIW 
> bundles, and reschedule the conflicted insns into two VLIW to make them 
> validation to avoid above issue, so this is not a correctness issue. 
>  Certainly, reschedule the conflicted insns into two VLIW will destroy the 
> kernel loop's sms schedule result, and usually it will be harmful to the 
> performance.

Yeah.  The reason I was worried about the TI markers being stale is
that, in general, register allocation can introduce new spills and
reloads, can add and remove instructions, and can convert instructions
into different forms (e.g. as a result of register elimination).
There are then post-reload optimisers that can change the code further.
All these things could invalidate the VLIW bundling done by the first
scheduler.

It sounds like that's not happening in your motivating testcase, and the
VLIW bundling is still correct (for this loop) by the time that regrename
runs.  Is that right?

It's interesting that this is for a testcase using SMS.  One of the
traditional problems with the GCC implementation of SMS has been
ensuring that later passes don't mess up the scheduled loop.  So in your
testcase, does register allocation succeed for the SMS loop without
invalidating the bundling decisions?

If so, then it's probably better to avoid running regrename on it at all.
It mostly exists to help the second scheduling pass, but the second
scheduling pass shouldn't be messing with an SMS loop anyway.  Also,
although the patch deals with one case in which regrename could disrupt
the bundling, there are others too.

So maybe one option would be to make regrename ignore blocks that have
BB_DISABLE_SCHEDULE set.  (Sorry if that's been discussed and discounted
already.)

Thanks,
Richard


答复: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-21 Thread Zhongyunde
Thanks for your review.

First of all, this is an optimization.
   gcc do sms before reload, and here each insn use pseudo-register. After 
reload, they are allocated hard-register, then the regrename pass try to adjust 
the register number with def/use chain created by build_def_use.
 As now gcc doesn't consider the VLIW bundles, so regrename pass may updated a 
reg which may not really unused, which will bring in invalid VLIW bundles. 
   Before the final schedule, we usually recheck the validation of VLIW 
bundles, and reschedule the conflicted insns into two VLIW to make them 
validation to avoid above issue, so this is not a correctness issue. 
 Certainly, reschedule the conflicted insns into two VLIW will destroy the 
kernel loop's sms schedule result, and usually it will be harmful to the 
performance.


-邮件原件-
发件人: Richard Sandiford [mailto:richard.sandif...@arm.com] 
发送时间: 2020年7月21日 0:05
收件人: Zhongyunde 
抄送: gcc-patches@gcc.gnu.org; Yangfei (A) 
主题: Re: [PATCH PR95696] regrename creates overlapping register allocations for 
vliw

Hi,

Zhongyunde  writes:
> Hi,
>
> In most target, it is limited to issue two insns with change the same 
> register. So a register is not realy unused if there is another insn, which 
> set the register in the save VLIW.
>
> For example, The insn 73 start with insn:TI, so it will be issued together 
> with others insns until a new insn start with insn:TI, here is insn 243.
>
> The regrename pass known the mode V2VF in insn 73 need two successive 
> registers, i.e. v2 and v3, here is dump snippet before the regrename:
>
>
>
> ==
> 
>
> (insn:TI 73 76 71 4 (set (reg/v:V2VF 37 v2 [orig:180 _62 ] [180])
>
> (unspec:V2VF [
>
> (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
>
> (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
>
> ] UNSPEC_HFSQMAG_32X32)) "../test_modify.c":57 710 
> {hfsqmag_v2vf}
>
>  (expr_list:REG_DEAD (reg/v:VHF 43 v8 [orig:210 Dest_value ] 
> [210])
>
> (expr_list:REG_UNUSED (reg:VHF 38 v3)
>
> (expr_list:REG_STAGE (const_int 2 [0x2])
>
> (expr_list:REG_CYCLE (const_int 2 [0x2])
>
> (expr_list:REG_UNITS (const_int 256 [0x100])
>
> (nil)))
>
>
>
> (insn 71 73 243 4 (set (reg:VHF 43 v8 [orig:265 MEM[(const vfloat32x16 
> *)Src_base_134] ] [265])
>
> (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 
> MEM[(const vfloat32x16 *)Src_base_134]+0 S64 A512])) 
> "../test_modify.c":56 450 {movvhf_internal}
>
>  (expr_list:REG_STAGE (const_int 1 [0x1])
>
> (expr_list:REG_CYCLE (const_int 2 [0x2])
>
> (nil
>
>
>
> (insn:TI 243 …
>
>
>
> Then, in the regrename, the insn 71 will be transformed as following code 
> with register v3, and there is an conflict between insn 73 and insn 71 (as 
> both of them set the v3 register).
>
>
>
> Register v2 (2): 73 [SVEC_REGS]
>
> Register v8 (1): 71 [VEC_ALL_REGS]
>
>
>
> 
>
>
>
> (insn 71 73 243 4 (set (reg:VHF 38 v3 [orig:265 MEM[(const vfloat32x16 
> *)Src_base_134] ] [265])
>
> (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 
> MEM[(const vfloat32x16 *)Src_base_134]+0 S64 A512])) 
> "../test_modify.c":56 450 {movvhf_internal}
>
>  (expr_list:REG_STAGE (const_int 1 [0x1])
>
> (expr_list:REG_CYCLE (const_int 2 [0x2])

Do you need this for correctness, or is it “just” an optimisation?

The reason for asking is that regrename runs quite a long time after the first 
scheduling pass, with things like register allocation happening in between.  
The :TI markers on the instructions are therefore going to be very stale by the 
time that regrename runs.

If it's just a heuristic then it might be OK to use them anyway, but it would 
be worth having a comment to say what's going on.

One alternative would be to run the DFA over the instructions to recompute the 
VLIW bundles, but that isn't simple.  It also isn't necessarily a better 
heuristic, since the second scheduling pass is likely to change things anyway.

There's also the problem that :TI markers are used on non-VLIW targets too, 
whereas the problem is really specific to VLIW targets.

Unfortunately, I can't think of any good counter-suggestions at the moment…

Thanks,
Richard

>
> diff --git a/gcc/regrename.c b/gcc/regrename.c index 
> c38173a77..e54794413 100644
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -1614,12 +1614,26 @@ record_out_operands (rtx_insn *insn, bool 
> earlyclobber, insn_rr_info *in

Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-20 Thread Richard Sandiford
Hi,

Zhongyunde  writes:
> Hi,
>
> In most target, it is limited to issue two insns with change the same 
> register. So a register is not realy unused if there is another insn, which 
> set the register in the save VLIW.
>
> For example, The insn 73 start with insn:TI, so it will be issued together 
> with others insns until a new insn start with insn:TI, here is insn 243.
>
> The regrename pass known the mode V2VF in insn 73 need two successive 
> registers, i.e. v2 and v3, here is dump snippet before the regrename:
>
>
>
> ==
>
> (insn:TI 73 76 71 4 (set (reg/v:V2VF 37 v2 [orig:180 _62 ] [180])
>
> (unspec:V2VF [
>
> (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
>
> (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
>
> ] UNSPEC_HFSQMAG_32X32)) "../test_modify.c":57 710 {hfsqmag_v2vf}
>
>  (expr_list:REG_DEAD (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])
>
> (expr_list:REG_UNUSED (reg:VHF 38 v3)
>
> (expr_list:REG_STAGE (const_int 2 [0x2])
>
> (expr_list:REG_CYCLE (const_int 2 [0x2])
>
> (expr_list:REG_UNITS (const_int 256 [0x100])
>
> (nil)))
>
>
>
> (insn 71 73 243 4 (set (reg:VHF 43 v8 [orig:265 MEM[(const vfloat32x16 
> *)Src_base_134] ] [265])
>
> (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const 
> vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 
> {movvhf_internal}
>
>  (expr_list:REG_STAGE (const_int 1 [0x1])
>
> (expr_list:REG_CYCLE (const_int 2 [0x2])
>
> (nil
>
>
>
> (insn:TI 243 …
>
>
>
> Then, in the regrename, the insn 71 will be transformed as following code 
> with register v3, and there is an conflict between insn 73 and insn 71 (as 
> both of them set the v3 register).
>
>
>
> Register v2 (2): 73 [SVEC_REGS]
>
> Register v8 (1): 71 [VEC_ALL_REGS]
>
>
>
> 
>
>
>
> (insn 71 73 243 4 (set (reg:VHF 38 v3 [orig:265 MEM[(const vfloat32x16 
> *)Src_base_134] ] [265])
>
> (mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const 
> vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 
> {movvhf_internal}
>
>  (expr_list:REG_STAGE (const_int 1 [0x1])
>
> (expr_list:REG_CYCLE (const_int 2 [0x2])

Do you need this for correctness, or is it “just” an optimisation?

The reason for asking is that regrename runs quite a long time after
the first scheduling pass, with things like register allocation
happening in between.  The :TI markers on the instructions are
therefore going to be very stale by the time that regrename runs.

If it's just a heuristic then it might be OK to use them anyway,
but it would be worth having a comment to say what's going on.

One alternative would be to run the DFA over the instructions
to recompute the VLIW bundles, but that isn't simple.  It also isn't
necessarily a better heuristic, since the second scheduling pass is
likely to change things anyway.

There's also the problem that :TI markers are used on non-VLIW
targets too, whereas the problem is really specific to VLIW targets.

Unfortunately, I can't think of any good counter-suggestions
at the moment…

Thanks,
Richard

>
> diff --git a/gcc/regrename.c b/gcc/regrename.c
> index c38173a77..e54794413 100644
> --- a/gcc/regrename.c
> +++ b/gcc/regrename.c
> @@ -1614,12 +1614,26 @@ record_out_operands (rtx_insn *insn, bool 
> earlyclobber, insn_rr_info *insn_info)
>cur_operand = NULL;
>  }
>  
> +/* Get the first real insn of next vliw in current BB.  */
> +static rtx_insn *
> +get_next_vliw_first_insn (rtx_insn *cur_insn, basic_block bb)
> +{
> +  rtx_insn *insn = next_real_insn (cur_insn);
> +
> +  for (; insn && insn != BB_END (bb); insn = next_real_insn (insn))
> +if (GET_MODE (insn) == TImode)
> +  return insn;
> +
> +  return cur_insn;
> +}
> +
>  /* Build def/use chain.  */
>  
>  static bool
>  build_def_use (basic_block bb)
>  {
>rtx_insn *insn;
> +  rtx_insn *vliw_start_insn = NULL;
>unsigned HOST_WIDE_INT untracked_operands;
>  
>fail_current_block = false;
> @@ -1663,6 +1677,9 @@ build_def_use (basic_block bb)
>to be marked unrenamable or even cause us to abort the entire
>basic block.  */
>  
> +   if (GET_MODE (insn) == TImode)
> + vliw_start_insn = insn;
> +
> extract_constrain_insn (insn);
> preprocess_constraints (insn);
> const operand_alternative *op_alt = which_op_alt ();
> @@ -1858,17 +1875,26 @@ build_def_use (basic_block bb)
> scan_rtx (insn, &XEXP (note, 0), ALL_REGS, mark_access,
>   OP_INOUT);
>  
> -   /* Step 7: Close chains for registers that were never
> -  really used here.  */
> -   for (note = REG_NOTES (insn); note; note = XEXP (note, 1))
> - if (REG_NOTE_KIND (note) == REG_UNUSED)
> -   {
> -

[PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-19 Thread Zhongyunde
Hi,

In most target, it is limited to issue two insns with change the same register. 
So a register is not realy unused if there is another insn, which set the 
register in the save VLIW.

For example, The insn 73 start with insn:TI, so it will be issued together with 
others insns until a new insn start with insn:TI, here is insn 243.

The regrename pass known the mode V2VF in insn 73 need two successive 
registers, i.e. v2 and v3, here is dump snippet before the regrename:



==

(insn:TI 73 76 71 4 (set (reg/v:V2VF 37 v2 [orig:180 _62 ] [180])

(unspec:V2VF [

(reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])

(reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])

] UNSPEC_HFSQMAG_32X32)) "../test_modify.c":57 710 {hfsqmag_v2vf}

 (expr_list:REG_DEAD (reg/v:VHF 43 v8 [orig:210 Dest_value ] [210])

(expr_list:REG_UNUSED (reg:VHF 38 v3)

(expr_list:REG_STAGE (const_int 2 [0x2])

(expr_list:REG_CYCLE (const_int 2 [0x2])

(expr_list:REG_UNITS (const_int 256 [0x100])

(nil)))



(insn 71 73 243 4 (set (reg:VHF 43 v8 [orig:265 MEM[(const vfloat32x16 
*)Src_base_134] ] [265])

(mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const 
vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 
{movvhf_internal}

 (expr_list:REG_STAGE (const_int 1 [0x1])

(expr_list:REG_CYCLE (const_int 2 [0x2])

(nil



(insn:TI 243 …



Then, in the regrename, the insn 71 will be transformed as following code with 
register v3, and there is an conflict between insn 73 and insn 71 (as both of 
them set the v3 register).



Register v2 (2): 73 [SVEC_REGS]

Register v8 (1): 71 [VEC_ALL_REGS]







(insn 71 73 243 4 (set (reg:VHF 38 v3 [orig:265 MEM[(const vfloat32x16 
*)Src_base_134] ] [265])

(mem:VHF (reg/v/f:DI 13 a13 [orig:207 Src_base ] [207]) [1 MEM[(const 
vfloat32x16 *)Src_base_134]+0 S64 A512])) "../test_modify.c":56 450 
{movvhf_internal}

 (expr_list:REG_STAGE (const_int 1 [0x1])

(expr_list:REG_CYCLE (const_int 2 [0x2])


PR95696.patch
Description: PR95696.patch


[PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-16 Thread zhongyunde via Gcc-patches
hi,   Insometarget,itislimitedtoissuetwoinsnstogetherwithchangethesameregister,so Imakeapatchtoextendtheliverangeuntiltheendofvliwtoavoidit.(Theinsn73startwithinsn:TI,soitwillbeissuedtogetherwithothersinsnsuntilanewinsnstartwithinsn:TI,suchasinsn71)TheregrenamecanknownthemodeV2VFininsn73needtwosuccessiveregisters,i.e.v2andv3,hereisdumpsnippetbeforetheregrename.(insn:TI7376714(set(reg/v:V2VF37v2[orig:180_62][180])(unspec:V2VF[(reg/v:VHF43v8[orig:210Dest_value][210])(reg/v:VHF43v8[orig:210Dest_value][210])]UNSPEC_HFSQMAG_32X32))"../test_modify.c":57710{hfsqmag_v2vf}(expr_list:REG_DEAD(reg/v:VHF43v8[orig:210Dest_value][210])(expr_list:REG_UNUSED(reg:VHF38v3)(expr_list:REG_STAGE(const_int2[0x2])(expr_list:REG_CYCLE(const_int2[0x2])(expr_list:REG_UNITS(const_int256[0x100])(nil)))(insn71732434(set(reg:VHF43v8[orig:265MEM[(constvfloat32x16*)Src_base_134]][265])(mem:VHF(reg/v/f:DI13a13[orig:207Src_base][207])[1MEM[(constvfloat32x16*)Src_base_134]+0S64A512]))"../test_modify.c":56450{movvhf_internal}(expr_list:REG_STAGE(const_int1[0x1])(expr_list:REG_CYCLE(const_int2[0x2])(nilThen,intheregrename,theinsn71willbetransformedintofollowingcodewithregisterv3,sothereisanconflictbetweeninsn73andinsn71,asbothofthemsetthev3register.Registerv2(2):73[SVEC_REGS]Registerv8(1):71[VEC_ALL_REGS](insn71732434(set(reg:VHF38v3[orig:265MEM[(constvfloat32x16*)Src_base_134]][265])(mem:VHF(reg/v/f:DI13a13[orig:207Src_base][207])[1MEM[(constvfloat32x16*)Src_base_134]+0S64A512]))"../test_modify.c":56450{movvhf_internal}(expr_list:REG_STAGE(const_int1[0x1]) (expr_list:REG_CYCLE(const_int2[0x2]) 随心邮-在微信里收发邮件,及时省电又安心

PR95696.patch
Description: Binary data