Re: in/out operands and auto-inc-dec
> On Jun 11, 2018, at 3:04 PM, Jeff Law wrote: > > On 06/04/2018 09:02 AM, Paul Koning wrote: >> >> ... >> >> By "multiple memory operands" do you mean both source and dest in >> memory? > Yes and no :-) I suspect no real thought was given to what happens when > there's more than one auto-inc in the pattern, regardless of what > happens in the final instruction. > > I realize that in your case the operand appears twice in the RTL, but > just once in the final output. One might argue that if the operands in > the pattern are tied together via a match_dup that we ought to be able > to support it. But I doubt anyone has thought much about it. The description of matching constraint makes it clear that it's meant to be different from match_dup. The way it's described is (paraphrasing) that it acts like match_dup except if the operand it refers to has a side effect in it. In that case, it matches an operand RTX with the side effect removed. The example given is: if the referenced operand is *p++, the matching constraint looks for *p. And that indeed would be right. If I have a two-operand add instruction, (set (mem (post_inc (reg))) (add (mem (reg)) operand2))) would make sense. The documentation makes me expect such a thing to show up and be recognized by a matching constraint. But the auto-inc code doesn't do this, and what's more interesting, it seems that the old code (in flow.c) didn't either. paul
Re: in/out operands and auto-inc-dec
On 06/04/2018 09:02 AM, Paul Koning wrote: > > >> On Jun 4, 2018, at 10:09 AM, Jeff Law wrote: >> >> On 06/04/2018 08:06 AM, Paul Koning wrote: >>> >>> On Jun 4, 2018, at 9:51 AM, Jeff Law wrote: On 06/04/2018 07:31 AM, Paul Koning wrote: > The internals manual in its description of the "matching > constraint" says that it works for cases where the in and out > operands are somewhat different, such as *p++ vs. *p. > Obviously that is meant to cover post_inc side effects. > > The curious thing is that auto-inc-dec.c specifically avoids > doing this: if it finds what looks like a suitable candidate > for auto-inc or auto-dec optimization but that operand occurs > more than once in the insn, it doesn't make the change. The > result is code that's both larger and slower for machines > that have post_inc etc. addressing modes. The gccint > documentation suggests that it was the intent to optimize > this case, so I wonder why it is avoided. I wouldn't be terribly surprised if the old flow.c based auto-inc discovery handled this, but the newer auto-inc-dec.c doesn't. The docs were probably written prior to the conversion. >>> >>> That fits, because there is a reference to "the flow pass of the >>> compiler" when these constructs are introduced in section 14.16. >>> >>> So is this an omission, or is there a reason why that >>> optimization was removed? >> I would guess omission, probably on the assumption it wasn't >> terribly important and there wasn't really a good way to test it. >> There aren't many targets that use auto-inc getting a lot of >> attention these days, and those that do can't have multiple memory >> operands. > > By "multiple memory operands" do you mean both source and dest in > memory? Yes and no :-) I suspect no real thought was given to what happens when there's more than one auto-inc in the pattern, regardless of what happens in the final instruction. I realize that in your case the operand appears twice in the RTL, but just once in the final output. One might argue that if the operands in the pattern are tied together via a match_dup that we ought to be able to support it. But I doubt anyone has thought much about it. Jeff
Re: in/out operands and auto-inc-dec
> On Jun 4, 2018, at 10:09 AM, Jeff Law wrote: > > On 06/04/2018 08:06 AM, Paul Koning wrote: >> >> >>> On Jun 4, 2018, at 9:51 AM, Jeff Law wrote: >>> >>> On 06/04/2018 07:31 AM, Paul Koning wrote: The internals manual in its description of the "matching constraint" says that it works for cases where the in and out operands are somewhat different, such as *p++ vs. *p. Obviously that is meant to cover post_inc side effects. The curious thing is that auto-inc-dec.c specifically avoids doing this: if it finds what looks like a suitable candidate for auto-inc or auto-dec optimization but that operand occurs more than once in the insn, it doesn't make the change. The result is code that's both larger and slower for machines that have post_inc etc. addressing modes. The gccint documentation suggests that it was the intent to optimize this case, so I wonder why it is avoided. >>> I wouldn't be terribly surprised if the old flow.c based auto-inc >>> discovery handled this, but the newer auto-inc-dec.c doesn't. The docs >>> were probably written prior to the conversion. >> >> That fits, because there is a reference to "the flow pass of the compiler" >> when these constructs are introduced in section 14.16. >> >> So is this an omission, or is there a reason why that optimization was >> removed? > I would guess omission, probably on the assumption it wasn't terribly > important and there wasn't really a good way to test it.There aren't > many targets that use auto-inc getting a lot of attention these days, > and those that do can't have multiple memory operands. By "multiple memory operands" do you mean both source and dest in memory? Ok, but I didn't mean that specifically. The issue is on an instruction with a read/modify/write destination operand, like two-operand add. If the destination looks like a candidate for post-inc, it's skipped because it shows up twice in the RTL -- since that uses three operand notation. For example: for (int i = 0; i < n; i++) *p++ += i; produces (on pdp11): add $02, r0 add r1, -02(r0) rather than simply "add r1, (r0)+". But if I change the += to =, the expected optimization does take place. paul
Re: in/out operands and auto-inc-dec
On 06/04/2018 08:06 AM, Paul Koning wrote: > > >> On Jun 4, 2018, at 9:51 AM, Jeff Law wrote: >> >> On 06/04/2018 07:31 AM, Paul Koning wrote: >>> The internals manual in its description of the "matching constraint" says >>> that it works for cases where the in and out operands are somewhat >>> different, such as *p++ vs. *p. Obviously that is meant to cover post_inc >>> side effects. >>> >>> The curious thing is that auto-inc-dec.c specifically avoids doing this: if >>> it finds what looks like a suitable candidate for auto-inc or auto-dec >>> optimization but that operand occurs more than once in the insn, it doesn't >>> make the change. The result is code that's both larger and slower for >>> machines that have post_inc etc. addressing modes. The gccint >>> documentation suggests that it was the intent to optimize this case, so I >>> wonder why it is avoided. >> I wouldn't be terribly surprised if the old flow.c based auto-inc >> discovery handled this, but the newer auto-inc-dec.c doesn't. The docs >> were probably written prior to the conversion. > > That fits, because there is a reference to "the flow pass of the compiler" > when these constructs are introduced in section 14.16. > > So is this an omission, or is there a reason why that optimization was > removed? I would guess omission, probably on the assumption it wasn't terribly important and there wasn't really a good way to test it.There aren't many targets that use auto-inc getting a lot of attention these days, and those that do can't have multiple memory operands. Jeff
Re: in/out operands and auto-inc-dec
> On Jun 4, 2018, at 9:51 AM, Jeff Law wrote: > > On 06/04/2018 07:31 AM, Paul Koning wrote: >> The internals manual in its description of the "matching constraint" says >> that it works for cases where the in and out operands are somewhat >> different, such as *p++ vs. *p. Obviously that is meant to cover post_inc >> side effects. >> >> The curious thing is that auto-inc-dec.c specifically avoids doing this: if >> it finds what looks like a suitable candidate for auto-inc or auto-dec >> optimization but that operand occurs more than once in the insn, it doesn't >> make the change. The result is code that's both larger and slower for >> machines that have post_inc etc. addressing modes. The gccint documentation >> suggests that it was the intent to optimize this case, so I wonder why it is >> avoided. > I wouldn't be terribly surprised if the old flow.c based auto-inc > discovery handled this, but the newer auto-inc-dec.c doesn't. The docs > were probably written prior to the conversion. That fits, because there is a reference to "the flow pass of the compiler" when these constructs are introduced in section 14.16. So is this an omission, or is there a reason why that optimization was removed? paul
Re: in/out operands and auto-inc-dec
On 06/04/2018 07:31 AM, Paul Koning wrote: > The internals manual in its description of the "matching constraint" says > that it works for cases where the in and out operands are somewhat different, > such as *p++ vs. *p. Obviously that is meant to cover post_inc side effects. > > The curious thing is that auto-inc-dec.c specifically avoids doing this: if > it finds what looks like a suitable candidate for auto-inc or auto-dec > optimization but that operand occurs more than once in the insn, it doesn't > make the change. The result is code that's both larger and slower for > machines that have post_inc etc. addressing modes. The gccint documentation > suggests that it was the intent to optimize this case, so I wonder why it is > avoided. I wouldn't be terribly surprised if the old flow.c based auto-inc discovery handled this, but the newer auto-inc-dec.c doesn't. The docs were probably written prior to the conversion. jeff
in/out operands and auto-inc-dec
The internals manual in its description of the "matching constraint" says that it works for cases where the in and out operands are somewhat different, such as *p++ vs. *p. Obviously that is meant to cover post_inc side effects. The curious thing is that auto-inc-dec.c specifically avoids doing this: if it finds what looks like a suitable candidate for auto-inc or auto-dec optimization but that operand occurs more than once in the insn, it doesn't make the change. The result is code that's both larger and slower for machines that have post_inc etc. addressing modes. The gccint documentation suggests that it was the intent to optimize this case, so I wonder why it is avoided. paul