On Sat, Apr 14, 2012 at 9:26 AM, Tom de Vries <tom_devr...@mentor.com> wrote: > On 27/01/12 21:37, Tom de Vries wrote: >> On 24/01/12 11:40, Richard Guenther wrote: >>> On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries <tom_devr...@mentor.com> >>> wrote: >>>> Richard, >>>> Jakub, >>>> >>>> the following patch fixes PR51879. >>>> >>>> Consider the following test-case: >>>> ... >>>> int bar (int); >>>> void baz (int); >>>> >>>> void >>>> foo (int y) >>>> { >>>> int a; >>>> if (y == 6) >>>> a = bar (7); >>>> else >>>> a = bar (7); >>>> baz (a); >>>> } >>>> ... >>>> >>>> after compiling at -02, the representation looks like this before >>>> tail-merging: >>>> ... >>>> # BLOCK 3 freq:1991 >>>> # PRED: 2 [19.9%] (true,exec) >>>> # .MEMD.1714_7 = VDEF <.MEMD.1714_6(D)> >>>> # USE = nonlocal >>>> # CLB = nonlocal >>>> aD.1709_3 = barD.1703 (7); >>>> goto <bb 5>; >>>> # SUCC: 5 [100.0%] (fallthru,exec) >>>> >>>> # BLOCK 4 freq:8009 >>>> # PRED: 2 [80.1%] (false,exec) >>>> # .MEMD.1714_8 = VDEF <.MEMD.1714_6(D)> >>>> # USE = nonlocal >>>> # CLB = nonlocal >>>> aD.1709_4 = barD.1703 (7); >>>> # SUCC: 5 [100.0%] (fallthru,exec) >>>> >>>> # BLOCK 5 freq:10000 >>>> # PRED: 3 [100.0%] (fallthru,exec) 4 [100.0%] (fallthru,exec) >>>> # aD.1709_1 = PHI <aD.1709_3(3), aD.1709_4(4)> >>>> # .MEMD.1714_5 = PHI <.MEMD.1714_7(3), .MEMD.1714_8(4)> >>>> # .MEMD.1714_9 = VDEF <.MEMD.1714_5> >>>> # USE = nonlocal >>>> # CLB = nonlocal >>>> bazD.1705 (aD.1709_1); >>>> # VUSE <.MEMD.1714_9> >>>> return; >>>> ... >>>> >>>> the patch allows aD.1709_4 to be value numbered to aD.1709_3, and >>>> .MEMD.1714_8 >>>> to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5. >>>> >>>> The patch makes sure non-const/pure call results (gimple_vdef and >>>> gimple_call_lhs) are properly value numbered. >>>> >>>> Bootstrapped and reg-tested on x86_64. >>>> >>>> ok for stage1? >>> >>> The following cannot really work: >>> >>> @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl >>> result = vn_reference_lookup_1 (&vr1, NULL); >>> if (result) >>> { >>> - changed = set_ssa_val_to (lhs, result); >>> + tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result)); >>> + if (vdef) >>> + changed |= set_ssa_val_to (vdef, result_vdef); >>> + changed |= set_ssa_val_to (lhs, result); >>> >>> because 'result' may be not an SSA name. It might also not have >>> a proper definition statement (if VN_INFO (result)->needs_insertion >>> is true). So you at least need to guard things properly. >>> >> >> Right. And that also doesn't work if the function is without lhs, such as in >> the >> new test-case pr51879-6.c. >> >> I fixed this by storing both lhs and vdef, such that I don't have to derive >> the vdef from the lhs. >> >>> (On a side-note - I _did_ want to remove value-numbering virtual operands >>> at some point ...) >>> >> >> Doing so willl hurt performance of tail-merging in its current form. >> OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that >> value numbering as used in tail-merging has its limitations too. >> Do you have any ideas how to address that one? >> >>> @@ -3359,8 +3366,10 @@ visit_use (tree use) >>> /* ??? We should handle stores from calls. */ >>> else if (TREE_CODE (lhs) == SSA_NAME) >>> { >>> + tree vuse = gimple_vuse (stmt); >>> if (!gimple_call_internal_p (stmt) >>> - && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)) >>> + && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST) >>> + || (vuse && SSA_VAL (vuse) != VN_TOP))) >>> changed = visit_reference_op_call (lhs, stmt); >>> else >>> changed = defs_to_varying (stmt); >>> >>> ... exactly because of the issue that a stmt has multiple defs. Btw, >>> vuse should have been visited here or be part of our SCC, so, why do >>> you need this check? >>> >> >> Removed now, that was a workaround for a bug in an earlier version of the >> patch, >> that I didn't need anymore. >> >> Bootstrapped and reg-tested on x86_64. >> >> OK for stage1? >> > > Richard, > > quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html: > ... > I think these fixes hint at that we should > use "structural" equality as fallback if value-numbering doesn't equate > two stmt effects. Thus, treat two stmts with exactly the same operands > and flags as equal and using value-numbering to canonicalize operands > (when they are SSA names) for that comparison, or use VN entirely > if there are no side-effects on the stmt. > > Changing value-numbering of virtual operands, even if it looks correct in the > simple cases you change, doesn't look like a general solution for the missed > tail merging opportunities. > ... > > The test-case pr51879-6.c shows a case where improving value numbering will > help > tail-merging, but structural equality comparison not: > ... > # BLOCK 3 freq:1991 > # PRED: 2 [19.9%] (true,exec) > # .MEMD.1717_7 = VDEF <.MEMD.1717_6(D)> > # USE = nonlocal > # CLB = nonlocal > blaD.1708 (5); > # .MEMD.1717_8 = VDEF <.MEMD.1717_7> > # USE = nonlocal > # CLB = nonlocal > aD.1712_3 = barD.1704 (7); > goto <bb 5>; > # SUCC: 5 [100.0%] (fallthru,exec) > > # BLOCK 4 freq:8009 > # PRED: 2 [80.1%] (false,exec) > # .MEMD.1717_9 = VDEF <.MEMD.1717_6(D)> > # USE = nonlocal > # CLB = nonlocal > blaD.1708 (5); > # .MEMD.1717_10 = VDEF <.MEMD.1717_9> > # USE = nonlocal > # CLB = nonlocal > aD.1712_4 = barD.1704 (7); > # SUCC: 5 [100.0%] (fallthru,exec) > > # BLOCK 5 freq:10000 > # PRED: 3 [100.0%] (fallthru,exec) 4 [100.0%] (fallthru,exec) > # aD.1712_1 = PHI <aD.1712_3(3), aD.1712_4(4)> > # .MEMD.1717_5 = PHI <.MEMD.1717_8(3), .MEMD.1717_10(4)> > # .MEMD.1717_11 = VDEF <.MEMD.1717_5> > # USE = nonlocal > # CLB = nonlocal > bazD.1706 (aD.1712_1); > # VUSE <.MEMD.1717_11> > return; > ... > by structurally comparing the gimples in both blocks, we can see that these > are > equal. > What we can't see, is that aD.1712_3 and aD.1712_4 have the same value. For > that, we need value numbering to number the vuses the same. And if we get > value > numbering to number these values equal, we don't need the structural > comparison. > > In this case structural comparison cannot be used as fallback for > value-numbering. So, ok for trunk? > > And if not ok, do you have a suggestion of how to deal with this otherwise?
Hmm. I'm not sure we can conclude that they have the same value! +int bar (int); +void baz (int); +void bla (int); + +void +foo (int y) +{ + int a; + if (y == 6) + { + bla (5); + a = bar (7); + } + else + { + bla (5); + a = bar (7); + } + baz (a); +} I can implement bla as void bla(int) { a = random (); } thus a would not have the same value on both paths - but that is not what matters - because obviously we can still merge the blocks. That is, we cannot be sure that the side-effects on memory of bla (5) and bla (5) are the same. But is that not what your value-numbering changes will assert? (not sure how you achieve this, but it looks like you insert/lookup general calls, thus not pure/const ones - that could easily lead to miscompiles(?)). Richard. Richard.