On Sat, Apr 14, 2012 at 9:26 AM, Tom de Vries <tom_devr...@mentor.com> wrote:
> On 27/01/12 21:37, Tom de Vries wrote:
>> On 24/01/12 11:40, Richard Guenther wrote:
>>> On Mon, Jan 23, 2012 at 10:27 PM, Tom de Vries <tom_devr...@mentor.com> 
>>> wrote:
>>>> Richard,
>>>> Jakub,
>>>>
>>>> the following patch fixes PR51879.
>>>>
>>>> Consider the following test-case:
>>>> ...
>>>> int bar (int);
>>>> void baz (int);
>>>>
>>>> void
>>>> foo (int y)
>>>> {
>>>>  int a;
>>>>  if (y == 6)
>>>>    a = bar (7);
>>>>  else
>>>>    a = bar (7);
>>>>  baz (a);
>>>> }
>>>> ...
>>>>
>>>> after compiling at -02, the representation looks like this before 
>>>> tail-merging:
>>>> ...
>>>>  # BLOCK 3 freq:1991
>>>>  # PRED: 2 [19.9%]  (true,exec)
>>>>  # .MEMD.1714_7 = VDEF <.MEMD.1714_6(D)>
>>>>  # USE = nonlocal
>>>>  # CLB = nonlocal
>>>>  aD.1709_3 = barD.1703 (7);
>>>>  goto <bb 5>;
>>>>  # SUCC: 5 [100.0%]  (fallthru,exec)
>>>>
>>>>  # BLOCK 4 freq:8009
>>>>  # PRED: 2 [80.1%]  (false,exec)
>>>>  # .MEMD.1714_8 = VDEF <.MEMD.1714_6(D)>
>>>>  # USE = nonlocal
>>>>  # CLB = nonlocal
>>>>  aD.1709_4 = barD.1703 (7);
>>>>  # SUCC: 5 [100.0%]  (fallthru,exec)
>>>>
>>>>  # BLOCK 5 freq:10000
>>>>  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
>>>>  # aD.1709_1 = PHI <aD.1709_3(3), aD.1709_4(4)>
>>>>  # .MEMD.1714_5 = PHI <.MEMD.1714_7(3), .MEMD.1714_8(4)>
>>>>  # .MEMD.1714_9 = VDEF <.MEMD.1714_5>
>>>>  # USE = nonlocal
>>>>  # CLB = nonlocal
>>>>  bazD.1705 (aD.1709_1);
>>>>  # VUSE <.MEMD.1714_9>
>>>>  return;
>>>> ...
>>>>
>>>> the patch allows aD.1709_4 to be value numbered to aD.1709_3, and 
>>>> .MEMD.1714_8
>>>> to .MEMD.1714_7, which enables tail-merging of blocks 4 and 5.
>>>>
>>>> The patch makes sure non-const/pure call results (gimple_vdef and
>>>> gimple_call_lhs) are properly value numbered.
>>>>
>>>> Bootstrapped and reg-tested on x86_64.
>>>>
>>>> ok for stage1?
>>>
>>> The following cannot really work:
>>>
>>> @@ -2600,7 +2601,11 @@ visit_reference_op_call (tree lhs, gimpl
>>>    result = vn_reference_lookup_1 (&vr1, NULL);
>>>    if (result)
>>>      {
>>> -      changed = set_ssa_val_to (lhs, result);
>>> +      tree result_vdef = gimple_vdef (SSA_NAME_DEF_STMT (result));
>>> +      if (vdef)
>>> +       changed |= set_ssa_val_to (vdef, result_vdef);
>>> +      changed |= set_ssa_val_to (lhs, result);
>>>
>>> because 'result' may be not an SSA name.  It might also not have
>>> a proper definition statement (if VN_INFO (result)->needs_insertion
>>> is true).  So you at least need to guard things properly.
>>>
>>
>> Right. And that also doesn't work if the function is without lhs, such as in 
>> the
>> new test-case pr51879-6.c.
>>
>> I fixed this by storing both lhs and vdef, such that I don't have to derive
>> the vdef from the lhs.
>>
>>> (On a side-note - I _did_ want to remove value-numbering virtual operands
>>> at some point ...)
>>>
>>
>> Doing so willl hurt performance of tail-merging in its current form.
>> OTOH, http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51964#c0 shows that
>> value numbering as used in tail-merging has its limitations too.
>> Do you have any ideas how to address that one?
>>
>>> @@ -3359,8 +3366,10 @@ visit_use (tree use)
>>>           /* ???  We should handle stores from calls.  */
>>>           else if (TREE_CODE (lhs) == SSA_NAME)
>>>             {
>>> +             tree vuse = gimple_vuse (stmt);
>>>               if (!gimple_call_internal_p (stmt)
>>> -                 && gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST))
>>> +                 && (gimple_call_flags (stmt) & (ECF_PURE | ECF_CONST)
>>> +                     || (vuse && SSA_VAL (vuse) != VN_TOP)))
>>>                 changed = visit_reference_op_call (lhs, stmt);
>>>               else
>>>                 changed = defs_to_varying (stmt);
>>>
>>> ... exactly because of the issue that a stmt has multiple defs.  Btw,
>>> vuse should have been visited here or be part of our SCC, so, why do
>>> you need this check?
>>>
>>
>> Removed now, that was a workaround for a bug in an earlier version of the 
>> patch,
>> that I didn't need anymore.
>>
>> Bootstrapped and reg-tested on x86_64.
>>
>> OK for stage1?
>>
>
> Richard,
>
> quoting you in http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00618.html:
> ...
> I think these fixes hint at that we should
> use "structural" equality as fallback if value-numbering doesn't equate
> two stmt effects.  Thus, treat two stmts with exactly the same operands
> and flags as equal and using value-numbering to canonicalize operands
> (when they are SSA names) for that comparison, or use VN entirely
> if there are no side-effects on the stmt.
>
> Changing value-numbering of virtual operands, even if it looks correct in the
> simple cases you change, doesn't look like a general solution for the missed
> tail merging opportunities.
> ...
>
> The test-case pr51879-6.c shows a case where improving value numbering will 
> help
> tail-merging, but structural equality comparison not:
> ...
>  # BLOCK 3 freq:1991
>  # PRED: 2 [19.9%]  (true,exec)
>  # .MEMD.1717_7 = VDEF <.MEMD.1717_6(D)>
>  # USE = nonlocal
>  # CLB = nonlocal
>  blaD.1708 (5);
>  # .MEMD.1717_8 = VDEF <.MEMD.1717_7>
>  # USE = nonlocal
>  # CLB = nonlocal
>  aD.1712_3 = barD.1704 (7);
>  goto <bb 5>;
>  # SUCC: 5 [100.0%]  (fallthru,exec)
>
>  # BLOCK 4 freq:8009
>  # PRED: 2 [80.1%]  (false,exec)
>  # .MEMD.1717_9 = VDEF <.MEMD.1717_6(D)>
>  # USE = nonlocal
>  # CLB = nonlocal
>  blaD.1708 (5);
>  # .MEMD.1717_10 = VDEF <.MEMD.1717_9>
>  # USE = nonlocal
>  # CLB = nonlocal
>  aD.1712_4 = barD.1704 (7);
>  # SUCC: 5 [100.0%]  (fallthru,exec)
>
>  # BLOCK 5 freq:10000
>  # PRED: 3 [100.0%]  (fallthru,exec) 4 [100.0%]  (fallthru,exec)
>  # aD.1712_1 = PHI <aD.1712_3(3), aD.1712_4(4)>
>  # .MEMD.1717_5 = PHI <.MEMD.1717_8(3), .MEMD.1717_10(4)>
>  # .MEMD.1717_11 = VDEF <.MEMD.1717_5>
>  # USE = nonlocal
>  # CLB = nonlocal
>  bazD.1706 (aD.1712_1);
>  # VUSE <.MEMD.1717_11>
>  return;
> ...
> by structurally comparing the gimples in both blocks, we can see that these 
> are
> equal.
> What we can't see, is that aD.1712_3 and aD.1712_4 have the same value. For
> that, we need value numbering to number the vuses the same. And if we get 
> value
> numbering to number these values equal, we don't need the structural 
> comparison.
>
> In this case structural comparison cannot be used as fallback for
> value-numbering. So, ok for trunk?
>
> And if not ok, do you have a suggestion of how to deal with this otherwise?

Hmm.  I'm not sure we can conclude that they have the same value!

+int bar (int);
+void baz (int);
+void bla (int);
+
+void
+foo (int y)
+{
+  int a;
+  if (y == 6)
+    {
+      bla (5);
+      a = bar (7);
+    }
+  else
+    {
+      bla (5);
+      a = bar (7);
+    }
+  baz (a);
+}

I can implement bla as

void bla(int) { a = random (); }

thus a would not have the same value on both paths - but that is not what
matters - because obviously we can still merge the blocks.  That is,
we cannot be sure that the side-effects on memory of bla (5) and bla (5)
are the same.  But is that not what your value-numbering changes will assert?
(not sure how you achieve this, but it looks like you insert/lookup general
calls, thus not pure/const ones - that could easily lead to miscompiles(?)).

Richard.

Richard.

Reply via email to