Re: Surprising CFG construction with goto from then to else

Jørgen Kvalsvik Thu, 08 Sep 2022 03:31:05 -0700

On 02/09/2022 14:22, Richard Biener wrote:

On Fri, Sep 2, 2022 at 11:50 AM Jørgen Kvalsvik <j...@lambda.is> wrote:



Hello,

I played some more with odd programs and the effect on control flow
graph construction (as a part of condition coverage support [1]) and
came across this:

int fn (int a, int b, int c) {
      int x = 0;
      if (a && b) {
          if (c) {
              goto a_;
          } else {
              x = a;
          }
      } else {
a_:
          x = (a - b);
      }

      return x;
}

Run through gcov --conditions I get:

          4:    5:    if (a && b) {
condition outcomes covered 2/2
condition outcomes covered 2/2
          2:    6:        if (c) {
condition outcomes covered 2/2

Which is clearly not correct. So I started digging into why and dump the
CFG as the coverage profiling sees it https://i.imgur.com/d0q72rA.png
[2]. I apologize for the labeling, but A2 = a, A3 = b, A5 = c and A9 the
else block. The problem, which is what confuses the algorithm, is that a
and b don't share A9 as a successor (on false) as I would expect.

If I add some operation before the label the problem disappears and a
and b share false-destination again https://i.imgur.com/PSrfaLC.png [3].

      } else {
          x++;
a_:
          x = (a - b);
      }

          4:    5:    if (a && b) {
condition outcomes covered 4/4
          2:    6:        if (c) {
condition outcomes covered 2/2


When dumping the cfg in the former case with -fdump-tree-cfg-graph I get
a CFG without the split destinations in a and b
https://i.imgur.com/05MCjzp.png [3]. I would assume from this that the
graph dump happens after _more_ CFG transformations than the branch
profiling.

So my questions are:

1. Is the control flow graph expected to be constructed as such where a
and b don't share outcome, or is it to be considered a bug?
2. If yes, would it be problematic to push the branch coverage and
condition profiling to a later stage where the cfg has been fixed?


I would say you should only see more nodes merged.  It's a bit hard to follow
what you say with the namings - I usually run cc1 in gdb, breaking at
execute_build_cfg where you can do, after build_gimple_cfg finished
(and before cleanup_tree_cfg ()) do a 'dot-fn' in gdb which produces a nice
picture of the CFG and code with graphviz.

It looks like I would have expected, in particular we do not force a
new basic-block to be generated for a_: after the D.1991: artificial
label we have for the else.  That might be premature optimization
for your case (but the cleanup_tree_cfg () would immediately do
that as well).

Richard.

I did some more digging into this and have isolated the problem to edgesplitting inside the branch_prob () function itself.


gcc/profile.cc:1248

  if (last
      && gimple_has_location (last)
      && !RESERVED_LOCATION_P (e->goto_locus)
      && !single_succ_p (bb)
      && (LOCATION_FILE (e->goto_locus)
          != LOCATION_FILE (gimple_location (last))
          || (LOCATION_LINE (e->goto_locus)
              != LOCATION_LINE (gimple_location (last)))))
  {
      basic_block new_bb = split_edge (e);
      edge ne = single_succ_edge (new_bb);
      ne->goto_locus = e->goto_locus;
  }

Based on the cleaned-up cfg that gcc dumps later it looks like thissplit only lives through the branch coverage/profiling phase (it maybleed slightly later but it shouldn't be of significance).

Out of curiosity I removed the splitting altogether and no tests failedwhen running make check-gcc check-g++ RUNTESTFLAGS="gcov.exp". Either itwas not covered by tests in the first place, or whatever behaviour thischeck is meant to fix is resolved elsewhere. I have to admit I don'treally see a difference with/without this patch, but I don't know whatto look for.


The check was first introduced in 2005 by Jan (cc):

commit d783b2a2dc91e1d2c1fea78cac2b6c6c73b3680d
Author: Jan Hubicka <j...@suse.cz>
Date:   Thu Aug 4 00:10:54 2005 +0200

profile.c (branch_prob): Split edges with goto locus on them to getproper line counts.



            * profile.c (branch_prob): Split edges with goto locus on them
            to get proper line counts.

* tree-cfg.c (make_cond_expr_edges): Record user gotolocuses, if any.


            * gcov-1.C: Fix switch counts.
            * gcov-4b.c: Likewise.

What stands out to me in the check is that it uses location-file andlocation-line to decide if to split the edge. I added a few prints tosee when the file/line is set:

2 int goto1 (int a) {



   3     if (a)
   4         goto end;
   5
   6     return 1;
   7 end:
   8     x += a;
   9     return 0;
  10 }

if (a_5(D) != 0)



        edge (true)
        last goto2.c:3
        goto (null):0

    if (a_5(D) != 0)
        edge (false)
        last goto2.c:3
        goto (null):0

    // predicted unlikely by goto predictor.
        edge (fallthru)
        last goto2.c:4
        goto goto2.c:4

The goto statement is the only with with a location for both the basicblock and the edge.

12 int goto2 (int a) {



  13     if (a) { goto end; }
  14     else   { label: a++; }
  15
  16     return 1;
  17 end:
  18     x += a;
  19     return 0;
  20 }

    if (a_5(D) != 0)
        edge (true)
        last goto2.c:13
        goto (null):0

    if (a_5(D) != 0)
        edge (false)
        last goto2.c:13
        goto goto2.c:14

    // predicted unlikely by goto predictor.
        edge (fallthru)
        last goto2.c:13
        goto goto2.c:13

Now the else block has two locations as well, with the edge labele->goto_locus being inside the else block. Note that this label is_unrelated_ to the edge jump a (false) -> else.


Now a function without gotos:

  22 int goto3 (int a, int b) {
  23     if (a && b) {
  24         x += a * b;
  25     } else {
  26         x -= 1;
  27     }
  28     return 0;
  29 }

    if (a_7(D) != 0)
        edge (true)
        last goto2.c:23
        goto (null):0

    if (a_7(D) != 0)
        edge (false)
        last goto2.c:23
        goto (null):0

    if (b_8(D) != 0)
        edge (true)
        last goto2.c:23
        goto (null):0

    if (b_8(D) != 0)
        edge (false)
        last goto2.c:23
        goto (null):0

    x = _3;
        edge (fallthru)
        last goto2.c:24
        goto goto2.c:24

    x = _5;
        edge (fallthru)
        last goto2.c:26
        goto (null):0

Now the checks if (a) and (b) don't have goto_locus on the edges. Forcompleteness I included the implied jumps in the then/else blocks which_do_ have edge locus in the then case.


Finally, the case that expose the problem for me:

  31 int goto4 (int a, int b) {
  32     if (a) {
  33         if (b) goto elseblock;
  34         else a++;
  35     } else {
  36 elseblock:
  37         a--;
  38     }
  39
  40     return a;
  41 }

    if (a_2(D) != 0)
        edge (true)
        last goto2.c:32
        goto (null):0

    if (a_2(D) != 0)
        edge (false)
        last goto2.c:32
        goto goto2.c:36

    if (b_3(D) != 0)
        edge (true)
        last goto2.c:33
        goto (null):0

    if (b_3(D) != 0)
        edge (false)
        last goto2.c:33
        goto (null):0

Again, a (false) has a goto_locus because there is an unrelated label atthe top of the else block. For completeness, it also applies happenswhen there's a label on top of then:


  43 int goto5 (int a, int b) {
  44     if (a) {
  45 then:
  46         a++;
  47     } else {

48 a--;



  49     }
  50
  51     return a;
  52 }

    if (a_2(D) != 0)
        edge (true)
        last goto2.c:44
        goto goto2.c:45

    if (a_2(D) != 0)
        edge (false)
        last goto2.c:44
        goto (null):0

This causes the edge to split which probably isn't a problem for thebranch coverage, but it is problematic for my condition coveragealgorithm. So how do we solve this?

1. Is the edge splitting necessary? I didn't find a test that coversthis (there might be one, any idea?). If the edge splitting is notnecessary anymore then removing it should be fine for my coverage needs.2. Assuming the edge split is necessary making a decision on source file+ line seems vulnerable to source formatting:


  54 int goto6 (int a) {

55 if (a) label: goto end;



  56     return 0;
  57 end:
  58     return 1;
  59
  60 }

    if (a_2(D) != 0)
        edge (true)
        last goto2.c:55
        goto goto2.c:55

    if (a_2(D) != 0)
        edge (false)
        last goto2.c:55
        goto (null):0

    // predicted unlikely by goto predictor.
        edge (fallthru)
        last goto2.c:55
        goto goto2.c:55

Now unrelated gotos all have the same file/line signature. Since theedge to the goto_locus is unrelated to the label itself.3. Assuming the test is fine and necessary I *could* work around theproblem by recording the original edge somewhere (it is very importantthat all conditions in an expression share the same then/else basicblocks), or enough metadata to make a virtual edge, but that is a hack Ihope I don't have to do.

Anyway, the problem is not with the cfg construction itself, but an edgesplitting that happens specifically for profiling.


CC Martin, maybe you have any idea?

Thanks,
Jørgen.

Re: Surprising CFG construction with goto from then to else

Reply via email to