Hello
> > Thanks.
> > 
> > I think my earlier analysis was wrong.
Sorry for late reply.  I was looking into it again yesterday but was bit
confused about what is goin gon here.
> > 
> > With the caveat that I'm not as familiar with the IPA code as other
> > parts of the compiler, what I think is going on is:
> > 
> > WPA streams in the input callgraph:
> > 
> >   main/1 -------------------> a/0  ----------------------> b/2
> >                  |                           |
> >           lto_stmt_uid == 1          lto_stmt_uid == 1
> > 
> > WPA generates a clone of a, and streams out this callgraph:
> > 
> >   main/1 -------------------> a.isra/3  ----------------------> b/2
> > 
> > Without -fanalyzer, the callgraph edges have NULL call_stmt and so
> > the
> > existing lto_stmt_uid values are used when streaming out:
> > 
> >   main/1 -------------------> a.isra/3  ----------------------> b/2
> >                  |                           |
> >          lto_stmt_uid == 1          lto_stmt_uid == 1
> > 
> > 
> > With -fanalyzer, the functions were materialized, so the callgraph
> > edges have non-NULL call-stmt.
> > 
> > lto_prepare_function_for_streaming is called for "main/1", but not
> > for
> > "a.isra/3", as it is a clone (see the condition at line 310):
> > 
> >     307   /* Do body modifications needed for streaming before we
> > fork out
> >     308      worker processes.  */
> >     309   FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
> >     310     if (!node->clone_of && gimple_has_body_p (node->decl))
> >     311       lto_prepare_function_for_streaming (node);
> > 
> > Hence the uids are written to within main/1, but not within a.isra/3
> > (I
> > think).

This is still correct. The idea here is that clone does not have its own
body but shares it with the original function until it gets materialized
and then it is turned to real function itself.

At WPA time some functions are loaded into memory while others are just
rerefences to LTO data.
So either
  1) we have function referencing LTO data and then the function
     and all its clones should use lto_stmt_uid and stmt pointers should
     be NULL or 
  2) we have function loaded in memory and the function and all its
     clones should have lto_stmt_uid undefined and stmt points used.

Since you load function to memory we should end up using stmt pointers.
Since there is only one body, you want to call
lto_prepare_function_for_streaming just once - for the function and not
for its clones.
> > Hence this callgraph is written out if the functions bodies were
> > loaded:
> > 
> >   main/1 -------------------> a.isra/3  ----------------------> b/2
> >                  |                           |
> >          call_stmt->uid == 0           call_stmt->uid == 2
> >         so lto_stmt_uid == 1          so lto_stmt_uid == 3
> > 
> > 
> > At the ltrans stream-in, it reads in the edges, and then
> > calls fixup_call_stmt_edges_1 on both the original node *and* any
> > clones:
> > 
> >         1237        fixup_call_stmt_edges_1 (orig, stmts, fn);
> >         1238      if (orig->clones)
> >         1239        for (node = orig->clones; node != orig;)
> >         1240          {
> >         1241            if (!node->thunk)
> >         1242              fixup_call_stmt_edges_1 (node, stmts, fn);

This is also intended - if you load the body and to maintain the
invariant above, you want to change the representation of all edges
(including those in clones)
> > 
> > WPA has renumbered the stmts in non-clone functions, but has not
> > renumbered them for clones.

And this is where things seem to be wrong - since function body is in
memory at WPA time, the uids sould be cleared by fixup_call_stmt_edges
at WPA time and thus edge->call_stmt != 0 for all callgraph edges in the
function and all its clones.
Consequently while streaming out we should use:
uid = !edge->call_stmt ? edge->lto_stmt_uid
                       : gimple_uid (edge->call_stmt) + 1;
So all the uids should go from gimple_uid (edge->call_stmt) and those
all points to the call statements that has been renumbered
by lto_prepare_function.  So I do not see how these are going out of
sync.
> > 
> > Note that fixup_call_stmt_edges passes the same array of stmts to
> > both
> > calls to fixup_call_stmt_edges_1 (both the original and the clones)
> > i.e. it's trying to fixup the edges for "a.isra/3" with the stmts in
> > "a/0".   Should it be doing this?  "stmts" is constructed in
Yes, since these are same call statements (there is only one gimple body
around).
> > input_function by walking the BBs of cfun.  Is that going to find the
> > stmts of the clones?  (sorry, I'm not familiar with how clones work
> > within our IPA implementation).
> > 
> > fixup_call_stmt_edges_1 attempts to fixup the call edge from a.isra/3
> > to b/2, which has lto_stmt_uid == 3 and thus stmt->uid == 2, but
> > there
> > are only 2 stmts within the function "a/0", hence it's one past the
> > end
> > of the array, and fails with:
> > test.c:3:5: fatal error: Cgraph edge statement index out of range
> > 
> > I wonder if this can happen any time WPA loads a function body that
> > makes a call that then gets cloned; it just happens that -fanalyzer
> > is
> > loading the function body.

I guess one needs to figure out how WPA end up streaming different
indexes.  I will try to de bug that.

Honza
> 
> It's still not clear to me what's going on with clones in the above,
> e.g. why we don't fix up clones in lto_prepare_function_for_streaming.
> However, given that the analyzer part of the patch fixes things and
> seems low risk, I've gone ahead and committed the following simpler
> version of the patch to trunk for gcc 11, to unbreak the combination
> of -fanalyzer and -flto for the case where a function gets cloned.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> Pushed to trunk as r11-8142-g17f3c2b8ac477b07ca0aafbc7d74ba305dc1ee33.
> 
> Here's the commit message and patch:
> 
> gimple.h has this comment for gimple's uid field:
> 
>   /* UID of this statement.  This is used by passes that want to
>      assign IDs to statements.  It must be assigned and used by each
>      pass.  By default it should be assumed to contain garbage.  */
>   unsigned uid;
> 
> and gimple_set_uid has:
> 
>    Please note that this UID property is supposed to be undefined at
>    pass boundaries.  This means that a given pass should not assume it
>    contains any useful value when the pass starts and thus can set it
>    to any value it sees fit.
> 
> which suggests that any pass can use the uid field as an arbitrary
> scratch space.
> 
> PR analyzer/98599 reports a case where this error occurs in LTO mode:
>   fatal error: Cgraph edge statement index out of range
> on certain inputs with -fanalyzer.
> 
> The error occurs in the LTRANS phase after -fanalyzer runs in the
> WPA phase.  The analyzer pass writes to the uid fields of all stmts.
> 
> The error occurs when LTRANS is streaming callgraph edges back in.
> The LTO format uses stmt uids to associate call stmts with callgraph
> edges between WPA and LTRANS.
> For example, in lto-cgraph.c, lto_output_edge writes out the
> gimple_uid, and input_edge reads it back in.
> 
> lto_prepare_function_for_streaming has code to renumber the stmt UIDs
> when the code is streamed back out, but for some reason this isn't
> called for clones:
>     307         /* Do body modifications needed for streaming before we fork 
> out
>     308            worker processes.  */
>     309         FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
>     310           if (!node->clone_of && gimple_has_body_p (node->decl))
>     311             lto_prepare_function_for_streaming (node);
> 
> Hence the combination of -fanalyzer and -flto will fail in LTRANS's
> stream-in if any function clones are encountered.
> 
> It's not fully clear to me why this isn't done for clones, and what the
> correct fix should be to allow arbitrary changes to uids within WPA
> passes.
> 
> In the meantime, this patch works around the issue by updating the
> analyzer to save and restore the UIDs, fixing the error.
> 
> gcc/analyzer/ChangeLog:
>       PR analyzer/98599
>       * supergraph.cc (saved_uids::make_uid_unique): New.
>       (saved_uids::restore_uids): New.
>       (supergraph::supergraph): Replace assignments to stmt->uid with
>       calls to m_stmt_uids.make_uid_unique.
>       (supergraph::~supergraph): New.
>       * supergraph.h (class saved_uids): New.
>       (supergraph::~supergraph): New decl.
>       (supergraph::m_stmt_uids): New field.
> 
> gcc/testsuite/ChangeLog:
>       PR analyzer/98599
>       * gcc.dg/analyzer/pr98599-a.c: New test.
>       * gcc.dg/analyzer/pr98599-b.c: New test.
> ---
>  gcc/analyzer/supergraph.cc                | 57 +++++++++++++++++++++--
>  gcc/analyzer/supergraph.h                 | 15 ++++++
>  gcc/testsuite/gcc.dg/analyzer/pr98599-a.c |  8 ++++
>  gcc/testsuite/gcc.dg/analyzer/pr98599-b.c |  1 +
>  4 files changed, 77 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr98599-a.c
>  create mode 100644 gcc/testsuite/gcc.dg/analyzer/pr98599-b.c
> 
> diff --git a/gcc/analyzer/supergraph.cc b/gcc/analyzer/supergraph.cc
> index 4b934568db6..8611d0f8689 100644
> --- a/gcc/analyzer/supergraph.cc
> +++ b/gcc/analyzer/supergraph.cc
> @@ -87,6 +87,50 @@ supergraph_call_edge (function *fun, gimple *stmt)
>    return edge;
>  }
>  
> +/* class saved_uids.
> +
> +   In order to ensure consistent results without relying on the ordering
> +   of pointer values we assign a uid to each gimple stmt, globally unique
> +   across all functions.
> +
> +   Normally, the stmt uids are a scratch space that each pass can freely
> +   assign its own values to.  However, in the case of LTO, the uids are
> +   used to associate call stmts with callgraph edges between the WPA phase
> +   (where the analyzer runs in LTO mode) and the LTRANS phase; if the
> +   analyzer changes them in the WPA phase, it leads to errors when
> +   streaming the code back in at LTRANS.
> +   lto_prepare_function_for_streaming has code to renumber the stmt UIDs
> +   when the code is streamed back out, but for some reason this isn't
> +   called for clones.
> +
> +   Hence, as a workaround, this class has responsibility for tracking
> +   the original uids and restoring them once the pass is complete
> +   (in the supergraph dtor).  */
> +
> +/* Give STMT a globally unique uid, storing its original uid so it can
> +   later be restored.  */
> +
> +void
> +saved_uids::make_uid_unique (gimple *stmt)
> +{
> +  unsigned next_uid = m_old_stmt_uids.length ();
> +  unsigned old_stmt_uid = stmt->uid;
> +  stmt->uid = next_uid;
> +  m_old_stmt_uids.safe_push
> +    (std::pair<gimple *, unsigned> (stmt, old_stmt_uid));
> +}
> +
> +/* Restore the saved uids of all stmts.  */
> +
> +void
> +saved_uids::restore_uids () const
> +{
> +  unsigned i;
> +  std::pair<gimple *, unsigned> *pair;
> +  FOR_EACH_VEC_ELT (m_old_stmt_uids, i, pair)
> +    pair->first->uid = pair->second;
> +}
> +
>  /* supergraph's ctor.  Walk the callgraph, building supernodes for each
>     CFG basic block, splitting the basic blocks at callsites.  Join
>     together the supernodes with interprocedural and intraprocedural
> @@ -101,8 +145,6 @@ supergraph::supergraph (logger *logger)
>  
>    /* First pass: make supernodes (and assign UIDs to the gimple stmts).  */
>    {
> -    unsigned next_uid = 0;
> -
>      /* Sort the cgraph_nodes?  */
>      cgraph_node *node;
>      FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
> @@ -127,7 +169,7 @@ supergraph::supergraph (logger *logger)
>           {
>             gimple *stmt = gsi_stmt (gpi);
>             m_stmt_to_node_t.put (stmt, node_for_stmts);
> -           stmt->uid = next_uid++;
> +           m_stmt_uids.make_uid_unique (stmt);
>           }
>  
>         /* Append statements from BB to the current supernode, splitting
> @@ -139,7 +181,7 @@ supergraph::supergraph (logger *logger)
>             gimple *stmt = gsi_stmt (gsi);
>             node_for_stmts->m_stmts.safe_push (stmt);
>             m_stmt_to_node_t.put (stmt, node_for_stmts);
> -           stmt->uid = next_uid++;
> +           m_stmt_uids.make_uid_unique (stmt);
>             if (cgraph_edge *edge = supergraph_call_edge (fun, stmt))
>               {
>                 m_cgraph_edge_to_caller_prev_node.put(edge, node_for_stmts);
> @@ -257,6 +299,13 @@ supergraph::supergraph (logger *logger)
>    }
>  }
>  
> +/* supergraph's dtor.  Reset stmt uids.  */
> +
> +supergraph::~supergraph ()
> +{
> +  m_stmt_uids.restore_uids ();
> +}
> +
>  /* Dump this graph in .dot format to PP, using DUMP_ARGS.
>     Cluster the supernodes by function, then by BB from original CFG.  */
>  
> diff --git a/gcc/analyzer/supergraph.h b/gcc/analyzer/supergraph.h
> index 5d1268e555f..f4090fd5e0e 100644
> --- a/gcc/analyzer/supergraph.h
> +++ b/gcc/analyzer/supergraph.h
> @@ -79,6 +79,18 @@ struct supergraph_traits
>    typedef supercluster cluster_t;
>  };
>  
> +/* A class to manage the setting and restoring of statement uids.  */
> +
> +class saved_uids
> +{
> +public:
> +  void make_uid_unique (gimple *stmt);
> +  void restore_uids () const;
> +
> +private:
> +  auto_vec<std::pair<gimple *, unsigned> > m_old_stmt_uids;
> +};
> +
>  /* A "supergraph" is a directed graph formed by joining together all CFGs,
>     linking them via interprocedural call and return edges.
>  
> @@ -90,6 +102,7 @@ class supergraph : public digraph<supergraph_traits>
>  {
>  public:
>    supergraph (logger *logger);
> +  ~supergraph ();
>  
>    supernode *get_node_for_function_entry (function *fun) const
>    {
> @@ -205,6 +218,8 @@ private:
>  
>    typedef hash_map<const function *, unsigned> function_to_num_snodes_t;
>    function_to_num_snodes_t m_function_to_num_snodes;
> +
> +  saved_uids m_stmt_uids;
>  };
>  
>  /* A node within a supergraph.  */
> diff --git a/gcc/testsuite/gcc.dg/analyzer/pr98599-a.c 
> b/gcc/testsuite/gcc.dg/analyzer/pr98599-a.c
> new file mode 100644
> index 00000000000..2bbf37b0e6e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/pr98599-a.c
> @@ -0,0 +1,8 @@
> +/* { dg-do link } */
> +/* { dg-require-effective-target lto } */
> +/* { dg-additional-options "-Os -flto" } */
> +/* { dg-additional-sources pr98599-b.c } */
> +
> +int b(int x);
> +int a() { b(5); }
> +int main() { a(); }
> diff --git a/gcc/testsuite/gcc.dg/analyzer/pr98599-b.c 
> b/gcc/testsuite/gcc.dg/analyzer/pr98599-b.c
> new file mode 100644
> index 00000000000..cfdeb3bf134
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/pr98599-b.c
> @@ -0,0 +1 @@
> +int b(int x) { return x; }
> -- 
> 2.26.2
> 

Reply via email to