On Wed, 5 Feb 2014, Jakub Jelinek wrote:

> On Wed, Feb 05, 2014 at 02:20:05PM +0100, Richard Biener wrote:
> > > Using !!optimize to determine if we should switch local ABI to regparm
> > > convention isn't compatible with optimize attribute, as !!optimize is
> > > whether the current function is being optimized, but for the ABI decisions
> > > we actually need the caller and callee to agree on the calling convention.
> > > 
> > > Fixed by looking at callee's optimize settings all the time.
> > > 
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > Seeing a 2nd variant of such code I'd rather have an abstraction
> > for this ... optimize_fn ()?  opt_for_fn (fn, OPT_...)?
> 
> Here is an updated version of the patch, unfortunately this one regressed in
> quite a lot of tests.  The problem is that by using opt_for_fn there
> it also sets the failure reason for all functions at -O0 that don't have
> the optimize attribute at all.  Now copy_forbidden is used by the inliner,
> where we have to handle always_inline functions, other -O0 functions we likely
> don't want to inline, be it because the cmd line options are -O0 or because
> the function we are considering for inlining has optimize (0), and
> for versioning, where I'd say we want to never version the -O0 functions.
> 
> So, where do we want to do that instead?  E.g. should it be e.g. in
> tree_versionable_function_p directly and let the inliner (if it doesn't do
> already) also treat optimize(0) functions that aren't always_inline as
> noinline?
> 
> I guess even without this patch my PR60026 fix broke inlining of
> __attribute__((always_inline, optimize (0))) functions.
> 
> 2014-02-05  Jakub Jelinek  <ja...@redhat.com>
> 
>       PR target/60062
>       * tree.h (opts_for_fn): New inline function.
>       (opt_for_fn): Define.
>       * config/i386/i386.c (ix86_function_regparm): Use
>       opt_for_fn (decl, optimize) instead of optimize.
>       * tree-inline.c (copy_forbidden): Use opt_for_fn.
> 
>       * gcc.c-torture/execute/pr60062.c: New test.
>       * gcc.c-torture/execute/pr60072.c: New test.
> 
> --- gcc/tree.h.jj     2014-01-20 19:16:29.000000000 +0100
> +++ gcc/tree.h        2014-02-05 15:54:04.681904492 +0100
> @@ -4470,6 +4470,20 @@ may_be_aliased (const_tree var)
>             || TREE_ADDRESSABLE (var)));
>  }
>  
> +/* Return pointer to optimization flags of FNDECL.  */
> +static inline struct cl_optimization *
> +opts_for_fn (const_tree fndecl)
> +{
> +  tree fn_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fndecl);
> +  if (fn_opts == NULL_TREE)
> +    fn_opts = optimization_default_node;
> +  return TREE_OPTIMIZATION (fn_opts);
> +}
> +
> +/* opt flag for function FNDECL, e.g. opts_for_fn (fndecl, optimize) is
> +   the optimization level of function fndecl.  */
> +#define opt_for_fn(fndecl, opt) (opts_for_fn (fndecl)->x_##opt)
> +
>  /* For anonymous aggregate types, we need some sort of name to
>     hold on to.  In practice, this should not appear, but it should
>     not be harmful if it does.  */
> --- gcc/config/i386/i386.c.jj 2014-02-05 10:37:36.166167116 +0100
> +++ gcc/config/i386/i386.c    2014-02-05 15:56:36.779146394 +0100
> @@ -5608,7 +5608,12 @@ ix86_function_regparm (const_tree type,
>    /* Use register calling convention for local functions when possible.  */
>    if (decl
>        && TREE_CODE (decl) == FUNCTION_DECL
> -      && optimize
> +      /* Caller and callee must agree on the calling convention, so
> +      checking here just optimize means that with
> +      __attribute__((optimize (...))) caller could use regparm convention
> +      and callee not, or vice versa.  Instead look at whether the callee
> +      is optimized or not.  */
> +      && opt_for_fn (decl, optimize)
>        && !(profile_flag && !flag_fentry))
>      {
>        /* FIXME: remove this CONST_CAST when cgraph.[ch] is constified.  */
> --- gcc/tree-inline.c.jj      2014-02-04 14:03:31.000000000 +0100
> +++ gcc/tree-inline.c 2014-02-05 15:55:34.747448968 +0100
> @@ -3315,16 +3315,10 @@ copy_forbidden (struct function *fun, tr
>       goto fail;
>        }
>  
> -  tree fs_opts;
> -  fs_opts = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (fun->decl);
> -  if (fs_opts)
> +  if (!opt_for_fn (fun->decl, optimize))

Maybe

  if (!opt_for_fn (fun->decl, optimize)
      && optimization_default_node->x_optimize != 0)

?

But yes, copy_forbidden is a bit big hammer considering always_inline
and similar stuff.  I think we need to handle -O0 at all consumers
selectively.

>      {
> -      struct cl_optimization *os = TREE_OPTIMIZATION (fs_opts);
> -      if (!os->x_optimize)
> -     {
> -       reason = G_("function %q+F compiled without optimizations");
> -       goto fail;
> -     }
> +      reason = G_("function %q+F compiled without optimizations");
> +      goto fail;
>      }
>  
>   fail:
> --- gcc/testsuite/gcc.c-torture/execute/pr60062.c.jj  2014-02-05 
> 15:55:48.639388697 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr60062.c     2014-02-05 
> 15:55:48.639388697 +0100
> @@ -0,0 +1,25 @@
> +/* PR target/60062 */
> +
> +int a;
> +
> +static void
> +foo (const char *p1, int p2)
> +{
> +  if (__builtin_strcmp (p1, "hello") != 0)
> +    __builtin_abort ();
> +}
> +
> +static void
> +bar (const char *p1)
> +{
> +  if (__builtin_strcmp (p1, "hello") != 0)
> +    __builtin_abort ();
> +}
> +
> +__attribute__((optimize (0))) int
> +main ()
> +{
> +  foo ("hello", a);
> +  bar ("hello");
> +  return 0;
> +}
> --- gcc/testsuite/gcc.c-torture/execute/pr60072.c.jj  2014-02-05 
> 15:55:48.640388641 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr60072.c     2014-02-05 
> 15:55:48.640388641 +0100
> @@ -0,0 +1,16 @@
> +/* PR target/60072 */
> +
> +int c = 1;
> +
> +__attribute__ ((optimize (1)))
> +static int *foo (int *p)
> +{
> +  return p;
> +}
> +
> +int
> +main ()
> +{
> +  *foo (&c) = 2;
> +  return c - 2;
> +}
> 
> 
>       Jakub
> 
> 

-- 
Richard Biener <rguent...@suse.de>
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imend"orffer

Reply via email to