Re: PR #53525 - track-macro-expansion performance regression

Dodji Seketeli Wed, 18 Jul 2012 11:58:53 -0700

Hello Dimitrios,

> With the attached patches I introduce four new obstacks in struct
> cpp_reader to substitute malloc's/realloc's when expanding
> macros. Numbers have been posted in the PR, but to summarize:
> 
> before: 0.785 s or  2201 M instr
> after:  0.760 s or  2108 M instr
> 
> Memory overhead is some tens kilobytes worst case. Tested on x86, no
> regressions. I think this version of the patch is OK to merge, besides
> some TODO comments (I'd appreciate opinions on them) and the following
> maybe:


Thank you for your time and dedication.

I am not a maintainer of any kind, so I can not approve or deny your
patch.  I am just chiming in to help with a few comments and CC the
maintainers.

> IMHO the new obstack_{mark,release} functions are the ones that will
> be harder to apply, because they make gcc's obstacks even more
> different than libc's. I sent the patch to libc-alpha but the feedback
> I got was that I should first make the two obstack versions (gcc,libc)
> identical, and then try to push changes. I've noted the primary
> differences and plan on tackling this, but it's not as trivial as it
> seems.
> 
> So if it's not OK, where could the new obstack_{mark,release} go?

I am letting the maintainers reply to this one.  :-)

Please find my comments below.

For each sub-project that your patch modifies, you need to add a
GNU-Style ChangeLog.  The custom is to add it separately (e.g inline
or in attachment), not as part of the diff.  That way, applying the
patch is less likely to trigger conflicts.

> === modified file 'include/libiberty.h'
> --- include/libiberty.h       2011-09-28 19:04:30 +0000
> +++ include/libiberty.h       2012-07-07 16:04:01 +0000

[...]

> -/* Type-safe obstack allocator.  */
> +/* Type-safe obstack allocator. You must first initialize the obstack with
> +   obstack_init() or _obstack_begin(). */

Why not just using the _obstack_begin function?  Why introducing the
use of the obstack_init macro?

>  #define XOBNEW(O, T)         ((T *) obstack_alloc ((O), sizeof (T)))
>  #define XOBNEWVEC(O, T, N)   ((T *) obstack_alloc ((O), sizeof (T) * (N)))
>  #define XOBNEWVAR(O, T, S)   ((T *) obstack_alloc ((O), (S)))
> -#define XOBFINISH(O, T)         ((T) obstack_finish ((O)))

I think this change is not needed.  You basically remove this line to
replace it with the hunk below, that comes later in the patch:

> +#define XOBFINISH(O, PT)     ((PT) obstack_finish ((O)))

So I believe you could do away with the change.

> +#define XOBDELETE(O, T, P)   (obstack_free ((O), (P)))

If you are not using the T parameter in the definition of the macro,
then you might as well remove it from the macro parameters.

> +
> +#define XOBGROW(O, T, D)     obstack_grow ((O), (D), sizeof (T))
> +#define XOBGROWVEC(O, T, D, N)       obstack_grow ((O), (D), sizeof (T) * 
> (N))
> +#define XOBSHRINK(O, T)              obstack_blank ((O), -1 * sizeof (T))
> +#define XOBSHRINKVEC(O, T, N)        obstack_blank ((O), -1 * sizeof (T) * 
> (N))

Maybe these new macros could use some comments at least to make it
easier to figure out what these O, T, D, N parameters mean.  I
understand that it is not done that way for the existing macros, but I
guess we could use some improvement here.  :-)

[...]

> === modified file 'libcpp/init.c'
> --- libcpp/init.c     2012-04-30 11:43:43 +0000
> +++ libcpp/init.c     2012-07-07 16:04:01 +0000

Missing ChangeLog for this changes of libcpp.

> @@ -241,10 +241,20 @@ cpp_create_reader (enum c_lang lang, has
>    /* The expression parser stack.  */
>    _cpp_expand_op_stack (pfile);
>  
> +#define obstack_chunk_alloc     ((void *(*) (long)) xmalloc)
> +#define obstack_chunk_free      ((void (*) (void *)) free)
> +
>    /* Initialize the buffer obstack.  */
> -  _obstack_begin (&pfile->buffer_ob, 0, 0,
> -               (void *(*) (long)) xmalloc,
> -               (void (*) (void *)) free);
> +  obstack_init(&pfile->buffer_ob);

Same comment as earlier.  Why obstack_init instead of just using
_obstack_begin?

> +
> +  /* Initialize the macro obstacks. */
> +  obstack_init (&pfile->exp_ob);
> +  if (CPP_OPTION (pfile, track_macro_expansion))
> +    {
> +      obstack_init (&pfile->virt_locs_ob);
> +      obstack_init (&pfile->arg_locs_ob);
> +      obstack_init (&pfile->exp_locs_ob);
> +    }

Same comment as above.

[...]

> === modified file 'libcpp/internal.h'
> --- libcpp/internal.h 2012-05-29 09:36:29 +0000
> +++ libcpp/internal.h 2012-07-07 17:18:53 +0000

[...]

@@ -555,6 +555,13 @@ struct cpp_reader

[...]

> +  /* Obstacks used to speed up macro expansion and virt_locs tracking. */

I'd say something like:

/* Obstacks used for fast memory allocation during macro expansion and
   virtual location tracking. */

+  struct obstack exp_ob;       /* for expanding macro arguments */

I'd rather call this field args_exp_ob, to make the name more
meaningful.

+ struct obstack exp_locs_ob;   /* virt_locs of expanded macro arguments */

Likewise, I'd call this field args_exp_virt_locs_ob.

+  struct obstack virt_locs_ob; /* virt locs for all other macros */

I'd change the comment here to:

    /* Virtual locations for tokens resulting from macro expansion.  */

Also, please make sure to add a dot at the end of the comments,
followed by two spaces before the */.

> === modified file 'libcpp/macro.c'
> --- libcpp/macro.c    2012-05-29 14:53:50 +0000
> +++ libcpp/macro.c    2012-07-07 18:08:44 +0000
@@ -24,10 +24,20 @@ along with this program; see the file CO
  You are forbidden to forbid anyone else to use, share and improve
  what you give them.   Help stamp out software-hoarding!  */
 
+

This is an unnecessary white space change.

[...]

+#define MACRO_ASSERT(EXPR)                     \
+  do {                                         \
+    if (! (EXPR))                              \
+      abort ();                                        \
+  } while (0)
+

I believe this macro should be guarded by an #ifdef ENABLED_CHECKING,
so that the macro expands to nothing when building the compiler with
--disable-checking.

>  static const cpp_token **arg_token_ptr_at (const macro_arg *,
> @@ -276,7 +279,7 @@ _cpp_builtin_macro_text (cpp_reader *pfi
>       unsigned int len;
>       const char *name;
>       uchar *buf;
> -     
> +

This is an unnecessary white space change.  At a very least, it should
go into a separate cleanup patch.

[...]

> @@ -361,7 +364,7 @@ _cpp_builtin_macro_text (cpp_reader *pfi
>           {
>             cpp_errno (pfile, CPP_DL_WARNING,
>                        "could not determine date and time");
> -             
> +

Likewise.

> @@ -388,7 +391,7 @@ _cpp_builtin_macro_text (cpp_reader *pfi
>        sprintf ((char *) result, "%u", number);
>      }
>  
> -  return result;      
> +  return result;
>  }

Likewise.

[...]

>  
>  /* Convert builtin macros like __FILE__ to a token and push it on the
> @@ -734,6 +737,52 @@ _cpp_arguments_ok (cpp_reader *pfile, cp
>    return false;
>  }
>  
> +/* Returns next token after pragma (either EOF or EOL). */
> +
> +static const cpp_token *
> +handle_pragma (cpp_reader *pfile, const cpp_token *token,
> +            _cpp_buff **pragma_buff)
> +{
> +  cpp_token *newtok = _cpp_temp_token (pfile);
> +
> +  /* CPP_PRAGMA token lives in directive_result, which will
> +     be overwritten on the next directive.  */
> +  *newtok = *token;
> +  token = newtok;
> +  do
> +    {
> +      if (*pragma_buff == NULL
> +       || BUFF_ROOM (*pragma_buff) < sizeof (cpp_token *))
> +     {
> +       _cpp_buff *next;
> +       if (*pragma_buff == NULL)
> +         *pragma_buff = _cpp_get_buff (pfile, 32 * sizeof (cpp_token *));
> +       else
> +         {
> +           next = *pragma_buff;
> +           *pragma_buff = _cpp_get_buff (pfile,
> +                                         (BUFF_FRONT (*pragma_buff)
> +                                          - (*pragma_buff)->base) * 2);
> +           (*pragma_buff)->next = next;
> +         }
> +     }
> +      *(const cpp_token **) BUFF_FRONT (*pragma_buff) = token;
> +      BUFF_FRONT (*pragma_buff) += sizeof (cpp_token *);
> +      if (token->type == CPP_PRAGMA_EOL)
> +     break;
> +
> +      token = cpp_get_token_1 (pfile, NULL);
> +    }
> +  while (token->type != CPP_EOF);
> +
> +  /* In deferred pragmas parsing_args and prevent_expansion
> +     had been changed, reset it.  */
> +  pfile->state.parsing_args = 2;
> +  pfile->state.prevent_expansion = 1;
> +
> +  return token;
> +}
> +

I understand that the macro expansion code is not always simple to
grok, and I understand the pressure to simplify it.  But I think that
factorizing this new handle_pragma function, out of the hunk:

[...]

> @@ -844,63 +880,40 @@ collect_args (cpp_reader *pfile, const c
>           break;
>         else if (token->type == CPP_PRAGMA)
>           {
> -           cpp_token *newtok = _cpp_temp_token (pfile);
> -
> -           /* CPP_PRAGMA token lives in directive_result, which will
> -              be overwritten on the next directive.  */
> -           *newtok = *token;
> -           token = newtok;
> -           do
> -             {
> -               if (*pragma_buff == NULL
> -                   || BUFF_ROOM (*pragma_buff) < sizeof (cpp_token *))
> -                 {
> -                   _cpp_buff *next;
> -                   if (*pragma_buff == NULL)
> -                     *pragma_buff
> -                       = _cpp_get_buff (pfile, 32 * sizeof (cpp_token *));
> -                   else
> -                     {
> -                       next = *pragma_buff;
> -                       *pragma_buff
> -                         = _cpp_get_buff (pfile,
> -                                          (BUFF_FRONT (*pragma_buff)
> -                                           - (*pragma_buff)->base) * 2);
> -                       (*pragma_buff)->next = next;
> -                     }
> -                 }
> -               *(const cpp_token **) BUFF_FRONT (*pragma_buff) = token;
> -               BUFF_FRONT (*pragma_buff) += sizeof (cpp_token *);
> -               if (token->type == CPP_PRAGMA_EOL)
> -                 break;
> -               token = cpp_get_token_1 (pfile, &virt_loc);
> -             }
> -           while (token->type != CPP_EOF);
> -
> -           /* In deferred pragmas parsing_args and prevent_expansion
> -              had been changed, reset it.  */
> -           pfile->state.parsing_args = 2;
> -           pfile->state.prevent_expansion = 1;
> +           /* Get last token after the pragma. */
> +           token = handle_pragma (pfile, token, pragma_buff);

... belongs to a different patch, which purpose would be to "cleanup"
this code, as opposed to the purpose of the current patch which is to
use obstack for virtual location tracking code.

[...]

>  /* Reads and returns the arguments to a function-like macro
>     invocation.  Assumes the opening parenthesis has been processed.
>     If there is an error, emits an appropriate diagnostic and returns
> @@ -757,7 +806,6 @@ collect_args (cpp_reader *pfile, const c
>    const cpp_token *token;
>    unsigned int argc;
>    source_location virt_loc;
> -  bool track_macro_expansion_p = CPP_OPTION (pfile, track_macro_expansion);
>    unsigned num_args_alloced = 0;
>  
>    macro = node->value.macro;
> @@ -769,6 +817,8 @@ collect_args (cpp_reader *pfile, const c
>  #define DEFAULT_NUM_TOKENS_PER_MACRO_ARG 50
>  #define ARG_TOKENS_EXTENT 1000
>  
> +  /* TODO maybe macro_args should be stored in a separate obstack growing
> +     little by little? */

Does the memory (de)allocation macro_args show up in your profile?

>    buff = _cpp_get_buff (pfile, argc * (DEFAULT_NUM_TOKENS_PER_MACRO_ARG
>                                      * sizeof (cpp_token *)
>                                      + sizeof (macro_arg)));
> @@ -781,23 +831,17 @@ collect_args (cpp_reader *pfile, const c
>    /* Collect the tokens making up each argument.  We don't yet know
>       how many arguments have been supplied, whether too many or too
>       few.  Hence the slightly bizarre usage of "argc" and "arg".  */
> -  do
> +
> +  do                         /* for each macro argument */
>      {

I believe there should be a period at the end of the comment, followed
by two spaces before the "*/".  That is the GCC style of commenting.

[...]

> -      for (;;)
> +      for (;;)                       /* for each token in argument */

Likewise, commenting style.

[...]

>             if (token->type == CPP_EOF)
>               break;
> -           else
> -             continue;
> +           continue;

I think this change is unnecessary.

>           }
> -       set_arg_token (arg, token, virt_loc,
> -                      ntokens, MACRO_ARG_TOKEN_NORMAL,
> -                      CPP_OPTION (pfile, track_macro_expansion));
> +
> +       /* TODO grow an obstack for tokens here? */

Same question as earlier.  Did the (de)allocation appear in your
profile?

> +       arg->first[ntokens] = token;
> +       if (CPP_OPTION (pfile, track_macro_expansion))
> +         XOBGROW (&pfile->arg_locs_ob, source_location, &virt_loc);
> +
>         ntokens++;
>       }
>  
>        /* Drop trailing padding.  */
>        while (ntokens > 0 && arg->first[ntokens - 1]->type == CPP_PADDING)
> +      {

The '{' should have two space indentation wrt the column of the
'while' keyword.

>       ntokens--;
> +     if (CPP_OPTION (pfile, track_macro_expansion))
> +       XOBSHRINK (&pfile->arg_locs_ob, source_location);
> +      }

Likewise, indentation of the '}'.

[...]

> +       /* At this point the array of virt_locs is finished. */

Two spaces after dot.

[...]

>        if (macro->fun_like)
>       {
> +       /* Mark where we start allocating so that we can free with the
> +          simple call to obstack_release() later.
> +
> +          It's not enough just to do obstack_alloc/free() because in
> +          recursive calls of enter_macro_context() happen through
> +          expand_arg()->_cpp_get_token_1() and *growing* objects may be
> +          interrupted but may need to continue growing after returning. */

Two spaces after dot.

Also, I would amend the comment to make it clear that from here on,
subroutines like funlike_invocation_p, expand_arg, collect_args, etc,
are going to allocate memory for the expansion of macros that are
present in the arguments of the current function-like macro we are
dealing with.

That memory is in a obstack that is global to all function-like
macros.  So we need to mark where the obstack memory for the arguments
of *this* function-like macro starts, so that we can later release
that memory only, and avoid releasing/corrupting memory of the
arguments of other intermixed function-like macro calls.

[...]

> +       void *expanded_locs_marker = obstack_mark (&pfile->exp_locs_ob);
> +       void *arg_locs_marker = obstack_mark (&pfile->arg_locs_ob);
> +       void *expanded_marker = obstack_mark (&pfile->exp_ob);
> +
>         _cpp_buff *buff;
>         unsigned num_args = 0;
>  
> @@ -1069,11 +1093,29 @@ enter_macro_context (cpp_reader *pfile,
>         pfile->keep_tokens--;
>         pfile->state.prevent_expansion--;
>  
> +       if (buff != NULL)
> +         {
> +           if (macro->paramc > 0)
> +             replace_args (pfile, node, macro,
> +                           (macro_arg *) buff->base,
> +                           location);
> +           _cpp_free_buff (buff);
> +         }
> +
> +       /* Free everything that was allocated in these obstacks after the
> +          obstack_mark() operation. */

Two spaces after the dot.

[...]

> +       /* If no opening parenthesis was found by funlike_invocation_p() or
> +          if an error happened when it called collect_args(). */

Likewise.

>         if (buff == NULL)
>           {
>             if (CPP_WTRADITIONAL (pfile) && ! node->value.macro->syshdr)
>               cpp_warning (pfile, CPP_W_TRADITIONAL,
> - "function-like macro \"%s\" must be used with arguments in traditional C",
> +                          "function-like macro \"%s\" must be used "
> +                          "with arguments in traditional C",
>                            NODE_NAME (node));

This change should go in a separate cleanup patch.

[...]

>  /* Get the pointer to the location of the argument token of the
>     function-like macro argument ARG.  This function must be called
>     only when we -ftrack-macro-expansion is on.  */
> @@ -1289,11 +1250,11 @@ arg_token_ptr_at (const macro_arg *arg,
>      case MACRO_ARG_TOKEN_NORMAL:
>        tokens_ptr = arg->first;
>        break;
> -    case MACRO_ARG_TOKEN_STRINGIFIED:      
> +    case MACRO_ARG_TOKEN_STRINGIFIED:

Unnecessary white space change.

>        tokens_ptr = (const cpp_token **) &arg->stringified;
>        break;
>      case MACRO_ARG_TOKEN_EXPANDED:
> -     tokens_ptr = arg->expanded;
> +      tokens_ptr = arg->expanded;
>        break;
>      }
>  
> @@ -1407,12 +1368,12 @@ macro_arg_token_iter_get_location (const
>     want each tokens resulting from function-like macro arguments
>     expansion to have a different location or not.
>  
> -   E.g, consider this function-like macro: 
> +   E.g, consider this function-like macro:

Likewise.

>          #define M(x) x - 3
>  
>     Then consider us "calling" it (and thus expanding it) like:
> -   
> +

Likewise.

>         M(1+4)
>  
>     It will be expanded into:
> @@ -1526,7 +1487,7 @@ replace_args (cpp_reader *pfile, cpp_has
>       location that records many things like the locus of the expansion
>       point as well as the original locus inside the definition of the
>       macro.  This location is called a virtual location.
> -     
> +

Likewise.

>       So the buffer BUFF holds a set of cpp_token*, and the buffer
>       VIRT_LOCS holds the virtual locations of the tokens held by BUFF.
>  
> @@ -1534,7 +1495,7 @@ replace_args (cpp_reader *pfile, cpp_has
>       context, when the latter is pushed.  The memory allocated to
>       store the tokens and their locations is going to be freed once
>       the context of macro expansion is popped.
> -     
> +

Likewise.

>       As far as tokens are concerned, the memory overhead of
>       -ftrack-macro-expansion is proportional to the number of
>       macros that get expanded multiplied by sizeof (source_location).
> @@ -1548,6 +1509,7 @@ replace_args (cpp_reader *pfile, cpp_has
>       the macro expansion, copy the tokens and replace the arguments.
>       This memory must be freed when the context of the macro MACRO is
>       popped.  */
> +  /* TODO why is total 0 sometimes? Should we return? */

Hmmh.  Maybe it's when a function-like macro has an empty
replacement-list?  Even in that case, I don't think we should return.
It's important to keep the information "we are expanding this macro to
nothing".  So we need to push a zero-length macro expansion context.

>    buff = tokens_buff_new (pfile, total, track_macro_exp ? &virt_locs : NULL);

[...]

> @@ -1811,6 +1773,7 @@ next_context (cpp_reader *pfile)
>  
>    if (result == 0)
>      {
> +      /* TODO obstack of contexts in cpp_reader? Too much to change? */

I am afraid that will increase peak memory usage quite a bit.  The
management of the context buffers' memory didn't involve freeing
contexts.  I introduced the memory freeing for these (after measuring
and seeing that the peak memory usage for contexts was high).

But if you are brave enough and maintainers think it could be
worthwhile, maybe you could try and measure both speed and space to
see.

[...]

> @@ -1984,7 +1947,7 @@ tokens_buff_put_token_to (const cpp_toke
>                         source_location *virt_loc_dest,
>                         const cpp_token *token,
>                         source_location virt_loc,
> -                       source_location parm_def_loc,                   
> +                       source_location parm_def_loc,

Unnecessary white space change.

>                         const struct line_map *map,
>                         unsigned int macro_token_index)
>  {
> @@ -2035,7 +1998,7 @@ tokens_buff_add_token (_cpp_buff *buffer
>  {
>    const cpp_token **result;
>    source_location *virt_loc_dest = NULL;
> -  unsigned token_index = 
> +  unsigned token_index =

Likewise.

[...]

> @@ -2280,7 +2199,7 @@ consume_next_token_from_context (cpp_rea
>        *location = (*token)->src_loc;
>        FIRST (c).token++;
>      }
> -  else if ((c)->tokens_kind == TOKENS_KIND_INDIRECT)         
> +  else if ((c)->tokens_kind == TOKENS_KIND_INDIRECT)

Unnecessary white space change.

>      {
>        *token = *FIRST (c).ptoken;
>        *location = (*token)->src_loc;
> @@ -2370,7 +2289,7 @@ cpp_get_token_1 (cpp_reader *pfile, sour
>             goto out;
>           }
>       }
> -      else
> +      else                   /* end of context */

GCC Comment style.

>       {
>         if (pfile->context->c.macro)
>           ++num_expanded_macros_counter;
> @@ -2433,7 +2352,7 @@ cpp_get_token_1 (cpp_reader *pfile, sour
>               }
>           }
>         else
> -         ret = enter_macro_context (pfile, node, result, 
> +         ret = enter_macro_context (pfile, node, result,

Unnecessary white space change.

[...]


> === modified file 'include/obstack.h'
> --- include/obstack.h 2011-10-22 01:35:29 +0000
> +++ include/obstack.h 2012-07-07 16:04:01 +0000

Missing ChangeLog

[...]

> +static inline void obstack_release (struct obstack *o, void *p)
> +{

Missing comment for this function.

> +  void *old_base = *(void **) p;
> +  obstack_free (o, p);
> +  o->object_base = (char *) old_base;
> +}
> +
> +
>  #ifdef __cplusplus
>  }    /* C++ */
>  #endif

Cheers.


-- 
                Dodji

Re: PR #53525 - track-macro-expansion performance regression

Reply via email to