Georg-Johann Lay <a...@gjlay.de> writes:
> So I had that situation where in an insn attribute, providing
> a block of code (rather than just an expression) would be
> useful.
>
> Expressions can provided by means of symbol_ref, like in
>
> (set (attr "length")
>       (symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)")))
>
> However providing a block of code gives a syntax error from
> the compiler, *NOT* from md_reader:
>
> (set (attr "length")
>       (symbol_ref
>        {
>          int len = 1;
>          return len;
>        }))
>
> This means such syntax is already supported to some degree,
> there's just no semantics assigned to such code.
>
> Blocks of code are already supported in insn predicates,
> like in
>
> (define_predicate "my_operand"
>    (match_code "code_label,label_ref,symbol_ref,plus,const")
> {
>    some code...
>    return true-or-false;
> })
>
> In the insn attribute case, I hacked a bit and supported
> blocks of code like in the example above.  The biggest change
> is that class attr_desc has to be moved from genattrtab.cc to
> read-md.h so that it is a complete type as required by
> md_reader::fprint_c_condition().
>
> That method prints to code for symbol_ref and some others, and
> it has to know the type of the attribute, like "int" for the
> "length" attribute.  The implementation in fprint_c_condition()
> is straight forward:
>
> When cond (which is the payload string of symbol_ref, including the
> '{}'s) starts with '{', the print a lambda that's called in place,
> like in
>
>     print "( [&]() -> <return_type> <cond> () )"
>
> The "&" capture is required so that variables like "insn" are
> accessible. "operands[]" and "which_alternative" are global,
> thus also accessible.
>
> Attached is the code I have so far (which is by no means a
> proposed patch, so I am posting here on gcc@).
>
> As far as I can tell, there is no performance penalty, e.g.
> in build times, when the feature is not used.  Of course instead
> of such syntax, a custom function could be used, or the
> braces-brackets-parentheses-gibberish could be written out
> in the symbol_ref as an expression.  Though I think this
> could be a nice addition, in particular because the scanning
> side in md_reader already supports the syntax.

Looks good to me.  I know you said it wasn't a patch submission,
but it looks mostly ready to go.  Some comments below:

> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 7f4335e0aac..3e46693e8c2 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -10265,6 +10265,56 @@ so there is no need to explicitly convert the 
> expression into a boolean
>  (match_test "(x & 2) != 0")
>  @end smallexample
>  
> +@cindex @code{symbol_ref} and attributes
> +@item (symbol_ref "@var{quoted-c-expr}")
> +
> +Specifies the value of the attribute sub-expression as a C expression,
> +where the surrounding quotes are not part of the expression.
> +Similar to @code{match_test}, variables @var{insn}, @var{operands[]}
> +and @var{which_alternative} are available.  Moreover, code and mode
> +attributes can be used to compose the resulting C expression, like in
> +
> +@smallexample
> +(set (attr "length")
> +     (symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)")))
> +@end smallexample
> +
> +where the according insn has exactly one mode iterator.
> +See @ref{Mode Iterators} and @ref{Code Iterators}.

I got the impression s/See @ref/@xref/ was recommended for sentence
references.

> +
> +@item  (symbol_ref "@{ @var{quoted-c-code} @}")
> +@itemx (symbol_ref @{ @var{c-code} @})
> +
> +The value of this subexpression is determined by running a block
> +of C code which returns the desired value.
> +The braces are part of the code, whereas the quotes in the quoted form are 
> not.
> +
> +This variant of @code{symbol_ref} allows for more comlpex code than
> +just a single C expression, like for example:
> +
> +@smallexample
> +(set (attr "length")
> +     (symbol_ref
> +      @{
> +        int len;
> +        some_function (insn, <CODE>, <MODE>mode, & len);
> +        return len;
> +      @}))
> +@end smallexample
> +
> +for an insn that has one code iterator and one mode iterator.
> +Again, variables @var{insn}, @var{operands[]} and @var{which_alternative}
> +can be used.  The unquoted form only supports a subset of C,
> +for example no C comments are supported, and strings that contain
> +characters like @samp{@}} are problematic and may need to be escaped
> +as @samp{\@}}.

By unquoted form, do you mean (symbol_ref { ... })?  I'd have expected
that to be better than "{ ... }" (or at least, I thought that was the
intention when { ... } was added).  I was going to suggest not documenting
the "{ ... }" form until I saw this.

> +
> +The return type is @code{int} for the @var{length} attribute, and
> +@code{enum attr_@var{name}} for an insn attribute named @var{name}.
> +The types and available enum values can be looked up in
> +@file{$builddir/gcc/insn-attr-common.h}.
> +
> +
>  @cindex @code{le} and attributes
>  @cindex @code{leu} and attributes
>  @cindex @code{lt} and attributes
> diff --git a/gcc/genattrtab.cc b/gcc/genattrtab.cc
> index 03c7d6c74a3..20d45ba88af 100644
> --- a/gcc/genattrtab.cc
> +++ b/gcc/genattrtab.cc
> @@ -168,22 +168,6 @@ struct attr_value
>    int has_asm_insn;          /* True if this value used for `asm' insns */
>  };
>  
> -/* Structure for each attribute.  */
> -
> -class attr_desc
> -{
> -public:
> -  char *name;                        /* Name of attribute.  */
> -  const char *enum_name;     /* Enum name for DEFINE_ENUM_NAME.  */
> -  class attr_desc *next;     /* Next attribute.  */
> -  struct attr_value *first_value; /* First value of this attribute.  */
> -  struct attr_value *default_val; /* Default value for this attribute.  */
> -  file_location loc;         /* Where in the .md files it occurs.  */
> -  unsigned is_numeric        : 1;    /* Values of this attribute are 
> numeric.  */
> -  unsigned is_const  : 1;    /* Attribute value constant for each run.  */
> -  unsigned is_special        : 1;    /* Don't call `write_attr_set'.  */
> -};
> -
>  /* Structure for each DEFINE_DELAY.  */
>  
>  class delay_desc
> @@ -3368,7 +3352,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int 
> attrs_cached, int flags,
>  {
>    int comparison_operator = 0;
>    RTX_CODE code;
> -  class attr_desc *attr;
> +  class attr_desc *attr = nullptr;
>  
>    if (emit_parens)
>      fprintf (outf, "(");
> @@ -3708,7 +3692,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int 
> attrs_cached, int flags,
>  
>      /* A random C expression.  */
>      case SYMBOL_REF:
> -      rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0));
> +      rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0), attr);
>        break;
>  
>      /* The address of the branch target.  */
> @@ -4052,12 +4036,7 @@ write_attr_get (FILE *outf, class attr_desc *attr)
>  
>    /* Write out start of function, then all values with explicit `case' lines,
>       then a `default', then the value with the most uses.  */
> -  if (attr->enum_name)
> -    fprintf (outf, "enum %s\n", attr->enum_name);
> -  else if (!attr->is_numeric)
> -    fprintf (outf, "enum attr_%s\n", attr->name);
> -  else
> -    fprintf (outf, "int\n");
> +  attr->fprint_type (outf, "\n");
>  
>    /* If the attribute name starts with a star, the remainder is the name of
>       the subroutine to use, instead of `get_attr_...'.  */
> @@ -4389,7 +4368,7 @@ write_attr_value (FILE *outf, class attr_desc *attr, 
> rtx value)
>        break;
>  
>      case SYMBOL_REF:
> -      rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0));
> +      rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0), attr);
>        break;
>  
>      case ATTR:
> diff --git a/gcc/read-md.cc b/gcc/read-md.cc
> index 93d1ea43781..a0025575cb1 100644
> --- a/gcc/read-md.cc
> +++ b/gcc/read-md.cc
> @@ -170,31 +170,62 @@ md_reader::join_c_conditions (const char *cond1, const 
> char *cond2)
>     directive for COND if its original file position is known.  */
>  
>  void
> -md_reader::fprint_c_condition (FILE *outf, const char *cond)
> +md_reader::fprint_c_condition (FILE *outf, const char *cond,
> +                            const attr_desc *attr)
>  {
>    const char **halves = (const char **) htab_find (m_joined_conditions, 
> &cond);
>    if (halves != 0)
>      {
>        fprintf (outf, "(");
> -      fprint_c_condition (outf, halves[1]);
> +      fprint_c_condition (outf, halves[1], attr);
>        fprintf (outf, " && ");
> -      fprint_c_condition (outf, halves[2]);
> +      fprint_c_condition (outf, halves[2], attr);
>        fprintf (outf, ")");
>      }
>    else
>      {
>        fputc ('\n', outf);
>        fprint_md_ptr_loc (outf, cond);
> -      fprintf (outf, "(%s)", cond);
> +      if (cond[0] == '{')
> +     {
> +       const struct ptr_loc *loc = get_md_ptr_loc (cond);
> +
> +       if (! attr)
> +         {
> +           error_at (loc->loc, "TODO md_reader::fprint_c_condition: "
> +                     "const attr_desc *attr is nullptr");
> +           exit (11);
> +         }

I think we should just skip the explicit return type if we don't
know what it is.  Simple cases will still work, and we should get
a reasonably sensible error from the compiler for other cases.
(And when we do, we can patch the caller to pass an appropriate type.)
> +
> +       // Like "( [&]() -> <return_type> <cond> () )"
> +       // Where COND is actually the body of a function, including the
> +       // outer {}'s.  Print this as a lambda that's evaluated in place.
> +       // The capture-all is required to have access to variables
> +       // like "insn".  Objects like "operands[]" and "which_alternative"
> +       // are accessible since they are global.
> +       fprintf (outf, "( [&]() -> ");
> +       attr->fprint_type (outf, " ");
> +       fprintf (outf, "%s () )", cond);
> +
> +       if (! strstr (cond, "return"))
> +         {
> +           error_at (loc->loc, "error: function body needs at least one"
> +                     " 'return' statement");
> +           error_at (loc->loc, "this is the body:\n%s\n", cond);
> +           exit (12);
> +         }

IMO it'd be better to leave the compiler to enforce this.  There are
many other ways in which the lambda body could be malformed.

> +     }
> +      else
> +     fprintf (outf, "(%s)", cond);
>      }
>  }
>  
>  /* Special fprint_c_condition for writing to STDOUT.  */
>  
>  void
> -md_reader::print_c_condition (const char *cond)
> +md_reader::print_c_condition (const char *cond, const attr_desc *desc)
>  {
> -  fprint_c_condition (stdout, cond);
> +  fprint_c_condition (stdout, cond, desc);
>  }
>  
>  /* A vfprintf-like function for reporting an error against line LINENO
> diff --git a/gcc/read-md.h b/gcc/read-md.h
> index 9703551a8fd..ae10b651de1 100644
> --- a/gcc/read-md.h
> +++ b/gcc/read-md.h
> @@ -132,6 +132,38 @@ struct overloaded_name {
>    overloaded_instance **next_instance_ptr;
>  };
>  
> +/* Structure for each attribute.  */
> +
> +struct attr_value;
> +
> +class attr_desc
> +{
> +public:
> +  char *name;                        /* Name of attribute.  */
> +  const char *enum_name;     /* Enum name for DEFINE_ENUM_NAME.  */
> +  class attr_desc *next;     /* Next attribute.  */
> +  struct attr_value *first_value; /* First value of this attribute.  */
> +  struct attr_value *default_val; /* Default value for this attribute.  */
> +  file_location loc;         /* Where in the .md files it occurs.  */
> +  unsigned is_numeric        : 1;    /* Values of this attribute are 
> numeric.  */
> +  unsigned is_const  : 1;    /* Attribute value constant for each run.  */
> +  unsigned is_special        : 1;    /* Don't call `write_attr_set'.  */
> +
> +  // Print the return type for functions like get_attr_<attribute-name>
> +  // to stream OUTF, followed by SUFFIX which should be white-space(s).
> +  void fprint_type (FILE *outf, const char *suffix) const
> +  {
> +    if (enum_name)
> +      fprintf (outf, "enum %s", enum_name);
> +    else if (! is_numeric)
> +      fprintf (outf, "enum attr_%s", name);
> +    else
> +      fprintf (outf, "int");
> +
> +    fprintf (outf, "%s", suffix);

It shouldn't be necessary to emit the enum tag these days.  If removing
it causes anything to break, I think we should fix whatever that breaking
thing is.  Could you try doing that, as a pre-patch?  Or I can give it a
go, if you'd rather not.

If we do that, then we can just a return a const char * for the type.
And then in turn we can pass a const char * to (f)print_c_condition.
The MD reader then wouldn't need to know about attributes.

Thanks,
Richard

> +  }
> +};
> +
>  struct mapping;
>  
>  /* A class for reading .md files and RTL dump files.
> @@ -204,8 +236,8 @@ class md_reader
>    void handle_enum (file_location loc, bool md_p);
>  
>    const char *join_c_conditions (const char *cond1, const char *cond2);
> -  void fprint_c_condition (FILE *outf, const char *cond);
> -  void print_c_condition (const char *cond);
> +  void fprint_c_condition (FILE *outf, const char *cond, const attr_desc * = 
> nullptr);
> +  void print_c_condition (const char *cond, const attr_desc * = nullptr);
>  
>    /* Defined in read-rtl.cc.  */
>    const char *apply_iterator_to_string (const char *string);

Reply via email to