Am 13.07.24 um 13:44 schrieb Richard Sandiford:
Georg-Johann Lay <a...@gjlay.de> writes:
So I had that situation where in an insn attribute, providing
a block of code (rather than just an expression) would be
useful.

Expressions can provided by means of symbol_ref, like in

(set (attr "length")
     (symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)")))

However providing a block of code gives a syntax error from
the compiler, *NOT* from md_reader:

(set (attr "length")
       (symbol_ref
        {
          int len = 1;
          return len;
        }))

This means such syntax is already supported to some degree,
there's just no semantics assigned to such code.

Blocks of code are already supported in insn predicates,
like in

(define_predicate "my_operand"
    (match_code "code_label,label_ref,symbol_ref,plus,const")
{
    some code...
    return true-or-false;
})

In the insn attribute case, I hacked a bit and supported
blocks of code like in the example above.  The biggest change
is that class attr_desc has to be moved from genattrtab.cc to
read-md.h so that it is a complete type as required by
md_reader::fprint_c_condition().

That method prints to code for symbol_ref and some others, and
it has to know the type of the attribute, like "int" for the
"length" attribute.  The implementation in fprint_c_condition()
is straight forward:

When cond (which is the payload string of symbol_ref, including the
'{}'s) starts with '{', the print a lambda that's called in place,
like in

     print "( [&]() -> <return_type> <cond> () )"

The "&" capture is required so that variables like "insn" are
accessible. "operands[]" and "which_alternative" are global,
thus also accessible.

Attached is the code I have so far (which is by no means a
proposed patch, so I am posting here on gcc@).

As far as I can tell, there is no performance penalty, e.g.
in build times, when the feature is not used.  Of course instead
of such syntax, a custom function could be used, or the
braces-brackets-parentheses-gibberish could be written out
in the symbol_ref as an expression.  Though I think this
could be a nice addition, in particular because the scanning
side in md_reader already supports the syntax.

Looks good to me.  I know you said it wasn't a patch submission,
but it looks mostly ready to go.  Some comments below:

diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 7f4335e0aac..3e46693e8c2 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -10265,6 +10265,56 @@ so there is no need to explicitly convert the 
expression into a boolean
  (match_test "(x & 2) != 0")
  @end smallexample
+@cindex @code{symbol_ref} and attributes
+@item (symbol_ref "@var{quoted-c-expr}")
+
+Specifies the value of the attribute sub-expression as a C expression,
+where the surrounding quotes are not part of the expression.
+Similar to @code{match_test}, variables @var{insn}, @var{operands[]}
+and @var{which_alternative} are available.  Moreover, code and mode
+attributes can be used to compose the resulting C expression, like in
+
+@smallexample
+(set (attr "length")
+     (symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)")))
+@end smallexample
+
+where the according insn has exactly one mode iterator.
+See @ref{Mode Iterators} and @ref{Code Iterators}.

I got the impression s/See @ref/@xref/ was recommended for sentence
references.

+
+@item  (symbol_ref "@{ @var{quoted-c-code} @}")
+@itemx (symbol_ref @{ @var{c-code} @})
+
+The value of this subexpression is determined by running a block
+of C code which returns the desired value.
+The braces are part of the code, whereas the quotes in the quoted form are not.
+
+This variant of @code{symbol_ref} allows for more comlpex code than
+just a single C expression, like for example:
+
+@smallexample
+(set (attr "length")
+     (symbol_ref
+      @{
+        int len;
+        some_function (insn, <CODE>, <MODE>mode, & len);
+        return len;
+      @}))
+@end smallexample
+
+for an insn that has one code iterator and one mode iterator.
+Again, variables @var{insn}, @var{operands[]} and @var{which_alternative}
+can be used.  The unquoted form only supports a subset of C,
+for example no C comments are supported, and strings that contain
+characters like @samp{@}} are problematic and may need to be escaped
+as @samp{\@}}.

By unquoted form, do you mean (symbol_ref { ... })?  I'd have expected
that to be better than "{ ... }" (or at least, I thought that was the
intention when { ... } was added).  I was going to suggest not documenting
the "{ ... }" form until I saw this.

+
+The return type is @code{int} for the @var{length} attribute, and
+@code{enum attr_@var{name}} for an insn attribute named @var{name}.
+The types and available enum values can be looked up in
+@file{$builddir/gcc/insn-attr-common.h}.
+
+
  @cindex @code{le} and attributes
  @cindex @code{leu} and attributes
  @cindex @code{lt} and attributes
diff --git a/gcc/genattrtab.cc b/gcc/genattrtab.cc
index 03c7d6c74a3..20d45ba88af 100644
--- a/gcc/genattrtab.cc
+++ b/gcc/genattrtab.cc
@@ -168,22 +168,6 @@ struct attr_value
    int has_asm_insn;           /* True if this value used for `asm' insns */
  };
-/* Structure for each attribute. */
-
-class attr_desc
-{
-public:
-  char *name;                  /* Name of attribute.  */
-  const char *enum_name;       /* Enum name for DEFINE_ENUM_NAME.  */
-  class attr_desc *next;       /* Next attribute.  */
-  struct attr_value *first_value; /* First value of this attribute.  */
-  struct attr_value *default_val; /* Default value for this attribute.  */
-  file_location loc;           /* Where in the .md files it occurs.  */
-  unsigned is_numeric  : 1;    /* Values of this attribute are numeric.  */
-  unsigned is_const    : 1;    /* Attribute value constant for each run.  */
-  unsigned is_special  : 1;    /* Don't call `write_attr_set'.  */
-};
-
  /* Structure for each DEFINE_DELAY.  */
class delay_desc
@@ -3368,7 +3352,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int 
attrs_cached, int flags,
  {
    int comparison_operator = 0;
    RTX_CODE code;
-  class attr_desc *attr;
+  class attr_desc *attr = nullptr;
if (emit_parens)
      fprintf (outf, "(");
@@ -3708,7 +3692,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int 
attrs_cached, int flags,
/* A random C expression. */
      case SYMBOL_REF:
-      rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0));
+      rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0), attr);
        break;
/* The address of the branch target. */
@@ -4052,12 +4036,7 @@ write_attr_get (FILE *outf, class attr_desc *attr)
/* Write out start of function, then all values with explicit `case' lines,
       then a `default', then the value with the most uses.  */
-  if (attr->enum_name)
-    fprintf (outf, "enum %s\n", attr->enum_name);
-  else if (!attr->is_numeric)
-    fprintf (outf, "enum attr_%s\n", attr->name);
-  else
-    fprintf (outf, "int\n");
+  attr->fprint_type (outf, "\n");
/* If the attribute name starts with a star, the remainder is the name of
       the subroutine to use, instead of `get_attr_...'.  */
@@ -4389,7 +4368,7 @@ write_attr_value (FILE *outf, class attr_desc *attr, rtx 
value)
        break;
case SYMBOL_REF:
-      rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0));
+      rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0), attr);
        break;
case ATTR:
diff --git a/gcc/read-md.cc b/gcc/read-md.cc
index 93d1ea43781..a0025575cb1 100644
--- a/gcc/read-md.cc
+++ b/gcc/read-md.cc
@@ -170,31 +170,62 @@ md_reader::join_c_conditions (const char *cond1, const 
char *cond2)
     directive for COND if its original file position is known.  */
void
-md_reader::fprint_c_condition (FILE *outf, const char *cond)
+md_reader::fprint_c_condition (FILE *outf, const char *cond,
+                              const attr_desc *attr)
  {
    const char **halves = (const char **) htab_find (m_joined_conditions, 
&cond);
    if (halves != 0)
      {
        fprintf (outf, "(");
-      fprint_c_condition (outf, halves[1]);
+      fprint_c_condition (outf, halves[1], attr);
        fprintf (outf, " && ");
-      fprint_c_condition (outf, halves[2]);
+      fprint_c_condition (outf, halves[2], attr);
        fprintf (outf, ")");
      }
    else
      {
        fputc ('\n', outf);
        fprint_md_ptr_loc (outf, cond);
-      fprintf (outf, "(%s)", cond);
+      if (cond[0] == '{')
+       {
+         const struct ptr_loc *loc = get_md_ptr_loc (cond);
+
+         if (! attr)
+           {
+             error_at (loc->loc, "TODO md_reader::fprint_c_condition: "
+                       "const attr_desc *attr is nullptr");
+             exit (11);
+           }

I think we should just skip the explicit return type if we don't
know what it is.  Simple cases will still work, and we should get
a reasonably sensible error from the compiler for other cases.
(And when we do, we can patch the caller to pass an appropriate type.)

Ok.

+
+         // Like "( [&]() -> <return_type> <cond> () )"
+         // Where COND is actually the body of a function, including the
+         // outer {}'s.  Print this as a lambda that's evaluated in place.
+         // The capture-all is required to have access to variables
+         // like "insn".  Objects like "operands[]" and "which_alternative"
+         // are accessible since they are global.
+         fprintf (outf, "( [&]() -> ");
+         attr->fprint_type (outf, " ");
+         fprintf (outf, "%s () )", cond);
+
+         if (! strstr (cond, "return"))
+           {
+             error_at (loc->loc, "error: function body needs at least one"
+                       " 'return' statement");
+             error_at (loc->loc, "this is the body:\n%s\n", cond);
+             exit (12);
+           }

IMO it'd be better to leave the compiler to enforce this.  There are
many other ways in which the lambda body could be malformed.

+       }
+      else
+       fprintf (outf, "(%s)", cond);
      }
  }
/* Special fprint_c_condition for writing to STDOUT. */ void
-md_reader::print_c_condition (const char *cond)
+md_reader::print_c_condition (const char *cond, const attr_desc *desc)
  {
-  fprint_c_condition (stdout, cond);
+  fprint_c_condition (stdout, cond, desc);
  }
/* A vfprintf-like function for reporting an error against line LINENO
diff --git a/gcc/read-md.h b/gcc/read-md.h
index 9703551a8fd..ae10b651de1 100644
--- a/gcc/read-md.h
+++ b/gcc/read-md.h
@@ -132,6 +132,38 @@ struct overloaded_name {
    overloaded_instance **next_instance_ptr;
  };
+/* Structure for each attribute. */
+
+struct attr_value;
+
+class attr_desc
+{
+public:
+  char *name;                  /* Name of attribute.  */
+  const char *enum_name;       /* Enum name for DEFINE_ENUM_NAME.  */
+  class attr_desc *next;       /* Next attribute.  */
+  struct attr_value *first_value; /* First value of this attribute.  */
+  struct attr_value *default_val; /* Default value for this attribute.  */
+  file_location loc;           /* Where in the .md files it occurs.  */
+  unsigned is_numeric  : 1;    /* Values of this attribute are numeric.  */
+  unsigned is_const    : 1;    /* Attribute value constant for each run.  */
+  unsigned is_special  : 1;    /* Don't call `write_attr_set'.  */
+
+  // Print the return type for functions like get_attr_<attribute-name>
+  // to stream OUTF, followed by SUFFIX which should be white-space(s).
+  void fprint_type (FILE *outf, const char *suffix) const
+  {
+    if (enum_name)
+      fprintf (outf, "enum %s", enum_name);
+    else if (! is_numeric)
+      fprintf (outf, "enum attr_%s", name);
+    else
+      fprintf (outf, "int");
+
+    fprintf (outf, "%s", suffix);

It shouldn't be necessary to emit the enum tag these days.  If removing

Hi Richard,

I am not familiar with the gensupport policies, which is the reason why
the feature is just a suggestion / proposal and not a patch.
IMO patches should not come from someone like me who has no experience
in that area; better someone more experienced would take it over.

it causes anything to break, I think we should fix whatever that breaking
thing is.  Could you try doing that, as a pre-patch?  Or I can give it a
go, if you'd rather not.

Yes please.

If we do that, then we can just a return a const char * for the type.

Yes, const char* would be easier. I just didn't know how to alloc one,
and where.  A new const char* property in class attr_desc_would solve
it.

And then in turn we can pass a const char * to (f)print_c_condition.
The MD reader then wouldn't need to know about attributes.

Thanks,
Richard

When this feature makes it into GCC, then match_test should behave
similar, I guess?  I.e. support function bodies that return bool.
I just wasn't sure which caller of fprint_c_condition runs with
match_test resp. symbol_ref from which context (insn attribute or
predicate, etc).

Thanks for looking into this and for considering it as an extension.

The shortcomings like non-support of pathological comments like
/* } */ is probably not such a big issue. And fixing it would have
to touch the md scanner / lexer and have side effects I don't know,
like on build performance and stability of course.  That part could
be fixed when someone actually needs it.

Thanks,

Johann

+  }
+};
+
  struct mapping;
/* A class for reading .md files and RTL dump files.
@@ -204,8 +236,8 @@ class md_reader
    void handle_enum (file_location loc, bool md_p);
const char *join_c_conditions (const char *cond1, const char *cond2);
-  void fprint_c_condition (FILE *outf, const char *cond);
-  void print_c_condition (const char *cond);
+  void fprint_c_condition (FILE *outf, const char *cond, const attr_desc * = 
nullptr);
+  void print_c_condition (const char *cond, const attr_desc * = nullptr);
/* Defined in read-rtl.cc. */
    const char *apply_iterator_to_string (const char *string);

Reply via email to