Am 13.07.24 um 13:44 schrieb Richard Sandiford:
Georg-Johann Lay <[email protected]> writes:So I had that situation where in an insn attribute, providing a block of code (rather than just an expression) would be useful.Expressions can provided by means of symbol_ref, like in (set (attr "length") (symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)"))) However providing a block of code gives a syntax error from the compiler, *NOT* from md_reader: (set (attr "length") (symbol_ref { int len = 1; return len; })) This means such syntax is already supported to some degree, there's just no semantics assigned to such code. Blocks of code are already supported in insn predicates, like in (define_predicate "my_operand" (match_code "code_label,label_ref,symbol_ref,plus,const") { some code... return true-or-false; }) In the insn attribute case, I hacked a bit and supported blocks of code like in the example above. The biggest change is that class attr_desc has to be moved from genattrtab.cc to read-md.h so that it is a complete type as required by md_reader::fprint_c_condition(). That method prints to code for symbol_ref and some others, and it has to know the type of the attribute, like "int" for the "length" attribute. The implementation in fprint_c_condition() is straight forward: When cond (which is the payload string of symbol_ref, including the '{}'s) starts with '{', the print a lambda that's called in place, like in print "( [&]() -> <return_type> <cond> () )" The "&" capture is required so that variables like "insn" are accessible. "operands[]" and "which_alternative" are global, thus also accessible. Attached is the code I have so far (which is by no means a proposed patch, so I am posting here on gcc@). As far as I can tell, there is no performance penalty, e.g. in build times, when the feature is not used. Of course instead of such syntax, a custom function could be used, or the braces-brackets-parentheses-gibberish could be written out in the symbol_ref as an expression. Though I think this could be a nice addition, in particular because the scanning side in md_reader already supports the syntax.Looks good to me. I know you said it wasn't a patch submission, but it looks mostly ready to go. Some comments below:diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 7f4335e0aac..3e46693e8c2 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -10265,6 +10265,56 @@ so there is no need to explicitly convert the expression into a boolean (match_test "(x & 2) != 0") @end smallexample+@cindex @code{symbol_ref} and attributes+@item (symbol_ref "@var{quoted-c-expr}") + +Specifies the value of the attribute sub-expression as a C expression, +where the surrounding quotes are not part of the expression. +Similar to @code{match_test}, variables @var{insn}, @var{operands[]} +and @var{which_alternative} are available. Moreover, code and mode +attributes can be used to compose the resulting C expression, like in + +@smallexample +(set (attr "length") + (symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)"))) +@end smallexample + +where the according insn has exactly one mode iterator. +See @ref{Mode Iterators} and @ref{Code Iterators}.I got the impression s/See @ref/@xref/ was recommended for sentence references.+ +@item (symbol_ref "@{ @var{quoted-c-code} @}") +@itemx (symbol_ref @{ @var{c-code} @}) + +The value of this subexpression is determined by running a block +of C code which returns the desired value. +The braces are part of the code, whereas the quotes in the quoted form are not. + +This variant of @code{symbol_ref} allows for more comlpex code than +just a single C expression, like for example: + +@smallexample +(set (attr "length") + (symbol_ref + @{ + int len; + some_function (insn, <CODE>, <MODE>mode, & len); + return len; + @})) +@end smallexample + +for an insn that has one code iterator and one mode iterator. +Again, variables @var{insn}, @var{operands[]} and @var{which_alternative} +can be used. The unquoted form only supports a subset of C, +for example no C comments are supported, and strings that contain +characters like @samp{@}} are problematic and may need to be escaped +as @samp{\@}}.By unquoted form, do you mean (symbol_ref { ... })? I'd have expected that to be better than "{ ... }" (or at least, I thought that was the intention when { ... } was added). I was going to suggest not documenting the "{ ... }" form until I saw this.+ +The return type is @code{int} for the @var{length} attribute, and +@code{enum attr_@var{name}} for an insn attribute named @var{name}. +The types and available enum values can be looked up in +@file{$builddir/gcc/insn-attr-common.h}. + + @cindex @code{le} and attributes @cindex @code{leu} and attributes @cindex @code{lt} and attributes diff --git a/gcc/genattrtab.cc b/gcc/genattrtab.cc index 03c7d6c74a3..20d45ba88af 100644 --- a/gcc/genattrtab.cc +++ b/gcc/genattrtab.cc @@ -168,22 +168,6 @@ struct attr_value int has_asm_insn; /* True if this value used for `asm' insns */ };-/* Structure for each attribute. */- -class attr_desc -{ -public: - char *name; /* Name of attribute. */ - const char *enum_name; /* Enum name for DEFINE_ENUM_NAME. */ - class attr_desc *next; /* Next attribute. */ - struct attr_value *first_value; /* First value of this attribute. */ - struct attr_value *default_val; /* Default value for this attribute. */ - file_location loc; /* Where in the .md files it occurs. */ - unsigned is_numeric : 1; /* Values of this attribute are numeric. */ - unsigned is_const : 1; /* Attribute value constant for each run. */ - unsigned is_special : 1; /* Don't call `write_attr_set'. */ -}; - /* Structure for each DEFINE_DELAY. */class delay_desc@@ -3368,7 +3352,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int attrs_cached, int flags, { int comparison_operator = 0; RTX_CODE code; - class attr_desc *attr; + class attr_desc *attr = nullptr;if (emit_parens)fprintf (outf, "("); @@ -3708,7 +3692,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int attrs_cached, int flags,/* A random C expression. */case SYMBOL_REF: - rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0)); + rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0), attr); break;/* The address of the branch target. */@@ -4052,12 +4036,7 @@ write_attr_get (FILE *outf, class attr_desc *attr)/* Write out start of function, then all values with explicit `case' lines,then a `default', then the value with the most uses. */ - if (attr->enum_name) - fprintf (outf, "enum %s\n", attr->enum_name); - else if (!attr->is_numeric) - fprintf (outf, "enum attr_%s\n", attr->name); - else - fprintf (outf, "int\n"); + attr->fprint_type (outf, "\n");/* If the attribute name starts with a star, the remainder is the name ofthe subroutine to use, instead of `get_attr_...'. */ @@ -4389,7 +4368,7 @@ write_attr_value (FILE *outf, class attr_desc *attr, rtx value) break;case SYMBOL_REF:- rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0)); + rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0), attr); break;case ATTR:diff --git a/gcc/read-md.cc b/gcc/read-md.cc index 93d1ea43781..a0025575cb1 100644 --- a/gcc/read-md.cc +++ b/gcc/read-md.cc @@ -170,31 +170,62 @@ md_reader::join_c_conditions (const char *cond1, const char *cond2) directive for COND if its original file position is known. */void-md_reader::fprint_c_condition (FILE *outf, const char *cond) +md_reader::fprint_c_condition (FILE *outf, const char *cond, + const attr_desc *attr) { const char **halves = (const char **) htab_find (m_joined_conditions, &cond); if (halves != 0) { fprintf (outf, "("); - fprint_c_condition (outf, halves[1]); + fprint_c_condition (outf, halves[1], attr); fprintf (outf, " && "); - fprint_c_condition (outf, halves[2]); + fprint_c_condition (outf, halves[2], attr); fprintf (outf, ")"); } else { fputc ('\n', outf); fprint_md_ptr_loc (outf, cond); - fprintf (outf, "(%s)", cond); + if (cond[0] == '{') + { + const struct ptr_loc *loc = get_md_ptr_loc (cond); + + if (! attr) + { + error_at (loc->loc, "TODO md_reader::fprint_c_condition: " + "const attr_desc *attr is nullptr"); + exit (11); + }I think we should just skip the explicit return type if we don't know what it is. Simple cases will still work, and we should get a reasonably sensible error from the compiler for other cases. (And when we do, we can patch the caller to pass an appropriate type.)
Ok.
+ + // Like "( [&]() -> <return_type> <cond> () )" + // Where COND is actually the body of a function, including the + // outer {}'s. Print this as a lambda that's evaluated in place. + // The capture-all is required to have access to variables + // like "insn". Objects like "operands[]" and "which_alternative" + // are accessible since they are global. + fprintf (outf, "( [&]() -> "); + attr->fprint_type (outf, " "); + fprintf (outf, "%s () )", cond); + + if (! strstr (cond, "return")) + { + error_at (loc->loc, "error: function body needs at least one" + " 'return' statement"); + error_at (loc->loc, "this is the body:\n%s\n", cond); + exit (12); + }IMO it'd be better to leave the compiler to enforce this. There are many other ways in which the lambda body could be malformed.+ } + else + fprintf (outf, "(%s)", cond); } }/* Special fprint_c_condition for writing to STDOUT. */ void-md_reader::print_c_condition (const char *cond) +md_reader::print_c_condition (const char *cond, const attr_desc *desc) { - fprint_c_condition (stdout, cond); + fprint_c_condition (stdout, cond, desc); }/* A vfprintf-like function for reporting an error against line LINENOdiff --git a/gcc/read-md.h b/gcc/read-md.h index 9703551a8fd..ae10b651de1 100644 --- a/gcc/read-md.h +++ b/gcc/read-md.h @@ -132,6 +132,38 @@ struct overloaded_name { overloaded_instance **next_instance_ptr; };+/* Structure for each attribute. */+ +struct attr_value; + +class attr_desc +{ +public: + char *name; /* Name of attribute. */ + const char *enum_name; /* Enum name for DEFINE_ENUM_NAME. */ + class attr_desc *next; /* Next attribute. */ + struct attr_value *first_value; /* First value of this attribute. */ + struct attr_value *default_val; /* Default value for this attribute. */ + file_location loc; /* Where in the .md files it occurs. */ + unsigned is_numeric : 1; /* Values of this attribute are numeric. */ + unsigned is_const : 1; /* Attribute value constant for each run. */ + unsigned is_special : 1; /* Don't call `write_attr_set'. */ + + // Print the return type for functions like get_attr_<attribute-name> + // to stream OUTF, followed by SUFFIX which should be white-space(s). + void fprint_type (FILE *outf, const char *suffix) const + { + if (enum_name) + fprintf (outf, "enum %s", enum_name); + else if (! is_numeric) + fprintf (outf, "enum attr_%s", name); + else + fprintf (outf, "int"); + + fprintf (outf, "%s", suffix);It shouldn't be necessary to emit the enum tag these days. If removing
Hi Richard, I am not familiar with the gensupport policies, which is the reason why the feature is just a suggestion / proposal and not a patch. IMO patches should not come from someone like me who has no experience in that area; better someone more experienced would take it over.
it causes anything to break, I think we should fix whatever that breaking thing is. Could you try doing that, as a pre-patch? Or I can give it a go, if you'd rather not.
Yes please.
If we do that, then we can just a return a const char * for the type.
Yes, const char* would be easier. I just didn't know how to alloc one, and where. A new const char* property in class attr_desc_would solve it.
And then in turn we can pass a const char * to (f)print_c_condition. The MD reader then wouldn't need to know about attributes. Thanks, Richard
When this feature makes it into GCC, then match_test should behave similar, I guess? I.e. support function bodies that return bool. I just wasn't sure which caller of fprint_c_condition runs with match_test resp. symbol_ref from which context (insn attribute or predicate, etc). Thanks for looking into this and for considering it as an extension. The shortcomings like non-support of pathological comments like /* } */ is probably not such a big issue. And fixing it would have to touch the md scanner / lexer and have side effects I don't know, like on build performance and stability of course. That part could be fixed when someone actually needs it. Thanks, Johann
+ } +}; + struct mapping;/* A class for reading .md files and RTL dump files.@@ -204,8 +236,8 @@ class md_reader void handle_enum (file_location loc, bool md_p);const char *join_c_conditions (const char *cond1, const char *cond2);- void fprint_c_condition (FILE *outf, const char *cond); - void print_c_condition (const char *cond); + void fprint_c_condition (FILE *outf, const char *cond, const attr_desc * = nullptr); + void print_c_condition (const char *cond, const attr_desc * = nullptr);/* Defined in read-rtl.cc. */const char *apply_iterator_to_string (const char *string);
