Georg-Johann Lay <a...@gjlay.de> writes:
So I had that situation where in an insn attribute, providing
a block of code (rather than just an expression) would be
useful.
Expressions can provided by means of symbol_ref, like in
(set (attr "length")
(symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)")))
However providing a block of code gives a syntax error from
the compiler, *NOT* from md_reader:
(set (attr "length")
(symbol_ref
{
int len = 1;
return len;
}))
This means such syntax is already supported to some degree,
there's just no semantics assigned to such code.
Blocks of code are already supported in insn predicates,
like in
(define_predicate "my_operand"
(match_code "code_label,label_ref,symbol_ref,plus,const")
{
some code...
return true-or-false;
})
In the insn attribute case, I hacked a bit and supported
blocks of code like in the example above. The biggest change
is that class attr_desc has to be moved from genattrtab.cc to
read-md.h so that it is a complete type as required by
md_reader::fprint_c_condition().
That method prints to code for symbol_ref and some others, and
it has to know the type of the attribute, like "int" for the
"length" attribute. The implementation in fprint_c_condition()
is straight forward:
When cond (which is the payload string of symbol_ref, including the
'{}'s) starts with '{', the print a lambda that's called in place,
like in
print "( [&]() -> <return_type> <cond> () )"
The "&" capture is required so that variables like "insn" are
accessible. "operands[]" and "which_alternative" are global,
thus also accessible.
Attached is the code I have so far (which is by no means a
proposed patch, so I am posting here on gcc@).
As far as I can tell, there is no performance penalty, e.g.
in build times, when the feature is not used. Of course instead
of such syntax, a custom function could be used, or the
braces-brackets-parentheses-gibberish could be written out
in the symbol_ref as an expression. Though I think this
could be a nice addition, in particular because the scanning
side in md_reader already supports the syntax.
Looks good to me. I know you said it wasn't a patch submission,
but it looks mostly ready to go. Some comments below:
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 7f4335e0aac..3e46693e8c2 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -10265,6 +10265,56 @@ so there is no need to explicitly convert the
expression into a boolean
(match_test "(x & 2) != 0")
@end smallexample
+@cindex @code{symbol_ref} and attributes
+@item (symbol_ref "@var{quoted-c-expr}")
+
+Specifies the value of the attribute sub-expression as a C expression,
+where the surrounding quotes are not part of the expression.
+Similar to @code{match_test}, variables @var{insn}, @var{operands[]}
+and @var{which_alternative} are available. Moreover, code and mode
+attributes can be used to compose the resulting C expression, like in
+
+@smallexample
+(set (attr "length")
+ (symbol_ref ("1 + GET_MODE_SIZE (<MODE>mode)")))
+@end smallexample
+
+where the according insn has exactly one mode iterator.
+See @ref{Mode Iterators} and @ref{Code Iterators}.
I got the impression s/See @ref/@xref/ was recommended for sentence
references.
+
+@item (symbol_ref "@{ @var{quoted-c-code} @}")
+@itemx (symbol_ref @{ @var{c-code} @})
+
+The value of this subexpression is determined by running a block
+of C code which returns the desired value.
+The braces are part of the code, whereas the quotes in the quoted form are not.
+
+This variant of @code{symbol_ref} allows for more comlpex code than
+just a single C expression, like for example:
+
+@smallexample
+(set (attr "length")
+ (symbol_ref
+ @{
+ int len;
+ some_function (insn, <CODE>, <MODE>mode, & len);
+ return len;
+ @}))
+@end smallexample
+
+for an insn that has one code iterator and one mode iterator.
+Again, variables @var{insn}, @var{operands[]} and @var{which_alternative}
+can be used. The unquoted form only supports a subset of C,
+for example no C comments are supported, and strings that contain
+characters like @samp{@}} are problematic and may need to be escaped
+as @samp{\@}}.
By unquoted form, do you mean (symbol_ref { ... })? I'd have expected
that to be better than "{ ... }" (or at least, I thought that was the
intention when { ... } was added). I was going to suggest not documenting
the "{ ... }" form until I saw this.
+
+The return type is @code{int} for the @var{length} attribute, and
+@code{enum attr_@var{name}} for an insn attribute named @var{name}.
+The types and available enum values can be looked up in
+@file{$builddir/gcc/insn-attr-common.h}.
+
+
@cindex @code{le} and attributes
@cindex @code{leu} and attributes
@cindex @code{lt} and attributes
diff --git a/gcc/genattrtab.cc b/gcc/genattrtab.cc
index 03c7d6c74a3..20d45ba88af 100644
--- a/gcc/genattrtab.cc
+++ b/gcc/genattrtab.cc
@@ -168,22 +168,6 @@ struct attr_value
int has_asm_insn; /* True if this value used for `asm' insns */
};
-/* Structure for each attribute. */
-
-class attr_desc
-{
-public:
- char *name; /* Name of attribute. */
- const char *enum_name; /* Enum name for DEFINE_ENUM_NAME. */
- class attr_desc *next; /* Next attribute. */
- struct attr_value *first_value; /* First value of this attribute. */
- struct attr_value *default_val; /* Default value for this attribute. */
- file_location loc; /* Where in the .md files it occurs. */
- unsigned is_numeric : 1; /* Values of this attribute are numeric. */
- unsigned is_const : 1; /* Attribute value constant for each run. */
- unsigned is_special : 1; /* Don't call `write_attr_set'. */
-};
-
/* Structure for each DEFINE_DELAY. */
class delay_desc
@@ -3368,7 +3352,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int
attrs_cached, int flags,
{
int comparison_operator = 0;
RTX_CODE code;
- class attr_desc *attr;
+ class attr_desc *attr = nullptr;
if (emit_parens)
fprintf (outf, "(");
@@ -3708,7 +3692,7 @@ write_test_expr (FILE *outf, rtx exp, unsigned int
attrs_cached, int flags,
/* A random C expression. */
case SYMBOL_REF:
- rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0));
+ rtx_reader_ptr->fprint_c_condition (outf, XSTR (exp, 0), attr);
break;
/* The address of the branch target. */
@@ -4052,12 +4036,7 @@ write_attr_get (FILE *outf, class attr_desc *attr)
/* Write out start of function, then all values with explicit `case' lines,
then a `default', then the value with the most uses. */
- if (attr->enum_name)
- fprintf (outf, "enum %s\n", attr->enum_name);
- else if (!attr->is_numeric)
- fprintf (outf, "enum attr_%s\n", attr->name);
- else
- fprintf (outf, "int\n");
+ attr->fprint_type (outf, "\n");
/* If the attribute name starts with a star, the remainder is the name of
the subroutine to use, instead of `get_attr_...'. */
@@ -4389,7 +4368,7 @@ write_attr_value (FILE *outf, class attr_desc *attr, rtx
value)
break;
case SYMBOL_REF:
- rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0));
+ rtx_reader_ptr->fprint_c_condition (outf, XSTR (value, 0), attr);
break;
case ATTR:
diff --git a/gcc/read-md.cc b/gcc/read-md.cc
index 93d1ea43781..a0025575cb1 100644
--- a/gcc/read-md.cc
+++ b/gcc/read-md.cc
@@ -170,31 +170,62 @@ md_reader::join_c_conditions (const char *cond1, const
char *cond2)
directive for COND if its original file position is known. */
void
-md_reader::fprint_c_condition (FILE *outf, const char *cond)
+md_reader::fprint_c_condition (FILE *outf, const char *cond,
+ const attr_desc *attr)
{
const char **halves = (const char **) htab_find (m_joined_conditions,
&cond);
if (halves != 0)
{
fprintf (outf, "(");
- fprint_c_condition (outf, halves[1]);
+ fprint_c_condition (outf, halves[1], attr);
fprintf (outf, " && ");
- fprint_c_condition (outf, halves[2]);
+ fprint_c_condition (outf, halves[2], attr);
fprintf (outf, ")");
}
else
{
fputc ('\n', outf);
fprint_md_ptr_loc (outf, cond);
- fprintf (outf, "(%s)", cond);
+ if (cond[0] == '{')
+ {
+ const struct ptr_loc *loc = get_md_ptr_loc (cond);
+
+ if (! attr)
+ {
+ error_at (loc->loc, "TODO md_reader::fprint_c_condition: "
+ "const attr_desc *attr is nullptr");
+ exit (11);
+ }
I think we should just skip the explicit return type if we don't
know what it is. Simple cases will still work, and we should get
a reasonably sensible error from the compiler for other cases.
(And when we do, we can patch the caller to pass an appropriate type.)
+
+ // Like "( [&]() -> <return_type> <cond> () )"
+ // Where COND is actually the body of a function, including the
+ // outer {}'s. Print this as a lambda that's evaluated in place.
+ // The capture-all is required to have access to variables
+ // like "insn". Objects like "operands[]" and "which_alternative"
+ // are accessible since they are global.
+ fprintf (outf, "( [&]() -> ");
+ attr->fprint_type (outf, " ");
+ fprintf (outf, "%s () )", cond);
+
+ if (! strstr (cond, "return"))
+ {
+ error_at (loc->loc, "error: function body needs at least one"
+ " 'return' statement");
+ error_at (loc->loc, "this is the body:\n%s\n", cond);
+ exit (12);
+ }
IMO it'd be better to leave the compiler to enforce this. There are
many other ways in which the lambda body could be malformed.
+ }
+ else
+ fprintf (outf, "(%s)", cond);
}
}
/* Special fprint_c_condition for writing to STDOUT. */
void
-md_reader::print_c_condition (const char *cond)
+md_reader::print_c_condition (const char *cond, const attr_desc *desc)
{
- fprint_c_condition (stdout, cond);
+ fprint_c_condition (stdout, cond, desc);
}
/* A vfprintf-like function for reporting an error against line LINENO
diff --git a/gcc/read-md.h b/gcc/read-md.h
index 9703551a8fd..ae10b651de1 100644
--- a/gcc/read-md.h
+++ b/gcc/read-md.h
@@ -132,6 +132,38 @@ struct overloaded_name {
overloaded_instance **next_instance_ptr;
};
+/* Structure for each attribute. */
+
+struct attr_value;
+
+class attr_desc
+{
+public:
+ char *name; /* Name of attribute. */
+ const char *enum_name; /* Enum name for DEFINE_ENUM_NAME. */
+ class attr_desc *next; /* Next attribute. */
+ struct attr_value *first_value; /* First value of this attribute. */
+ struct attr_value *default_val; /* Default value for this attribute. */
+ file_location loc; /* Where in the .md files it occurs. */
+ unsigned is_numeric : 1; /* Values of this attribute are numeric. */
+ unsigned is_const : 1; /* Attribute value constant for each run. */
+ unsigned is_special : 1; /* Don't call `write_attr_set'. */
+
+ // Print the return type for functions like get_attr_<attribute-name>
+ // to stream OUTF, followed by SUFFIX which should be white-space(s).
+ void fprint_type (FILE *outf, const char *suffix) const
+ {
+ if (enum_name)
+ fprintf (outf, "enum %s", enum_name);
+ else if (! is_numeric)
+ fprintf (outf, "enum attr_%s", name);
+ else
+ fprintf (outf, "int");
+
+ fprintf (outf, "%s", suffix);
It shouldn't be necessary to emit the enum tag these days. If removing