[merging both threads, thanks for the answers]
On Mon, 1 Oct 2012, Richard Guenther wrote:
optabs should be fixed instead, an is_gimple_val condition is implicitely
val != 0.
For vectors, I think it should be val < 0 (with an appropriate cast of val
to a signed integer vector type if necessary). Or (val & highbit) != 0, but
that's longer.
I don't think so. Throughout the compiler we generally assume false == 0
and anything else is true. (yes, for FP there is STORE_FLAG_VALUE, but
it's scope is quite limited - if we want sth similar for vectors we'd have to
invent it).
See below.
If we for example have
predicate = a < b;
x = predicate ? d : e;
y = predicate ? f : g;
we ideally want to re-use the predicate computation on targets where
that would be optimal (and combine should be able to recover the
case where it is not).
That I don't understand. The vcond instruction implemented by targets takes
as arguments d, e, cmp, a, b and emits the comparison itself. I don't see
how I can avoid sending to the targets both (d,e,<,a,b) and (f,g,<,a,b).
They will notice eventually that a<b is computed twice and remove one of the
two, but I don't see how to do that in optabs.c. Or I can compute x = a < b,
use x < 0 as the comparison passed to the targets, and expect targets (those
for which it is true) to recognize that < 0 is useless in a vector condition
(PR54700), or is useless on a comparison result.
But that's a limitation of how vcond works. ISTR there is/was a vselect
instruction as well, taking a "mask" and two vectors to select from. At least
that's how vcond works internally for some sub-targets.
vselect seems to only appear in config/. Would it be defined as:
vselect(m,a,b)=(a&m)|(b&~m) ? I would almost be tempted to just define a
pattern in .md files and let combine handle it, although it might be one
instruction too long for that (and if m is x<y, ~m might look like x>=y).
Or would it match the OpenCL select: "For each component of a vector type,
result[i] = if MSB of c[i] is set ? b[i] : a[i]."? Or the pattern with &
and | but with a precondition that the value of each element of the mask
must be 0 or ±1?
I don't find vcond that bad, as long as targets check for trivial
comparisons in the expansion (what trivial means may depend on the
platform). It is quite flexible for targets.
On Mon, 1 Oct 2012, Richard Guenther wrote:
tmp = fold_build2_loc (gimple_location (def_stmt),
code,
- boolean_type_node,
+ TREE_TYPE (cond),
That's obvious.
Ok, I'll test and commit that line separately.
+ if (TREE_CODE (op0) == VECTOR_CST && TREE_CODE (op1) == VECTOR_CST)
+ {
+ int count = VECTOR_CST_NELTS (op0);
+ tree *elts = XALLOCAVEC (tree, count);
+ gcc_assert (TREE_CODE (type) == VECTOR_TYPE);
+
+ for (int i = 0; i < count; i++)
+ {
+ tree elem_type = TREE_TYPE (type);
+ tree elem0 = VECTOR_CST_ELT (op0, i);
+ tree elem1 = VECTOR_CST_ELT (op1, i);
+
+ elts[i] = fold_relational_const (code, elem_type,
+ elem0, elem1);
+
+ if(elts[i] == NULL_TREE)
+ return NULL_TREE;
+
+ elts[i] = fold_negate_const (elts[i], elem_type);
I think you need to invent something new similar to STORE_FLAG_VALUE
or use STORE_FLAG_VALUE here. With the above you try to map
{0, 1} to {0, -1} which is only true if the operation on the element types
returns {0, 1} (thus, STORE_FLAG_VALUE is 1).
Er, seems to me that constant folding of a scalar comparison in the
front/middle-end only returns {0, 1}.
+/* Return true if EXPR is an integer constant representing true. */
+
+bool
+integer_truep (const_tree expr)
+{
+ STRIP_NOPS (expr);
+
+ switch (TREE_CODE (expr))
+ {
+ case INTEGER_CST:
+ /* Do not just test != 0, some places expect the value 1. */
+ return (TREE_INT_CST_LOW (expr) == 1
+ && TREE_INT_CST_HIGH (expr) == 0);
I wonder if using STORE_FLAG_VALUE is better here (note that it
usually differs for FP vs. integral comparisons and the mode passed
to STORE_FLAG_VALUE is that of the comparison result).
I notice there is already a VECTOR_STORE_FLAG_VALUE (used only once in
simplify-rtx, in a way that seems a bit strange but I'll try to
understand that later). Thanks for showing me this macro, it seems
important indeed. However the STORE_FLAG_VALUE mechanism seems to be for
the RTL level.
It looks like it would be possible to have 3 different semantics:
source code is OpenCL, middle-end whatever we want (0 / 1 for instance),
and back-end is whatever the target wants. The front-end would generate
for a<b : vec_cond_expr(a<b,-1,0) and for a?b:c : vec_cond_expr(a<0,b,c)
and there is no need for the middle-end to use the same representation
of comparisons as the front-ends or targets (expand of a vec_cond_expr
whose first argument is not a comparison would use != 0 if we chose a 0
/ 1 encoding).
However, since the front-ends and many targets agree on the OpenCL
semantics, it means introducing a conversion back and forth in the
middle-end, which may complicate things a bit. Note also that we already
have constant_boolean_node that returns a vector of -1 for true, and
that the front-end doesn't currently generate a vec_cond_expr for a<b.
It is true though that using 1 for true would fit better with the scalar
ops.
Well, if someone feels like taking a decision... I am happy with either
choice, as long as I know where to go.
That said, until we are sure what semantics we want here (forwprop
for example doesn't look at 'comparisons' but operations on special
values and types) I'd prefer to not introduce integer_truep ().
I completely agree that defining the semantics comes first :-)
--
Marc Glisse