Re: Vector Comparison patch

Richard Guenther Tue, 23 Aug 2011 01:18:24 -0700

On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov
<[email protected]> wrote:
> I'll just send you my current version. I'll be a little bit more specific.
>
> The problem starts when you try to lower the following expression:
>
> x = a > b;
> x1 = vcond <x != 0, -1, 0>
> vcond <x1, c, d>
>
> Now, you go from the beginning to the end of the block, and you cannot
> leave a > b, because only vconds are valid expressions to expand.
>
> Now, you meet a > b first. You try to transform it into vcond <a > b,
> -1, 0>, you build this expression, then you try to gimplify it, and
> you see that you have something like:
>
> x' = a >b;
> x = vcond <x', -1, 0>
> x1 = vcond <x != 0, -1, 0>
> vcond <x1, c, d>
>
> and your gsi stands at the x1 now, so the gimplification created a
> comparison that optab would not understand. And I am not really sure
> that you would be able to solve this problem easily.
>
> It would helpr, if you could create vcond<x, op, y, op0, op1>, but you
> cant and x op y is a single tree that must be gimplified, and I am not
> sure that you can persuade gimplifier to leave this expression
> untouched.
>
> In the attachment the current version of the patch.


I can't reproduce it with your patch.  For

#define vector(elcount, type)  \
    __attribute__((vector_size((elcount)*sizeof(type)))) type

vector (4, float) x, y;
vector (4, int) a,b;
int
main (int argc, char *argv[])
{
  vector (4, int) i0 = x < y;
  vector (4, int) i1 = i0 ? a : b;
  return 0;
}

I get from the C frontend:

  vector(4) int i0 =  VEC_COND_EXPR < SAVE_EXPR <x < y> , { -1, -1,
-1, -1 } , { 0, 0, 0, 0 } > ;
  vector(4) int i1 =  VEC_COND_EXPR < SAVE_EXPR <i0> , SAVE_EXPR <a> ,
SAVE_EXPR <b> > ;

but I have expected i0 != 0 in the second VEC_COND_EXPR.

I do see that the gimplifier pulls away the condition for the first
VEC_COND_EXPR though:

  x.0 = x;
  y.1 = y;
  D.2735 = x.0 < y.1;
  D.2734 = D.2735;
  D.2736 = D.2734;
  i0 = [vec_cond_expr]  VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } ,
{ 0, 0, 0, 0 } > ;

which is, I believe because of the SAVE_EXPR wrapped around the
comparison.  Why do you bother wrapping all operands in save-exprs?

With that the

  /* Currently the expansion of VEC_COND_EXPR does not allow
     expessions where the type of vectors you compare differs
     form the type of vectors you select from. For the time
     being we insert implicit conversions.  */
  if ((COMPARISON_CLASS_P (ifexp)
       && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1))
      || TREE_TYPE (ifexp) != TREE_TYPE (op1))

checks will fail (because ifexp is a SAVE_EXPR).

I'll run into errors when not adding the SAVE_EXPR around the ifexp,
the transform into x < y ? {-1,...} : {0,...} is not happening.

>
> Thanks,
> Artem.
>
>
> On Mon, Aug 22, 2011 at 9:58 PM, Richard Guenther
> <[email protected]> wrote:
>> On Mon, Aug 22, 2011 at 10:49 PM, Artem Shinkarov
>> <[email protected]> wrote:
>>> On Mon, Aug 22, 2011 at 9:42 PM, Richard Guenther
>>> <[email protected]> wrote:
>>>> On Mon, Aug 22, 2011 at 5:58 PM, Artem Shinkarov
>>>> <[email protected]> wrote:
>>>>> On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther
>>>>> <[email protected]> wrote:
>>>>>> On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov
>>>>>> <[email protected]> wrote:
>>>>>>> On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther
>>>>>>> <[email protected]> wrote:
>>>>>>>> On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov
>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther
>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov
>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> Richard
>>>>>>>>>>>>>
>>>>>>>>>>>>> I formalized an approach a little-bit, now it works without target
>>>>>>>>>>>>> hooks, but some polishing is still required. I want you to 
>>>>>>>>>>>>> comment on
>>>>>>>>>>>>> the several important approaches that I use in the patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So how does it work.
>>>>>>>>>>>>> 1) All the vector comparisons at the level of  type-checker are
>>>>>>>>>>>>> introduced using VEC_COND_EXPR with constant selection operands 
>>>>>>>>>>>>> being
>>>>>>>>>>>>> {-1} and {0}. For example v0 > v1 is transformed into 
>>>>>>>>>>>>> VEC_COND_EXPR<v0
>>>>>>>>>>>>>> v1, {-1}, {0}>.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2) When optabs expand VEC_COND_EXPR, two cases are considered:
>>>>>>>>>>>>> 2.a) first operand of VEC_COND_EXPR is comparison, in that case 
>>>>>>>>>>>>> nothing changes.
>>>>>>>>>>>>> 2.b) first operand is something else, in that case, we specially 
>>>>>>>>>>>>> mark
>>>>>>>>>>>>> this case, recognize it in the backend, and do not create a
>>>>>>>>>>>>> comparison, but use the mask as it was a result of some 
>>>>>>>>>>>>> comparison.
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3) In order to make sure that mask in VEC_COND_EXPR<mask, v0, v1> 
>>>>>>>>>>>>> is a
>>>>>>>>>>>>> vector comparison we use is_vector_comparison function, if it 
>>>>>>>>>>>>> returns
>>>>>>>>>>>>> false, then we replace mask with mask != {0}.
>>>>>>>>>>>>>
>>>>>>>>>>>>> So we end-up with the following functionality:
>>>>>>>>>>>>> VEC_COND_EXPR<mask, v0,v1> -- if we know that mask is a result of
>>>>>>>>>>>>> comparison of two vectors, we leave it as it is, otherwise change 
>>>>>>>>>>>>> with
>>>>>>>>>>>>> mask != {0}.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Plain vector comparison a <op> b is represented with 
>>>>>>>>>>>>> VEC_COND_EXPR,
>>>>>>>>>>>>> which correctly expands, without creating useless masking.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Basically for me there are two questions:
>>>>>>>>>>>>> 1) Can we perform information passing in optabs in a nicer way?
>>>>>>>>>>>>> 2) How is_vector_comparison could be improved? I have several 
>>>>>>>>>>>>> ideas,
>>>>>>>>>>>>> like checking if constant vector all consists of 0 and -1, and so 
>>>>>>>>>>>>> on.
>>>>>>>>>>>>> But first is it conceptually fine.
>>>>>>>>>>>>>
>>>>>>>>>>>>> P.S. I tired to put the functionality of is_vector_comparison in
>>>>>>>>>>>>> tree-ssa-forwprop, but the thing is that it is called only with 
>>>>>>>>>>>>> -On,
>>>>>>>>>>>>> which I find inappropriate, and the functionality gets more
>>>>>>>>>>>>> complicated.
>>>>>>>>>>>>
>>>>>>>>>>>> Why is it inappropriate to not optimize it at -O0?  If the user
>>>>>>>>>>>> separates comparison and ?: expression it's his own fault.
>>>>>>>>>>>
>>>>>>>>>>> Well, because all the information is there, and I perfectly envision
>>>>>>>>>>> the case when user expressed comparison separately, just to avoid 
>>>>>>>>>>> code
>>>>>>>>>>> duplication.
>>>>>>>>>>>
>>>>>>>>>>> Like:
>>>>>>>>>>> mask = a > b;
>>>>>>>>>>> res1 = mask ? v0 : v1;
>>>>>>>>>>> res2 = mask ? v2 : v3;
>>>>>>>>>>>
>>>>>>>>>>> Which in this case would be different from
>>>>>>>>>>> res1 = a > b ? v0 : v1;
>>>>>>>>>>> res2 = a > b ? v2 : v3;
>>>>>>>>>>>
>>>>>>>>>>>> Btw, the new hook is still in the patch.
>>>>>>>>>>>>
>>>>>>>>>>>> I would simply always create != 0 if it isn't and let optimizers
>>>>>>>>>>>> (tree-ssa-forwprop.c) optimize this.  You still have to deal with
>>>>>>>>>>>> non-comparison operands during expansion though, but if
>>>>>>>>>>>> you always forced a != 0 from the start you can then simply
>>>>>>>>>>>> interpret it as a proper comparison result (in which case I'd
>>>>>>>>>>>> modify the backends to have an alternate pattern or directly
>>>>>>>>>>>> expand to masking operations - using the fake comparison
>>>>>>>>>>>> RTX is too much of a hack).
>>>>>>>>>>>
>>>>>>>>>>> Richard, I think you didn't get the problem.
>>>>>>>>>>> I really need the way, to pass the information, that the expression
>>>>>>>>>>> that is in the first operand of vcond is an appropriate mask. I 
>>>>>>>>>>> though
>>>>>>>>>>> for quite a while and this hack is the only answer I found, is 
>>>>>>>>>>> there a
>>>>>>>>>>> better way to do that. I could for example introduce another
>>>>>>>>>>> tree-node, but it would be overkill as well.
>>>>>>>>>>>
>>>>>>>>>>> Now why do I need it so much:
>>>>>>>>>>> I want to implement the comparison in a way that {1, 5, 0, -1} is
>>>>>>>>>>> actually {-1,-1,-1,-1}. So whenever I am not sure that mask of
>>>>>>>>>>> VEC_COND_EXPR is a real comparison I transform it to mask != {0} 
>>>>>>>>>>> (not
>>>>>>>>>>> always). To check the stuff, I use is_vector_comparison in
>>>>>>>>>>> tree-vect-generic.
>>>>>>>>>>>
>>>>>>>>>>> So I really have the difference between mask ? x : y and mask != 
>>>>>>>>>>> {0} ?
>>>>>>>>>>> x : y, otherwise I could treat mask != {0} in the backend as just
>>>>>>>>>>> mask.
>>>>>>>>>>>
>>>>>>>>>>> If this link between optabs and backend breaks, then the patch falls
>>>>>>>>>>> apart. Because every time the comparison is taken out 
>>>>>>>>>>> VEC_COND_EXPR, I
>>>>>>>>>>> will have to put != {0}. Keep in mind, that I cannot always put the
>>>>>>>>>>> comparison inside the VEC_COND_EXPR, what if it is defined in a
>>>>>>>>>>> function-comparison, or somehow else?
>>>>>>>>>>>
>>>>>>>>>>> So what would be an appropriate way to connect optabs and the 
>>>>>>>>>>> backend?
>>>>>>>>>>
>>>>>>>>>> Well, there is no problem in having the only valid mask operand for
>>>>>>>>>> VEC_COND_EXPRs being either a comparison or a {-1,...} / {0,....} 
>>>>>>>>>> mask.
>>>>>>>>>> Just the C parser has to transform mask ? vec1 : vec2 to mask != 0 ?
>>>>>>>>>> vec1 : vec2.
>>>>>>>>>
>>>>>>>>> This happens already in the new version of patch (not submitted yet).
>>>>>>>>>
>>>>>>>>>> This comparison can be eliminated by optimization passes
>>>>>>>>>> that
>>>>>>>>>> either replace it by the real comparison computing the mask or just
>>>>>>>>>> propagating the information this mask is already {-1,...} / {0,....} 
>>>>>>>>>> by simply
>>>>>>>>>> dropping the comparison against zero.
>>>>>>>>>
>>>>>>>>> This is not a problem, because the backend recognizes these patterns,
>>>>>>>>> so no optimization is needed in this part.
>>>>>>>>
>>>>>>>> I mean for
>>>>>>>>
>>>>>>>>  mask = v1 < v2 ? {-1,...} : {0,...};
>>>>>>>>  val = VCOND_EXPR <mask != 0, v3, v4>;
>>>>>>>>
>>>>>>>> optimizers can see how mask is defined and drop the != 0 test or 
>>>>>>>> replace
>>>>>>>> it by v1 < v2.
>>>>>>>
>>>>>>> Yes, sure.
>>>>>>>
>>>>>>>>>> For the backends I'd have vcond patterns for both an embedded 
>>>>>>>>>> comparison
>>>>>>>>>> and for a mask.  (Now we can rewind the discussion a bit and allow
>>>>>>>>>> arbitrary masks and define a vcond with a mask operand to do bitwise
>>>>>>>>>> selection - what matters is the C frontend semantics which we need to
>>>>>>>>>> translate to what the middle-end thinks of a VEC_COND_EXPR, they
>>>>>>>>>> do not have to agree).
>>>>>>>>>
>>>>>>>>> But it seems like another combinatorial explosion here. Considering
>>>>>>>>> what Richard said in his e-mail, in order to support "generic" vcond,
>>>>>>>>> we just need to enumerate all the possible cases. Or I didn't
>>>>>>>>> understand right?
>>>>>>>>
>>>>>>>> Well, the question is still what VCOND_EXPR and thus the vcond pattern
>>>>>>>> semantically does for a non-comparison operand.  I'd argue that using
>>>>>>>> the bitwise selection semantic gives us maximum flexibility and a 
>>>>>>>> native
>>>>>>>> instruction with AMD XOP.  This non-comparison VCOND_EXPR is
>>>>>>>> also easy to implement in the middle-end expansion code if there is
>>>>>>>> no native instruction for it - by simply emitting the bitwise 
>>>>>>>> operations.
>>>>>>>>
>>>>>>>> But I have the feeling we are talking past each other ...?
>>>>>>>
>>>>>>> I am all for the bitwise behaviour in the backend pattern, that is
>>>>>>> something that I rely on at the moment. What I don't want to have is
>>>>>>> the same behaviour in the frontend. So If we can guarantee, that we
>>>>>>> add != 0, when we don't know the "nature" of the mask, then I am
>>>>>>> perfectly fine with the back-end having bitwise-selection behaviour.
>>>>>>
>>>>>> Well, the C frontend would simply always add that != 0 (because it
>>>>>> doesn't know).
>>>>>>
>>>>>>>>> I mean, I don't mind of course, but it seems to me that it would be
>>>>>>>>> cleaner to have one generic enough pattern.
>>>>>>>>>
>>>>>>>>> Is there seriously no way to pass something from optab into the 
>>>>>>>>> backend??
>>>>>>>>
>>>>>>>> You can pass operands.  And information is implicitly encoded in the 
>>>>>>>> name.
>>>>>>>
>>>>>>> I didn't quite get that, could you give an example?
>>>>>>
>>>>>> It was a larger variant of "no, apart from what is obvious".
>>>>>
>>>>> Ha, joking again. :)
>>>>>
>>>>>>>>>> If the mask is computed by a function you are of course out of luck,
>>>>>>>>>> but I don't see how you'd manage to infer knowledge from nowhere 
>>>>>>>>>> either.
>>>>>>>>>
>>>>>>>>> Well, take simpler example
>>>>>>>>>
>>>>>>>>> a = {0};
>>>>>>>>> for ( ; *p; p += 16)
>>>>>>>>>  a &= pattern > (vec)*p;
>>>>>>>>>
>>>>>>>>> res = a ? v0 : v1;
>>>>>>>>>
>>>>>>>>> In this case it is simple to analyse that a is a comparison, but you
>>>>>>>>> cannot embed the operations of a into VEC_COND_EXPR.
>>>>>>>>
>>>>>>>> Sure, but if the above is C source the frontend would generate
>>>>>>>> res = a != 0 ? v0 : v1; initially.  An optimization pass could still
>>>>>>>> track this information and replace VEC_COND_EXPR <a != 0, v0, v1>
>>>>>>>> with VEC_COND_EXPR <a, v0, v1> (no existing one would track
>>>>>>>> vector contents though).
>>>>>>>
>>>>>>> Yeah, sure. My point is, that we must be able to pass this information
>>>>>>> in the backend, that we checked everything, and we are sure that a is
>>>>>>> a corerct mask, please don't add any != 0 to it.
>>>>>>
>>>>>> But all masks are correct as soon as they appear in a VEC_COND_EXPR.
>>>>>> That's the whole point of the bitwise semantics.  It's only the C 
>>>>>> frontend
>>>>>> that needs to be careful to impose its stricter semantics.
>>>>>>
>>>>>> Richard.
>>>>>>
>>>>>
>>>>> Ok, I see the last difference in the approaches we envision.
>>>>> I am assuming, that frontend does not put != 0, but the later
>>>>> optimisations (veclower in my case) check every mask in VEC_COND_EXPR
>>>>> and does the same functionality as you describe. So the philosophical
>>>>> question why it is better to first add and then remove, rather than
>>>>> just add if needed?
>>>>
>>>> Well, it's "better be right than sorry".  Thus, default to the
>>>> conservatively correct
>>>> way and let optimizers "optimize" it.
>>>
>>> How can we get sorry, it is impossible to skip the vcond during the
>>> optimisation, but whatever, it is not really so important when to add.
>>> Currently I have a bigger problem, see below.
>>>>
>>>>> In all the rest I think we agreed.
>>>>
>>>> Fine.
>>>>
>>>> Thanks,
>>>> Richard.
>>>>
>>>>>
>>>>> Artem.
>>>>>
>>>>
>>>
>>> I found out that I cannot really gimplify correctly the vcond<a >b ,
>>> c, d> expression when a > b is vcond<a > b, -1, 0>. The problem is
>>> that gimplifier pulls a > b always as a separate expression during the
>>> gimplification, and I don't think that we can avoid it. So what
>>> happens is:
>>>
>>> vcond <a > b , c , d>
>>> transformed to
>>> x = b > c;
>>> x1 = vcond <x , -1, 0>
>>> vcond <x1, c, d>
>>>
>>> and so on, infinitely long.
>>
>> Sounds like a bug that is possible to fix.
>>
>>> In order to fix the problem we need whether to introduce a new code
>>> like VEC_COMP_LT, VEC_COMP_GT, and so on
>>> whether a builtin function which we would lower
>>> whether stick back to the idea of hook.
>>>
>>> Anyway, representing a >b using vcond does not work.
>>
>> Well, sure it will work, it just needs some work appearantly.
>>
>>> What would be your thinking here?
>>
>> Do you have a patch that exposes this problem?  I can have a look
>> tomorrow.
>>
>> Richard.
>>
>>>
>>> Thanks,
>>> Artem.
>>>
>>
>

Re: Vector Comparison patch

Reply via email to