On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther
<richard.guent...@gmail.com> wrote:
> On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov
> <artyom.shinkar...@gmail.com> wrote:
>> On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther
>> <richard.guent...@gmail.com> wrote:
>>> On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov
>>> <artyom.shinkar...@gmail.com> wrote:
>>>> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther
>>>> <richard.guent...@gmail.com> wrote:
>>>>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov
>>>>> <artyom.shinkar...@gmail.com> wrote:
>>>>>> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther
>>>>>> <richard.guent...@gmail.com> wrote:
>>>>>>> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov
>>>>>>> <artyom.shinkar...@gmail.com> wrote:
>>>>>>>> Richard
>>>>>>>>
>>>>>>>> I formalized an approach a little-bit, now it works without target
>>>>>>>> hooks, but some polishing is still required. I want you to comment on
>>>>>>>> the several important approaches that I use in the patch.
>>>>>>>>
>>>>>>>> So how does it work.
>>>>>>>> 1) All the vector comparisons at the level of  type-checker are
>>>>>>>> introduced using VEC_COND_EXPR with constant selection operands being
>>>>>>>> {-1} and {0}. For example v0 > v1 is transformed into VEC_COND_EXPR<v0
>>>>>>>>> v1, {-1}, {0}>.
>>>>>>>>
>>>>>>>> 2) When optabs expand VEC_COND_EXPR, two cases are considered:
>>>>>>>> 2.a) first operand of VEC_COND_EXPR is comparison, in that case 
>>>>>>>> nothing changes.
>>>>>>>> 2.b) first operand is something else, in that case, we specially mark
>>>>>>>> this case, recognize it in the backend, and do not create a
>>>>>>>> comparison, but use the mask as it was a result of some comparison.
>>>>>>>>
>>>>>>>> 3) In order to make sure that mask in VEC_COND_EXPR<mask, v0, v1> is a
>>>>>>>> vector comparison we use is_vector_comparison function, if it returns
>>>>>>>> false, then we replace mask with mask != {0}.
>>>>>>>>
>>>>>>>> So we end-up with the following functionality:
>>>>>>>> VEC_COND_EXPR<mask, v0,v1> -- if we know that mask is a result of
>>>>>>>> comparison of two vectors, we leave it as it is, otherwise change with
>>>>>>>> mask != {0}.
>>>>>>>>
>>>>>>>> Plain vector comparison a <op> b is represented with VEC_COND_EXPR,
>>>>>>>> which correctly expands, without creating useless masking.
>>>>>>>>
>>>>>>>>
>>>>>>>> Basically for me there are two questions:
>>>>>>>> 1) Can we perform information passing in optabs in a nicer way?
>>>>>>>> 2) How is_vector_comparison could be improved? I have several ideas,
>>>>>>>> like checking if constant vector all consists of 0 and -1, and so on.
>>>>>>>> But first is it conceptually fine.
>>>>>>>>
>>>>>>>> P.S. I tired to put the functionality of is_vector_comparison in
>>>>>>>> tree-ssa-forwprop, but the thing is that it is called only with -On,
>>>>>>>> which I find inappropriate, and the functionality gets more
>>>>>>>> complicated.
>>>>>>>
>>>>>>> Why is it inappropriate to not optimize it at -O0?  If the user
>>>>>>> separates comparison and ?: expression it's his own fault.
>>>>>>
>>>>>> Well, because all the information is there, and I perfectly envision
>>>>>> the case when user expressed comparison separately, just to avoid code
>>>>>> duplication.
>>>>>>
>>>>>> Like:
>>>>>> mask = a > b;
>>>>>> res1 = mask ? v0 : v1;
>>>>>> res2 = mask ? v2 : v3;
>>>>>>
>>>>>> Which in this case would be different from
>>>>>> res1 = a > b ? v0 : v1;
>>>>>> res2 = a > b ? v2 : v3;
>>>>>>
>>>>>>> Btw, the new hook is still in the patch.
>>>>>>>
>>>>>>> I would simply always create != 0 if it isn't and let optimizers
>>>>>>> (tree-ssa-forwprop.c) optimize this.  You still have to deal with
>>>>>>> non-comparison operands during expansion though, but if
>>>>>>> you always forced a != 0 from the start you can then simply
>>>>>>> interpret it as a proper comparison result (in which case I'd
>>>>>>> modify the backends to have an alternate pattern or directly
>>>>>>> expand to masking operations - using the fake comparison
>>>>>>> RTX is too much of a hack).
>>>>>>
>>>>>> Richard, I think you didn't get the problem.
>>>>>> I really need the way, to pass the information, that the expression
>>>>>> that is in the first operand of vcond is an appropriate mask. I though
>>>>>> for quite a while and this hack is the only answer I found, is there a
>>>>>> better way to do that. I could for example introduce another
>>>>>> tree-node, but it would be overkill as well.
>>>>>>
>>>>>> Now why do I need it so much:
>>>>>> I want to implement the comparison in a way that {1, 5, 0, -1} is
>>>>>> actually {-1,-1,-1,-1}. So whenever I am not sure that mask of
>>>>>> VEC_COND_EXPR is a real comparison I transform it to mask != {0} (not
>>>>>> always). To check the stuff, I use is_vector_comparison in
>>>>>> tree-vect-generic.
>>>>>>
>>>>>> So I really have the difference between mask ? x : y and mask != {0} ?
>>>>>> x : y, otherwise I could treat mask != {0} in the backend as just
>>>>>> mask.
>>>>>>
>>>>>> If this link between optabs and backend breaks, then the patch falls
>>>>>> apart. Because every time the comparison is taken out VEC_COND_EXPR, I
>>>>>> will have to put != {0}. Keep in mind, that I cannot always put the
>>>>>> comparison inside the VEC_COND_EXPR, what if it is defined in a
>>>>>> function-comparison, or somehow else?
>>>>>>
>>>>>> So what would be an appropriate way to connect optabs and the backend?
>>>>>
>>>>> Well, there is no problem in having the only valid mask operand for
>>>>> VEC_COND_EXPRs being either a comparison or a {-1,...} / {0,....} mask.
>>>>> Just the C parser has to transform mask ? vec1 : vec2 to mask != 0 ?
>>>>> vec1 : vec2.
>>>>
>>>> This happens already in the new version of patch (not submitted yet).
>>>>
>>>>> This comparison can be eliminated by optimization passes
>>>>> that
>>>>> either replace it by the real comparison computing the mask or just
>>>>> propagating the information this mask is already {-1,...} / {0,....} by 
>>>>> simply
>>>>> dropping the comparison against zero.
>>>>
>>>> This is not a problem, because the backend recognizes these patterns,
>>>> so no optimization is needed in this part.
>>>
>>> I mean for
>>>
>>>  mask = v1 < v2 ? {-1,...} : {0,...};
>>>  val = VCOND_EXPR <mask != 0, v3, v4>;
>>>
>>> optimizers can see how mask is defined and drop the != 0 test or replace
>>> it by v1 < v2.
>>
>> Yes, sure.
>>
>>>>> For the backends I'd have vcond patterns for both an embedded comparison
>>>>> and for a mask.  (Now we can rewind the discussion a bit and allow
>>>>> arbitrary masks and define a vcond with a mask operand to do bitwise
>>>>> selection - what matters is the C frontend semantics which we need to
>>>>> translate to what the middle-end thinks of a VEC_COND_EXPR, they
>>>>> do not have to agree).
>>>>
>>>> But it seems like another combinatorial explosion here. Considering
>>>> what Richard said in his e-mail, in order to support "generic" vcond,
>>>> we just need to enumerate all the possible cases. Or I didn't
>>>> understand right?
>>>
>>> Well, the question is still what VCOND_EXPR and thus the vcond pattern
>>> semantically does for a non-comparison operand.  I'd argue that using
>>> the bitwise selection semantic gives us maximum flexibility and a native
>>> instruction with AMD XOP.  This non-comparison VCOND_EXPR is
>>> also easy to implement in the middle-end expansion code if there is
>>> no native instruction for it - by simply emitting the bitwise operations.
>>>
>>> But I have the feeling we are talking past each other ...?
>>
>> I am all for the bitwise behaviour in the backend pattern, that is
>> something that I rely on at the moment. What I don't want to have is
>> the same behaviour in the frontend. So If we can guarantee, that we
>> add != 0, when we don't know the "nature" of the mask, then I am
>> perfectly fine with the back-end having bitwise-selection behaviour.
>
> Well, the C frontend would simply always add that != 0 (because it
> doesn't know).
>
>>>> I mean, I don't mind of course, but it seems to me that it would be
>>>> cleaner to have one generic enough pattern.
>>>>
>>>> Is there seriously no way to pass something from optab into the backend??
>>>
>>> You can pass operands.  And information is implicitly encoded in the name.
>>
>> I didn't quite get that, could you give an example?
>
> It was a larger variant of "no, apart from what is obvious".

Ha, joking again. :)

>>>>> If the mask is computed by a function you are of course out of luck,
>>>>> but I don't see how you'd manage to infer knowledge from nowhere either.
>>>>
>>>> Well, take simpler example
>>>>
>>>> a = {0};
>>>> for ( ; *p; p += 16)
>>>>  a &= pattern > (vec)*p;
>>>>
>>>> res = a ? v0 : v1;
>>>>
>>>> In this case it is simple to analyse that a is a comparison, but you
>>>> cannot embed the operations of a into VEC_COND_EXPR.
>>>
>>> Sure, but if the above is C source the frontend would generate
>>> res = a != 0 ? v0 : v1; initially.  An optimization pass could still
>>> track this information and replace VEC_COND_EXPR <a != 0, v0, v1>
>>> with VEC_COND_EXPR <a, v0, v1> (no existing one would track
>>> vector contents though).
>>
>> Yeah, sure. My point is, that we must be able to pass this information
>> in the backend, that we checked everything, and we are sure that a is
>> a corerct mask, please don't add any != 0 to it.
>
> But all masks are correct as soon as they appear in a VEC_COND_EXPR.
> That's the whole point of the bitwise semantics.  It's only the C frontend
> that needs to be careful to impose its stricter semantics.
>
> Richard.
>

Ok, I see the last difference in the approaches we envision.
I am assuming, that frontend does not put != 0, but the later
optimisations (veclower in my case) check every mask in VEC_COND_EXPR
and does the same functionality as you describe. So the philosophical
question why it is better to first add and then remove, rather than
just add if needed?

In all the rest I think we agreed.


Artem.

Reply via email to