On Mon, Aug 22, 2011 at 11:11 PM, Artem Shinkarov <artyom.shinkar...@gmail.com> wrote: > I'll just send you my current version. I'll be a little bit more specific. > > The problem starts when you try to lower the following expression: > > x = a > b; > x1 = vcond <x != 0, -1, 0> > vcond <x1, c, d> > > Now, you go from the beginning to the end of the block, and you cannot > leave a > b, because only vconds are valid expressions to expand. > > Now, you meet a > b first. You try to transform it into vcond <a > b, > -1, 0>, you build this expression, then you try to gimplify it, and > you see that you have something like: > > x' = a >b; > x = vcond <x', -1, 0> > x1 = vcond <x != 0, -1, 0> > vcond <x1, c, d> > > and your gsi stands at the x1 now, so the gimplification created a > comparison that optab would not understand. And I am not really sure > that you would be able to solve this problem easily. > > It would helpr, if you could create vcond<x, op, y, op0, op1>, but you > cant and x op y is a single tree that must be gimplified, and I am not > sure that you can persuade gimplifier to leave this expression > untouched. > > In the attachment the current version of the patch.
I can't reproduce it with your patch. For #define vector(elcount, type) \ __attribute__((vector_size((elcount)*sizeof(type)))) type vector (4, float) x, y; vector (4, int) a,b; int main (int argc, char *argv[]) { vector (4, int) i0 = x < y; vector (4, int) i1 = i0 ? a : b; return 0; } I get from the C frontend: vector(4) int i0 = VEC_COND_EXPR < SAVE_EXPR <x < y> , { -1, -1, -1, -1 } , { 0, 0, 0, 0 } > ; vector(4) int i1 = VEC_COND_EXPR < SAVE_EXPR <i0> , SAVE_EXPR <a> , SAVE_EXPR <b> > ; but I have expected i0 != 0 in the second VEC_COND_EXPR. I do see that the gimplifier pulls away the condition for the first VEC_COND_EXPR though: x.0 = x; y.1 = y; D.2735 = x.0 < y.1; D.2734 = D.2735; D.2736 = D.2734; i0 = [vec_cond_expr] VEC_COND_EXPR < D.2736 , { -1, -1, -1, -1 } , { 0, 0, 0, 0 } > ; which is, I believe because of the SAVE_EXPR wrapped around the comparison. Why do you bother wrapping all operands in save-exprs? With that the /* Currently the expansion of VEC_COND_EXPR does not allow expessions where the type of vectors you compare differs form the type of vectors you select from. For the time being we insert implicit conversions. */ if ((COMPARISON_CLASS_P (ifexp) && TREE_TYPE (TREE_OPERAND (ifexp, 0)) != TREE_TYPE (op1)) || TREE_TYPE (ifexp) != TREE_TYPE (op1)) checks will fail (because ifexp is a SAVE_EXPR). I'll run into errors when not adding the SAVE_EXPR around the ifexp, the transform into x < y ? {-1,...} : {0,...} is not happening. > > Thanks, > Artem. > > > On Mon, Aug 22, 2011 at 9:58 PM, Richard Guenther > <richard.guent...@gmail.com> wrote: >> On Mon, Aug 22, 2011 at 10:49 PM, Artem Shinkarov >> <artyom.shinkar...@gmail.com> wrote: >>> On Mon, Aug 22, 2011 at 9:42 PM, Richard Guenther >>> <richard.guent...@gmail.com> wrote: >>>> On Mon, Aug 22, 2011 at 5:58 PM, Artem Shinkarov >>>> <artyom.shinkar...@gmail.com> wrote: >>>>> On Mon, Aug 22, 2011 at 4:50 PM, Richard Guenther >>>>> <richard.guent...@gmail.com> wrote: >>>>>> On Mon, Aug 22, 2011 at 5:43 PM, Artem Shinkarov >>>>>> <artyom.shinkar...@gmail.com> wrote: >>>>>>> On Mon, Aug 22, 2011 at 4:34 PM, Richard Guenther >>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>> On Mon, Aug 22, 2011 at 5:21 PM, Artem Shinkarov >>>>>>>> <artyom.shinkar...@gmail.com> wrote: >>>>>>>>> On Mon, Aug 22, 2011 at 4:01 PM, Richard Guenther >>>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>>> On Mon, Aug 22, 2011 at 2:05 PM, Artem Shinkarov >>>>>>>>>> <artyom.shinkar...@gmail.com> wrote: >>>>>>>>>>> On Mon, Aug 22, 2011 at 12:25 PM, Richard Guenther >>>>>>>>>>> <richard.guent...@gmail.com> wrote: >>>>>>>>>>>> On Mon, Aug 22, 2011 at 12:53 AM, Artem Shinkarov >>>>>>>>>>>> <artyom.shinkar...@gmail.com> wrote: >>>>>>>>>>>>> Richard >>>>>>>>>>>>> >>>>>>>>>>>>> I formalized an approach a little-bit, now it works without target >>>>>>>>>>>>> hooks, but some polishing is still required. I want you to >>>>>>>>>>>>> comment on >>>>>>>>>>>>> the several important approaches that I use in the patch. >>>>>>>>>>>>> >>>>>>>>>>>>> So how does it work. >>>>>>>>>>>>> 1) All the vector comparisons at the level of type-checker are >>>>>>>>>>>>> introduced using VEC_COND_EXPR with constant selection operands >>>>>>>>>>>>> being >>>>>>>>>>>>> {-1} and {0}. For example v0 > v1 is transformed into >>>>>>>>>>>>> VEC_COND_EXPR<v0 >>>>>>>>>>>>>> v1, {-1}, {0}>. >>>>>>>>>>>>> >>>>>>>>>>>>> 2) When optabs expand VEC_COND_EXPR, two cases are considered: >>>>>>>>>>>>> 2.a) first operand of VEC_COND_EXPR is comparison, in that case >>>>>>>>>>>>> nothing changes. >>>>>>>>>>>>> 2.b) first operand is something else, in that case, we specially >>>>>>>>>>>>> mark >>>>>>>>>>>>> this case, recognize it in the backend, and do not create a >>>>>>>>>>>>> comparison, but use the mask as it was a result of some >>>>>>>>>>>>> comparison. >>>>>>>>>>>>> >>>>>>>>>>>>> 3) In order to make sure that mask in VEC_COND_EXPR<mask, v0, v1> >>>>>>>>>>>>> is a >>>>>>>>>>>>> vector comparison we use is_vector_comparison function, if it >>>>>>>>>>>>> returns >>>>>>>>>>>>> false, then we replace mask with mask != {0}. >>>>>>>>>>>>> >>>>>>>>>>>>> So we end-up with the following functionality: >>>>>>>>>>>>> VEC_COND_EXPR<mask, v0,v1> -- if we know that mask is a result of >>>>>>>>>>>>> comparison of two vectors, we leave it as it is, otherwise change >>>>>>>>>>>>> with >>>>>>>>>>>>> mask != {0}. >>>>>>>>>>>>> >>>>>>>>>>>>> Plain vector comparison a <op> b is represented with >>>>>>>>>>>>> VEC_COND_EXPR, >>>>>>>>>>>>> which correctly expands, without creating useless masking. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> Basically for me there are two questions: >>>>>>>>>>>>> 1) Can we perform information passing in optabs in a nicer way? >>>>>>>>>>>>> 2) How is_vector_comparison could be improved? I have several >>>>>>>>>>>>> ideas, >>>>>>>>>>>>> like checking if constant vector all consists of 0 and -1, and so >>>>>>>>>>>>> on. >>>>>>>>>>>>> But first is it conceptually fine. >>>>>>>>>>>>> >>>>>>>>>>>>> P.S. I tired to put the functionality of is_vector_comparison in >>>>>>>>>>>>> tree-ssa-forwprop, but the thing is that it is called only with >>>>>>>>>>>>> -On, >>>>>>>>>>>>> which I find inappropriate, and the functionality gets more >>>>>>>>>>>>> complicated. >>>>>>>>>>>> >>>>>>>>>>>> Why is it inappropriate to not optimize it at -O0? If the user >>>>>>>>>>>> separates comparison and ?: expression it's his own fault. >>>>>>>>>>> >>>>>>>>>>> Well, because all the information is there, and I perfectly envision >>>>>>>>>>> the case when user expressed comparison separately, just to avoid >>>>>>>>>>> code >>>>>>>>>>> duplication. >>>>>>>>>>> >>>>>>>>>>> Like: >>>>>>>>>>> mask = a > b; >>>>>>>>>>> res1 = mask ? v0 : v1; >>>>>>>>>>> res2 = mask ? v2 : v3; >>>>>>>>>>> >>>>>>>>>>> Which in this case would be different from >>>>>>>>>>> res1 = a > b ? v0 : v1; >>>>>>>>>>> res2 = a > b ? v2 : v3; >>>>>>>>>>> >>>>>>>>>>>> Btw, the new hook is still in the patch. >>>>>>>>>>>> >>>>>>>>>>>> I would simply always create != 0 if it isn't and let optimizers >>>>>>>>>>>> (tree-ssa-forwprop.c) optimize this. You still have to deal with >>>>>>>>>>>> non-comparison operands during expansion though, but if >>>>>>>>>>>> you always forced a != 0 from the start you can then simply >>>>>>>>>>>> interpret it as a proper comparison result (in which case I'd >>>>>>>>>>>> modify the backends to have an alternate pattern or directly >>>>>>>>>>>> expand to masking operations - using the fake comparison >>>>>>>>>>>> RTX is too much of a hack). >>>>>>>>>>> >>>>>>>>>>> Richard, I think you didn't get the problem. >>>>>>>>>>> I really need the way, to pass the information, that the expression >>>>>>>>>>> that is in the first operand of vcond is an appropriate mask. I >>>>>>>>>>> though >>>>>>>>>>> for quite a while and this hack is the only answer I found, is >>>>>>>>>>> there a >>>>>>>>>>> better way to do that. I could for example introduce another >>>>>>>>>>> tree-node, but it would be overkill as well. >>>>>>>>>>> >>>>>>>>>>> Now why do I need it so much: >>>>>>>>>>> I want to implement the comparison in a way that {1, 5, 0, -1} is >>>>>>>>>>> actually {-1,-1,-1,-1}. So whenever I am not sure that mask of >>>>>>>>>>> VEC_COND_EXPR is a real comparison I transform it to mask != {0} >>>>>>>>>>> (not >>>>>>>>>>> always). To check the stuff, I use is_vector_comparison in >>>>>>>>>>> tree-vect-generic. >>>>>>>>>>> >>>>>>>>>>> So I really have the difference between mask ? x : y and mask != >>>>>>>>>>> {0} ? >>>>>>>>>>> x : y, otherwise I could treat mask != {0} in the backend as just >>>>>>>>>>> mask. >>>>>>>>>>> >>>>>>>>>>> If this link between optabs and backend breaks, then the patch falls >>>>>>>>>>> apart. Because every time the comparison is taken out >>>>>>>>>>> VEC_COND_EXPR, I >>>>>>>>>>> will have to put != {0}. Keep in mind, that I cannot always put the >>>>>>>>>>> comparison inside the VEC_COND_EXPR, what if it is defined in a >>>>>>>>>>> function-comparison, or somehow else? >>>>>>>>>>> >>>>>>>>>>> So what would be an appropriate way to connect optabs and the >>>>>>>>>>> backend? >>>>>>>>>> >>>>>>>>>> Well, there is no problem in having the only valid mask operand for >>>>>>>>>> VEC_COND_EXPRs being either a comparison or a {-1,...} / {0,....} >>>>>>>>>> mask. >>>>>>>>>> Just the C parser has to transform mask ? vec1 : vec2 to mask != 0 ? >>>>>>>>>> vec1 : vec2. >>>>>>>>> >>>>>>>>> This happens already in the new version of patch (not submitted yet). >>>>>>>>> >>>>>>>>>> This comparison can be eliminated by optimization passes >>>>>>>>>> that >>>>>>>>>> either replace it by the real comparison computing the mask or just >>>>>>>>>> propagating the information this mask is already {-1,...} / {0,....} >>>>>>>>>> by simply >>>>>>>>>> dropping the comparison against zero. >>>>>>>>> >>>>>>>>> This is not a problem, because the backend recognizes these patterns, >>>>>>>>> so no optimization is needed in this part. >>>>>>>> >>>>>>>> I mean for >>>>>>>> >>>>>>>> mask = v1 < v2 ? {-1,...} : {0,...}; >>>>>>>> val = VCOND_EXPR <mask != 0, v3, v4>; >>>>>>>> >>>>>>>> optimizers can see how mask is defined and drop the != 0 test or >>>>>>>> replace >>>>>>>> it by v1 < v2. >>>>>>> >>>>>>> Yes, sure. >>>>>>> >>>>>>>>>> For the backends I'd have vcond patterns for both an embedded >>>>>>>>>> comparison >>>>>>>>>> and for a mask. (Now we can rewind the discussion a bit and allow >>>>>>>>>> arbitrary masks and define a vcond with a mask operand to do bitwise >>>>>>>>>> selection - what matters is the C frontend semantics which we need to >>>>>>>>>> translate to what the middle-end thinks of a VEC_COND_EXPR, they >>>>>>>>>> do not have to agree). >>>>>>>>> >>>>>>>>> But it seems like another combinatorial explosion here. Considering >>>>>>>>> what Richard said in his e-mail, in order to support "generic" vcond, >>>>>>>>> we just need to enumerate all the possible cases. Or I didn't >>>>>>>>> understand right? >>>>>>>> >>>>>>>> Well, the question is still what VCOND_EXPR and thus the vcond pattern >>>>>>>> semantically does for a non-comparison operand. I'd argue that using >>>>>>>> the bitwise selection semantic gives us maximum flexibility and a >>>>>>>> native >>>>>>>> instruction with AMD XOP. This non-comparison VCOND_EXPR is >>>>>>>> also easy to implement in the middle-end expansion code if there is >>>>>>>> no native instruction for it - by simply emitting the bitwise >>>>>>>> operations. >>>>>>>> >>>>>>>> But I have the feeling we are talking past each other ...? >>>>>>> >>>>>>> I am all for the bitwise behaviour in the backend pattern, that is >>>>>>> something that I rely on at the moment. What I don't want to have is >>>>>>> the same behaviour in the frontend. So If we can guarantee, that we >>>>>>> add != 0, when we don't know the "nature" of the mask, then I am >>>>>>> perfectly fine with the back-end having bitwise-selection behaviour. >>>>>> >>>>>> Well, the C frontend would simply always add that != 0 (because it >>>>>> doesn't know). >>>>>> >>>>>>>>> I mean, I don't mind of course, but it seems to me that it would be >>>>>>>>> cleaner to have one generic enough pattern. >>>>>>>>> >>>>>>>>> Is there seriously no way to pass something from optab into the >>>>>>>>> backend?? >>>>>>>> >>>>>>>> You can pass operands. And information is implicitly encoded in the >>>>>>>> name. >>>>>>> >>>>>>> I didn't quite get that, could you give an example? >>>>>> >>>>>> It was a larger variant of "no, apart from what is obvious". >>>>> >>>>> Ha, joking again. :) >>>>> >>>>>>>>>> If the mask is computed by a function you are of course out of luck, >>>>>>>>>> but I don't see how you'd manage to infer knowledge from nowhere >>>>>>>>>> either. >>>>>>>>> >>>>>>>>> Well, take simpler example >>>>>>>>> >>>>>>>>> a = {0}; >>>>>>>>> for ( ; *p; p += 16) >>>>>>>>> a &= pattern > (vec)*p; >>>>>>>>> >>>>>>>>> res = a ? v0 : v1; >>>>>>>>> >>>>>>>>> In this case it is simple to analyse that a is a comparison, but you >>>>>>>>> cannot embed the operations of a into VEC_COND_EXPR. >>>>>>>> >>>>>>>> Sure, but if the above is C source the frontend would generate >>>>>>>> res = a != 0 ? v0 : v1; initially. An optimization pass could still >>>>>>>> track this information and replace VEC_COND_EXPR <a != 0, v0, v1> >>>>>>>> with VEC_COND_EXPR <a, v0, v1> (no existing one would track >>>>>>>> vector contents though). >>>>>>> >>>>>>> Yeah, sure. My point is, that we must be able to pass this information >>>>>>> in the backend, that we checked everything, and we are sure that a is >>>>>>> a corerct mask, please don't add any != 0 to it. >>>>>> >>>>>> But all masks are correct as soon as they appear in a VEC_COND_EXPR. >>>>>> That's the whole point of the bitwise semantics. It's only the C >>>>>> frontend >>>>>> that needs to be careful to impose its stricter semantics. >>>>>> >>>>>> Richard. >>>>>> >>>>> >>>>> Ok, I see the last difference in the approaches we envision. >>>>> I am assuming, that frontend does not put != 0, but the later >>>>> optimisations (veclower in my case) check every mask in VEC_COND_EXPR >>>>> and does the same functionality as you describe. So the philosophical >>>>> question why it is better to first add and then remove, rather than >>>>> just add if needed? >>>> >>>> Well, it's "better be right than sorry". Thus, default to the >>>> conservatively correct >>>> way and let optimizers "optimize" it. >>> >>> How can we get sorry, it is impossible to skip the vcond during the >>> optimisation, but whatever, it is not really so important when to add. >>> Currently I have a bigger problem, see below. >>>> >>>>> In all the rest I think we agreed. >>>> >>>> Fine. >>>> >>>> Thanks, >>>> Richard. >>>> >>>>> >>>>> Artem. >>>>> >>>> >>> >>> I found out that I cannot really gimplify correctly the vcond<a >b , >>> c, d> expression when a > b is vcond<a > b, -1, 0>. The problem is >>> that gimplifier pulls a > b always as a separate expression during the >>> gimplification, and I don't think that we can avoid it. So what >>> happens is: >>> >>> vcond <a > b , c , d> >>> transformed to >>> x = b > c; >>> x1 = vcond <x , -1, 0> >>> vcond <x1, c, d> >>> >>> and so on, infinitely long. >> >> Sounds like a bug that is possible to fix. >> >>> In order to fix the problem we need whether to introduce a new code >>> like VEC_COMP_LT, VEC_COMP_GT, and so on >>> whether a builtin function which we would lower >>> whether stick back to the idea of hook. >>> >>> Anyway, representing a >b using vcond does not work. >> >> Well, sure it will work, it just needs some work appearantly. >> >>> What would be your thinking here? >> >> Do you have a patch that exposes this problem? I can have a look >> tomorrow. >> >> Richard. >> >>> >>> Thanks, >>> Artem. >>> >> >