On Wed, 24 Oct 2012, Mike Stump wrote:
Well, I suspect the OpenCL community had a ton of people sweat over the
details and take into consideration the realities and the needs of
people. I'd like to believe they had more people in on this and that
this was a compromise for someones vector unit.
Intel's, apparently...
The problem is, what if your vector unit produces 0, 1, to be compatible
with C? Suddenly, spilling this onto ? is annoying, both because it
doesn't match hardware, nor the expected semantics of a person that just
knows C. Maybe we run a poll of vector units that prefer -1 or prefer 1
for true, and then decide. SSE has CMPPS, which likes the -1.
Altivec's vec_cmpgt says true is all bits 1. Gosh, I guess we can stop
there. Neon, for fun VCGE is defined to set to all ones for true.
OpenCL it is.
We already decided that comparisons return -1, most processors agree on
that except sparc, which doesn't return a vector at all. The question is
about the selection instructions.
{-2,-1,0,1} ? {x,x,x,x} : {y,y,y,y}
OpenCL says this should be {x,x,y,y}. We are considering making it
{x,x,y,x} instead. Hardware selection instructions vary a lot. OpenCL
follows x86, what we are considering matches Power IIRC, and ARM only has
a bitwise selection (I only quickly glanced at all of these, I may have
read them wrong).
I am fine with both alternatives, but the choice is important...
--
Marc Glisse