On 8/22/2012 7:19 PM, bearophile wrote:
Some time ago I have suggested to add support to vector comparisons in
D, because this is sometimes useful and in the modern SIMD units there
is hardware support for such operations:


I think that code is semantically equivalent to:

void main() {
     double[] a = [1.0, 1.0, -1.0, 1.0, 0.0, -1.0];
     double[] b = [10,   20,   30,  40,  50,   60];
     double[] c = [1,     2,    3,   4,   5,    6];
     foreach (i; 0 .. a.length)
         if (a[i] > 0)
             b[i] += c[i];
}


After that code b is:
[11, 22, 30, 44, 50, 60]


This means the contents of the 'then' branch of the vectorized
comparison is done only on items of b and c where the comparison has
given true.

This looks useful. Is it possible to implement this in D, and do you
like it?

Well, right now the binary operators == != >= <= > and < are required to return bool instead of allowing a user defined type, which prevents a lot of the sugar you would want to make the code nice to write. Without the sugar the code would ends up this:

foreach(i; 0 .. a.length)
{
    float4 mask = greaterThan(a[i], float4(0,0,0,0));
    b[i] = select(mask, b[i] + c[i], b[i]);
}

in GPU shader land this expression is at least simpler to write:

foreach(i; 0 .. a.length)
{
    b[i] = (b[i] > 0) ? (b[i] + c[i]) : b[i];
}


All of these implementations are equivalent and remove the branch from the code flow, which is pretty nice for the CPU pipeline. In SIMD the comparisons generate masks into a register which you can immediately use. On modern (SSE4) CPUs the select is a single instruction, on older ones it takes three: (mask & A) | (~mask & B), but its all better than a real branch.

If you have a large amount of code needing a branch, you can take the mask generated by the compare, and extract it into a CPU register, and compare it for 0, nonzero, specific or any bits set. a float4 comparison ends up generating 4 bits, so the code with a real branch is like:

if (any(a[i] > 0))
{
    // do stuff if any of a[i] are greater than zero
}       
if (all(a[i] > 0))
{
    // do stuff if all of a[i] are greater than zero
}
if ((getMask(a[i] > 0) & 0x7) == 0x7)
{
    // do stuff if the first three elements are greater than zero
}


Reply via email to