Re: getting an unpleasant lack of symmetry for a cross product with avx2 target only.

Matt Pharr Mon, 12 Sep 2016 09:20:23 -0700

I'm sorry, but I'm not quite following what you're expecting versus what
you're seeing. e.g. if I compile that with --target=avx2, I see a series of
three multiplies followed by FMA instructions, which seems about as good as
it gets.

Is it that you're expecting that if you have code that does both Cross(v0,
v1) and Cross(v1, v0), you're expecting that it will just generate code for
one cross product operation and then compute its negation for the second
one? In that case, I suspect that the issue is the fused multiply add
instruction in AVX2 (that ispc uses by default). The issue there is that a
FMA doesn't necessarily give precisely the same result as a multiply and an
add (the FMA only rounds to floating point once, at the end, while the MUL,
ADD has two rounding operations). This in turn may inhibit the optimization
you're expecting?

Using --opt=fast-math might help with that case (but I'm not sure). There's
also --opt=disable-fma, which may help (but may not be what you want in
general...)

Thanks,
Matt

On Mon, Sep 12, 2016 at 8:36 AM, Morten Mikkelsen <[email protected]>
wrote:

> I have this basic cross product declared like this in my kernel:
>
> inline const Vec3 Cross(const Vec3 v1, const Vec3 v2)
> {
> Vec3 v;
> v.x = v1.y*v2.z - v2.y*v1.z;
> v.y = v1.z*v2.x - v2.z*v1.x;
> v.z = v1.x*v2.y - v2.x*v1.y;
> return v;
> }
>
> Though I get that floating point operations are generally speaking
> order-dependent I was expecting given the simplicity of a cross product
> that we'd still find Cross(v0,v1) = -Cross(v1,v0) yet when I build for the
> avx2 target this is not what I'm seeing. For all other targets SSE2-AVX1 it
> is working in the sense that the above property holds true.
>
> Right now I'm working around by doing sort of a 96bit check if v0>v1 and
> if it is then I swap the inputs but it's adding overhead. So I wanted to
> ask if you guys would consider it a big or not on ispc side? Or is it
> simply that I can't reasonably expect Cross(v0,v1) = -Cross(v1,v0) for the
> generated code.
>
>
> Thank you,
>
> Morten S. Mikkelsen
>
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "Intel SPMD Program Compiler Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Intel SPMD Program Compiler Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: getting an unpleasant lack of symmetry for a cross product with avx2 target only.

Reply via email to