http://llvm.org/bugs/show_bug.cgi?id=6209
Summary: Keep Results Into SIMD Registers
Product: new-bugs
Version: unspecified
Platform: PC
OS/Version: All
Status: NEW
Keywords: code-quality
Severity: normal
Priority: P2
Component: new bugs
AssignedTo: [email protected]
ReportedBy: [email protected]
CC: [email protected]
An article by Gustavo Oliveira:
http://www.gamasutra.com/view/feature/4248/designing_fast_crossplatform_simd_.php?page=3
See the part: 2. Keep Results Into SIMD Registers
It suggests to don't write code like this (note the deltalength, that's a
float):
Vec4& x2 = m_x[i2];
Vec4 delta = x2-x1;
float deltalength = Sqrt(Dot(delta,delta));
float diff = (deltalength-restlength)/deltalength;
x1 += delta*half*diff;
x2 -= delta*half*diff;
He says that's expensive since the compiler needs to generate code that will
move data from and to the SIMD and FPU registers. He suggests to write it like
this, assuming the "Dot" function above replicates the result into the SIMD
4-quad words and the "w" component zeroed-out. Now the expensive casting
operations are no longer necessary:
Vec4& x2 = m_x[i2];
Vec4 delta = x2-x1;
Vec4 deltalength = Sqrt(Dot(delta,delta));
Vec4 diff = (deltalength-restlength)/deltalength;
x1 += delta*half*diff;
x2 -= delta*half*diff;
Maybe LLVM can sometimes perform a similar optimization (I think it's not
already done by LLVM, but I can be wrong).
--------------------
Below there's another interesting section:
3. Re-Arrange Data to Be Friendly to SIMD operations
But this looks like an optimization harder to do for a C compiler.
--
Configure bugmail: http://llvm.org/bugs/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
_______________________________________________
LLVMbugs mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/llvmbugs