On 25.02.2013 06:03, Nozomi Kodama wrote:
+ out.u.m[2][i] = v.z / signed_det;
+ out.u.m[3][i] = v.w / signed_det;
}
*pout = out;
While you are at it, you may fix the indentation of out*, "}", "*pout =
out;" and "return pout;".
> signed_det = (i % 2)? -det: det;
Couldn't you just use something like "det = -det;" instead of the
modulo? This should be a little bit faster.
I did some small tests for speed with the following results. You may
also avoid such a lot of variable assignments like *pout = out and you
may use 4 vecs instead. This should save ~48 assignments and it should
also improve the speed a bit more (~10%). Though, native is still 40%
faster than that.
With the change above it should look like:
int i;
D3DXVECTOR4 v, vec[4];
FLOAT det;
...
for (i = 0; i < 4; i++)
{
vec[i].x = pm->u.m[i][0];
vec[i].y = pm->u.m[i][1];
vec[i].z = pm->u.m[i][2];
vec[i].w = pm->u.m[i][3];
}
for (i = 0; i < 4; i++)
{
switch (i)
{
case 0: D3DXVec4Cross(&v, &vec[1], &vec[2], &vec[3]); break;
case 1: D3DXVec4Cross(&v, &vec[0], &vec[2], &vec[3]); break;
case 2: D3DXVec4Cross(&v, &vec[0], &vec[1], &vec[3]); break;
case 3: D3DXVec4Cross(&v, &vec[0], &vec[1], &vec[2]); break;
}
pout->u.m[0][i] = v.x / det;
pout->u.m[1][i] = v.y / det;
pout->u.m[2][i] = v.z / det;
pout->u.m[3][i] = v.w / det;
det = -det;
}
return pout;
Maybe we could reuse some calculations from the D3DXVec4Cross function ...
Cheers
Rico