On 25.02.2013 06:03, Nozomi Kodama wrote:
+    out.u.m[2][i] = v.z / signed_det;
+    out.u.m[3][i] = v.w / signed_det;
     }

     *pout = out;

While you are at it, you may fix the indentation of out*, "}", "*pout = out;" and "return pout;".

> signed_det = (i % 2)? -det: det;
Couldn't you just use something like "det = -det;" instead of the modulo? This should be a little bit faster.


I did some small tests for speed with the following results. You may also avoid such a lot of variable assignments like *pout = out and you may use 4 vecs instead. This should save ~48 assignments and it should also improve the speed a bit more (~10%). Though, native is still 40% faster than that.

With the change above it should look like:
int i;
D3DXVECTOR4 v, vec[4];
FLOAT det;
...
for (i = 0; i < 4; i++)
{
    vec[i].x = pm->u.m[i][0];
    vec[i].y = pm->u.m[i][1];
    vec[i].z = pm->u.m[i][2];
    vec[i].w = pm->u.m[i][3];
}

for (i = 0; i < 4; i++)
{
    switch (i)
    {
        case 0: D3DXVec4Cross(&v, &vec[1], &vec[2], &vec[3]); break;
        case 1: D3DXVec4Cross(&v, &vec[0], &vec[2], &vec[3]); break;
        case 2: D3DXVec4Cross(&v, &vec[0], &vec[1], &vec[3]); break;
        case 3: D3DXVec4Cross(&v, &vec[0], &vec[1], &vec[2]); break;
    }
    pout->u.m[0][i] = v.x / det;
    pout->u.m[1][i] = v.y / det;
    pout->u.m[2][i] = v.z / det;
    pout->u.m[3][i] = v.w / det;
    det = -det;
}
return pout;

Maybe we could reuse some calculations from the D3DXVec4Cross function ...

Cheers
Rico


Reply via email to