Re: d3dx9: Avoid expensive computations

Rico Schüller Tue, 26 Feb 2013 14:01:57 -0800

Hi Nozomi,

this is pretty fast. Just some numbers (run time on my machine, so itmight not be that representative)...


before: 43s
previous patch: 27s
this patch: 21s
native: 16s

So from the speed point of view, it's a lot closer than the rest.

Though, I would split this into 2 patches, one for D3DXMatrixDeterminantand one for D3DXMatrixInverse. I think it's a nice step forward. Thoughtwe might test the speed of an sse version and may use it later ...


Are there any other opinions?

Cheers
Rico


On 25.02.2013 12:34, Nozomi Kodama wrote:

Rico,

can you give a try to this patch?
If it is slightly slower than native, we could at first merge it.


Anyway, if the application is well coded, this function should not be
called often. Usually an application uses transformations matrices that
are a lot easier to inverse

Nozomi


------------------------------------------------------------------------
*De :* Henri Verbeet <hverb...@gmail.com>
*À :* Rico Schüller <kgbric...@web.de>
*Cc :* wine-devel@winehq.org; Nozomi Kodama <nozomi.kod...@yahoo.com>
*Envoyé le :* Lundi 25 février 2013 0h08
*Objet :* Re: d3dx9: Avoid expensive computations

On 25 February 2013 10:24, Rico Schüller <kgbric...@web.de
<mailto:kgbric...@web.de>> wrote:
 > I did some small tests for speed with the following results. You may also
 > avoid such a lot of variable assignments like *pout = out and you may
use 4
 > vecs instead. This should save ~48 assignments and it should also improve
 > the speed a bit more (~10%). Though, native is still 40% faster than
that.
 >
I'd somewhat expect native to use SSE versions of this kind of thing
when the CPU supports those instructions. You also generally want to
pay attention to the order in which you access memory, although
perhaps it doesn't matter so much here because an entire matrix should
be able to fit in a single cacheline, provided it's properly aligned.

Re: d3dx9: Avoid expensive computations

Reply via email to