On 29 June 2013 18:57, Jonathan Dunlap <jad...@gmail.com> wrote: > I've updated the project with your suggestions at > http://dpaste.dzfl.pl/fce2d93b but still get the same performance. Vectors > defined in the benchmark function body, no function calling overhead, etc. > See some of my comments below btw: > > >> First of all, calcSIMD and calcScalar are virtual functions so they can't >> be inlined, which prevents any further optimization. > > > For the dlang docs: Member functions which are private or package are never > virtual, and hence cannot be overridden. > >> So my guess is that the first four multiplications and the second four >> multiplications in calcScalar are done in parallel. ... The reason it's >> faster is that gdc replaces multiplication by 2 with addition and omits >> multiplication by 1. > > > I've changed the multiplies of 2 and 1 to 2.1 and 1.01 respectively. Still > no performance difference between the two for me.
s/class/struct/ -- Iain Buclaw *(p < e ? p++ : p) = (c & 0x0f) + '0';