Re: GDC vs dmd speed
On 14/10/2013 22:22, Walter Bright wrote: On 10/14/2013 12:24 PM, Spacen Jasset wrote: dmd32 v2.063.2 with flags: ["-O", "-release", "-noboundscheck", "-inline"] gdc 4.6 (0.29.1-4.6.4-1ubuntu4) Which I assume might be v2.020? with flags: ["-O2"] dmd uses the x87 for 32 bit code for floating point, while gdc uses the SIMD instructions, which are faster. For 64 bit code, dmd uses SIMD instructions too. Thanks Walter. I shall find a 64 bit system at some point to compare.
Re: GDC vs dmd speed
On 14/10/2013 22:06, bearophile wrote: Spacen Jasset: const float pi = 3.14159265f; float dx = cast(float)(Clock.currSystemTick.length % (TickDuration.ticksPerSec * 10)) / (TickDuration.ticksPerSec * 10); float xRot = sin(dx * pi * 2) * 0.4f + pi / 2; float yRot = cos(dx * pi * 2) * 0.4f; float yCos = cos(yRot); float ySin = sin(yRot); float xCos = cos(xRot); float xSin = sin(xRot); float ox = 32.5f + dx * 64; float oy = 32.5f; float oz = 32.5f; for (int x = 0; x < width; ++x) { float ___xd = cast(float)(x - width / 2) / height; for (int y = 0; y < height; ++y) { float __yd = cast(float)(y - height / 2) / height; float __zd = 1; The performance difference between the DMD and GDC compile is kind of expected for FP-heavy code. Also try the new LDC2 compiler (ldmd2 for the same compilation switches) that sometimes is better than GDC. More comments: - There is a PI in std.math (but it's not a float); - Add immutable/const to every variable that doesn't need to change. This is a good habit like washing your hands before eating; - "for (int x = 0; x < width; ++x)" ==> "foreach (immutable x; 0 .. width)"; - I suggest to avoid many leading/trailing newlines in identifier names; - It's probably worth replacing all those "float" with another name, like "FP" and then define "alias FP = float;" at the beginning. So you can see how much performance you lose/gain using floats/doubles. In many cases in my code there is no difference, but float are less precise. Floats can be useful when you have many of them, in a struct or array. Floats can also be useful when you call certain numerical functions that compute their result by approximation, but on some CPUs sin/cos are not among those functions. Bye, bearophile Thank you. I may take up some of those suggestions. It was a direct port of some c++ hence the style.
Re: GDC vs dmd speed
On Monday, 14 October 2013 at 19:24:27 UTC, Spacen Jasset wrote: gdc 4.6 (0.29.1-4.6.4-1ubuntu4) Which I assume might be v2.020? with flags: ["-O2"] That's a really old gdc. If you can, upgrade to ubuntu 13.10 and you'll get a more up-to-date version. Alternatively, build from source: http://gdcproject.org/wiki/Installation/GeneralIt'll take an age to run the compilation, but it's not hard to do.
Re: GDC vs dmd speed
On Monday, 14 October 2013 at 19:24:27 UTC, Spacen Jasset wrote: Hello, Whilst porting some C++ code I have discovered that the compiled output from the gdc compiler seems to be 47% quicker than the dmd compiler. Here is a few more data points for microbenchmarks of simple functions (Project Euler), which supports an observation (disclaimer: my microbenchmark is not a guarantee of your code performance, etc.) that the fastest code is produced by LDC, then GDC and DMD is the slowest one. Tested on Xubuntu 13.04 64-bit Core i5 3450S 2.8GHz. Test 1: // 454ns LDC 0.11.0: ldmd2 -m64 -O -noboundscheck -inline -release // 830ns GDC 4.8.1: gdc -m64 -march=native -fno-bounds-check -frename-registers -frelease -O3 // 1115ns DMD64 2.063.2: dmd -O -noboundscheck -inline -release int e28_0(int N = 1002) { int diagNumber = 1; int sum= diagNumber; for (int width = 2; width < N; width += 2) for (int j = 0; j < 4; ++j) { diagNumber += width; sum+= diagNumber; } return sum; } Test 2: // 118ms LDC 0.11.0: ldmd2 -m64 -O -noboundscheck -inline -release // 125ms GDC 4.8.1: gdc -m64 -march=native -fno-bounds-check -frename-registers -frelease -O3 // 161ms DMD64 2.063.2: dmd -O -noboundscheck -inline -release bool isPalindrome(string s) {return equal(s, s.retro);} int e4(int N = 1000) { int nMax = 0; foreach (uint i; 1..N) foreach (uint j; i..N) if (isPalindrome(to!string(i*j)) && i*j > nMax) nMax = i*j; return nMax; } Test 3: // 585us LDC 0.11.0: ldmd2 -m64 -O -noboundscheck -inline -release // 667us GDC 4.8.1: gdc -m64 -march=native -fno-bounds-check -frename-registers -frelease -O3 // 853us DMD64 2.063.2: dmd -O -noboundscheck -inline -release int e67_0(string fileName = r"C:\Euler\data\e67.txt") { // Read triangle numbers from file. int[][] cell; foreach (line; splitLines(cast(char[]) read(fileName))) { int[] row; foreach (token; std.array.splitter(line)) row ~= [to!int(token)]; cell ~= row; } // Compute maximum value partial paths ending at each cell. foreach (y; 1..cell.length) { cell[y][0] += cell[y-1][0]; foreach (x; 1..y) cell[y][x] += max(cell[y-1][x-1], cell[y-1][x]); cell[y][y] += cell[y-1][y-1]; } // Return the maximum value terminal path. return cell[$-1].reduce!max; } Here is the relative to LDC code speed averaged over these three test (larger number is slower): LDC 1.00 GDC 1.34 DMD 1.76
Re: GDC vs dmd speed
On 10/14/2013 12:24 PM, Spacen Jasset wrote: dmd32 v2.063.2 with flags: ["-O", "-release", "-noboundscheck", "-inline"] gdc 4.6 (0.29.1-4.6.4-1ubuntu4) Which I assume might be v2.020? with flags: ["-O2"] dmd uses the x87 for 32 bit code for floating point, while gdc uses the SIMD instructions, which are faster. For 64 bit code, dmd uses SIMD instructions too.
Re: GDC vs dmd speed
Spacen Jasset: const float pi = 3.14159265f; float dx = cast(float)(Clock.currSystemTick.length % (TickDuration.ticksPerSec * 10)) / (TickDuration.ticksPerSec * 10); float xRot = sin(dx * pi * 2) * 0.4f + pi / 2; float yRot = cos(dx * pi * 2) * 0.4f; float yCos = cos(yRot); float ySin = sin(yRot); float xCos = cos(xRot); float xSin = sin(xRot); float ox = 32.5f + dx * 64; float oy = 32.5f; float oz = 32.5f; for (int x = 0; x < width; ++x) { float ___xd = cast(float)(x - width / 2) / height; for (int y = 0; y < height; ++y) { float __yd = cast(float)(y - height / 2) / height; float __zd = 1; The performance difference between the DMD and GDC compile is kind of expected for FP-heavy code. Also try the new LDC2 compiler (ldmd2 for the same compilation switches) that sometimes is better than GDC. More comments: - There is a PI in std.math (but it's not a float); - Add immutable/const to every variable that doesn't need to change. This is a good habit like washing your hands before eating; - "for (int x = 0; x < width; ++x)" ==> "foreach (immutable x; 0 .. width)"; - I suggest to avoid many leading/trailing newlines in identifier names; - It's probably worth replacing all those "float" with another name, like "FP" and then define "alias FP = float;" at the beginning. So you can see how much performance you lose/gain using floats/doubles. In many cases in my code there is no difference, but float are less precise. Floats can be useful when you have many of them, in a struct or array. Floats can also be useful when you call certain numerical functions that compute their result by approximation, but on some CPUs sin/cos are not among those functions. Bye, bearophile
Re: GDC vs dmd speed
Although I could provide a complete test if anyone is interested. Is this an expected result and/or is there something I could change to make the compilers perform similarly. Maybe you could do something to make the code compiled with DMD perform better, but it is not unusual for GDC to produce significantly faster code than DMD.