Re: From a C++/JS benchmark

2011-08-08 Thread Eric Poggel (JoeCoder)
On 8/8/2011 3:02 PM, bearophile wrote: Eric Poggel (JoeCoder): determinism can be very important when it comes to reducing network traffic. If you can achieve it, then you can make sure all players have the same game state and then only send user input commands over the network. It seems a h

Re: From a C++/JS benchmark

2011-08-08 Thread bearophile
Eric Poggel (JoeCoder): > determinism can be very important when it comes to > reducing network traffic. If you can achieve it, then you can make sure > all players have the same game state and then only send user input > commands over the network. It seems a hard thing to obtain, but I agree

Re: From a C++/JS benchmark

2011-08-07 Thread Eric Poggel (JoeCoder)
On 8/6/2011 8:34 PM, bearophile wrote: Walter: On 8/6/2011 4:46 PM, bearophile wrote: Walter is not a lover of that -ffast-math switch. No, I am not. Few understand the subtleties of IEEE arithmetic, and breaking IEEE conformance is something very, very few should even consider. I have rea

Re: From a C++/JS benchmark

2011-08-07 Thread Trass3r
Anyways, I've tweaked the GDC codegen, and program speed meets that of C++ now (on my system). Implementation: http://ideone.com/0j0L1 Command-line: gdc -O3 -mfpmath=sse -ffast-math -march=native -frelease g++ bench.cc -O3 -mfpmath=sse -ffast-math -march=native Best times: G++-32bit: 114

Re: From a C++/JS benchmark

2011-08-06 Thread bearophile
Walter: > On 8/6/2011 4:46 PM, bearophile wrote: > > Walter is not a lover of that -ffast-math switch. > > No, I am not. Few understand the subtleties of IEEE arithmetic, and breaking > IEEE conformance is something very, very few should even consider. I have read several papers about FP arithm

Re: From a C++/JS benchmark

2011-08-06 Thread Walter Bright
On 8/6/2011 4:46 PM, bearophile wrote: Walter is not a lover of that -ffast-math switch. No, I am not. Few understand the subtleties of IEEE arithmetic, and breaking IEEE conformance is something very, very few should even consider.

Re: From a C++/JS benchmark

2011-08-06 Thread bearophile
Iain Buclaw: > Anyways, I've tweaked the GDC codegen, and program speed meets that of C++ > now (on > my system). Are you willing to explain your changes (and maybe give a link to the changes)? Maybe Walter is interested for DMD too. > Command-line: > gdc -O3 -mfpmath=sse -ffast-math -march=n

Re: From a C++/JS benchmark

2011-08-06 Thread bearophile
Walter: > A dynamic array is two values being passed, a pointer is one. I know, but I think there are many optimization opportunities. An example: private void foo(int[] a2) {} void main() { int[100] a1; foo(a1); } In code like that I think a D compiler is free to compile like this, b

Re: From a C++/JS benchmark

2011-08-06 Thread Iain Buclaw
== Quote from bearophile (bearophileh...@lycos.com)'s article > Iain Buclaw: > > 1) using pointers over dynamic arrays. (5% speedup) > > 2) removing the calls to CalVector4's constructor (5.7% speedup) > With DMD I have seen 180k -> 190k vertices/sec replacing this: > struct CalVector4 { > floa

Re: From a C++/JS benchmark

2011-08-06 Thread Walter Bright
On 8/6/2011 3:19 PM, bearophile wrote: I don't know why passing pointers gives some more performance here, compared to passing dynamic arrays (but I have seen the same behaviour in other D programs of mine). A dynamic array is two values being passed, a pointer is one.

Re: From a C++/JS benchmark

2011-08-06 Thread bearophile
Iain Buclaw: > 1) using pointers over dynamic arrays. (5% speedup) > 2) removing the calls to CalVector4's constructor (5.7% speedup) With DMD I have seen 180k -> 190k vertices/sec replacing this: struct CalVector4 { float X, Y, Z, W; this(float x, float y, float z, float w = 0.0f) {

Re: From a C++/JS benchmark

2011-08-06 Thread Iain Buclaw
== Quote from bearophile (bearophileh...@lycos.com)'s article > Iain Buclaw: > Are you using GDC2-64 bit on Linux? GDC2-32 bit on Linux. > > Three things that helped improve performance in a minor way for me: > > 1) using pointers over dynamic arrays. (5% speedup) > > 2) removing the calls to Ca

Re: From a C++/JS benchmark

2011-08-06 Thread bearophile
Iain Buclaw: Are you using GDC2-64 bit on Linux? > Three things that helped improve performance in a minor way for me: > 1) using pointers over dynamic arrays. (5% speedup) > 2) removing the calls to CalVector4's constructor (5.7% speedup) > 3) using core.stdc.time over std.datetime. (1.6% speedu

Re: From a C++/JS benchmark

2011-08-06 Thread Iain Buclaw
== Quote from bearophile (bearophileh...@lycos.com)'s article > Iain Buclaw: > > I will look into this later from my workstation. > The remaining thing to look at is just the small performance difference > between the D-GDC version and the C++-G++ version. > Bye, > bearophile Three things that he

Re: From a C++/JS benchmark

2011-08-06 Thread bearophile
Iain Buclaw: > I will look into this later from my workstation. The remaining thing to look at is just the small performance difference between the D-GDC version and the C++-G++ version. Bye, bearophile

Re: From a C++/JS benchmark

2011-08-06 Thread Iain Buclaw
== Quote from bearophile (bearophileh...@lycos.com)'s article > Trass3r: > > C++ no SIMD: > > Skinned vertices per second: 4242 > > > ... > > D gdc: > > Skinned vertices per second: 2345 > Are you able and willing to show me the asm produced by gdc? There's a problem there. > Bye, > bearoph

Re: From a C++/JS benchmark

2011-08-05 Thread Trass3r
I'd like to know why the GCC back-end is able to produce a more efficient binary from the C++ code (compared to the D code), but now the problem is not large, as before. I attached both asm versions ;) cppver.s Description: Binary data dver.s Description: Binary data

Re: From a C++/JS benchmark

2011-08-05 Thread bearophile
Trass3r: > >> C++ no SIMD: > >> Skinned vertices per second: 4242 >... > D gdc with added -frelease -fno-bounds-check: > Skinned vertices per second: 3771 I'd like to know why the GCC back-end is able to produce a more efficient binary from the C++ code (compared to the D code), but now

Re: From a C++/JS benchmark

2011-08-05 Thread Trass3r
Am 04.08.2011, 04:07 Uhr, schrieb Trass3r : C++: Skinned vertices per second: 4866 C++ no SIMD: Skinned vertices per second: 4242 D dmd: Skinned vertices per second: 159046 D gdc: Skinned vertices per second: 2345 D ldc: Skinned vertices per second: 3791 ldc2 -O3 -release

Re: From a C++/JS benchmark

2011-08-05 Thread Trass3r
If you want to go on with this exploration, then I suggest you to find a way to disable bound tests. Ok, now I get up to 3293 skinned vertices per second. Still a bit worse than LDC.

Re: From a C++/JS benchmark

2011-08-05 Thread bearophile
Trass3r: > > are you willing and able to show me the asm before it gets assembled? > > (with gcc you do it with the -S switch). (I also suggest to use only the > > C standard library, with time() and printf() to produce a smaller asm > > output: http://codepad.org/12EUo16J ). You are a pers

Re: From a C++/JS benchmark

2011-08-05 Thread Don
Adam Ruppe wrote: But what's the purpose of those callq? They seem to call the successive asm instruct I find AT&T syntax to be almost impossible to read, but it looks like they are comparing the instruction pointer for some reason. call works by pushing the instruction pointer on the stack, t

Re: From a C++/JS benchmark

2011-08-04 Thread Adam Ruppe
> But what's the purpose of those callq? They seem to call the > successive asm instruct I find AT&T syntax to be almost impossible to read, but it looks like they are comparing the instruction pointer for some reason. call works by pushing the instruction pointer on the stack, then jumping to th

Re: From a C++/JS benchmark

2011-08-04 Thread bearophile
> Trass3r: >> are you able and willing to show me the asm produced by gdc? There's a >> problem there. > [attach bla.rar] In the bla.rar attach there's the unstripped Linux binary, so to read the asm I have used the objdump disassembler. But are you willing and able to show me the asm before it

Re: From a C++/JS benchmark

2011-08-04 Thread Trass3r
e you able and willing to show me the asm produced by gdc? There's a problem there. bla.rar Description: application/rar-compressed

Re: From a C++/JS benchmark

2011-08-04 Thread Adam Ruppe
Marco Leise wrote: > I thought he was referring to the processor being able to handle > 64-bit ints more efficiently in 64-bit operation mode on a 64-bit OS > with 64-bit executables. I was thinking a little of both but this is the main thing. My suspicion was that Java might have been using a 64

Re: From a C++/JS benchmark

2011-08-03 Thread Marco Leise
Am 03.08.2011, 21:52 Uhr, schrieb David Nadlinger : On 8/3/11 9:48 PM, Adam D. Ruppe wrote: System: Windows XP, Core 2 Duo E6850 Is this Windows XP 32 bit or 64 bit? That will probably make a difference on the longs I'd expect. It doesn't, long is 32-bit wide on Windows x86_64 too (LLP64).

Re: From a C++/JS benchmark

2011-08-03 Thread bearophile
Trass3r: > C++ no SIMD: > Skinned vertices per second: 4242 > ... > D gdc: > Skinned vertices per second: 2345 Are you able and willing to show me the asm produced by gdc? There's a problem there. Bye, bearophile

Re: From a C++/JS benchmark

2011-08-03 Thread Trass3r
C++: Skinned vertices per second: 4866 C++ no SIMD: Skinned vertices per second: 4242 D dmd: Skinned vertices per second: 159046 D gdc: Skinned vertices per second: 2345 D ldc: Skinned vertices per second: 3791 ldc2 -O3 -release -enable-inlining dver.d

Re: From a C++/JS benchmark

2011-08-03 Thread Trass3r
C++: Skinned vertices per second: 4866 C++ no SIMD: Skinned vertices per second: 4242 D dmd: Skinned vertices per second: 159046 D gdc: Skinned vertices per second: 2345 Compilers: gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) g++ -s -O3 -mfpmath=sse -ffast-math -march=nativ

Re: From a C++/JS benchmark

2011-08-03 Thread bearophile
Trass3r: > I'm afraid not. dmd's backend isn't good at floating point calculations. Studying a bit the asm it's not hard to find the cause, because this benchmark is quite pure (synthetic, despite I think it comes from real-world code). This is what G++ generates from the C++ code without intri

Re: From a C++/JS benchmark

2011-08-03 Thread Trass3r
Looks like a spiteful joke... In other words: WTF?! JavaScript is about 10 times faster than D in floating point calculations!? Please, tell me that I'm mistaken. I'm afraid not. dmd's backend isn't good at floating point calculations.

Re: From a C++/JS benchmark

2011-08-03 Thread bearophile
Denis Shelomovskij: > (tests from bearophile's message, C++ test is "skinning_test_no_simd.cpp"). For a more realistic test I suggest you to time the C++ version that uses the intrinsics too (only for float). > Looks like a spiteful joke... In other words: WTF?! JavaScript is about > 10 times

Re: From a C++/JS benchmark

2011-08-03 Thread Denis Shelomovskij
03.08.2011 22:48, Adam D. Ruppe пишет: System: Windows XP, Core 2 Duo E6850 Is this Windows XP 32 bit or 64 bit? That will probably make a difference on the longs I'd expect. I meant Windows XP 32 bit (5.1 (Build 2600: Service Pack 3)) (according to what is "Windows XP" in wikipedia)

Re: From a C++/JS benchmark

2011-08-03 Thread Adam D. Ruppe
> System: Windows XP, Core 2 Duo E6850 Is this Windows XP 32 bit or 64 bit? That will probably make a difference on the longs I'd expect.

Re: From a C++/JS benchmark

2011-08-03 Thread David Nadlinger
On 8/3/11 9:48 PM, Adam D. Ruppe wrote: System: Windows XP, Core 2 Duo E6850 Is this Windows XP 32 bit or 64 bit? That will probably make a difference on the longs I'd expect. It doesn't, long is 32-bit wide on Windows x86_64 too (LLP64). David

Re: From a C++/JS benchmark

2011-08-03 Thread Denis Shelomovskij
03.08.2011 22:15, Ziad Hatahet: I believe that "long" in this case is 32 bits in C++, and 64-bits in the remaining languages, hence the same result for int and long in C++. Try with "long long" maybe? :) -- Ziad 2011/8/3 Denis Shelomovskij mailto:verylonglogin@gmail.com>> 03.08.2011

Re: From a C++/JS benchmark

2011-08-03 Thread Ziad Hatahet
I believe that "long" in this case is 32 bits in C++, and 64-bits in the remaining languages, hence the same result for int and long in C++. Try with "long long" maybe? :) -- Ziad 2011/8/3 Denis Shelomovskij > 03.08.2011 18:20, bearophile: > > The benchmark info: >> http://chadaustin.me/2011

Re: From a C++/JS benchmark

2011-08-03 Thread Denis Shelomovskij
03.08.2011 18:20, bearophile: The benchmark info: http://chadaustin.me/2011/01/digging-into-javascript-performance/ The code, in C++, JS, Java, C#: https://github.com/chadaustin/Web-Benchmarks/ The C++/JS/Java code runs on a single core. D2 version translated from the C# version (the C++ versio

From a C++/JS benchmark

2011-08-03 Thread bearophile
The benchmark info: http://chadaustin.me/2011/01/digging-into-javascript-performance/ The code, in C++, JS, Java, C#: https://github.com/chadaustin/Web-Benchmarks/ The C++/JS/Java code runs on a single core. D2 version translated from the C# version (the C++ version uses struct inheritance!): ht