Re: Estimation of π using Leibniz series
> Are you on Mac OS X by chance? We're calling c stdlib function for pow and > they may be significantly faster. Yes, but that wouldn't explain the faster Julia code (and if it did, it wouldn't explain the difference between Julia and Nim that others are observing).
Re: Estimation of π using Leibniz series
Indeed could this be a library issue, I get ~4.5x faster results if I link to musl instead of glibc.
Re: Estimation of π using Leibniz series
Maybe you have some special Nim settings in your nim.cfg, Jehan? For me it's ~3.5 seconds with an i7 6700k for both gcc-5 and clang. Or are you using an older Nim release? I'm running the current devel branch. **Are you on Mac OS X by chance? We're calling c stdlib function for pow and they may be significantly faster. Optimizing pow(-1.0, ...) seems pretty reasonable.** Compiling for x86 instead of x86-64 also makes this twice as slow here.
Re: Estimation of π using Leibniz series
What baffles me is how slow this seems to be for everyone. I get some .7 seconds (plus or minus some random noise) for both Nim and Julia, both with clang, gcc-5, and gcc-7. Julia is a tick slower (around .78 seconds for Julia vs. .72 seconds for Nim), but nothing that breaks even the single second barrier. That's on a 2.5 GHz Core i7, a mid-2014 Mac, so how hardly a particularly powerful system. This is with the code copied and pasted from the first post in this thread and no modifications applied.
Re: Estimation of π using Leibniz series
@zolem: Ah, ok, then I didn't understand you correctly that time.
Re: Estimation of π using Leibniz series
@LeuGim I mean that LLVM/clang's pow is much faster than gcc's pow and not just in this particular case pow(-1, n), but is faster in general.
Re: Estimation of π using Leibniz series
This is no longer Nim's problem, but just for fun, (short of importing intrinsics) {.passc:"-march=native".} import times, math proc leibniz0(terms: int): float = var res = 0.0 for n in 0..terms: res += pow(-1,float(n))/(2*float(n)+1) 4*res proc leibniz1(terms: int): float = var res = 0.0 for n in 0..terms: if (n and 1) == 0: res += 1/(2*float(n)+1) else: res += -1/(2*float(n)+1) 4*res proc leibniz2[N:static[int]](terms: int): float = const L = 1 shl N L2 = L shl 1 T = 10 var t: array[L*T, float] let r = (terms shr N) div T if (terms mod (L*T)) != 0: quit 1 var res = 1/float(2*terms+1) for n in 0..
Re: Estimation of π using Leibniz series
@zolern: My point was exactly that the library function is NOT to be considered as bad implemetned for not optimizing this case. Such optimizations come with cost (additional runtime checks), yet at least they have a cost of implementing them (their developers effort/time), so all possible optimizations cannot be done, library/compiler developers should choose more probable and more sane cases among all possible, to optimize them (say, -2 could also be optimized just to instantly return 4, and -2.5 to instantly return -6.25, ..., but there's infinity of numbers). And this particular case (using POW for 1, -1, ...) is not of those worth both runtime checks and library developers effort. If the programmer considers efficiency (and readability too!) a little bit, why would he write it this way? So what point in optimizing it? GCC does better in this case. Yet for such special cases special (if the programmer just likes writing this way) just-in-time optimizations can be made, like proc pow(x: static[float], y: float): float = (when x == -1.0: float([1,-1][y.int mod 2]) else: math.pow(x, y)) (yet faster with template), or with term rewriting, smth like template optPowMinusOne{pow(-1.0, x)}(x: float): float = float([1,-1][y.int mod 2]) (this didn't work for me though, may be someone can point what's wrong with it).
Re: Estimation of π using Leibniz series
@wiffel: It is true: last updates of LLVM and Clang among other things declared 5x faster pow execution. Nim can do nothing about this. @LeuGim: Yes, in this particular case POW is not the best choice, but it is some kind worrying and unpleasant when your code depends on library function that is unexpectedly bad implemented.
Re: Estimation of π using Leibniz series
Using exponentiation just for interlacing 1, -1, 1, -1, ... is pointless (apart from mathematical formulas on paper) and should not be done, so no matter if some compiler optimizes it.
Re: Estimation of π using Leibniz series
@zolern : I'm wondering too why the original nim version is that slow. Using the windows version of nim on my computer (i7-6650, 3.60GHz, Windows 10 Pro 64-bit) is already faster at running my version of the program (see below) then running it under _windows/bash/ubuntu_ (what I did before). The same is true for the _Julia_ version. Using _clang_ instead of _gcc_ makes it almost 3x faster. Since _Julia_ is using _LLVM_, probably the _clang_ version uses the same (faster?) library functions. I'm not sure ... **clang version on windows/mingw**: nim c -r -d:release --cc:clang pi.nim ... Elapsed time: 2.055 Pi: 3.141592663589326 **gcc version on windows/mingw**: nim c -r -d:release pi.nim ... Elapsed time: 5.909 Pi: 3.141592663589326 **pi.nim**: import times, math proc `/`(a, b: int): float = float(a) / float(b) proc leibniz(terms: int): float = for i in 0 .. terms: result += ((-1)^i) / (2*i+1) result *= 4.0 let t0 = cpuTime() pi = leibniz(100_000_000) tt = cpuTime() - t0 echo("Elapsed time: ", tt) echo("Pi: ", pi)
Re: Estimation of π using Leibniz series
I am still confused that Nim's pow is so unexpectedly slow: Nim just calls C library function pow from , wtf? Anyway, last edition (without pow) is just fast & furious And Nim is awesome, no doubt!
Re: Estimation of π using Leibniz series
Our **lovely Nim** is outstanding !
Re: Estimation of π using Leibniz series
I am pretty sure that Julias's POW takes care that first argument is -1 and optimized it with something like MOD You can check it, I suppose that modified Julia code with MOD will take pretty same time as code with POW.
Re: Estimation of π using Leibniz series
@zolern, thank you. Your code rocks. However, we are cheating Julia because in her code there is a **POW** instead **MOD**. I will modify / rerun the .jl script just to check the result.
Re: Estimation of π using Leibniz series
Well, my 10 cents import times, math proc leibniz(terms: int): float = var res = 0.0 for n in 0..terms: res = res + (if n mod 2 == 0: 1.0 else: -1.0) / float(2 * n + 1) return 4*res let t0 = cpuTime() echo(leibniz(100_000_000)) let t1 = cpuTime() echo "Elapsed time: ", $(t1 - t0) * With -d:release compile option: 0.381 seconds * Without -d:release: 2.711 seconds Original "pow" version: * With -d:release: 7.253 seconds * Withoud -d:release: 10.697 seconds
Re: Estimation of π using Leibniz series
Julia compiles with `-march=native` by default so try passing `--passC:"-march=native"` to Nim. I have this in my `~/.config/nim.cfg` along with `--passC:"-flto"` (for release mode).
Re: Estimation of π using Leibniz series
It seems that Julia JIT is aware of SSE extensions and it uses them. GCC or MSVC should be emitting them too, but it depends on C code
Re: Estimation of π using Leibniz series
Thank you all. @wiffel/Nibbler, in my Julia test, I was using Pro version 0.6.1 64 bit running on Windows 10. Now, running the same .jl code on a Windows 8 64 bit (i5 CPU 650 @ 3.20GHz, 3193 Mhz, 2 Cores, 4 Processors) I got the following: 2.763225 seconds (1.77 k allocations: 95.291 KiB) Pi: 3.141592663589326 After JIT compilation: 1.863196 seconds (1.69 k allocations: 88.809 KiB) Pi: 3.141592663589326
Re: Estimation of π using Leibniz series
I compiled the same approximate version to C with gcc optimisations on, and found the execution time to be roughly comparable between Nim and C. Could Julia's JIT be doing some sort of optimisation that shortcuts the full code somehow? With the Nim version: import times, math proc leibniz(terms: int): float = var res = 0.0 for n in 0..terms: res += pow(-1.0,float(n))/(2.0*float(n)+1.0) return 4*res let t0 = cpuTime() echo(leibniz(100_000_000)) let t1 = cpuTime() echo "Elapsed time: ", $(t1 - t0) I got these output times: C:projectsNim>nim_version 3.141592663589326 Elapsed time: 6.541 C:projectsNim>nim_version 3.141592663589326 Elapsed time: 6.676 C:projectsNim>nim_version 3.141592663589326 Elapsed time: 6.594 While with the same C version: #include #include #include double leibniz(int terms) { double res = 0.0; for (int i = 0; i < terms; ++i) { res += pow(-1.0, (double)i) / (2.0 * (double)i + 1.0); } return 4*res; } int main() { clock_t start = clock(); double x = leibniz(1); printf("%.15f\n", x); printf("Time elapsed: %f\n", ((double)clock() - start) / CLOCKS_PER_SEC); } The times taken were (EDIT: used -Ofast instead and got faster times): C:projectsc>c_version 3.141592643589326 Time elapsed: 6.206000 C:projectsc>c_version 3.141592643589326 Time elapsed: 6.204000 C:projectsc>c_version 3.141592643589326 Time elapsed: 6.217000 I realise I actually got a slightly different decimal number with C, but to be honest I am not a C programmer so I am sure I did something wrong in the formatting.
Re: Estimation of π using Leibniz series
@alfrednewman I tried to replicate your test. On my computer the following (slightly modified version) of the _nim_ program and the _julia_ program have the same runtime. Whatever I do, I fail to run the _julia_ version in less than 2 seconds (as you had). Are you sure that test went OK? import times, math proc `/`(a, b: int): float = float(a) / float(b) proc leibniz(terms: int): float = for i in 0..terms: result += (-1)^i / (2*i+1) result *= 4.0 let t0 = cpuTime() pi = leibniz(100_000_000) tt = cpuTime() - t0 echo("Elapsed time: ", tt) echo("Pi: ", pi) gives >> nim c -d:release pi.nim >> time ./pi Elapsed time: 5.671875 Pi: 3.141592663589326 real0m5.706s user0m5.672s sys 0m0.000s and function leibniz(terms) res = 0.0 for i in 0:terms res += (-1.0)^i/(2.0*i+1.0) end return res * 4.0 end println("Pi: ", @time leibniz(100_000_000)) gives >> time julia pi.jl 5.770856 seconds (4.48 k allocations: 226.962 KB) Pi: 3.141592663589326 real0m6.538s user0m6.594s sys 0m0.234s
Re: Estimation of π using Leibniz series
At least so: `res += float([1,-1][n mod 2])/(2.0*float(n)+1.0)`.
Re: Estimation of π using Leibniz series
That call to `pow` to change sign may be a possible reason of slowdown.
Estimation of π using Leibniz series
Hello, How can I optimize the speed of the following proc: import times, math proc leibniz(terms: int): float = var res = 0.0 for n in 0..terms: res += pow(-1.0,float(n))/(2.0*float(n)+1.0) return 4*res let t0 = cpuTime() echo(leibniz(100_000_000)) let t1 = cpuTime() echo "Elapsed time: ", $(t1 - t0) I have the following result in my computer: 3.141592663589326 Elapsed time: 8.23 This result is almost 5x faster than my CPython counter party, but on the other hand it is around 6x slower than Julia, given the following code: function leibniz(terms) res = 0.0 for i in 0:terms res += (-1.0)^i/(2.0*i+1.0) end return res *= 4.0 end println("Pi: ", @time leibniz(100_000_000)) 1.374829 seconds (1.72 k allocations: 90.561 KiB) Pi: 3.141592663589326