Re: [Pdl-general] [Pdl-devel] benchmarks

2021-10-29 Thread Karl Glazebrook via pdl-general
Thanks Ed - I will experiment with -O3. Indeed it would be good to make the basic ops as optimised as possible Karl > On 27 Oct 2021, at 3:08 am, Ed . wrote: > > If you’re using a typical consumer computer, you’ll get limitations of memory > bandwidth, which it seems will limit simple

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-10-26 Thread Ed .
If you’re using a typical consumer computer, you’ll get limitations of memory bandwidth, which it seems will limit simple calculations on large amounts of data. It would probably be worth ensuring one’s installation of PDL is compiled with -O3 just in case; -O2 (the usual default) enables

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-10-26 Thread Karl Glazebrook via pdl-general
This thread is interesting. I was wondering if anyone has ever seen speedups of 2x or better with PDL_AUTOPTHREAD_TARG > 2? I find it tends to max out at around 1.5-1.7x whatever I set. I know about overhead etc. but kind of feel for some of the basic stuff (e.g. A=B*C for large arrays with

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-10-03 Thread Ed .
Thank you for the independent measurement! From: Luis Mochan Sent: 03 October 2021 15:03 To: pdl-de...@lists.sourceforge.net; perldl Subject: Re: [Pdl-devel] benchmarks Now I have run

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-10-03 Thread Luis Mochan
Now I have run the C benchmark and Ed's. My results are: | Program | # iterations | time (s) | speed (K/s) | factor | |--+--+--+-+| | ansi c |150e6 | 133 | 1127.8195 | 1. | | perl |1.5e6

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-10-02 Thread Luis Mochan
I made my own version of the ray-tracing program (as I tried to understand it). I didn't use pp_def, only Perl and ordinary PDL. I used ndarrays indexed by the three wavelengths and the entrance height, $size of them (I tested $size=1 and 10), so the calculations could have been

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-10-02 Thread Ed .
I have now updated the PDL version, which now shows speed comparable to C, and indeed with the pthreading it’s even faster (which isn’t fun to do in pure C, but incredibly trivial in PDL): https://github.com/Fourmilab/floating_point_benchmarks/pull/1 From: Ed . Sent:

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-10-01 Thread Ed .
Hi Boyd, The problem as stated is difficult to do vectorised because it features such small data. It occurred to me today that the way to make PDL shine is simply to replace the for-loop with iterations (which isn’t really the PDL way in any case) by taking my @input and multiplying all the

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-10-01 Thread Boyd Duffee
I was thinking that the problem was better suited to using a vector approach, but I haven't been able to find the original article or reference to the technique and I haven't sat down and worked it out from Wyld's loops. I also haven't seen any output from Walker's benchmark scripts. I ended up

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-09-30 Thread Ed .
The pull request now also shows my attempt with a PP function. It’s about twice as slow as pure-Perl (rather than 70 times with the naïve version). The way this would benefit performance-wise would be to trace rays through many more surfaces than 4, or (as mentioned below) with a large number

Re: [Pdl-general] [Pdl-devel] benchmarks

2021-09-30 Thread Ed .
Hi Luis, That was very interesting in the part about Raku, and I think I agree with the author’s comments about that language. I have made a (so-far very naïve) PDL version, visible at https://github.com/Fourmilab/floating_point_benchmarks/pull/1/files?diff=unified=1. This caused some