Thanks Ed - I will experiment with -O3. Indeed it would be good to make the basic ops as optimised as possible
Karl > On 27 Oct 2021, at 3:08 am, Ed . <ej...@hotmail.com> wrote: > > If you’re using a typical consumer computer, you’ll get limitations of memory > bandwidth, which it seems will limit simple calculations on large amounts of > data. It would probably be worth ensuring one’s installation of PDL is > compiled with -O3 just in case; -O2 (the usual default) enables vectorisation > on clang, but not on GCC which only does so on -O3. > > I just did a bit more experimenting with very latest PDL on a MacBook with 6 > cores/12 hyperthreads (which apparently defaults to -O3). For comparison, > normal Perl takes about 28ms for 1000 iterations, so C will be about 1ms. > Best performance was with PDL_AUTOPTHREAD_SIZE=0 PDL_AUTOPTHREAD_TARG=10 (11 > was about 1.5x as long), where 1000 iterations took about 0.31ms, or a bit > over 3x quicker than C, and comparable with the JavaScript (which I suspect > benefits from using GPU or maybe just multicore). > > This 2019 presentation > (https://indico.cern.ch/event/814979/contributions/3401203/attachments/1831468/3115808/VectorParallelismMultiCoreProc.pdf > > <https://indico.cern.ch/event/814979/contributions/3401203/attachments/1831468/3115808/VectorParallelismMultiCoreProc.pdf>) > discusses the various issues in making parallel process go Really Fast. For > me, a key takeaway is the problem is generally quite hard, and it’s wise to > use e.g. BLAS where all the possible optimisations have been wrung out. PDL > could benefit from that by parsing the “Code” etc sections, and inserting > BLAS calls. Similarly, we should probably start using LAPACK in core, like > GNU Octave etc do. An interesting possibility would be to use the “Matriplex” > library for vectorising operations on many smallish matrices (it even > generates code using Perl). > > It also mentions Amdahl’s Law, which gives limits to parallelism speedups > (fundamentally, the non-parallelisable bits impose limits, including > main-memory access). > > From: Karl Glazebrook <mailto:karlglazebr...@mac.com> > Sent: 26 October 2021 08:57 > To: Ed . <mailto:ej...@hotmail.com> > Cc: Luis Mochan <mailto:moc...@icf.unam.mx>; pdl-de...@lists.sourceforge.net > <mailto:pdl-de...@lists.sourceforge.net>; perldl > <mailto:pdl-general@lists.sourceforge.net> > Subject: Re: [Pdl-devel] benchmarks > > This thread is interesting. > > I was wondering if anyone has ever seen speedups of 2x or better with > PDL_AUTOPTHREAD_TARG > 2? I find it tends to max out at around 1.5-1.7x > whatever I set. > > I know about overhead etc. but kind of feel for some of the basic stuff (e.g. > A=B*C for large arrays with big chunks) I should see 4x for > PDL_AUTOPTHREAD_TARG=4 and never do) > > The various numbers in the tests reported by Ed show <2x. > > Nice getting faster than C! > > Karl > > > > On 4 Oct 2021, at 1:05 am, Ed . <ej...@hotmail.com > <mailto:ej...@hotmail.com>> wrote: > > Thank you for the independent measurement! > > From: Luis Mochan <mailto:moc...@icf.unam.mx> > Sent: 03 October 2021 15:03 > To: pdl-de...@lists.sourceforge.net <mailto:pdl-de...@lists.sourceforge.net>; > perldl <mailto:pdl-general@lists.sourceforge.net> > Subject: Re: [Pdl-devel] benchmarks > > > Now I have run the C benchmark and Ed's. My results are: > > | Program | # iterations | time (s) | speed (K/s) | factor | > |--------------+--------------+----------+-------------+--------| > | ansi c | 150e6 | 133 | 1127.8195 | 1. | > | perl | 1.5e6 | 56 | 26.785714 | 42.1 | > | my pdl | 15e6 | 67 | 223.88060 | 5.0 | > | Ed's pdl | 15e6 | 16 | 937.5 | 1.2 | > | Ed's 4 cores | 15e6 | 11 | 1363.6364 | 0.8 | > > So, as Ed wrote, just by stting and environment variable, > perl+pdl+pp_def can be made faster than c. > > > > > On Sat, Oct 02, 2021 at 07:03:50PM -0500, Luis Mochan wrote: > > I made my own version of the ray-tracing program (as I tried to > > understand it). I didn't use pp_def, only Perl and ordinary PDL. I used > > ... > > -- > > o > W. Luis Mochán, | tel:(52)(777)329-1734 /<(*) > Instituto de Ciencias Físicas, UNAM | fax:(52)(777)317-5388 `>/ /\ > Av. Universidad s/n CP 62210 | (*)/\/ \ > Cuernavaca, Morelos, México | moc...@fis.unam.mx > <mailto:moc...@fis.unam.mx> /\_/\__/ > GPG: 791EB9EB, C949 3F81 6D9B 1191 9A16 C2DF 5F0A C52B 791E B9EB > > > _______________________________________________ > pdl-devel mailing list > pdl-de...@lists.sourceforge.net <mailto:pdl-de...@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/pdl-devel > <https://lists.sourceforge.net/lists/listinfo/pdl-devel> > > _______________________________________________ > pdl-devel mailing list > pdl-de...@lists.sourceforge.net <mailto:pdl-de...@lists.sourceforge.net> > https://lists.sourceforge.net/lists/listinfo/pdl-devel > <https://lists.sourceforge.net/lists/listinfo/pdl-devel> > >
_______________________________________________ pdl-general mailing list pdl-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/pdl-general