Re: [Rd] boneheaded BLAS questions
On 17 March 2021 at 22:53, Ben Bolker wrote: |Thanks. I know it's supposed to Just Work (and I definitely | appreciate all the work that's gone into making it Just Work 99% of the | time!). And for what it is worth, the aforementioned 'switching from within' solution is using FlexiBLAS (not BLIS as I had said in the previous email), and was described in an R application R here: https://www.enchufa2.es/archives/switch-blas-lapack-without-leaving-your-r-session.html That won't help for you tried on your Debian-based system though, and I would (in the near-term) try that. Dirk -- https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] boneheaded BLAS questions
Ben, possibly useful project related to this https://github.com/staticfloat/libblastrampoline which is what Dirk referred to as the Julia state of art. It is actually much more complex than it sounds because of differences in naming and ABI between BLAS implementations, so simple switches don't work. Dirk has the luxury of having control over what he compiles, but that is not always the case (like with Accelerate or MKL) ... Cheers, Simon > On 18/03/2021, at 3:53 PM, Ben Bolker wrote: > > Thanks. I know it's supposed to Just Work (and I definitely appreciate all > the work that's gone into making it Just Work 99% of the time!). > > I tried --with-lapack, no joy. > Will try to decipher the rules file tomorrow ... > > cheers > Ben > > > On 3/17/21 10:25 PM, Dirk Eddelbuettel wrote: >> Ben, >> This stuff has worked unchanged since the 1990s when we had a _really_ far >> sighted fellow in Debian come up with the 'switch the links' scheme which was >> (and is) subsequently deployed by many numerical applications within Debian, >> R and e.g. Octave included. >> And I used this ability to switch over a decade ago in a never-quite-finished >> paper which resulted in a package as well as a vignette as paper draft on >> CRAN: gcbd [1] It used the ability to switch between implementation to time >> and compare and benchmark the various BLAS and LAPACK libraries -- which was >> then motivated by a comparison with GPUs. (The actual code / package is >> stale-ish as some of the underlying packages have gone as eg the GPU one -- >> but the mechanics you are after still work the exact same way on Debian and >> derivarives including Ubuntu and PopOS.) >> (As a complete aside, the state of the art here is now one level up in >> libraries based on flame/blis (a riff on blas) which can do a similar logical >> switch _at runtime_ (rather than by flipping softlinks and restarting the >> app). Julia and some other languages uses that, I think Fedora may have it in >> its R build as well. Inaki may know more...) >> That said, from the top of my head, I think you error may just be with the >> second R compilation -- I always (i.e. for the Debian package) use both >> --with-blas --with-lapack >> and not just --with-blas. And I do there is public: if you know where to look >> you can see the exact invocation of the Debian build of the R package (which >> Ubuntu and Pop and ... then shadow) [2] >> Hth, Dirk >> [1] https://cran.r-project.org/package=gcbd >> [2] https://sources.debian.org/src/r-base/4.0.4-1/debian/rules/ >> (and I apologise for how messy this still is) >> > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] boneheaded BLAS questions
Thanks. I know it's supposed to Just Work (and I definitely appreciate all the work that's gone into making it Just Work 99% of the time!). I tried --with-lapack, no joy. Will try to decipher the rules file tomorrow ... cheers Ben On 3/17/21 10:25 PM, Dirk Eddelbuettel wrote: Ben, This stuff has worked unchanged since the 1990s when we had a _really_ far sighted fellow in Debian come up with the 'switch the links' scheme which was (and is) subsequently deployed by many numerical applications within Debian, R and e.g. Octave included. And I used this ability to switch over a decade ago in a never-quite-finished paper which resulted in a package as well as a vignette as paper draft on CRAN: gcbd [1] It used the ability to switch between implementation to time and compare and benchmark the various BLAS and LAPACK libraries -- which was then motivated by a comparison with GPUs. (The actual code / package is stale-ish as some of the underlying packages have gone as eg the GPU one -- but the mechanics you are after still work the exact same way on Debian and derivarives including Ubuntu and PopOS.) (As a complete aside, the state of the art here is now one level up in libraries based on flame/blis (a riff on blas) which can do a similar logical switch _at runtime_ (rather than by flipping softlinks and restarting the app). Julia and some other languages uses that, I think Fedora may have it in its R build as well. Inaki may know more...) That said, from the top of my head, I think you error may just be with the second R compilation -- I always (i.e. for the Debian package) use both --with-blas --with-lapack and not just --with-blas. And I do there is public: if you know where to look you can see the exact invocation of the Debian build of the R package (which Ubuntu and Pop and ... then shadow) [2] Hth, Dirk [1] https://cran.r-project.org/package=gcbd [2] https://sources.debian.org/src/r-base/4.0.4-1/debian/rules/ (and I apologise for how messy this still is) __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] boneheaded BLAS questions
Ben, This stuff has worked unchanged since the 1990s when we had a _really_ far sighted fellow in Debian come up with the 'switch the links' scheme which was (and is) subsequently deployed by many numerical applications within Debian, R and e.g. Octave included. And I used this ability to switch over a decade ago in a never-quite-finished paper which resulted in a package as well as a vignette as paper draft on CRAN: gcbd [1] It used the ability to switch between implementation to time and compare and benchmark the various BLAS and LAPACK libraries -- which was then motivated by a comparison with GPUs. (The actual code / package is stale-ish as some of the underlying packages have gone as eg the GPU one -- but the mechanics you are after still work the exact same way on Debian and derivarives including Ubuntu and PopOS.) (As a complete aside, the state of the art here is now one level up in libraries based on flame/blis (a riff on blas) which can do a similar logical switch _at runtime_ (rather than by flipping softlinks and restarting the app). Julia and some other languages uses that, I think Fedora may have it in its R build as well. Inaki may know more...) That said, from the top of my head, I think you error may just be with the second R compilation -- I always (i.e. for the Debian package) use both --with-blas --with-lapack and not just --with-blas. And I do there is public: if you know where to look you can see the exact invocation of the Debian build of the R package (which Ubuntu and Pop and ... then shadow) [2] Hth, Dirk [1] https://cran.r-project.org/package=gcbd [2] https://sources.debian.org/src/r-base/4.0.4-1/debian/rules/ (and I apologise for how messy this still is) -- https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] boneheaded BLAS questions
I've been going around in circles trying to get BLAS-switching working on a current r-devel, I'm sure I'm doing something dumb. Any ideas about what I might be doing wrong, or suggestions for further diagnosis, would be welcome! tl;dr I am compiling R-devel with (to the best of my knowledge) options set to allow BLAS-switching, but getting "undefined symbol" errors. Latest R-devel (via SVN), PopOS!/Ubuntu 20.10 I have read Dirk E's post: https://github.com/eddelbuettel/mkl4deb I have attempted to read the relevant section of R Installation & Administration several times: https://cran.r-project.org/doc/manuals/r-release/R-admin.html#BLAS https://wiki.debian.org/DebianScience/LinearAlgebraLibraries I have installed MKL and OpenBLAS on my system via 'apt install' (libopenblas-dev, libopenblas-base, and TWO versions of intel-mkl-64bit) When I build R without BLAS everything is OK; rm -Rf r-build; mkdir r-build; cd r-build; ../r-devel/configure --without-blas --enable-R-shlib --enable-BLAS-shlib; make -j 6 Matrix products: default BLAS: /usr/local/lib/R/lib/libRblas.so LAPACK: /usr/local/lib/R/lib/libRlapack.so When I look at my BLAS alternatives I don't see anything obviously wrong: sudo update-alternatives --config libblas.so.3-x86_64-linux-gnu There are 3 choices for the alternative libblas.so.3-x86_64-linux-gnu (providing /usr/lib/x86_64-linux-gnu/libblas.so.3). SelectionPath Priority Status * 0/opt/intel/mkl/lib/intel64/libmkl_rt.so 150 auto mode 1/opt/intel/mkl/lib/intel64/libmkl_rt.so 150 manual mode 2/usr/lib/x86_64-linux-gnu/blas/libblas.so.3 10manual mode 3/usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 100 manual mode When I rebuild R with --with-blas: rm -Rf r-build; mkdir r-build; cd r-build; ../r-devel/configure --with-blas --enable-R-shlib --enable-BLAS-shlib; make -j 6 I end up with this: gcc -I../../../r-devel/src/extra -I/usr/include/tirpc -I. -I../../src/include -I../../../r-devel/src/include -I/usr/local/include -I../../../r-devel/src/nmath -DHAVE_CONFIG_H -fopenmp -fpic -g -O2 -c ../../../r-devel/src/main/Rmain.c -o Rmain.o gcc -Wl,--export-dynamic -fopenmp -L"../../lib" -L/usr/local/lib -o R.bin Rmain.o -lR -lRblas /usr/bin/ld: ../../lib/libR.so: undefined reference to `zgemm_' /usr/bin/ld: ../../lib/libR.so: undefined reference to `daxpy_' /usr/bin/ld: ../../lib/libR.so: undefined reference to `dgemv_' /usr/bin/ld: ../../lib/libR.so: undefined reference to `dscal_' If === intel-mkl-64bit-2018.2-046/all,now 2018.2-046 amd64 [installed] intel-mkl-64bit-2020.4-912/all,now 2020.4-912 amd64 [installed] <... lots more intel-mkl stuff> libblas-dev/groovy,now 3.9.0-3ubuntu1 amd64 [installed,automatic] libblas3/groovy,now 3.9.0-3ubuntu1 amd64 [installed,automatic] libgraphblas3/groovy,now 1:5.8.1+dfsg-2 amd64 [installed,automatic] libgslcblas0/groovy,now 2.6+dfsg-2 amd64 [installed,automatic] libopenblas-base/groovy,now 0.3.10+ds-3ubuntu1 amd64 [installed] libopenblas-dev/groovy,now 0.3.10+ds-3ubuntu1 amd64 [installed] libopenblas-pthread-dev/groovy,now 0.3.10+ds-3ubuntu1 amd64 [installed,automatic] libopenblas0-pthread/groovy,now 0.3.10+ds-3ubuntu1 amd64 [installed,automatic] libopenblas0/groovy,now 0.3.10+ds-3ubuntu1 amd64 [installed] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] reshape documentation
Comments in line On 13/03/2021 09:50, SOEIRO Thomas wrote: Dear list, I have some questions/suggestions about reshape. 1) I think a good amount of the popularity of base::reshape alternative is due to the complexity of reshape documentation. It is quite hard (at least it is for me) to figure out what argument is needed for respectively "long to wide" and "wide to long", because reshapeWide and reshapeLong are documented together. - Do you agree with this? - Would you consider a proposal to modify the documentation? - If yes, what approach do you suggest? e.g. split in two pages? The current documentation is much clearer than it was when I first started using R but we should always strive for more. I would suggest leaving the documentation in one place but it might be helpful to add which direction is relevant for each parameter by placing (to wide) or (to long) as appropriate. I think having completely separate lists is not needed 2) I do not think the documentation indicates that we can use varying argument to rename variables in reshapeWide. - Is this worth documenting? - Is the construct list(c()) really needed? Yes, because you may have more than one set of variables which need to correspond to a single variable in long format. So in your example if you also had 11 variables for the temperature as well as the concentration each would need specifying as a separate vector in the list. Michael reshape(Indometh, v.names = "conc", idvar = "Subject", timevar = "time", direction = "wide", varying = list(c("conc_0.25hr", "conc_0.5hr", "conc.0.75hr", "conc_1hr", "conc_1.25hr", "conc_2hr", "conc_3hr", "conc_4hr", "conc_5hr", "conc_6hr", "conc_8hr"))) Thanks, Thomas __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel -- Michael http://www.dewey.myzen.co.uk/home.html __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] Faster sorting algorithm...
Thank you Neal. This is interesting. I will have a look at pqR. Indeed radix only does C collation, I believe that is why it is not the default choice for character ordering and sorting. Not sure but I believe it can help address the following bugzilla item: https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17400 On the same topic of collation, there is an experimental sorting function "psort" in package kit that might help address this issue. > library(kit) Attaching kit 0.0.7 (OPENMP enabled using 1 thread) > x <- c("b","A","B","a","\xe4") > Encoding(x) <- "latin1" > identical(psort(x, c.locale=FALSE), sort(x)) [1] TRUE > identical(psort(x, c.locale=TRUE), sort(x, method="radix")) [1] TRUE Coming back to the topic of fsort, I have just finished the implementation for double, integer, factor and logical. The implementation takes into account NA, Inf.. values. Values can be sorted in a decreasing order or increasing order. Comparing benchmark with the current implementation in data.table, it is currently over 30% faster. There might bugs but I am sure performance can be further improved as I did not really try hard. If there is interest in both the implementation and cross community sharing, please let know Best regards, Morgan On Wed, 17 Mar 2021, 00:37 Radford Neal, wrote: > Those interested in faster sorting may want to look at the merge sort > implemented in pqR (see pqR-project.org). It's often used as the > default, because it is stable, and does different collations, while > being faster than shell sort (except for small vectors). > > Here are examples, with timings, for pqR-2020-07-23 and R-4.0.2, > compiled identically: > > - > pqR-2020-07-23 in C locale: > > > set.seed(1) > > N <- 100 > > x <- as.character (sample(N,N,replace=TRUE)) > > print(system.time (os <- order(x,method="shell"))) >user system elapsed > 1.332 0.000 1.334 > > print(system.time (or <- order(x,method="radix"))) >user system elapsed > 0.092 0.004 0.096 > > print(system.time (om <- order(x,method="merge"))) >user system elapsed > 0.363 0.000 0.363 > > print(identical(os,or)) > [1] TRUE > > print(identical(os,om)) > [1] TRUE > > > > x <- c("a","~") > > print(order(x,method="shell")) > [1] 1 2 > > print(order(x,method="radix")) > [1] 1 2 > > print(order(x,method="merge")) > [1] 1 2 > > - > R-4.0.2 in C locale: > > > set.seed(1) > > N <- 100 > > x <- as.character (sample(N,N,replace=TRUE)) > > print(system.time (os <- order(x,method="shell"))) >user system elapsed > 2.381 0.004 2.387 > > print(system.time (or <- order(x,method="radix"))) >user system elapsed > 0.138 0.000 0.137 > > #print(system.time (om <- order(x,method="merge"))) > > print(identical(os,or)) > [1] TRUE > > #print(identical(os,om)) > > > > x <- c("a","~") > > print(order(x,method="shell")) > [1] 1 2 > > print(order(x,method="radix")) > [1] 1 2 > > #print(order(x,method="merge")) > > > pqR-2020-07-23 in fr_CA.utf8 locale: > > > set.seed(1) > > N <- 100 > > x <- as.character (sample(N,N,replace=TRUE)) > > print(system.time (os <- order(x,method="shell"))) > utilisateur système écoulé > 2.960 0.000 2.962 > > print(system.time (or <- order(x,method="radix"))) > utilisateur système écoulé > 0.083 0.008 0.092 > > print(system.time (om <- order(x,method="merge"))) > utilisateur système écoulé > 1.143 0.000 1.142 > > print(identical(os,or)) > [1] TRUE > > print(identical(os,om)) > [1] TRUE > > > > x <- c("a","~") > > print(order(x,method="shell")) > [1] 2 1 > > print(order(x,method="radix")) > [1] 1 2 > > print(order(x,method="merge")) > [1] 2 1 > > > R-4.0.2 in fr_CA.utf8 locale: > > > set.seed(1) > > N <- 100 > > x <- as.character (sample(N,N,replace=TRUE)) > > print(system.time (os <- order(x,method="shell"))) > utilisateur système écoulé > 4.222 0.016 4.239 > > print(system.time (or <- order(x,method="radix"))) > utilisateur système écoulé > 0.136 0.000 0.137 > > #print(system.time (om <- order(x,method="merge"))) > > print(identical(os,or)) > [1] TRUE > > #print(identical(os,om)) > > > > x <- c("a","~") > > print(order(x,method="shell")) > [1] 2 1 > > print(order(x,method="radix")) > [1] 1 2 > > #print(order(x,method="merge")) > > > pqR is faster in all the tests, but more relevant to this discussion > is that the "merge" method is substantially faster than "shell" for > these character vectors with a million elements, while retaining the > stability and collation properties of "shell" (whereas "radix" only > does C collation). > > It would probably not be too hard to take the merge sort code from pqR > and use it in R core's implementation. > > The merge sort in pqR doesn't exploit parallelism at the moment, but > merge so
Re: [Rd] quantile() names
Getting back to this after 3 months : > Martin Maechler > on Wed, 16 Dec 2020 11:13:32 +0100 writes: > Gabriel Becker > on Mon, 14 Dec 2020 13:23:00 -0800 writes: >> Hi Edgar, I certainly don't think quantile(x, .975) should >> return 980, as that is a completely wrong answer. >> I do agree that it seems like the name is a bit >> offputting. I'm not sure how deep in the machinery you'd >> have to go to get digits to no effect on the names (I >> don't have time to dig in right this second). >> On the other hand, though, if we're going to make the >> names not respect digits entirely, what do we do when >> someone does quantile(x, 1/3)? That'd be a bad time had by >> all without digits coming to the rescue, i think. >> Best, ~G > and now we read more replies on this topic without anyone looking at > the pure R source code which is pretty simple and easy. > Instead, people do experiments and take time to muse about their findings.. > Honestly, I'm disappointed: I've always thought that if you > *write* on R-devel, you should be able to figure out a few > things yourself before that.. > It's not rocket science to see/know that you need to quickly look at > the quantile.default() method function and then to note > that it's format_perc(.) which is used to create the names. > Almost surely, I've been a bit envolved in creating parts of > this and probably am responsible for the current default > behavior. > > (sounds of digging) ... > > > > > > --> Yes: > > r837 | maechler | 1998-03-05 12:20:37 +0100 (Thu, 05. Mar 1998) | 2 Zeilen > Geänderte Pfade: > M /trunk/src/library/base/R/quantile > M /trunk/src/library/base/man/quantile.Rd > fixed names(.) construction > > With this diff (my 'svn-diffB -c837 quantile') : > Index: quantile > === > 21c21,23 > < names(qs) <- paste(round(100 * probs), "%", sep = "") > --- >>names(qs) <- paste(formatC(100 * probs, format= "fg", wid=1, >> dig= max(2,.Options$digits)), >> "%", sep = "") > - > so this was before this was modularized into the format_perc() > utility and quite a while before R 1.0.0 > Now, 22.8 years later, I do think that indeed it was not > necessarily the best idea to make the names() construction depend on the > 'digits' option entirely and just protect it by using at least 2 digits. > What I think is better is to > 1) provide an optional argument 'digits = 7' > back compatible w/ default getOption("digits") > 2) when used, check that it is at least '1' > But then some scripts / examples of some people *will* change > ..., e.g., because they preferred to have a global setting of digits=5 > so I'm guessing it may make more people unhappy than other > people happy if we change this now, after close to 23 years .. ?? > Martin I had more thoughts about this, and noticed that not one example or test in base R plus Recommended packages was changed, so I've now committed the above change. NEWS entry • The names of quantile()'s result no longer depend on the global getOption("digits"), but quantile() gets a new optional argument digits = 7 instead. Martin -- Martin Maechler ETH Zurich and R Core team >> On Mon, Dec 14, 2020 at 11:55 AM Merkle, Edgar >> C. wrote: >>> All, >>> >>> Consider the code below >>> >>> options(digits=2) >>> x <- 1:1000 >>> quantile(x, .975) >>> The value returned is 975 (the 97.5th percentile), but >>> the name has been shortened to "98%" due to the digits >>> option. Is this intended? I would have expected the name >>> to also be "97.5%" here. Alternatively, the returned >>> value might be 980 in order to match the name of "98%". >>> >>> Best, Ed >>> __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel