Re: [Rd] boneheaded BLAS questions

2021-03-17 Thread Dirk Eddelbuettel


On 17 March 2021 at 22:53, Ben Bolker wrote:
|Thanks.  I know it's supposed to Just Work (and I definitely 
| appreciate all the work that's gone into making it Just Work 99% of the 
| time!).

And for what it is worth, the aforementioned 'switching from within' solution
is using FlexiBLAS (not BLIS as I had said in the previous email), and was
described in an R application R here:

  
https://www.enchufa2.es/archives/switch-blas-lapack-without-leaving-your-r-session.html

That won't help for you tried on your Debian-based system though, and I would
(in the near-term) try that.

Dirk

-- 
https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] boneheaded BLAS questions

2021-03-17 Thread Simon Urbanek
Ben,

possibly useful project related to this
https://github.com/staticfloat/libblastrampoline
which is what Dirk referred to as the Julia state of art. It is actually much 
more complex than it sounds because of differences in naming and ABI between 
BLAS implementations, so simple switches don't work.
Dirk has the luxury of having control over what he compiles, but that is not 
always the case (like with Accelerate or MKL) ... 

Cheers,
Simon


> On 18/03/2021, at 3:53 PM, Ben Bolker  wrote:
> 
>  Thanks.  I know it's supposed to Just Work (and I definitely appreciate all 
> the work that's gone into making it Just Work 99% of the time!).
> 
>  I tried --with-lapack, no joy.
>  Will try to decipher the rules file tomorrow ...
> 
>  cheers
>   Ben
> 
> 
> On 3/17/21 10:25 PM, Dirk Eddelbuettel wrote:
>> Ben,
>> This stuff has worked unchanged since the 1990s when we had a _really_ far
>> sighted fellow in Debian come up with the 'switch the links' scheme which was
>> (and is) subsequently deployed by many numerical applications within Debian,
>> R and e.g. Octave included.
>> And I used this ability to switch over a decade ago in a never-quite-finished
>> paper which resulted in a package as well as a vignette as paper draft on
>> CRAN: gcbd [1] It used the ability to switch between implementation to time
>> and compare and benchmark the various BLAS and LAPACK libraries -- which was
>> then motivated by a comparison with GPUs. (The actual code / package is
>> stale-ish as some of the underlying packages have gone as eg the GPU one --
>> but the mechanics you are after still work the exact same way on Debian and
>> derivarives including Ubuntu and PopOS.)
>> (As a complete aside, the state of the art here is now one level up in
>> libraries based on flame/blis (a riff on blas) which can do a similar logical
>> switch _at runtime_ (rather than by flipping softlinks and restarting the
>> app). Julia and some other languages uses that, I think Fedora may have it in
>> its R build as well. Inaki may know more...)
>> That said, from the top of my head, I think you error may just be with the
>> second R compilation -- I always (i.e. for the Debian package) use both
>>   --with-blas --with-lapack
>> and not just --with-blas. And I do there is public: if you know where to look
>> you can see the exact invocation of the Debian build of the R package (which
>> Ubuntu and Pop and ... then shadow) [2]
>> Hth, Dirk
>> [1] https://cran.r-project.org/package=gcbd
>> [2] https://sources.debian.org/src/r-base/4.0.4-1/debian/rules/
>> (and I apologise for how messy this still is)
>> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] boneheaded BLAS questions

2021-03-17 Thread Ben Bolker
  Thanks.  I know it's supposed to Just Work (and I definitely 
appreciate all the work that's gone into making it Just Work 99% of the 
time!).


  I tried --with-lapack, no joy.
  Will try to decipher the rules file tomorrow ...

  cheers
   Ben


On 3/17/21 10:25 PM, Dirk Eddelbuettel wrote:


Ben,

This stuff has worked unchanged since the 1990s when we had a _really_ far
sighted fellow in Debian come up with the 'switch the links' scheme which was
(and is) subsequently deployed by many numerical applications within Debian,
R and e.g. Octave included.

And I used this ability to switch over a decade ago in a never-quite-finished
paper which resulted in a package as well as a vignette as paper draft on
CRAN: gcbd [1] It used the ability to switch between implementation to time
and compare and benchmark the various BLAS and LAPACK libraries -- which was
then motivated by a comparison with GPUs. (The actual code / package is
stale-ish as some of the underlying packages have gone as eg the GPU one --
but the mechanics you are after still work the exact same way on Debian and
derivarives including Ubuntu and PopOS.)

(As a complete aside, the state of the art here is now one level up in
libraries based on flame/blis (a riff on blas) which can do a similar logical
switch _at runtime_ (rather than by flipping softlinks and restarting the
app). Julia and some other languages uses that, I think Fedora may have it in
its R build as well. Inaki may know more...)

That said, from the top of my head, I think you error may just be with the
second R compilation -- I always (i.e. for the Debian package) use both
   --with-blas --with-lapack
and not just --with-blas. And I do there is public: if you know where to look
you can see the exact invocation of the Debian build of the R package (which
Ubuntu and Pop and ... then shadow) [2]

Hth, Dirk

[1] https://cran.r-project.org/package=gcbd
[2] https://sources.debian.org/src/r-base/4.0.4-1/debian/rules/
 (and I apologise for how messy this still is)



__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] boneheaded BLAS questions

2021-03-17 Thread Dirk Eddelbuettel


Ben,

This stuff has worked unchanged since the 1990s when we had a _really_ far
sighted fellow in Debian come up with the 'switch the links' scheme which was
(and is) subsequently deployed by many numerical applications within Debian,
R and e.g. Octave included.

And I used this ability to switch over a decade ago in a never-quite-finished
paper which resulted in a package as well as a vignette as paper draft on
CRAN: gcbd [1] It used the ability to switch between implementation to time
and compare and benchmark the various BLAS and LAPACK libraries -- which was
then motivated by a comparison with GPUs. (The actual code / package is
stale-ish as some of the underlying packages have gone as eg the GPU one --
but the mechanics you are after still work the exact same way on Debian and
derivarives including Ubuntu and PopOS.)

(As a complete aside, the state of the art here is now one level up in
libraries based on flame/blis (a riff on blas) which can do a similar logical
switch _at runtime_ (rather than by flipping softlinks and restarting the
app). Julia and some other languages uses that, I think Fedora may have it in
its R build as well. Inaki may know more...)

That said, from the top of my head, I think you error may just be with the
second R compilation -- I always (i.e. for the Debian package) use both
  --with-blas --with-lapack
and not just --with-blas. And I do there is public: if you know where to look
you can see the exact invocation of the Debian build of the R package (which
Ubuntu and Pop and ... then shadow) [2]

Hth, Dirk

[1] https://cran.r-project.org/package=gcbd
[2] https://sources.debian.org/src/r-base/4.0.4-1/debian/rules/
(and I apologise for how messy this still is)
-- 
https://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] boneheaded BLAS questions

2021-03-17 Thread Ben Bolker
  I've been going around in circles trying to get BLAS-switching 
working on a current r-devel, I'm sure I'm doing something dumb.  Any 
ideas about what I might be doing wrong, or suggestions for further 
diagnosis, would be welcome!


  tl;dr  I am compiling R-devel with (to the best of my knowledge) 
options set to allow BLAS-switching, but getting "undefined symbol" errors.


 

  Latest R-devel (via SVN), PopOS!/Ubuntu 20.10

  I have read Dirk E's post: https://github.com/eddelbuettel/mkl4deb
  I have attempted to read the relevant section of R Installation & 
Administration several times: 
https://cran.r-project.org/doc/manuals/r-release/R-admin.html#BLAS

  https://wiki.debian.org/DebianScience/LinearAlgebraLibraries


  I have installed MKL and OpenBLAS on my system via 'apt install' 
(libopenblas-dev, libopenblas-base, and TWO versions of intel-mkl-64bit)


  When I build R without BLAS everything is OK;

	rm -Rf r-build; mkdir r-build; cd r-build; ../r-devel/configure 
--without-blas --enable-R-shlib --enable-BLAS-shlib; make -j 6



Matrix products: default
BLAS:   /usr/local/lib/R/lib/libRblas.so
LAPACK: /usr/local/lib/R/lib/libRlapack.so


   When I look at my BLAS alternatives I don't see anything obviously 
wrong:



sudo update-alternatives --config libblas.so.3-x86_64-linux-gnu
There are 3 choices for the alternative libblas.so.3-x86_64-linux-gnu 
(providing /usr/lib/x86_64-linux-gnu/libblas.so.3).


  SelectionPath 
Priority   Status


* 0/opt/intel/mkl/lib/intel64/libmkl_rt.so 
150   auto mode
  1/opt/intel/mkl/lib/intel64/libmkl_rt.so 
 150   manual mode
  2/usr/lib/x86_64-linux-gnu/blas/libblas.so.3 
 10manual mode
  3/usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
 100   manual mode



  When I rebuild R with --with-blas:

	rm -Rf r-build; mkdir r-build; cd r-build; ../r-devel/configure 
--with-blas --enable-R-shlib --enable-BLAS-shlib; make -j 6


 I end up with this:

gcc -I../../../r-devel/src/extra -I/usr/include/tirpc -I. 
-I../../src/include -I../../../r-devel/src/include  -I/usr/local/include 
-I../../../r-devel/src/nmath -DHAVE_CONFIG_H   -fopenmp -fpic  -g -O2 
-c ../../../r-devel/src/main/Rmain.c -o Rmain.o
gcc -Wl,--export-dynamic -fopenmp  -L"../../lib" -L/usr/local/lib -o 
R.bin Rmain.o  -lR -lRblas



/usr/bin/ld: ../../lib/libR.so: undefined reference to `zgemm_'
/usr/bin/ld: ../../lib/libR.so: undefined reference to `daxpy_'
/usr/bin/ld: ../../lib/libR.so: undefined reference to `dgemv_'
/usr/bin/ld: ../../lib/libR.so: undefined reference to `dscal_'


   If

===
intel-mkl-64bit-2018.2-046/all,now 2018.2-046 amd64 [installed]
intel-mkl-64bit-2020.4-912/all,now 2020.4-912 amd64 [installed]

<... lots more intel-mkl stuff>

libblas-dev/groovy,now 3.9.0-3ubuntu1 amd64 [installed,automatic]
libblas3/groovy,now 3.9.0-3ubuntu1 amd64 [installed,automatic]
libgraphblas3/groovy,now 1:5.8.1+dfsg-2 amd64 [installed,automatic]
libgslcblas0/groovy,now 2.6+dfsg-2 amd64 [installed,automatic]
libopenblas-base/groovy,now 0.3.10+ds-3ubuntu1 amd64 [installed]
libopenblas-dev/groovy,now 0.3.10+ds-3ubuntu1 amd64 [installed]
libopenblas-pthread-dev/groovy,now 0.3.10+ds-3ubuntu1 amd64 
[installed,automatic]
libopenblas0-pthread/groovy,now 0.3.10+ds-3ubuntu1 amd64 
[installed,automatic]

libopenblas0/groovy,now 0.3.10+ds-3ubuntu1 amd64 [installed]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] reshape documentation

2021-03-17 Thread Michael Dewey

Comments in line

On 13/03/2021 09:50, SOEIRO Thomas wrote:

Dear list,

I have some questions/suggestions about reshape.

1) I think a good amount of the popularity of base::reshape alternative is due to the complexity of 
reshape documentation. It is quite hard (at least it is for me) to figure out what argument is 
needed for respectively "long to wide" and "wide to long", because reshapeWide 
and reshapeLong are documented together.
- Do you agree with this?
- Would you consider a proposal to modify the documentation?
- If yes, what approach do you suggest? e.g. split in two pages?


The current documentation is much clearer than it was when I first 
started using R but we should always strive for more.


I would suggest leaving the documentation in one place but it might be 
helpful to add which direction is relevant for each parameter by placing 
(to wide) or (to long) as appropriate. I think having completely 
separate lists is not needed


  
2) I do not think the documentation indicates that we can use varying argument to rename variables in reshapeWide.

- Is this worth documenting?
- Is the construct list(c()) really needed?


Yes, because you may have more than one set of variables which need to 
correspond to a single variable in long format. So in your example if 
you also had 11 variables for the temperature as well as the 
concentration each would need specifying as a separate vector in the list.


Michael



reshape(Indometh,
 v.names = "conc",
 idvar = "Subject",
 timevar = "time",
 direction = "wide",
 varying = list(c("conc_0.25hr",
  "conc_0.5hr",
  "conc.0.75hr",
  "conc_1hr",
  "conc_1.25hr",
  "conc_2hr",
  "conc_3hr",
  "conc_4hr",
  "conc_5hr",
  "conc_6hr",
  "conc_8hr")))

Thanks,

Thomas
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel



--
Michael
http://www.dewey.myzen.co.uk/home.html

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Faster sorting algorithm...

2021-03-17 Thread Morgan Morgan
Thank you Neal. This is interesting. I will have a look at pqR.
Indeed radix only does C collation, I believe that is why it is not the
default choice for character ordering and sorting.
Not sure but I believe it can help address the following bugzilla item:
https://bugs.r-project.org/bugzilla/show_bug.cgi?id=17400

On the same topic of collation, there is an experimental sorting function
"psort" in package kit that might help address this issue.

> library(kit)
Attaching kit 0.0.7 (OPENMP enabled using 1 thread)
> x <- c("b","A","B","a","\xe4")
> Encoding(x) <- "latin1"
> identical(psort(x, c.locale=FALSE), sort(x))
[1] TRUE
> identical(psort(x, c.locale=TRUE), sort(x, method="radix"))
[1] TRUE

Coming back to the topic of fsort, I have just finished the implementation
for double, integer, factor and logical.
The implementation takes into account NA, Inf.. values. Values can be
sorted in a decreasing order or increasing order.
Comparing benchmark with the current implementation in data.table, it is
currently over 30% faster.
There might bugs but I am sure performance can be further improved as I did
not really try hard.
If there is interest in both the implementation and cross community
sharing, please let know

Best regards,
Morgan

On Wed, 17 Mar 2021, 00:37 Radford Neal,  wrote:

> Those interested in faster sorting may want to look at the merge sort
> implemented in pqR (see pqR-project.org).  It's often used as the
> default, because it is stable, and does different collations, while
> being faster than shell sort (except for small vectors).
>
> Here are examples, with timings, for pqR-2020-07-23 and R-4.0.2,
> compiled identically:
>
> -
> pqR-2020-07-23 in C locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
>user  system elapsed
>   1.332   0.000   1.334
> > print(system.time (or <- order(x,method="radix")))
>user  system elapsed
>   0.092   0.004   0.096
> > print(system.time (om <- order(x,method="merge")))
>user  system elapsed
>   0.363   0.000   0.363
> > print(identical(os,or))
> [1] TRUE
> > print(identical(os,om))
> [1] TRUE
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 1 2
> > print(order(x,method="radix"))
> [1] 1 2
> > print(order(x,method="merge"))
> [1] 1 2
>
> -
> R-4.0.2 in C locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
>user  system elapsed
>   2.381   0.004   2.387
> > print(system.time (or <- order(x,method="radix")))
>user  system elapsed
>   0.138   0.000   0.137
> > #print(system.time (om <- order(x,method="merge")))
> > print(identical(os,or))
> [1] TRUE
> > #print(identical(os,om))
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 1 2
> > print(order(x,method="radix"))
> [1] 1 2
> > #print(order(x,method="merge"))
>
> 
> pqR-2020-07-23 in fr_CA.utf8 locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
> utilisateur système  écoulé
>   2.960   0.000   2.962
> > print(system.time (or <- order(x,method="radix")))
> utilisateur système  écoulé
>   0.083   0.008   0.092
> > print(system.time (om <- order(x,method="merge")))
> utilisateur système  écoulé
>   1.143   0.000   1.142
> > print(identical(os,or))
> [1] TRUE
> > print(identical(os,om))
> [1] TRUE
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 2 1
> > print(order(x,method="radix"))
> [1] 1 2
> > print(order(x,method="merge"))
> [1] 2 1
>
> 
> R-4.0.2 in fr_CA.utf8 locale:
>
> > set.seed(1)
> > N <- 100
> > x <- as.character (sample(N,N,replace=TRUE))
> > print(system.time (os <- order(x,method="shell")))
> utilisateur système  écoulé
>   4.222   0.016   4.239
> > print(system.time (or <- order(x,method="radix")))
> utilisateur système  écoulé
>   0.136   0.000   0.137
> > #print(system.time (om <- order(x,method="merge")))
> > print(identical(os,or))
> [1] TRUE
> > #print(identical(os,om))
> >
> > x <- c("a","~")
> > print(order(x,method="shell"))
> [1] 2 1
> > print(order(x,method="radix"))
> [1] 1 2
> > #print(order(x,method="merge"))
>
>
> pqR is faster in all the tests, but more relevant to this discussion
> is that the "merge" method is substantially faster than "shell" for
> these character vectors with a million elements, while retaining the
> stability and collation properties of "shell" (whereas "radix" only
> does C collation).
>
> It would probably not be too hard to take the merge sort code from pqR
> and use it in R core's implementation.
>
> The merge sort in pqR doesn't exploit parallelism at the moment, but
> merge so

Re: [Rd] quantile() names

2021-03-17 Thread Martin Maechler
Getting back to this after 3 months :

> Martin Maechler 
> on Wed, 16 Dec 2020 11:13:32 +0100 writes:

> Gabriel Becker 
> on Mon, 14 Dec 2020 13:23:00 -0800 writes:

>> Hi Edgar, I certainly don't think quantile(x, .975) should
>> return 980, as that is a completely wrong answer.

>> I do agree that it seems like the name is a bit
>> offputting. I'm not sure how deep in the machinery you'd
>> have to go to get digits to no effect on the names (I
>> don't have time to dig in right this second).

>> On the other hand, though, if we're going to make the
>> names not respect digits entirely, what do we do when
>> someone does quantile(x, 1/3)? That'd be a bad time had by
>> all without digits coming to the rescue, i think.

>> Best, ~G

> and now we read more replies on this topic without anyone looking at
> the pure R source code which is pretty simple and easy.
> Instead, people do experiments and take time to muse about their 
findings..

> Honestly, I'm disappointed: I've always thought that if you
> *write* on R-devel, you should be able to figure out a few
> things yourself before that..

> It's not rocket science to see/know that you need to quickly look at
> the quantile.default() method function and then to note 
> that it's  format_perc(.) which is used to create the names.

> Almost surely, I've been a bit envolved in creating parts of
> this and probably am responsible for the current default
> behavior.

> 
> (sounds of digging) ...
> 
> 
> 
> 
> 
> 

--> Yes:

> 
> r837 | maechler | 1998-03-05 12:20:37 +0100 (Thu, 05. Mar 1998) | 2 Zeilen
> Geänderte Pfade:
> M /trunk/src/library/base/R/quantile
> M /trunk/src/library/base/man/quantile.Rd

> fixed names(.) construction
> 

> With this diff  (my 'svn-diffB -c837 quantile') :
> Index: quantile
> ===
> 21c21,23
> < names(qs) <- paste(round(100 * probs), "%", sep = "")
> ---
>>names(qs) <- paste(formatC(100 * probs, format= "fg", wid=1,
>>  dig= max(2,.Options$digits)),
>> "%", sep = "")

> -
> so this was before this was modularized into the format_perc()
> utility and quite a while before R 1.0.0 

> Now, 22.8 years later, I do think that indeed it was not
> necessarily the best idea to make the names() construction depend  on the
> 'digits' option entirely and just protect it by using at least 2 digits.

> What I think is better is to

> 1) provide an optional argument   'digits = 7'
> back compatible w/ default getOption("digits")

> 2) when used, check that it is at least '1'

> But then some scripts / examples of some people *will* change
> ..., e.g., because they preferred to have a global setting of digits=5

> so I'm guessing it may make more people unhappy than other
> people happy if we change this now, after close to 23 years  .. ??

> Martin

I had more thoughts about this, and noticed that not one example
or test in base R  plus Recommended packages was changed, so
I've now committed the above change.

NEWS entry

• The names of quantile()'s result no longer depend on the global
  getOption("digits"), but quantile() gets a new optional argument
  digits = 7 instead.

Martin


--
Martin Maechler
ETH Zurich  and  R Core team


>> On Mon, Dec 14, 2020 at 11:55 AM Merkle, Edgar
>> C.  wrote:

>>> All,
>>> 
>>> Consider the code below
>>> 
>>> options(digits=2)
>>> x <- 1:1000 
>>> quantile(x, .975)

>>> The value returned is 975 (the 97.5th percentile), but
>>> the name has been shortened to "98%" due to the digits
>>> option. Is this intended? I would have expected the name
>>> to also be "97.5%" here. Alternatively, the returned
>>> value might be 980 in order to match the name of "98%".
>>> 
>>> Best, Ed
>>>

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel