Re: [R-pkg-devel] separate Functions: and Datasets: indices?

2018-05-18 Thread Ben Bolker

  Figured it out.  MASS has a custom INDEX file ...

--
1.1.4 The INDEX file

The optional file INDEX contains a line for each sufficiently
interesting object in the package, giving its name and a description
(functions such as print methods not usually called explicitly might not
be included). Normally this file is missing and the corresponding
information is automatically generated from the documentation sources
(using tools::Rdindex()) when installing from source.

The file is part of the information given by library(help = pkgname).

Rather than editing this file, it is preferable to put customized
information about the package into an overview help page (see
Documenting packages) and/or a vignette (see Writing package vignettes).
---


On 2018-05-18 03:44 PM, Ben Bolker wrote:
> 
>   I notice that when I say  help(package="MASS") I get separate indices
> for functions and data sets.  AFAICT this doesn't seem to occur in other
> packages that have both functions and data sets (e.g. mgcv, lattice,
> lme4), despite the tags \docType{data} and \keywords{datasets} being
> used in the relevant .Rd files; I don't see any other obvious magic in
> the .Rd files for MASS (in fact, they don't use the \docType{} tag at
> all ...
> 
>   Before I go spelunking, does anyone have any guesses/ideas/information
> about why MASS is special (or if it is)?
> 
>   cheers
> Ben Bolker
>

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] [FORGED] separate Functions: and Datasets: indices?

2018-05-18 Thread Rolf Turner


On 19/05/18 07:44, Ben Bolker wrote:



   I notice that when I say  help(package="MASS") I get separate indices
for functions and data sets.  AFAICT this doesn't seem to occur in other
packages that have both functions and data sets (e.g. mgcv, lattice,
lme4), despite the tags \docType{data} and \keywords{datasets} being
used in the relevant .Rd files; I don't see any other obvious magic in
the .Rd files for MASS (in fact, they don't use the \docType{} tag at
all ...

   Before I go spelunking, does anyone have any guesses/ideas/information
about why MASS is special (or if it is)?


Doesn't happen to me.  (I am still using R 3.4.  Is this an R 3.5 
phenomenon?) I get single alphabetical listing, of functions and data 
sets, intermingled.  E.g.:



  -- A -- > abbey  Determinations 
of Nickel Content
accdeaths Accidental Deaths in the US 1973-1978
addterm   Try All One-Term Additions to a Model
addterm.default Try All One-Term Additions to a Model
addterm.glm   Try All One-Term Additions to a Model
addterm.lmTry All One-Term Additions to a Model
Aids2 Australian AIDS Survival Data
Animals   Brain and Body Weights for 28 Species

...
...

My session info:


 > sessionInfo()
R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/local/lib64/R/lib/libRblas.so
LAPACK: /usr/local/lib64/R/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_NZ.UTF-8   LC_NUMERIC=C  
 [3] LC_TIME=en_NZ.UTF-8LC_COLLATE=en_NZ.UTF-8
 [5] LC_MONETARY=en_NZ.UTF-8LC_MESSAGES=en_NZ.UTF-8   
 [7] LC_PAPER=en_NZ.UTF-8   LC_NAME=C 
 [9] LC_ADDRESS=C   LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C   


attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 


other attached packages:
[1] misc_0.0-16

loaded via a namespace (and not attached):
 [1] Brobdingnag_1.2-4splines_3.4.4gtools_3.5.0
 [4] StanHeaders_2.16.0-1 threejs_0.3.1shiny_1.0.3 
 [7] assertthat_0.2.0 stats4_3.4.4 pillar_1.0.1   




cheers,

Rolf

--
Technical Editor ANZJS
Department of Statistics
University of Auckland
Phone: +64-9-373-7599 ext. 88276

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] questions about promptPackage(); -package.Rd; help(package="")

2018-05-18 Thread Dirk Eddelbuettel

Ben,

Been meaning to write a short blog post about it as it also affects two (old)
packages of mine.  My favourite is to just rely on the Rd macros to the
fullest, and I generally just hand-edit it -- no promptPackage() use.

See eg this side-by-side diff of the first of the two I need to update; the
other may follow tomorrow.

https://github.com/eddelbuettel/inline/commit/51d1ed2fbb5493b0cbc76d9bdf22beec7fe42ec9?diff=split#diff-c6be7fcd65260038f24d97252a80fef7

This should give you the old one on the left, and the new one -- which is
essentially just references to DESCRIPTION on the right.  As it is short, I
include the new one (indented by three spaces):

   \name{inline-package}
   \alias{inline-package}
   \alias{inline}
   \docType{package}
   \title{\packageTitle{inline}}
   \description{\packageDescription{inline}}
   \seealso{\code{\link{cfunction}}, \code{\link{cxxfunction}}}
   \author{\packageAuthor{inline}}
   \section{Maintainer}{\packageMaintainer{inline}}
   \keyword{package}

Hth,  Dirk

-- 
http://dirk.eddelbuettel.com | @eddelbuettel | e...@debian.org

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] separate Functions: and Datasets: indices?

2018-05-18 Thread Ben Bolker

  I notice that when I say  help(package="MASS") I get separate indices
for functions and data sets.  AFAICT this doesn't seem to occur in other
packages that have both functions and data sets (e.g. mgcv, lattice,
lme4), despite the tags \docType{data} and \keywords{datasets} being
used in the relevant .Rd files; I don't see any other obvious magic in
the .Rd files for MASS (in fact, they don't use the \docType{} tag at
all ...

  Before I go spelunking, does anyone have any guesses/ideas/information
about why MASS is special (or if it is)?

  cheers
Ben Bolker

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] questions about promptPackage(); -package.Rd; help(package="")

2018-05-18 Thread Ben Bolker

   I was advised to update the metadata in the "*-package.Rd" man page
of an old CRAN package (emdbook), which had gotten out of date.  As
suggested I used promptPackage() to build a new emdbook-package.Rd file.
I did that, but I was surprised/confused that it essentially dumped all
of the DESCRIPTION meta-data (using macros, not raw data) into the .Rd
file. The default information specified seems to replicate the results
of help(package=""). If I hadn't double-checked I might
accidentally ending up pushing a -package.Rd file that had a lot of
what I'd consider junk in it ...  Except for providing twice as many
places to find information [help(package="") and
help("-package")], which might help users who only knew about one
of these routes, I don't understand the benefit of these defaults.

  (Excerpts of *Writing R Extensions* section 2.1.4 "Documenting
packages" appended below, for reference.)

- Is it just that it's hard to prescribe/automatically write a "short
overview" statement for the package author?
- Is all this information intended as a reminder to the package author
of what's in the package, and they're expected to throw most of it out
when editing the file?  (When the information appears in the .Rd file as
e.g. \packageIndices{emdbook}, it's not actually that helpful as a
reminder ...)
- Or is the expectation that much of the DESCRIPTION metadata will also
be presented in -package.Rd ?

When I look at mature packages that have "-package" pages (e.g.
mgcv, vegan, lattice) they seem to incorporate almost *none* of the
DESCRIPTION meta-information -- perhaps just \packageDescription{} and
\packageAuthor{}.

  Part of my confusion may stem from the package in question being very
small, so there's not very much one could say in -package.Rd that's
not already said by the Title: and Description: fields in DESCRIPTION ...

  I ended up making a minimal emdbook-package.Rd file, but I wonder if
that's what's intended/what other developers would suggest here.

  cheers
Ben Bolker

-
> Packages may have an overview help page with an \alias
pkgname-package, e.g. ‘utils-package’ for the utils package, when
package?pkgname will open that help page.  ...

> ... Otherwise [if final=FALSE] (the default) comments will be inserted
giving suggestions for content.

> Apart from the mandatory \name and \title and the pkgname-package
alias, the only requirement for the package overview page is that it
include a \docType{package} statement. All other content is optional. We
suggest that it should be a short overview, to give a reader unfamiliar
with the package enough information to get started.

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] File link does not exist : how to get the correct one?

2018-05-18 Thread Martin Maechler
> Duncan Murdoch 
> on Fri, 18 May 2018 11:42:53 -0400 writes:

> On 18/05/2018 11:37 AM, Duncan Murdoch wrote:
>> On 18/05/2018 11:29 AM, Duncan Murdoch wrote:
>>> On 18/05/2018 11:06 AM, Joris Meys wrote:
 Hi all,
 
 The latest changes in R cause a lot of Rd warnings about file links 
that
 don't exist and are treated as a topic. One example is
 
 \code{\link[stats]{fitted}}
 
 Now if I look at ?fitted , the name of the page (top left corner) is 
given
 as "fitted". So I would expect that the code above should just work 
fine,
 but it generates the warning.
 
 How can one get these names without having to browse through the 
directory
 with html files?
>>> 
>>> You could ask for HTML help on fitted, but don't use the class print 
method:
>> 
>> Sorry, forgot to edit that out:  it doesn't need to be HTML help, any
>> format would do.

> And this should be my last message on the topic:  a nicer solution is 
> simply to use basename():

>> basename(?fitted)
> [1] "fitted.values"

> Duncan Murdoch

Wow ... Awesome!

I had no idea about such an elegant a solution to this
problem...

Martin

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] SIAM Wilkinson prize

2018-05-18 Thread Avraham Adler
Seems as if it is restricted to people with PhDs.

Avi

On Fri, May 18, 2018 at 1:13 PM J C Nash  wrote:

> It occurs to me that there could be packages developed by early career R
> developers that might fit
> this prize which is considered quite prestigious (not to mention the cash)
> in the numerical methods community.
> It is also likely that people may not be aware of the award in the R
> community.
>
> Cheers, JN
>
>
>
>  Forwarded Message 
> Subject:[SIAM-OPT] June 1 Entry Deadline - James H. Wilkinson
> Prize for Numerical Software
> Date:   Thu, 17 May 2018 14:22:41 +
> From:   SIAM Prize Program 
> CC: Optimization SIAG mailing list 
>
>
>
> James H. Wilkinson Prize for Numerical Software
>
> *cid:image001.png@01D29F3D.6ECC9B50* <
> https://siam2019.secure-platform.com/a/solicitations/home/5>
>
> The deadline is June 1 for entries for the James H. Wilkinson Prize for
> Numerical Software
> . We are
> looking for submissions of high-quality numerical
> software from early career teams. If you or your team are developing
> numerical software for scientific computing, act as
> a nominator and enter your software for the prize. To submit an entry, you
> first need to create an account at the SIAM
> Prize Portal. Click on the “Submit” button above to start the process.
>
> The James H. Wilkinson Prize for Numerical Software is awarded every four
> years to the authors of an outstanding piece
> of numerical software. The prize is awarded for an entry that best
> addresses all phases of the preparation of
> high-quality numerical software. It is intended to recognize innovative
> software in scientific computing and to
> encourage researchers in the earlier stages of their career.
>
> SIAM will award the Wilkinson Prize for Numerical Software at the SIAM
> Conference on Computational Science and
> Engineering (CSE19). The award will consist of $3,000 and a plaque. As
> part of the award, the recipient(s) will be
> expected to present a lecture at the conference.
>
>
>
>
> 
>
> *Eligibility Criteria:*
>
> Selection will be based on: clarity of the software implementation and
> documentation, importance of the application(s)
> addressed by the software; portability, reliability, efficiency, and
> usability of the software implementation; clarity
> and depth of analysis of the algorithms and the software in the
> accompanying paper; and quality of the test software.
>
> Candidates must have worked in mathematics or science for at most 12 years
> (full time equivalent) after receiving their
> PhD as of January 1 of the award year, allowing for breaks in continuity.
> The prize committee can make exceptions, if in
> their opinion the candidate is at an equivalent stage in their career.
>
> For the 2019 award, a candidate must have received their PhD no earlier
> than January 1, 2007.
>
>
> 
>
> *Entry Deadline:*
>
> *June 1, 2018*
>
>
> 
>
> *Required Materials:*
>
> · CVs of the authors of the software, at most two pages per author
> (PDF)
>
> · A two-page summary of the main features of the algorithm and
> software implementation (PDF)
>
> · A paper describing the algorithm and the software implementation
> (PDF)
>
> · Open source software written in a widely available high-level
> programming language. The software should be
> submitted in a gzipped .tar archive with a README file describing the
> contents of the archive. Each submission should
> include documentation, examples of the use of the software, a test
> program, and scripts for executing the test programs.
>
>
> 
>
>
>
> *Previous recipients:*
>
>
>
> Previous recipients of the James H. Wilkinson Prize for Numerical Software
> are:
>
>
>
> *2015*Patrick Farrell, Simon Funke, David Ham, and Marie Rognes for
> dolfin-adjoint
> *2011 *Andreas Waechter and Carl Laird for IPOPT
>
> *2007 *Wolfgang Bangerth, Guido Kanschat, and Ralf Hartmann for deal.II
>
> *2003 *Jonathan Shewchuk for Triangle
>
> *1999 *Matteo Frigo and Steven Johnson for FFTW
> *1995 *Chris Bischof and Alan Carle for ADIFOR 2.0
> *1991 *Linda Petzold for DASSL
>
>
>
>
> 
>
> *Selection Committee:*
>
> Jorge Moré (Chair), Argonne National Laboratory
> Sven Hammarling, Numerical Algorithms Group Ltd and University of
> Manchester
> Michael Heroux, Sandia National Laborato

[R-pkg-devel] SIAM Wilkinson prize

2018-05-18 Thread J C Nash
It occurs to me that there could be packages developed by early career R 
developers that might fit
this prize which is considered quite prestigious (not to mention the cash) in 
the numerical methods community.
It is also likely that people may not be aware of the award in the R community.

Cheers, JN



 Forwarded Message 
Subject:[SIAM-OPT] June 1 Entry Deadline - James H. Wilkinson Prize for 
Numerical Software
Date:   Thu, 17 May 2018 14:22:41 +
From:   SIAM Prize Program 
CC: Optimization SIAG mailing list 



James H. Wilkinson Prize for Numerical Software

*cid:image001.png@01D29F3D.6ECC9B50* 


The deadline is June 1 for entries for the James H. Wilkinson Prize for 
Numerical Software
. We are looking 
for submissions of high-quality numerical
software from early career teams. If you or your team are developing numerical 
software for scientific computing, act as
a nominator and enter your software for the prize. To submit an entry, you 
first need to create an account at the SIAM
Prize Portal. Click on the “Submit” button above to start the process.

The James H. Wilkinson Prize for Numerical Software is awarded every four years 
to the authors of an outstanding piece
of numerical software. The prize is awarded for an entry that best addresses 
all phases of the preparation of
high-quality numerical software. It is intended to recognize innovative 
software in scientific computing and to
encourage researchers in the earlier stages of their career.

SIAM will award the Wilkinson Prize for Numerical Software at the SIAM 
Conference on Computational Science and
Engineering (CSE19). The award will consist of $3,000 and a plaque. As part of 
the award, the recipient(s) will be
expected to present a lecture at the conference.





*Eligibility Criteria:*

Selection will be based on: clarity of the software implementation and 
documentation, importance of the application(s)
addressed by the software; portability, reliability, efficiency, and usability 
of the software implementation; clarity
and depth of analysis of the algorithms and the software in the accompanying 
paper; and quality of the test software.

Candidates must have worked in mathematics or science for at most 12 years 
(full time equivalent) after receiving their
PhD as of January 1 of the award year, allowing for breaks in continuity. The 
prize committee can make exceptions, if in
their opinion the candidate is at an equivalent stage in their career.

For the 2019 award, a candidate must have received their PhD no earlier than 
January 1, 2007.



*Entry Deadline:*

*June 1, 2018*



*Required Materials:*

· CVs of the authors of the software, at most two pages per author (PDF)

· A two-page summary of the main features of the algorithm and software 
implementation (PDF)

· A paper describing the algorithm and the software implementation (PDF)

· Open source software written in a widely available high-level 
programming language. The software should be
submitted in a gzipped .tar archive with a README file describing the contents 
of the archive. Each submission should
include documentation, examples of the use of the software, a test program, and 
scripts for executing the test programs.





*Previous recipients:*



Previous recipients of the James H. Wilkinson Prize for Numerical Software are:



*2015*Patrick Farrell, Simon Funke, David Ham, and Marie Rognes for 
dolfin-adjoint
*2011 *Andreas Waechter and Carl Laird for IPOPT

*2007 *Wolfgang Bangerth, Guido Kanschat, and Ralf Hartmann for deal.II

*2003 *Jonathan Shewchuk for Triangle

*1999 *Matteo Frigo and Steven Johnson for FFTW
*1995 *Chris Bischof and Alan Carle for ADIFOR 2.0
*1991 *Linda Petzold for DASSL





*Selection Committee:*

Jorge Moré (Chair), Argonne National Laboratory
Sven Hammarling, Numerical Algorithms Group Ltd and University of Manchester
Michael Heroux, Sandia National Laboratories
Randall J. LeVeque, University of Washington
Katherine Yelick, Lawrence Berkeley National Laboratory



Learn more about our prize program and view all 
prize

Re: [R-pkg-devel] File link does not exist : how to get the correct one?

2018-05-18 Thread Duncan Murdoch

On 18/05/2018 11:37 AM, Duncan Murdoch wrote:

On 18/05/2018 11:29 AM, Duncan Murdoch wrote:

On 18/05/2018 11:06 AM, Joris Meys wrote:

Hi all,

The latest changes in R cause a lot of Rd warnings about file links that
don't exist and are treated as a topic. One example is

\code{\link[stats]{fitted}}

Now if I look at ?fitted , the name of the page (top left corner) is given
as "fitted". So I would expect that the code above should just work fine,
but it generates the warning.

How can one get these names without having to browse through the directory
with html files?


You could ask for HTML help on fitted, but don't use the class print method:


Sorry, forgot to edit that out:  it doesn't need to be HTML help, any
format would do.


And this should be my last message on the topic:  a nicer solution is 
simply to use basename():


> basename(?fitted)
[1] "fitted.values"

Duncan Murdoch



   > unclass(?fitted)
[1]
"/Library/Frameworks/R.framework/Versions/3.4/Resources/library/stats/help/fitted.values"
attr(,"call")
help(topic = "fitted", package = NULL)
attr(,"topic")
[1] "fitted"
attr(,"tried_all_packages")
[1] FALSE
attr(,"type")
[1] "html"

The first line says that the name should be "fitted.values".

Duncan Murdoch








__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] File link does not exist : how to get the correct one?

2018-05-18 Thread Duncan Murdoch

On 18/05/2018 11:29 AM, Duncan Murdoch wrote:

On 18/05/2018 11:06 AM, Joris Meys wrote:

Hi all,

The latest changes in R cause a lot of Rd warnings about file links that
don't exist and are treated as a topic. One example is

\code{\link[stats]{fitted}}

Now if I look at ?fitted , the name of the page (top left corner) is given
as "fitted". So I would expect that the code above should just work fine,
but it generates the warning.

How can one get these names without having to browse through the directory
with html files?


You could ask for HTML help on fitted, but don't use the class print method:


Sorry, forgot to edit that out:  it doesn't need to be HTML help, any 
format would do.


Duncan Murdoch



  > unclass(?fitted)
[1]
"/Library/Frameworks/R.framework/Versions/3.4/Resources/library/stats/help/fitted.values"
attr(,"call")
help(topic = "fitted", package = NULL)
attr(,"topic")
[1] "fitted"
attr(,"tried_all_packages")
[1] FALSE
attr(,"type")
[1] "html"

The first line says that the name should be "fitted.values".

Duncan Murdoch






__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] File link does not exist : how to get the correct one?

2018-05-18 Thread Duncan Murdoch

On 18/05/2018 11:06 AM, Joris Meys wrote:

Hi all,

The latest changes in R cause a lot of Rd warnings about file links that
don't exist and are treated as a topic. One example is

\code{\link[stats]{fitted}}

Now if I look at ?fitted , the name of the page (top left corner) is given
as "fitted". So I would expect that the code above should just work fine,
but it generates the warning.

How can one get these names without having to browse through the directory
with html files?


You could ask for HTML help on fitted, but don't use the class print method:

> unclass(?fitted)
[1] 
"/Library/Frameworks/R.framework/Versions/3.4/Resources/library/stats/help/fitted.values"

attr(,"call")
help(topic = "fitted", package = NULL)
attr(,"topic")
[1] "fitted"
attr(,"tried_all_packages")
[1] FALSE
attr(,"type")
[1] "html"

The first line says that the name should be "fitted.values".

Duncan Murdoch

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


[R-pkg-devel] File link does not exist : how to get the correct one?

2018-05-18 Thread Joris Meys
Hi all,

The latest changes in R cause a lot of Rd warnings about file links that
don't exist and are treated as a topic. One example is

\code{\link[stats]{fitted}}

Now if I look at ?fitted , the name of the page (top left corner) is given
as "fitted". So I would expect that the code above should just work fine,
but it generates the warning.

How can one get these names without having to browse through the directory
with html files?
Cheers
Joris

-- 
Joris Meys
Statistical consultant

Department of Data Analysis and Mathematical Modelling
Ghent University
Coupure Links 653, B-9000 Gent (Belgium)


tel: +32 (0)9 264 61 79
---
Biowiskundedagen 2017-2018
http://www.biowiskundedagen.ugent.be/

---
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php

[[alternative HTML version deleted]]

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel


Re: [R-pkg-devel] mvrnorm, eigen, tests, and R CMD check

2018-05-18 Thread Jari Oksanen
I am afraid that these suggestions may not work. There are more choices than 
Win32 and Win64, including several flavours of BLAS/Lapack which probably are 
involved if you evaluate eigenvalues, and also differences in hardware, 
compilers and phase of the moon.  If there are several equal eigenvalues, any 
solution of axes is arbitrary and it can be made stable for testing only by 
chance. If you have M equal eigenvalues, you should try to find a test that the 
M-dimensional (sub)space is approximately correct irrespective of random 
orientation of axes in this subspace.

Cheers, Jari Oksanen

On 18 May 2018, at 00:06 am, Kevin Coombes 
mailto:kevin.r.coom...@gmail.com>> wrote:

Yes; but I have been running around all day without time to sit down and
try them. The suggestions make sense, and I'm looking forward to
implementing them.

On Thu, May 17, 2018, 3:55 PM Ben Bolker 
mailto:bbol...@gmail.com>> wrote:

There have been various comments in this thread (by me, and I think
Duncan Murdoch) about how you can identify the platform you're running
on (some combination of .Platform and/or R.Version()) and use it to
write conditional statements so that your tests will only be compared
with reference values that were generated on the same platform ... did
those get through?  Did they make sense?

On Thu, May 17, 2018 at 3:30 PM, Kevin Coombes
mailto:kevin.r.coom...@gmail.com>> wrote:
Yes; I'm pretty sure that it is exactly the repeated eigenvalues that are
the issue. The matrices I am using are all nonsingular, and the various
algorithms have no problem computing the eigenvalues correctly (up to
numerical errors that I can bound and thus account for on tests by
rounding
appropriately). But an eigenvalue of multiplicity M has an M-dimensional
eigenspace with no preferred basis. So, any M-dimensional  (unitary)
change
of basis is permitted. That's what give rise to the lack of
reproducibility
across architectures. The choice of basis appears to use different
heuristics on 32-bit windows than on 64-bit Windows or Linux machines.
As a
result, I can't include the tests I'd like as part of a CRAN submission.

On Thu, May 17, 2018, 2:29 PM William Dunlap 
mailto:wdun...@tibco.com>> wrote:

Your explanation needs to be a bit more general in the case of identical
eigenvalues - each distinct eigenvalue has an associated subspace, whose
dimension is the number repeats of that eigenvalue and the eigenvectors
for
that eigenvalue are an orthonormal basis for that subspace.  (With no
repeated eigenvalues this gives your 'unique up to sign'.)

E.g., for the following 5x5 matrix with two eigenvalues of 1 and two of
0

x <- tcrossprod( cbind(c(1,0,0,0,1),c(0,1,0,0,1),c(0,0,1,0,1)) )
x
  [,1] [,2] [,3] [,4] [,5]
 [1,]10001
 [2,]01001
 [3,]00101
 [4,]00000
 [5,]11103
the following give valid but different (by more than sign) eigen vectors

e1 <- structure(list(values = c(4, 1, 0.999, 0,
-2.22044607159862e-16
), vectors = structure(c(-0.288675134594813, -0.288675134594813,
-0.288675134594813, 0, -0.866025403784439, 0, 0.707106781186547,
-0.707106781186547, 0, 0, 0.816496580927726, -0.408248290463863,
-0.408248290463863, 0, -6.10622663543836e-16, 0, 0, 0, -1, 0,
-0.5, -0.5, -0.5, 0, 0.5), .Dim = c(5L, 5L))), .Names = c("values",
"vectors"), class = "eigen")
e2 <- structure(list(values = c(4, 1, 1, 0, -2.29037708937563e-16),
   vectors = structure(c(0.288675134594813, 0.288675134594813,
   0.288675134594813, 0, 0.866025403784438, -0.784437556312061,
   0.588415847923579, 0.196021708388481, 0, 4.46410900710223e-17,
   0.22654886208902, 0.566068420404321, -0.79261728249334, 0,
   -1.11244069540181e-16, 0, 0, 0, -1, 0, -0.5, -0.5, -0.5,
   0, 0.5), .Dim = c(5L, 5L))), .Names = c("values", "vectors"
), class = "eigen")

I.e.,
all.equal(crossprod(e1$vectors), diag(5), tol=0)
[1] "Mean relative difference: 1.407255e-15"
all.equal(crossprod(e2$vectors), diag(5), tol=0)
[1] "Mean relative difference: 3.856478e-15"
all.equal(e1$vectors %*% diag(e1$values) %*% t(e1$vectors), x, tol=0)
[1] "Mean relative difference: 1.110223e-15"
all.equal(e2$vectors %*% diag(e2$values) %*% t(e2$vectors), x, tol=0)
[1] "Mean relative difference: 9.069735e-16"

e1$vectors
  [,1]   [,2]  [,3] [,4] [,5]
[1,] -0.2886751  0.000  8.164966e-010 -0.5
[2,] -0.2886751  0.7071068 -4.082483e-010 -0.5
[3,] -0.2886751 -0.7071068 -4.082483e-010 -0.5
[4,]  0.000  0.000  0.00e+00   -1  0.0
[5,] -0.8660254  0.000 -6.106227e-160  0.5
e2$vectors
 [,1]  [,2]  [,3] [,4] [,5]
[1,] 0.2886751 -7.844376e-01  2.265489e-010 -0.5
[2,] 0.2886751  5.884158e-01  5.660684e-010 -0.5
[3,] 0.2886751  1.960217e-01 -7.926173e-010 -0.5
[4,] 0.000  0.00e+00  0.00e+00   -1  0.0
[5,] 0.8660254  4.464109e-17 -1.112441e-160  0.5





Bill Dunlap
TIBCO Software
wdunlap tibco.com

Re: [R-pkg-devel] mvrnorm, eigen, tests, and R CMD check

2018-05-18 Thread Facundo Muñoz
In my opinion, the underlying problem is that you are checking whether
the test reproduces exactly your pre-computed solution, while there
actually exist other valid answers.

I believe you want to check whether the sub-spaces are the same, not
whether the bases are identical (which can depend on platform, linear
algebra library, etc.)

ƒacu.-



On 05/17/2018 09:30 PM, Kevin Coombes wrote:
> Yes; I'm pretty sure that it is exactly the repeated eigenvalues that are
> the issue. The matrices I am using are all nonsingular, and the various
> algorithms have no problem computing the eigenvalues correctly (up to
> numerical errors that I can bound and thus account for on tests by rounding
> appropriately). But an eigenvalue of multiplicity M has an M-dimensional
> eigenspace with no preferred basis. So, any M-dimensional  (unitary) change
> of basis is permitted. That's what give rise to the lack of reproducibility
> across architectures. The choice of basis appears to use different
> heuristics on 32-bit windows than on 64-bit Windows or Linux machines. As a
> result, I can't include the tests I'd like as part of a CRAN submission.
>
> On Thu, May 17, 2018, 2:29 PM William Dunlap  wrote:
>
>> Your explanation needs to be a bit more general in the case of identical
>> eigenvalues - each distinct eigenvalue has an associated subspace, whose
>> dimension is the number repeats of that eigenvalue and the eigenvectors for
>> that eigenvalue are an orthonormal basis for that subspace.  (With no
>> repeated eigenvalues this gives your 'unique up to sign'.)
>>
>> E.g., for the following 5x5 matrix with two eigenvalues of 1 and two of 0
>>
>>   > x <- tcrossprod( cbind(c(1,0,0,0,1),c(0,1,0,0,1),c(0,0,1,0,1)) )
>>   > x
>>[,1] [,2] [,3] [,4] [,5]
>>   [1,]10001
>>   [2,]01001
>>   [3,]00101
>>   [4,]00000
>>   [5,]11103
>> the following give valid but different (by more than sign) eigen vectors
>>
>> e1 <- structure(list(values = c(4, 1, 0.999, 0,
>> -2.22044607159862e-16
>> ), vectors = structure(c(-0.288675134594813, -0.288675134594813,
>> -0.288675134594813, 0, -0.866025403784439, 0, 0.707106781186547,
>> -0.707106781186547, 0, 0, 0.816496580927726, -0.408248290463863,
>> -0.408248290463863, 0, -6.10622663543836e-16, 0, 0, 0, -1, 0,
>> -0.5, -0.5, -0.5, 0, 0.5), .Dim = c(5L, 5L))), .Names = c("values",
>> "vectors"), class = "eigen")
>> e2 <- structure(list(values = c(4, 1, 1, 0, -2.29037708937563e-16),
>> vectors = structure(c(0.288675134594813, 0.288675134594813,
>> 0.288675134594813, 0, 0.866025403784438, -0.784437556312061,
>> 0.588415847923579, 0.196021708388481, 0, 4.46410900710223e-17,
>> 0.22654886208902, 0.566068420404321, -0.79261728249334, 0,
>> -1.11244069540181e-16, 0, 0, 0, -1, 0, -0.5, -0.5, -0.5,
>> 0, 0.5), .Dim = c(5L, 5L))), .Names = c("values", "vectors"
>> ), class = "eigen")
>>
>> I.e.,
>>> all.equal(crossprod(e1$vectors), diag(5), tol=0)
>> [1] "Mean relative difference: 1.407255e-15"
>>> all.equal(crossprod(e2$vectors), diag(5), tol=0)
>> [1] "Mean relative difference: 3.856478e-15"
>>> all.equal(e1$vectors %*% diag(e1$values) %*% t(e1$vectors), x, tol=0)
>> [1] "Mean relative difference: 1.110223e-15"
>>> all.equal(e2$vectors %*% diag(e2$values) %*% t(e2$vectors), x, tol=0)
>> [1] "Mean relative difference: 9.069735e-16"
>>
>>> e1$vectors
>>[,1]   [,2]  [,3] [,4] [,5]
>> [1,] -0.2886751  0.000  8.164966e-010 -0.5
>> [2,] -0.2886751  0.7071068 -4.082483e-010 -0.5
>> [3,] -0.2886751 -0.7071068 -4.082483e-010 -0.5
>> [4,]  0.000  0.000  0.00e+00   -1  0.0
>> [5,] -0.8660254  0.000 -6.106227e-160  0.5
>>> e2$vectors
>>   [,1]  [,2]  [,3] [,4] [,5]
>> [1,] 0.2886751 -7.844376e-01  2.265489e-010 -0.5
>> [2,] 0.2886751  5.884158e-01  5.660684e-010 -0.5
>> [3,] 0.2886751  1.960217e-01 -7.926173e-010 -0.5
>> [4,] 0.000  0.00e+00  0.00e+00   -1  0.0
>> [5,] 0.8660254  4.464109e-17 -1.112441e-160  0.5
>>
>>
>>
>>
>>
>> Bill Dunlap
>> TIBCO Software
>> wdunlap tibco.com
>>
>> On Thu, May 17, 2018 at 10:14 AM, Martin Maechler <
>> maech...@stat.math.ethz.ch> wrote:
>>
 Duncan Murdoch 
 on Thu, 17 May 2018 12:13:01 -0400 writes:
>>> > On 17/05/2018 11:53 AM, Martin Maechler wrote:
>>> >>> Kevin Coombes ... on Thu, 17
>>> >>> May 2018 11:21:23 -0400 writes:
>>>
>>> >>[..]
>>>
>>> >> > [3] Should the documentation (man page) for "eigen" or
>>> >> > "mvrnorm" include a warning that the results can change
>>> >> > from machine to machine (or between things like 32-bit and
>>> >> > 64-bit R on the same machine) because of difference in
>>> >> > linear algebra modules? (Possibly including the statement
>>> >> > that "set.seed" won't save you.)
>>>
>>> >> The problem 

Re: [R-pkg-devel] mvrnorm, eigen, tests, and R CMD check

2018-05-18 Thread Martin Maechler
> William Dunlap 
> on Thu, 17 May 2018 11:28:50 -0700 writes:

> Your explanation needs to be a bit more general in the
> case of identical eigenvalues - each distinct eigenvalue
> has an associated subspace, whose dimension is the number
> repeats of that eigenvalue and the eigenvectors for that
> eigenvalue are an orthonormal basis for that subspace.
> (With no repeated eigenvalues this gives your 'unique up
> to sign'.)

Thank you, Bill, notably for the concrete example of non-trivial
eigenspaces (per eigenvector). 
Note I did say

  "... such that at least for the good cases where all eigenspaces
   are 1-dimensional, ..."

knowing well that only in that case it "is easy".
I have a gut feeling but may be wrong that such simplistic post
processing may also help (to get cross-platform reproducibility)
in the case of MASS::mvrnorm() where repeated eigenvalues will
be common in practice.

Martin

__
R-package-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-package-devel