Re: [Rd] Inconsistent rank in qr()

2018-01-22 Thread Martin Maechler
> Serguei Sokol 
> on Mon, 22 Jan 2018 17:57:47 +0100 writes:

> Le 22/01/2018 à 17:40, Keith O'Hara a écrit :
>> This behavior is noted in the qr documentation, no?
>> 
>> rank - the rank of x as computed by the decomposition(*): always full 
rank in the LAPACK case.
> For a me a "full rank matrix" is a matrix the rank of which is indeed 
min(nrow(A), ncol(A))
> but here the meaning of "always is full rank" is somewhat confusing. Does 
it mean
> that only full rank matrices must be submitted to qr() when LAPACK=TRUE?
> May be there is a jargon where "full rank" is a synonym of min(nrow(A), 
ncol(A)) for any matrix
> but the fix to stick with commonly admitted rank definition (i.e. the 
number of linearly independent
> columns in A) is so easy. Why to discard lapack case from it (even 
properly documented)?

Because 99.5% of caller to qr()  never look at '$rank', 
so why should we compute it every time qr() is called?

==> Matrix :: rankMatrix() does use "qr" as one of its several methods.

--

As wiser people than me have said (I'm paraphrasing, don't find a nice 
citation):

  While the rank of a matrix is a very well defined concept in
  mathematics (theory), its practical computation on a finite
  precision computer is much more challenging.

The ?rankMatrix  help page (package Matrix, part of your R)
   https://stat.ethz.ch/R-manual/R-devel/library/Matrix/html/rankMatrix.html
starts with the following 'Description' 

__ Compute ‘the’ matrix rank, a well-defined functional in theory(*), somewhat 
ambigous in practice. We provide several methods, the default corresponding to 
Matlab's definition.

__ (*) The rank of a n x m matrix A, rk(A) is the maximal number of linearly 
independent columns (or rows); hence rk(A) <= min(n,m).


>>> On Jan 22, 2018, at 11:21 AM, Serguei Sokol  
wrote:
>>> 
>>> Hi,
>>> 
>>> I have noticed different rank values calculated by qr() depending on
>>> LAPACK parameter. When it is FALSE (default) a true rank is estimated 
and returned.
>>> Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A)) is 
returned
>>> which is only occasionally a true rank.
>>> 
>>> Would not it be more consistent to replace the rank in the latter case 
by something
>>> based on the following pseudo code ?
>>> 
>>> d=abs(diag(qr))
>>> rank=sum(d >= d[1]*tol)
>>> 
>>> Here, we rely on the fact column pivoting is activated in the called 
lapack routine (dgeqp3)
>>> and diagonal term in qr matrix are put in decreasing order (according 
to their absolute values).
>>> 
>>> Serguei.
>>> 
>>> How to reproduce:
>>> 
>>> a=diag(2)
>>> a[2,2]=0
>>> qaf=qr(a, LAPACK=FALSE)
>>> qaf$rank # shows 1. OK it's the true rank value
>>> qat=qr(a, LAPACK=TRUE)
>>> qat$rank #shows 2. Bad, it's not the expected value.
>>> 

> -- 
> Serguei Sokol
> Ingenieur de recherche INRA

> Cellule mathématique
> LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
> 135 Avenue de Rangueil
> 31077 Toulouse Cedex 04

> tel: +33 5 6155 9849
> email: so...@insa-toulouse.fr
> http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Inconsistent rank in qr()

2018-01-22 Thread Keith O'Hara
I agree the result is a little confusing, but the behavior is in line with the 
documentation and so not ‘unexpected’ as such...

I don’t think this is a matter of semantics, more of a ‘return the rank when we 
have it for free’ situation—when A is real-valued, qr(A,LAPACK=false) calls a 
modified version of Linpack’s dqrdc which computes the rank internally; see 
dqrdc2.f, output ‘k’. Lapack’s dqeqp3 doesn’t return the rank as is.

Just my 2 cents on a potential fix: Matrix::rankMatrix follows the logic of 
your code, and I think this would be simpler to implement than modifying 
Lapack.c in two places (around lines 657 and 1175).

Keith


> On Jan 22, 2018, at 11:57 AM, Serguei Sokol  wrote:
> 
> Le 22/01/2018 à 17:40, Keith O'Hara a écrit :
>> This behavior is noted in the qr documentation, no?
>> 
>> rank - the rank of x as computed by the decomposition(*): always full rank 
>> in the LAPACK case.
> For a me a "full rank matrix" is a matrix the rank of which is indeed 
> min(nrow(A), ncol(A))
> but here the meaning of "always is full rank" is somewhat confusing. Does it 
> mean
> that only full rank matrices must be submitted to qr() when LAPACK=TRUE?
> May be there is a jargon where "full rank" is a synonym of min(nrow(A), 
> ncol(A)) for any matrix
> but the fix to stick with commonly admitted rank definition (i.e. the number 
> of linearly independent
> columns in A) is so easy. Why to discard lapack case from it (even properly 
> documented)?
> 
>> 
>> 
>> 
>>> On Jan 22, 2018, at 11:21 AM, Serguei Sokol  wrote:
>>> 
>>> Hi,
>>> 
>>> I have noticed different rank values calculated by qr() depending on
>>> LAPACK parameter. When it is FALSE (default) a true rank is estimated and 
>>> returned.
>>> Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A)) is 
>>> returned
>>> which is only occasionally a true rank.
>>> 
>>> Would not it be more consistent to replace the rank in the latter case by 
>>> something
>>> based on the following pseudo code ?
>>> 
>>> d=abs(diag(qr))
>>> rank=sum(d >= d[1]*tol)
>>> 
>>> Here, we rely on the fact column pivoting is activated in the called lapack 
>>> routine (dgeqp3)
>>> and diagonal term in qr matrix are put in decreasing order (according to 
>>> their absolute values).
>>> 
>>> Serguei.
>>> 
>>> How to reproduce:
>>> 
>>> a=diag(2)
>>> a[2,2]=0
>>> qaf=qr(a, LAPACK=FALSE)
>>> qaf$rank # shows 1. OK it's the true rank value
>>> qat=qr(a, LAPACK=TRUE)
>>> qat$rank #shows 2. Bad, it's not the expected value.
>>> 
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
> 
> -- 
> Serguei Sokol
> Ingenieur de recherche INRA
> 
> Cellule mathématique
> LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
> 135 Avenue de Rangueil
> 31077 Toulouse Cedex 04
> 
> tel: +33 5 6155 9849
> email: so...@insa-toulouse.fr 
> http://www.lisbp.fr 

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.character(list(NA))

2018-01-22 Thread Hervé Pagès

On 01/22/2018 01:02 PM, William Dunlap wrote:
I tend to avoid using as. functions on lists, since they act oddly 
in several ways.
E.g, if the list "L" consists entirely of scalar elements then 
as.numeric(L) acts like
as.numeric(unlist(L)) but if any element is not a scalar there is an 
error.


FWIW personally I see this as a nice feature and use as.numeric(L)
instead of as.numeric(unlist(L) in places where I'd rather fail than
getting something that is not parallel to the input.

H.


  as.character()
does not seem to make a distinction between the all-scalar and 
not-all-scalar cases

but does various things with NA's of various types.

Bill Dunlap
TIBCO Software
wdunlap tibco.com 



On Mon, Jan 22, 2018 at 11:14 AM, Robert McGehee 
mailto:rmcge...@walleyetrading.net>> wrote:


Also perhaps a surprise that the behavior depends on the mode of the NA.

 > is.na

(as.character(list(NA_real_)))
[1] FALSE
 > is.na

(as.character(list(NA_character_)))
[1] TRUE

Does this mean deparse() preserves NA-ness for NA_character_ but not
NA_real_?


-Original Message-
From: R-devel [mailto:r-devel-boun...@r-project.org
] On Behalf Of Hervé Pagès
Sent: Monday, January 22, 2018 2:01 PM
To: William Dunlap mailto:wdun...@tibco.com>>;
Patrick Perry mailto:ppe...@stern.nyu.edu>>
Cc: r-devel@r-project.org 
Subject: Re: [Rd] as.character(list(NA))

On 01/20/2018 08:24 AM, William Dunlap via R-devel wrote:
 > I believe that for a list as.character() applies deparse()  to
each element
 > of the list.  deparse() does not preserve NA-ness, as it is
intended to
 > make text that the parser can read.
 >
 >> str(as.character(list(Na=NA, LglVec=c(TRUE,NA),
 > Function=function(x){x+1})))
 >   chr [1:3] "NA" "c(TRUE, NA)" "function (x) \n{\n    x + 1\n}"
 >

This really comes as a surprise though since coercion to all the
other atomic types (except raw) preserve the NAs.

And also as.character(unlist(list(NA))) preserves them.

H.

 >
 > Bill Dunlap
 > TIBCO Software
 > wdunlap tibco.com


 >
 > On Sat, Jan 20, 2018 at 7:43 AM, Patrick Perry
mailto:ppe...@stern.nyu.edu>> wrote:
 >
 >> As of R Under development (unstable) (2018-01-19 r74138):
 >>
 >>> as.character(list(NA))
 >> [1] "NA"
 >>
 >>> is.na

(as.character(list(NA)))
 >> [1] FALSE
 >>
 >> __
 >> R-devel@r-project.org  mailing list
 >>

https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=VbamM9XRQOlfBakrmlrmQZ7DLgXZ-hhhFeLD-fKpoCo&s=Luhqwpr2bTltIA9Cy7kA4gwcQh16bla0S6OVe3Z09Xo&e=


 >>
 >
 >       [[alternative HTML version deleted]]
 >
 > __
 > R-devel@r-project.org  mailing list
 >

https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=VbamM9XRQOlfBakrmlrmQZ7DLgXZ-hhhFeLD-fKpoCo&s=Luhqwpr2bTltIA9Cy7kA4gwcQh16bla0S6OVe3Z09Xo&e=



Re: [Rd] as.character(list(NA))

2018-01-22 Thread William Dunlap via R-devel
I tend to avoid using as. functions on lists, since they act oddly in
several ways.
E.g, if the list "L" consists entirely of scalar elements then
as.numeric(L) acts like
as.numeric(unlist(L)) but if any element is not a scalar there is an
error.  as.character()
does not seem to make a distinction between the all-scalar and
not-all-scalar cases
but does various things with NA's of various types.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Mon, Jan 22, 2018 at 11:14 AM, Robert McGehee <
rmcge...@walleyetrading.net> wrote:

> Also perhaps a surprise that the behavior depends on the mode of the NA.
>
> > is.na(as.character(list(NA_real_)))
> [1] FALSE
> > is.na(as.character(list(NA_character_)))
> [1] TRUE
>
> Does this mean deparse() preserves NA-ness for NA_character_ but not
> NA_real_?
>
>
> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Hervé
> Pagès
> Sent: Monday, January 22, 2018 2:01 PM
> To: William Dunlap ; Patrick Perry <
> ppe...@stern.nyu.edu>
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] as.character(list(NA))
>
> On 01/20/2018 08:24 AM, William Dunlap via R-devel wrote:
> > I believe that for a list as.character() applies deparse()  to each
> element
> > of the list.  deparse() does not preserve NA-ness, as it is intended to
> > make text that the parser can read.
> >
> >> str(as.character(list(Na=NA, LglVec=c(TRUE,NA),
> > Function=function(x){x+1})))
> >   chr [1:3] "NA" "c(TRUE, NA)" "function (x) \n{\nx + 1\n}"
> >
>
> This really comes as a surprise though since coercion to all the
> other atomic types (except raw) preserve the NAs.
>
> And also as.character(unlist(list(NA))) preserves them.
>
> H.
>
> >
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> > On Sat, Jan 20, 2018 at 7:43 AM, Patrick Perry 
> wrote:
> >
> >> As of R Under development (unstable) (2018-01-19 r74138):
> >>
> >>> as.character(list(NA))
> >> [1] "NA"
> >>
> >>> is.na(as.character(list(NA)))
> >> [1] FALSE
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.
> ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=
> BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=
> VbamM9XRQOlfBakrmlrmQZ7DLgXZ-hhhFeLD-fKpoCo&s=
> Luhqwpr2bTltIA9Cy7kA4gwcQh16bla0S6OVe3Z09Xo&e=
> >>
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-devel@r-project.org mailing list
> > https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.
> ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=
> BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=
> VbamM9XRQOlfBakrmlrmQZ7DLgXZ-hhhFeLD-fKpoCo&s=
> Luhqwpr2bTltIA9Cy7kA4gwcQh16bla0S6OVe3Z09Xo&e=
> >
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fredhutch.org
> Phone:  (206) 667-5791
> Fax:(206) 667-1319
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.character(list(NA))

2018-01-22 Thread Robert McGehee
Also perhaps a surprise that the behavior depends on the mode of the NA. 

> is.na(as.character(list(NA_real_)))
[1] FALSE
> is.na(as.character(list(NA_character_)))
[1] TRUE

Does this mean deparse() preserves NA-ness for NA_character_ but not NA_real_?


-Original Message-
From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Hervé Pagès
Sent: Monday, January 22, 2018 2:01 PM
To: William Dunlap ; Patrick Perry 
Cc: r-devel@r-project.org
Subject: Re: [Rd] as.character(list(NA))

On 01/20/2018 08:24 AM, William Dunlap via R-devel wrote:
> I believe that for a list as.character() applies deparse()  to each element
> of the list.  deparse() does not preserve NA-ness, as it is intended to
> make text that the parser can read.
> 
>> str(as.character(list(Na=NA, LglVec=c(TRUE,NA),
> Function=function(x){x+1})))
>   chr [1:3] "NA" "c(TRUE, NA)" "function (x) \n{\nx + 1\n}"
> 

This really comes as a surprise though since coercion to all the
other atomic types (except raw) preserve the NAs.

And also as.character(unlist(list(NA))) preserves them.

H.

> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> On Sat, Jan 20, 2018 at 7:43 AM, Patrick Perry  wrote:
> 
>> As of R Under development (unstable) (2018-01-19 r74138):
>>
>>> as.character(list(NA))
>> [1] "NA"
>>
>>> is.na(as.character(list(NA)))
>> [1] FALSE
>>
>> __
>> R-devel@r-project.org mailing list
>> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=VbamM9XRQOlfBakrmlrmQZ7DLgXZ-hhhFeLD-fKpoCo&s=Luhqwpr2bTltIA9Cy7kA4gwcQh16bla0S6OVe3Z09Xo&e=
>>
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=VbamM9XRQOlfBakrmlrmQZ7DLgXZ-hhhFeLD-fKpoCo&s=Luhqwpr2bTltIA9Cy7kA4gwcQh16bla0S6OVe3Z09Xo&e=
> 

-- 
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] as.character(list(NA))

2018-01-22 Thread Hervé Pagès

On 01/20/2018 08:24 AM, William Dunlap via R-devel wrote:

I believe that for a list as.character() applies deparse()  to each element
of the list.  deparse() does not preserve NA-ness, as it is intended to
make text that the parser can read.


str(as.character(list(Na=NA, LglVec=c(TRUE,NA),

Function=function(x){x+1})))
  chr [1:3] "NA" "c(TRUE, NA)" "function (x) \n{\nx + 1\n}"



This really comes as a surprise though since coercion to all the
other atomic types (except raw) preserve the NAs.

And also as.character(unlist(list(NA))) preserves them.

H.



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Sat, Jan 20, 2018 at 7:43 AM, Patrick Perry  wrote:


As of R Under development (unstable) (2018-01-19 r74138):


as.character(list(NA))

[1] "NA"


is.na(as.character(list(NA)))

[1] FALSE

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=VbamM9XRQOlfBakrmlrmQZ7DLgXZ-hhhFeLD-fKpoCo&s=Luhqwpr2bTltIA9Cy7kA4gwcQh16bla0S6OVe3Z09Xo&e=



[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=VbamM9XRQOlfBakrmlrmQZ7DLgXZ-hhhFeLD-fKpoCo&s=Luhqwpr2bTltIA9Cy7kA4gwcQh16bla0S6OVe3Z09Xo&e=



--
Hervé Pagès

Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M1-B514
P.O. Box 19024
Seattle, WA 98109-1024

E-mail: hpa...@fredhutch.org
Phone:  (206) 667-5791
Fax:(206) 667-1319

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Inconsistent rank in qr()

2018-01-22 Thread Serguei Sokol

Le 22/01/2018 à 17:40, Keith O'Hara a écrit :

This behavior is noted in the qr documentation, no?

rank - the rank of x as computed by the decomposition(*): always full rank in 
the LAPACK case.

For a me a "full rank matrix" is a matrix the rank of which is indeed 
min(nrow(A), ncol(A))
but here the meaning of "always is full rank" is somewhat confusing. Does it 
mean
that only full rank matrices must be submitted to qr() when LAPACK=TRUE?
May be there is a jargon where "full rank" is a synonym of min(nrow(A), 
ncol(A)) for any matrix
but the fix to stick with commonly admitted rank definition (i.e. the number of 
linearly independent
columns in A) is so easy. Why to discard lapack case from it (even properly 
documented)?






On Jan 22, 2018, at 11:21 AM, Serguei Sokol  wrote:

Hi,

I have noticed different rank values calculated by qr() depending on
LAPACK parameter. When it is FALSE (default) a true rank is estimated and 
returned.
Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A)) is returned
which is only occasionally a true rank.

Would not it be more consistent to replace the rank in the latter case by 
something
based on the following pseudo code ?

d=abs(diag(qr))
rank=sum(d >= d[1]*tol)

Here, we rely on the fact column pivoting is activated in the called lapack 
routine (dgeqp3)
and diagonal term in qr matrix are put in decreasing order (according to their 
absolute values).

Serguei.

How to reproduce:

a=diag(2)
a[2,2]=0
qaf=qr(a, LAPACK=FALSE)
qaf$rank # shows 1. OK it's the true rank value
qat=qr(a, LAPACK=TRUE)
qat$rank #shows 2. Bad, it's not the expected value.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel




--
Serguei Sokol
Ingenieur de recherche INRA

Cellule mathématique
LISBP, INSA/INRA UMR 792, INSA/CNRS UMR 5504
135 Avenue de Rangueil
31077 Toulouse Cedex 04

tel: +33 5 6155 9849
email: so...@insa-toulouse.fr
http://www.lisbp.fr

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Inconsistent rank in qr()

2018-01-22 Thread Keith O'Hara
This behavior is noted in the qr documentation, no?

rank - the rank of x as computed by the decomposition(*): always full rank in 
the LAPACK case.



> On Jan 22, 2018, at 11:21 AM, Serguei Sokol  wrote:
> 
> Hi,
> 
> I have noticed different rank values calculated by qr() depending on
> LAPACK parameter. When it is FALSE (default) a true rank is estimated and 
> returned.
> Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A)) is 
> returned
> which is only occasionally a true rank.
> 
> Would not it be more consistent to replace the rank in the latter case by 
> something
> based on the following pseudo code ?
> 
> d=abs(diag(qr))
> rank=sum(d >= d[1]*tol)
> 
> Here, we rely on the fact column pivoting is activated in the called lapack 
> routine (dgeqp3)
> and diagonal term in qr matrix are put in decreasing order (according to 
> their absolute values).
> 
> Serguei.
> 
> How to reproduce:
> 
> a=diag(2)
> a[2,2]=0
> qaf=qr(a, LAPACK=FALSE)
> qaf$rank # shows 1. OK it's the true rank value
> qat=qr(a, LAPACK=TRUE)
> qat$rank #shows 2. Bad, it's not the expected value.
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Inconsistent rank in qr()

2018-01-22 Thread Serguei Sokol

Hi,

I have noticed different rank values calculated by qr() depending on
LAPACK parameter. When it is FALSE (default) a true rank is estimated and 
returned.
Unfortunately, when LAPACK is set to TRUE, the min(nrow(A), ncol(A)) is returned
which is only occasionally a true rank.

Would not it be more consistent to replace the rank in the latter case by 
something
based on the following pseudo code ?

d=abs(diag(qr))
rank=sum(d >= d[1]*tol)

Here, we rely on the fact column pivoting is activated in the called lapack 
routine (dgeqp3)
and diagonal term in qr matrix are put in decreasing order (according to their 
absolute values).

Serguei.

How to reproduce:

a=diag(2)
a[2,2]=0
qaf=qr(a, LAPACK=FALSE)
qaf$rank # shows 1. OK it's the true rank value
qat=qr(a, LAPACK=TRUE)
qat$rank #shows 2. Bad, it's not the expected value.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Better error message in loadNamespace

2018-01-22 Thread Thomas Lin Pedersen


> On 22 Jan 2018, at 16.21, Martin Maechler  wrote:
> 
>> Thomas Lin Pedersen mailto:thomas...@gmail.com>>
>>on Mon, 22 Jan 2018 14:32:27 +0100 writes:
> 
>> Hi I’ve just spend a bit of time debugging an error
>> arising in `loadNamespace`. The bottom line is that the
>> `vI` object is assigned within an `if` block but expected
>> to exist for all of the remaining code. In some cases
>> where the package library has been corrupted or when it
>> resides on a network drive with bad connection this can
>> lead to error messages complaining about `vI` object not
>> existing. Debugging through the error is difficult, both
>> because `loadNamespace` is called recursively through the
>> dependency graph and the error can arise at any depth. And
>> because the recursive calls are wrapped in `try` so the
>> code breaks some distance from the point where the error
>> occurred.
> 
>> I will suggest mitigating this by adding an `else` clause
>> to the `if` block where `vI` gets assigned that warns
>> about potential corruption of the library and names the
>> package that caused the error.
> 
> Not sure this is desirable... in general even though it may well
> be desirable in your use case...
> 
> You will be aware that this an important function that maybe
> called many times, e.g., notably even at R startup time and so
> must be very robust [hence the many try* settings] and must use
> messages/warnings that are suppressable etc etc.

I absolutely agree that even small changes in execution speed 
would be bad in such a low-level function. Still, the else clause 
would only get called in the event an error should be thrown so I
can’t envision any performance regression.

> 
> On reading the source, I tend to agree with you that it looks
> odd there is no  else  clause to that if(), but then there may
> be subtle good reasons for that we don't see now.

If there are reasons for the current construct, then that should of
course be taken into account. As far as I can parse every code
branch that follows the if statement ends up referencing vI, but
lazy evaluation might result in those expression to never be 
evaluated so it might be valid calls in some circumstances..?

Anyway, I can accept the argument that changing it might break
things in unexpected ways :-)

> 
>> I can open a bug report if you wish, but I would require a
>> bugzilla account for that. Otherwise you’re also welcome
>> to take it from here.
> 
> I'll do that for you in any case.

thanks

best
Thomas

> 
> Martin Maechler
> ETH Zurich
> 
> 
>> With best wishes Thomas Lin Pedersen


[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] Better error message in loadNamespace

2018-01-22 Thread Martin Maechler
> Thomas Lin Pedersen 
> on Mon, 22 Jan 2018 14:32:27 +0100 writes:

> Hi I’ve just spend a bit of time debugging an error
> arising in `loadNamespace`. The bottom line is that the
> `vI` object is assigned within an `if` block but expected
> to exist for all of the remaining code. In some cases
> where the package library has been corrupted or when it
> resides on a network drive with bad connection this can
> lead to error messages complaining about `vI` object not
> existing. Debugging through the error is difficult, both
> because `loadNamespace` is called recursively through the
> dependency graph and the error can arise at any depth. And
> because the recursive calls are wrapped in `try` so the
> code breaks some distance from the point where the error
> occurred.

> I will suggest mitigating this by adding an `else` clause
> to the `if` block where `vI` gets assigned that warns
> about potential corruption of the library and names the
> package that caused the error.

Not sure this is desirable... in general even though it may well
be desirable in your use case...

You will be aware that this an important function that maybe
called many times, e.g., notably even at R startup time and so
must be very robust [hence the many try* settings] and must use
messages/warnings that are suppressable etc etc.

On reading the source, I tend to agree with you that it looks
odd there is no  else  clause to that if(), but then there may
be subtle good reasons for that we don't see now.

> I can open a bug report if you wish, but I would require a
> bugzilla account for that. Otherwise you’re also welcome
> to take it from here.

I'll do that for you in any case.

Martin Maechler
ETH Zurich


> With best wishes Thomas Lin Pedersen

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] How to address the following: CRAN packages not using Suggests conditionally

2018-01-22 Thread Martin Morgan

On 01/22/2018 08:40 AM, Ulrich Bodenhofer wrote:
Thanks a lot, Iñaki, this is a perfect solution! I already implemented 
it and it works great. I'll wait for 2 more days before I submit the 
revised package to CRAN - in order to give others to comment on it.


It's very easy for 'pictures of code' (unevaluated code chunks in 
vignettes) to drift from the actual implementation. So I'd really 
encourage your conditional evaluation to be as narrow as possible -- 
during CRAN or even CRAN fedora checks. Certainly trying to use 
uninstalled Suggest'ed packages in vignettes should provide an error 
message that is informative to users. Presumably the developer or user 
intends actually to execute the code, and needs to struggle through 
whatever issues come up. I'm not sure whether my comments are consistent 
with Writing R Extensions or not.


There is a fundamental tension between the CRAN and Bioconductor release 
models. The Bioconductor 'devel' package repositories and nightly builds 
are meant to be a place where new features and breaking changes can be 
introduced and problems resolved before being exposed to general users 
as a stable 'release' branch, once every six months. This means that the 
Bioconductor devel branch periodically (as recently and I suspect over 
the next several days) contains considerable carnage that propagates to 
CRAN devel builds, creating additional work for CRAN maintainers.


Martin Morgan
Bioconductor



Best regards,
Ulrich


On 01/22/2018 10:16 AM, Iñaki Úcar wrote:
Re-sending, since I forgot to include the list, sorry. I'm including 
r-package-devel too this time, as it seems more appropriate for this 
list.



El 22 ene. 2018 10:11, "Iñaki Úcar" > escribió:




    El 22 ene. 2018 8:12, "Ulrich Bodenhofer"
    mailto:bodenho...@bioinf.jku.at>> 
escribió:


    Dear colleagues, dear members of the R Core Team,

    This was an issue raised by Prof. Brian Ripley and sent
    privately to all developers of CRAN packages that suggest
    Bioconductor packages (see original message below). As
    mentioned in my message enclosed below, it was easy for me to
    fix the error in examples (new version not submitted to CRAN
    yet), but it might turn into a major effort for the warnings
    raised by the package vignette. Since I have not gotten any
    advice yet, I take the liberty to post it here on this list -
    hoping that we reach a conclusion here how to deal with this
    matter.


    Just disable code chunk evaluation if suggested packages are
    missing (see [1]). As explained by Prof. Ripley, it will only
    affect Fedora checks on r-devel, i.e., your users will still see
    fully evaluated vignettes on CRAN.

    [1] https://www.enchufa2.es/archives/suggests-and-vignettes.html
    

    Iñaki


    Thanks in advance for your kind assistance,
    Ulrich Bodenhofer



     Forwarded Message 
    Subject:        Re: CRAN packages not using Suggests 
conditionally

    Date:   Mon, 15 Jan 2018 08:44:40 +0100
    From:   Ulrich Bodenhofer mailto:bodenho...@bioinf.jku.at>>
    To:     Prof Brian Ripley mailto:rip...@stats.ox.ac.uk>>
    CC:     [...stripped for the sake of privacy ...]



    Dear Prof. Ripley,

    Thank you very much for bringing this important issue to my
    attention. I
    am the maintainer of the 'apcluster' package. My package 
refers to

    'Biostrings' in an example section of a help page (a quite
    insignificant
    one, by the way), which creates errors on some platforms. It
    also refers
    to 'kebabs' in the package vignette, which leads to warnings.

    I could fix the first, more severe, problem quite easily, (1)
    since it
    is relatively easy to wrap an entire examples section in a
    conditional,
    and (2), as I have mentioned, it is not a particularly
    important help page.

    Regarding the vignette, I want to ask for your advice now,
    since the
    situation appears more complicated to me. While it is, of
    course, only
    one code chunk that loads the 'kebabs' package, five more code
    chunks
    depend on the package (more specifically, the data objects
    created by a
    method implemented in the package) - with quite some text in
    between. So
    the handling of the conditional loading of the package would
    propagate
    to multiple code chunks and also affect the validity of the
    explanations
    in between. I would see the following options:

    1. Remove the entire section of the vignette. That would be a
    pity,
    since I can no longer point the users to an otherwise interesting
    application of my package.
    2. Replace the code chunks by static LaTeX code suc

[Rd] Better error message in loadNamespace

2018-01-22 Thread Thomas Lin Pedersen
Hi

I’ve just spend a bit of time debugging an error arising in `loadNamespace`. 
The bottom line is that the `vI` object is assigned within an `if` block but 
expected to exist for all of the remaining code. In some cases where the 
package library has been corrupted or when it resides on a network drive with 
bad connection this can lead to error messages complaining about `vI` object 
not existing. Debugging through the error is difficult, both because 
`loadNamespace` is called recursively through the dependency graph and the 
error can arise at any depth. And because the recursive calls are wrapped in 
`try` so the code breaks some distance from the point where the error occurred.

I will suggest mitigating this by adding an `else` clause to the `if` block 
where `vI` gets assigned that warns about potential corruption of the library 
and names the package that caused the error.

I can open a bug report if you wish, but I would require a bugzilla account for 
that. Otherwise you’re also welcome to take it from here.

With best wishes
Thomas Lin Pedersen
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] How to address the following: CRAN packages not using Suggests conditionally

2018-01-22 Thread Ulrich Bodenhofer
Thanks a lot, Iñaki, this is a perfect solution! I already implemented 
it and it works great. I'll wait for 2 more days before I submit the 
revised package to CRAN - in order to give others to comment on it.


Best regards,
Ulrich


On 01/22/2018 10:16 AM, Iñaki Úcar wrote:
Re-sending, since I forgot to include the list, sorry. I'm including 
r-package-devel too this time, as it seems more appropriate for this 
list.



El 22 ene. 2018 10:11, "Iñaki Úcar" > escribió:




El 22 ene. 2018 8:12, "Ulrich Bodenhofer"
mailto:bodenho...@bioinf.jku.at>> escribió:

Dear colleagues, dear members of the R Core Team,

This was an issue raised by Prof. Brian Ripley and sent
privately to all developers of CRAN packages that suggest
Bioconductor packages (see original message below). As
mentioned in my message enclosed below, it was easy for me to
fix the error in examples (new version not submitted to CRAN
yet), but it might turn into a major effort for the warnings
raised by the package vignette. Since I have not gotten any
advice yet, I take the liberty to post it here on this list -
hoping that we reach a conclusion here how to deal with this
matter.


Just disable code chunk evaluation if suggested packages are
missing (see [1]). As explained by Prof. Ripley, it will only
affect Fedora checks on r-devel, i.e., your users will still see
fully evaluated vignettes on CRAN.

[1] https://www.enchufa2.es/archives/suggests-and-vignettes.html


Iñaki


Thanks in advance for your kind assistance,
Ulrich Bodenhofer



 Forwarded Message 
Subject:        Re: CRAN packages not using Suggests conditionally
Date:   Mon, 15 Jan 2018 08:44:40 +0100
From:   Ulrich Bodenhofer mailto:bodenho...@bioinf.jku.at>>
To:     Prof Brian Ripley mailto:rip...@stats.ox.ac.uk>>
CC:     [...stripped for the sake of privacy ...]



Dear Prof. Ripley,

Thank you very much for bringing this important issue to my
attention. I
am the maintainer of the 'apcluster' package. My package refers to
'Biostrings' in an example section of a help page (a quite
insignificant
one, by the way), which creates errors on some platforms. It
also refers
to 'kebabs' in the package vignette, which leads to warnings.

I could fix the first, more severe, problem quite easily, (1)
since it
is relatively easy to wrap an entire examples section in a
conditional,
and (2), as I have mentioned, it is not a particularly
important help page.

Regarding the vignette, I want to ask for your advice now,
since the
situation appears more complicated to me. While it is, of
course, only
one code chunk that loads the 'kebabs' package, five more code
chunks
depend on the package (more specifically, the data objects
created by a
method implemented in the package) - with quite some text in
between. So
the handling of the conditional loading of the package would
propagate
to multiple code chunks and also affect the validity of the
explanations
in between. I would see the following options:

1. Remove the entire section of the vignette. That would be a
pity,
since I can no longer point the users to an otherwise interesting
application of my package.
2. Replace the code chunks by static LaTeX code such that it
appears in
the PDF as if there were code chunks that had been run. This
sort of
undermines the philosophy of vignettes and also creates extra
effort for
me to maintain the vignette.
3. Use the functionality of 'kernlab' instead of 'kebabs' if
the latter
is not available. This would be technically possible, but (1)
the code
in the vignette will look much more complicated to the user
and (2)
'kernlab' does not implement the necessary functionality fully
correctly
and also has much longer run times. Needless to say, the issue
with
conditional loading will then simply propagate to 'kernlab'.

Which of the three solutions would you prefer? Do you see any
fourth
alternative? Or would you tolerate the warnings on some platforms
arising from the non-availability of packages suggested by the
package
vignette?

Thanks for your time and best regards,
Ulrich Bodenhofer

P.S.: @all: I hope it is acceptable that I replied to all. I
thought the
discussion would be interesting for some of you having similar
issues

Re: [Rd] How to address the following: CRAN packages not using Suggests conditionally

2018-01-22 Thread Iñaki Úcar
Re-sending, since I forgot to include the list, sorry. I'm including
r-package-devel too this time, as it seems more appropriate for this list.


El 22 ene. 2018 10:11, "Iñaki Úcar"  escribió:

>
>
> El 22 ene. 2018 8:12, "Ulrich Bodenhofer" 
> escribió:
>
> Dear colleagues, dear members of the R Core Team,
>
> This was an issue raised by Prof. Brian Ripley and sent privately to all
> developers of CRAN packages that suggest Bioconductor packages (see
> original message below). As mentioned in my message enclosed below, it was
> easy for me to fix the error in examples (new version not submitted to CRAN
> yet), but it might turn into a major effort for the warnings raised by the
> package vignette. Since I have not gotten any advice yet, I take the
> liberty to post it here on this list - hoping that we reach a conclusion
> here how to deal with this matter.
>
>
> Just disable code chunk evaluation if suggested packages are missing (see
> [1]). As explained by Prof. Ripley, it will only affect Fedora checks on
> r-devel, i.e., your users will still see fully evaluated vignettes on CRAN.
>
> [1] https://www.enchufa2.es/archives/suggests-and-vignettes.html
>
> Iñaki
>
>
> Thanks in advance for your kind assistance,
> Ulrich Bodenhofer
>
>
>
>  Forwarded Message 
> Subject:Re: CRAN packages not using Suggests conditionally
> Date:   Mon, 15 Jan 2018 08:44:40 +0100
> From:   Ulrich Bodenhofer 
> To: Prof Brian Ripley 
> CC: [...stripped for the sake of privacy ...]
>
>
>
> Dear Prof. Ripley,
>
> Thank you very much for bringing this important issue to my attention. I
> am the maintainer of the 'apcluster' package. My package refers to
> 'Biostrings' in an example section of a help page (a quite insignificant
> one, by the way), which creates errors on some platforms. It also refers
> to 'kebabs' in the package vignette, which leads to warnings.
>
> I could fix the first, more severe, problem quite easily, (1) since it
> is relatively easy to wrap an entire examples section in a conditional,
> and (2), as I have mentioned, it is not a particularly important help page.
>
> Regarding the vignette, I want to ask for your advice now, since the
> situation appears more complicated to me. While it is, of course, only
> one code chunk that loads the 'kebabs' package, five more code chunks
> depend on the package (more specifically, the data objects created by a
> method implemented in the package) - with quite some text in between. So
> the handling of the conditional loading of the package would propagate
> to multiple code chunks and also affect the validity of the explanations
> in between. I would see the following options:
>
> 1. Remove the entire section of the vignette. That would be a pity,
> since I can no longer point the users to an otherwise interesting
> application of my package.
> 2. Replace the code chunks by static LaTeX code such that it appears in
> the PDF as if there were code chunks that had been run. This sort of
> undermines the philosophy of vignettes and also creates extra effort for
> me to maintain the vignette.
> 3. Use the functionality of 'kernlab' instead of 'kebabs' if the latter
> is not available. This would be technically possible, but (1) the code
> in the vignette will look much more complicated to the user and (2)
> 'kernlab' does not implement the necessary functionality fully correctly
> and also has much longer run times. Needless to say, the issue with
> conditional loading will then simply propagate to 'kernlab'.
>
> Which of the three solutions would you prefer? Do you see any fourth
> alternative? Or would you tolerate the warnings on some platforms
> arising from the non-availability of packages suggested by the package
> vignette?
>
> Thanks for your time and best regards,
> Ulrich Bodenhofer
>
> P.S.: @all: I hope it is acceptable that I replied to all. I thought the
> discussion would be interesting for some of you having similar issues.
>
>
>
> On 01/14/2018 09:20 AM, Prof Brian Ripley wrote:
>
>> as required by §1.1.3.1 of the manual.
>>
>> The Bioconductor branch used by R-devel has been very unstable recently,
>> and it has been decided not to use it for the Fedora checks on R-devel. As
>> you can see from the CRAN results pages (at least at the time of writing),
>> packages
>>
>> ACMEeqtl BoSSA CNVassoc CorShrink GRANBase GenCAT GiANT NMF PlackettLuce
>> ProFit ProFound RNAseqNet SIBERG antaresRead apcluster cherry clValid coloc
>> colorhcplot entropart filematrix fuzzyforest fuzzyjoin glycanr hexbin loon
>> nscancor ordinalgmifs penalized phangorn propr shiftR switchr tcgsaseq
>> tileHMM tmod
>>
>> then give ERRORs or (new) WARNINGs on their checks.  Please correct ASAP,
>> and by Feb 20 to safely retain the package on CRAN.
>>
>>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>
>

[[alternative HTML version deleted]]

_