Re: [Rd] capturing multiple warnings in tryCatch()

2021-12-03 Thread Fox, John
Dear Adrian,

For consistency, you might want to put toreturn$value <- output$value inside of 
if (capture) {}. In any event, it makes sense for me to wait for the modified 
admisc::tryCatchWEM to find its way to CRAN rather than to maintain my own 
version of the function.

Thanks for this,
 John

On 2021-12-03, 6:27 PM, "R-devel on behalf of Adrian Dușa" 
 wrote:

Dear John,

The logical argument capture is already in production use by other
packages, but I think this is easily solved by:

if (!is.null(output$value) & output$visible) {
if (capture) {
toreturn$output <- capture.output(output$value)
}
toreturn$value <- output$value
}

so that value is always part of the return list, if visible.

This is a very good suggestion, and I've already incorporated it into this
function.

All the best,
Adrian

On Fri, 3 Dec 2021 at 21:42, Fox, John  wrote:

> Dear Adrian,
>
> Here's my slightly modified version of your function, which serves my
> purpose:
>
> --- snip ---
>
> tryCatchWEM <- function (expr, capture = TRUE) {
> toreturn <- list()
> output <- withVisible(withCallingHandlers(
> tryCatch(expr,
>  error = function(e) {
>  toreturn$error <<- e$message
>  NULL
>  }), warning = function(w) {
>  toreturn$warning <<- c(toreturn$warning, w$message)
>  invokeRestart("muffleWarning")
>  }, message = function(m) {
>  toreturn$message <<- paste(toreturn$message,
> m$message,
> sep = "")
>  invokeRestart("muffleMessage")
>  }))
> if (capture & output$visible) {
> if (!is.null(output$value)) {
> toreturn$result <- output$value
> }
> }
> if (length(toreturn) > 0) {
> return(toreturn)
> }
> }
>
> --- snip ---
>
> The two small modifications are to change the default of capture to TRUE
> and to return output$value rather than capture.output(output$value). So a
> suggestion would be to modify the capture argument to, say, 
capture=c("no",
> "output", "value") and then something like
>
> . . .
> capture <- match.arg(capture)
> . . .
> if (capture == "output"){
> toreturn$output <- capture.output(output$value)
> } else if (capture == "value"){
>     toreturn$value <- output$value
> }
> . . .
>
> Best,
>  John
>
> On 2021-12-03, 1:56 PM, "R-devel on behalf of Adrian Dușa" <
> r-devel-boun...@r-project.org on behalf of dusa.adr...@gmail.com> wrote:
>
> On Fri, 3 Dec 2021 at 00:37, Fox, John  wrote:
>
> > Dear Henrik, Simon, and Adrian,
> >
> > As it turns out Adrian's admisc::tryCatchWEM() *almost* does what I
> want,
> > which is both to capture all messages and the result of the
> expression
> > (rather than the visible representation of the result). I was easily
> able
> > to modify tryCatchWEM() to return the result.
> >
>
> Glad it helps.
> I would be happy to improve the function, should you send a reprex
> with the
> desired final result.
>
> Best wishes,
> Adrian
>
> [[alternative HTML version deleted]]
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] capturing multiple warnings in tryCatch()

2021-12-03 Thread Fox, John
Dear Adrian,

Here's my slightly modified version of your function, which serves my purpose:

--- snip ---

tryCatchWEM <- function (expr, capture = TRUE) {
toreturn <- list()
output <- withVisible(withCallingHandlers(
tryCatch(expr, 
 error = function(e) {
 toreturn$error <<- e$message
 NULL
 }), warning = function(w) {
 toreturn$warning <<- c(toreturn$warning, w$message)
 invokeRestart("muffleWarning")
 }, message = function(m) {
 toreturn$message <<- paste(toreturn$message, m$message, 
sep = "")
 invokeRestart("muffleMessage")
 }))
if (capture & output$visible) {
if (!is.null(output$value)) {
toreturn$result <- output$value
}
}
if (length(toreturn) > 0) {
return(toreturn)
}
}

--- snip ---

The two small modifications are to change the default of capture to TRUE and to 
return output$value rather than capture.output(output$value). So a suggestion 
would be to modify the capture argument to, say, capture=c("no", "output", 
"value") and then something like

. . .
capture <- match.arg(capture)
. . .
if (capture == "output"){
toreturn$output <- capture.output(output$value)
} else if (capture == "value"){
toreturn$value <- output$value
}
        . . .

Best,
 John

On 2021-12-03, 1:56 PM, "R-devel on behalf of Adrian Dușa" 
 wrote:

On Fri, 3 Dec 2021 at 00:37, Fox, John  wrote:

> Dear Henrik, Simon, and Adrian,
>
> As it turns out Adrian's admisc::tryCatchWEM() *almost* does what I want,
> which is both to capture all messages and the result of the expression
> (rather than the visible representation of the result). I was easily able
> to modify tryCatchWEM() to return the result.
>

Glad it helps.
I would be happy to improve the function, should you send a reprex with the
desired final result.

Best wishes,
Adrian

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] capturing multiple warnings in tryCatch()

2021-12-03 Thread Fox, John
Dear Rui,

Thanks for this. Simon referred to demo(error.catching), and Adrian's version 
returns the printed representation of the result along with messages.

Best,
 John

On 2021-12-03, 11:35 AM, "Rui Barradas"  wrote:

Hello,

I remembered having seen a function tryCatch.W.E and after an online 
search, found where.
It was in a R-Help post and in demo(error.catching). The question by 
Marius Hofert [1] was answered, among others, by Martin Maechler [2] 
which included the function tryCatch.W.E.

These posts refer to an old thread dated 2004 [3], with an answer by 
Luke Tierney [4]. The function withWarnings posted by Luke returns all 
warning messages in a list, as seen below.
I repost the function to have this self contained.



withWarnings <- function (expr) {
   warnings <- character()
   retval <- withCallingHandlers(expr, warning = function(ex) {
 warnings <<- c(warnings, conditionMessage(ex))
 invokeRestart("muffleWarning")
   })
   list(Value = retval, Warnings = warnings)
}
withWarnings(foo())
#$Value
#[1] "warning 2"
#
#$Warnings
#[1] "warning 1" "warning 2"



Function tryCatch.W.E is now part of contributed package simsalapar [5], 
with credits to Marius and Martin given in its documentation.


[1] https://stat.ethz.ch/pipermail/r-help/2010-December/262185.html
[2] https://stat.ethz.ch/pipermail/r-help/2010-December/262626.html
[3] https://stat.ethz.ch/pipermail/r-help/2004-June/052092.html
[4] https://stat.ethz.ch/pipermail/r-help/2004-June/052132.html
[5] https://CRAN.R-project.org/package=simsalapar


    Hope this helps,

Rui Barradas



Às 22:37 de 02/12/21, Fox, John escreveu:
> Dear Henrik, Simon, and Adrian,
> 
> As it turns out Adrian's admisc::tryCatchWEM() *almost* does what I want, 
which is both to capture all messages and the result of the expression (rather 
than the visible representation of the result). I was easily able to modify 
tryCatchWEM() to return the result.
> 
> Henrik: I was aware that tryCatch() doesn't return the final result of 
the expression, and I was previously re-executing the expression to capture the 
reult, but only getting the first warning message, along with the result.
> 
> Thanks for responding to my question and providing viable solutions,
>   John
> 
> On 2021-12-02, 5:19 PM, "Henrik Bengtsson"  
wrote:
> 
>  Simon's suggestion with withCallingHandlers() is the correct way.
>  Also, note that if you use tryCatch() to catch warnings, you're
>  *interrupting* the evaluation of the expression of interest, e.g.
> 
>  > res <- tryCatch({ message("hey"); warning("boom"); 
message("there"); 42 }, warning = function(w) { message("Warning caught: ", 
conditionMessage(w)); 3.14 })
>  hey
>  Warning caught: boom
>  > res
>  [1] 3.14
> 
>  Note how it never completes your expression.
> 
>  /Henrik
> 
>  On Thu, Dec 2, 2021 at 1:14 PM Simon Urbanek
>   wrote:
>  >
>  >
>  > Adapted from demo(error.catching):
>  >
>  > > W=list()
>  > > withCallingHandlers(foo(), warning=function(w) { W <<- c(W, 
list(w)); invokeRestart("muffleWarning") })
>  > > str(W)
>  > List of 2
>  >  $ :List of 2
>  >   ..$ message: chr "warning 1"
>  >   ..$ call   : language foo()
>  >   ..- attr(*, "class")= chr [1:3] "simpleWarning" "warning" 
"condition"
>  >  $ :List of 2
>  >   ..$ message: chr "warning 2"
>  >   ..$ call   : language foo()
>  >   ..- attr(*, "class")= chr [1:3] "simpleWarning" "warning" 
"condition"
>  >
>  > Cheers,
>  > Simon
>  >
>  >
>  > > On Dec 3, 2021, at 10:02 AM, Fox, John  wrote:
>  > >
>  > > Dear R-devel list members,
>  > >
>  > > Is it possible to capture more than one warning message using 
tryCatch()? The answer may be in ?conditions, but, if it is, I can't locate it.
>  > >
>  > > For example, in the following only the first warning message is 
captured and reported:
>  > >
>  > >> foo <- function(){
>  > > +   warni

Re: [Rd] capturing multiple warnings in tryCatch()

2021-12-02 Thread Fox, John
Dear Henrik, Simon, and Adrian,

As it turns out Adrian's admisc::tryCatchWEM() *almost* does what I want, which 
is both to capture all messages and the result of the expression (rather than 
the visible representation of the result). I was easily able to modify 
tryCatchWEM() to return the result.

Henrik: I was aware that tryCatch() doesn't return the final result of the 
expression, and I was previously re-executing the expression to capture the 
reult, but only getting the first warning message, along with the result. 

Thanks for responding to my question and providing viable solutions,
 John

On 2021-12-02, 5:19 PM, "Henrik Bengtsson"  wrote:

Simon's suggestion with withCallingHandlers() is the correct way.
Also, note that if you use tryCatch() to catch warnings, you're
*interrupting* the evaluation of the expression of interest, e.g.

> res <- tryCatch({ message("hey"); warning("boom"); message("there"); 42 
}, warning = function(w) { message("Warning caught: ", conditionMessage(w)); 
3.14 })
hey
Warning caught: boom
> res
[1] 3.14

Note how it never completes your expression.

/Henrik

On Thu, Dec 2, 2021 at 1:14 PM Simon Urbanek
 wrote:
>
>
> Adapted from demo(error.catching):
>
> > W=list()
> > withCallingHandlers(foo(), warning=function(w) { W <<- c(W, list(w)); 
invokeRestart("muffleWarning") })
> > str(W)
> List of 2
>  $ :List of 2
>   ..$ message: chr "warning 1"
>   ..$ call   : language foo()
>   ..- attr(*, "class")= chr [1:3] "simpleWarning" "warning" "condition"
>  $ :List of 2
>   ..$ message: chr "warning 2"
>   ..$ call   : language foo()
>   ..- attr(*, "class")= chr [1:3] "simpleWarning" "warning" "condition"
>
> Cheers,
> Simon
>
>
> > On Dec 3, 2021, at 10:02 AM, Fox, John  wrote:
> >
> > Dear R-devel list members,
> >
> > Is it possible to capture more than one warning message using 
tryCatch()? The answer may be in ?conditions, but, if it is, I can't locate it.
> >
> > For example, in the following only the first warning message is 
captured and reported:
> >
> >> foo <- function(){
> > +   warning("warning 1")
> > +   warning("warning 2")
> > + }
> >
> >> foo()
> > Warning messages:
> > 1: In foo() : warning 1
> > 2: In foo() : warning 2
> >
> >> bar <- function(){
> > +   tryCatch(foo(), warning=function(w) print(w))
> > + }
> >
> >> bar()
> > 
> >
> > Is there a way to capture "warning 2" as well?
> >
> > Any help would be appreciated.
> >
> > John
> >
> > --
> > John Fox, Professor Emeritus
> > McMaster University
> > Hamilton, Ontario, Canada
> > Web: http://socserv.mcmaster.ca/jfox/
> >
> >
> >
> > __
> > R-devel@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-devel
> >
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] capturing multiple warnings in tryCatch()

2021-12-02 Thread Fox, John
Dear R-devel list members,

Is it possible to capture more than one warning message using tryCatch()? The 
answer may be in ?conditions, but, if it is, I can't locate it.

For example, in the following only the first warning message is captured and 
reported:

> foo <- function(){
+   warning("warning 1")
+   warning("warning 2")
+ }

> foo()
Warning messages:
1: In foo() : warning 1
2: In foo() : warning 2

> bar <- function(){
+   tryCatch(foo(), warning=function(w) print(w))
+ }

> bar()


Is there a way to capture "warning 2" as well?

Any help would be appreciated.

John

-- 
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/
 
 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] tcltk image reading problem (on a mac?): [tcl] encountered an unsupported criticial chunk type "eXIf"

2020-06-11 Thread Fox, John
Dear Simon,

> On Jun 11, 2020, at 9:00 PM, Simon Urbanek  
> wrote:
> 
> Wayne,
> 
> that one is unrelated, but interesting - you can fix it with 
> 
> sudo install_name_tool -change \
>  /usr/local/lib:/opt/X11/lib/libtk8.6.dylib \
>  /usr/local/lib/libtk8.6.dylib \
>  /usr/local/bin/wish8.6 
> 
> There is a bug in tcltk with IDs on the libraries which I have worked-around 
> for R, but not for wish.
> 
> Back to the original question - do you have any example of a file that 
> doesn't work so I could test? Exif chunks are fairly rare in PNG and are a 
> more late extension so I couldn't find any examples.

The code in Wayne's original message (copied below) generated the offending 
file:

library(tcltk)

fname <- "Rplot.png"
png(filename = fname, width = 500, height = 500)
hist(rnorm(20))
dev.off()

tkimage.create("photo", file = fname)

Best,
 John


> 
> Thanks,
> Simon
> 
> 
>> On 12/06/2020, at 12:24 PM, Wayne Oldford  wrote:
>> 
>> I don't know what has changed with Catalina
>> 
>> But I just tried my tk console from the shell command tkcon 
>> And got the following error. 
>> Here is my shell:
>> 
>> $ tkcon 
>> 
>> dyld: Library not loaded: /usr/local/lib:/opt/X11/lib/libtk8.6.dylib
>> Referenced from: /usr/local/bin/wish
>> Reason: image not found
>>   Abort trap: 6
>> 
>> 
>> I don't know whether this is a red herring or not, but the 
>> Console fails to boot.
>> 
>> John does it work for you?
>> 
>> Not sure whether Python has the same trouble.  Kind of old info at 
>> https://www.python.org/download/mac/tcltk/ 
>> 
>> 
>> 
>> 
>> -Original Message-
>> From: "Fox, John" 
>> Date: Thursday, June 11, 2020 at 7:54 PM
>> To: Wayne Oldford 
>> Cc: Peter Dalgaard , "r-devel@r-project.org" 
>> 
>> Subject: Re: [Rd]  tcltk image reading problem (on a mac?): [tcl] 
>> encountered an unsupported criticial chunk type "eXIf"
>> 
>>   Dear Wayne and Peter,
>> 
>>   FWIW, I observe exactly the same problem in Catalina. The error and my 
>> session info:
>> 
>>    snip 
>> 
>>> tkimage.create("photo", file = fname)
>>   Error in structure(.External(.C_dotTclObjv, objv), class = "tclObj") : 
>> [tcl] encountered an unsupported criticial chunk type "eXIf".
>> 
>>> sessionInfo()
>>   R version 4.0.0 (2020-04-24)
>>   Platform: x86_64-apple-darwin17.0 (64-bit)
>>   Running under: macOS Catalina 10.15.5
>> 
>>   Matrix products: default
>>   BLAS:   
>> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
>>   LAPACK: 
>> /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
>> 
>>   locale:
>>   [1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8
>> 
>>   attached base packages:
>>   [1] tcltk stats graphics  grDevices utils datasets  methods   
>> base 
>> 
>>   loaded via a namespace (and not attached):
>>   [1] compiler_4.0.0 tools_4.0.0   
>> 
>>    snip 
>> 
>>   This is from RStudio but I see the same thing in the R.app.
>> 
>>   I hope this is of some help,
>>John
>> 
>>-
>> John Fox, Professor Emeritus
>> McMaster University
>> Hamilton, Ontario, Canada
>> Web: http::/socserv.mcmaster.ca/jfox
>> 
>>> On Jun 11, 2020, at 6:43 PM, Wayne Oldford  wrote:
>>> 
>>> Yes.
>>> I seem to be picking up
>>>   8.6
>>> I should have noted that.
>>> 
>>> Use to work for me too in Mojave.
>>> I have the sneaky feeling that Catalina is the problem.
>>> 
>>> R. W. Oldford
>>> 
>>> https://math.uwaterloo.ca/~rwoldfor
>>> 
>>> 
>>> From: Peter Dalgaard 
>>> Sent: Thursday, June 11, 2020 5:56:15 PM
>>> To: Wayne Oldford 
>>> Cc: r-devel@r-project.org 
>>> Subject: Re: [Rd] tcltk image reading problem (on a mac?): [tcl] 
>>> encountered an unsupported criticial chunk type "eXIf"
>>> 
>>> Happy enough for me on Mojave.
>>> 
>>> On the off chance that you are picking up an old Tcl, do you see this?
>>> 
>>>> tcl("info","tclversion")
>>>  8

Re: [Rd] tcltk image reading problem (on a mac?): [tcl] encountered an unsupported criticial chunk type "eXIf"

2020-06-11 Thread Fox, John
Dear Wayne and Peter,

FWIW, I observe exactly the same problem in Catalina. The error and my session 
info:

 snip 

> tkimage.create("photo", file = fname)
Error in structure(.External(.C_dotTclObjv, objv), class = "tclObj") : 
  [tcl] encountered an unsupported criticial chunk type "eXIf".

> sessionInfo()
R version 4.0.0 (2020-04-24)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Catalina 10.15.5

Matrix products: default
BLAS:   
/System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: 
/Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib

locale:
[1] en_CA.UTF-8/en_CA.UTF-8/en_CA.UTF-8/C/en_CA.UTF-8/en_CA.UTF-8

attached base packages:
[1] tcltk stats graphics  grDevices utils datasets  methods   base  
   

loaded via a namespace (and not attached):
[1] compiler_4.0.0 tools_4.0.0   

 snip 

This is from RStudio but I see the same thing in the R.app.

I hope this is of some help,
 John

 -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Jun 11, 2020, at 6:43 PM, Wayne Oldford  wrote:
> 
> Yes.
> I seem to be picking up
>   8.6
> I should have noted that.
> 
> Use to work for me too in Mojave.
> I have the sneaky feeling that Catalina is the problem.
> 
> R. W. Oldford
> 
> https://math.uwaterloo.ca/~rwoldfor
> 
> 
> From: Peter Dalgaard 
> Sent: Thursday, June 11, 2020 5:56:15 PM
> To: Wayne Oldford 
> Cc: r-devel@r-project.org 
> Subject: Re: [Rd] tcltk image reading problem (on a mac?): [tcl] encountered 
> an unsupported criticial chunk type "eXIf"
> 
> Happy enough for me on Mojave.
> 
> On the off chance that you are picking up an old Tcl, do you see this?
> 
>> tcl("info","tclversion")
>  8.6
> 
> 
> -pd
> 
>> On 11 Jun 2020, at 23:04 , Wayne Oldford  wrote:
>> 
>> Hello everyone
>> 
>> I am not sure when this appeared
>> (sometime post R 3.5.0 and after I switched to Mac OS Catalina).
>> 
>> I do not think it happens on all platforms (e.g. seems to work on windows).
>> 
>> But it seems that
>> 
>> tkimage.create()
>> 
>> no longer works on a Mac for all png files.
>> 
>> 
>> (It does work for *some* old png files I have on disk but I have not been 
>> able to determine what is different about the ones that work)
>> 
>> Any help would be appreciated.
>> 
>> - Wayne
>> 
>> 
>> R.W. Oldford
>> math.uwaterloo.ca/~rwoldfor
>> 
>> 
>> 
>> 
>>> library(tcltk)
>> 
>>> fname <- "Rplot.png"
>>> png(filename = fname, width = 500, height = 500)
>>> hist(rnorm(20))
>>> dev.off()
>> 
>>> tkimage.create("photo", file = fname)
>> 
>> Error in structure(.External(.C_dotTclObjv, objv), class = "tclObj") :
>> [tcl] encountered an unsupported criticial chunk type "eXIf".
>> 
>> 
>> __
>> 
>>> R.version
>>  _
>> platform   x86_64-apple-darwin17.0
>> arch   x86_64
>> os darwin17.0
>> system x86_64, darwin17.0
>> status
>> major  4
>> minor  0.0
>> year   2020
>> month  04
>> day24
>> svn rev78286
>> language   R
>> version.string R version 4.0.0 (2020-04-24)
>> nickname   Arbor Day
>> 
>> ___
>> 
>> macOS Catalina V 10.15.5
>> 
>> ___
>> 
>> 
>>   [[alternative HTML version deleted]]
>> 
>> __
>> R-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel



 
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [R] R 4.0.2 scheduled for June 22

2020-06-09 Thread Fox, John
Dear Peter,

Thank you very much for this.

To clarify slightly, the bug affects not just the Rcmdr package but use of the 
tcltk package on Windows more generally.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Jun 9, 2020, at 5:28 PM, Peter Dalgaard via R-help  
> wrote:
> 
> Unfortunatly, a memory allocation bug prevented the R Commander package from 
> working on Windows. This is fixed in R-patched, but we cannot have this not 
> working in the official release when IT departments start installing for the 
> Fall semester, so we need to issue a new release.
> 
> Full schedule is available on developer.r-project.org.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ___
> r-annou...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-announce
> 
> __
> r-h...@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] use of the tcltk package crashes R 4.0.1 for Windows

2020-06-08 Thread Fox, John
Dear Jeroen,

With the caveat that I've tested only a few of the Rcmdr dialogs (a full test 
takes hours and must be done manually), everything seems to be working fine 
again.

Thank you for addressing this problem so quickly.

John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Jun 8, 2020, at 3:36 AM, Jeroen Ooms  wrote:
> 
> On Mon, Jun 8, 2020 at 12:03 AM  wrote:
>> 
>> I've committed the change to use Free instead of free in tcltk.c and
>> sys-std.c (r78652 for R-devel, r78653 for R-patched).
> 
> Thank you! I can confirm that the example from above no longer crashes
> in R--patched.
> 
> John, can you confirm that everything seems to work now in Rcmd with
> today's R-patched build from CRAN?
> https://cran.r-project.org/bin/windows/base/rpatched.html
> 
> Hopefully Peter will be able to tag a 4.0.2 hotfix release based on
> 4.0.1 + above patch, without going through the full release
> procedure...

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] [External] Re: use of the tcltk package crashes R 4.0.1 for Windows

2020-06-07 Thread Fox, John
Hi,

Does it make sense to withdraw the Windows R 4.0.1 binary until the issue is 
resolved? 

Best,
 John

> -Original Message-
> From: luke-tier...@uiowa.edu 
> Sent: Sunday, June 7, 2020 11:54 AM
> To: peter dalgaard 
> Cc: Jeroen Ooms ; Fox, John ; r-
> de...@r-project.org
> Subject: Re: [External] Re: [Rd] use of the tcltk package crashes R 4.0.1
> for Windows
> 
> On Sun, 7 Jun 2020, peter dalgaard wrote:
> 
> > So this wasn't tested for a month?
> >
> > Anyways, Free() is just free() with a check that we're not freeing a
> > null pointer, followed by setting the pointer to NULL. At that point
> > of tcltk.c, we have
> >
> >   for (objc = i = 0; i < length(avec); i++){
> >const char *s;
> >char *tmp;
> >if (!isNull(nm) && strlen(s = translateChar(STRING_ELT(nm, i{
> >//  tmp = calloc(strlen(s)+2, sizeof(char));
> >tmp = Calloc(strlen(s)+2, char);
> >*tmp = '-';
> >strcpy(tmp+1, s);
> >objv[objc++] = Tcl_NewStringObj(tmp, -1);
> >free(tmp);
> >}
> >if (!isNull(t = VECTOR_ELT(avec, i)))
> >objv[objc++] = (Tcl_Obj *) R_ExternalPtrAddr(t);
> >}
> >
> > and I can't see how tmp can be NULL at the free(), nor can I see it
> mattering if it is not set to NULL (notice that it goes out of scope with
> the for loop).
> 
> Right. And the calloc->Calloc change doesn't look like an issue either
> -- just checking for a NULL.
> 
> If the crash is happening in free() then that most likely means corrupted
> malloc data structures. Unfortunately that could be happening anywhere.
> 
> Best bet to narrow this down is for someone with a good Windows setup who
> can reproduce this to bisect the svn commits and see at what commit this
> started happening. Unfortunately my office Windows machine isn't
> responding and it will probably take some time to get that fixed.
> 
> Best,
> 
> luke
> 
> >
> > -pd
> >
> >
> >> On 7 Jun 2020, at 16:00 , Jeroen Ooms  wrote:
> >>
> >> On Sun, Jun 7, 2020 at 3:13 AM Fox, John  wrote:
> >>>
> >>> Hi,
> >>>
> >>> The following code, from the examples in ?TkWidgets , immediately
> crashes R 4.0.1 for Windows:
> >>>
> >>> - snip 
> >>> library("tcltk")
> >>> tt <- tktoplevel()
> >>> label.widget <- tklabel(tt, text = "Hello, World!") button.widget <-
> >>> tkbutton(tt, text = "Push",
> >>> command = function()cat("OW!\n")) tkpack(label.widget,
> >>> button.widget) # geometry manager
> >>> - snip 
> >>
> >>
> >> I can reproduce this. The backtrace shows the crash happens in
> >> dotTclObjv  [/src/library/tcltk/src/tcltk.c@243 ]. This looks like a
> >> bug that was introduced by commit 78408/78409 about a month ago. I
> >> think the problem is that this commit changes 'calloc' to 'Calloc'
> >> without changing the corresponding 'free' to 'Free'.
> >>
> >> This has nothing to do with the Windows build or installation.
> >> Nothing has changed in the windows build procedure between 4.0.0 and
> 4.0.1.
> >>
> >> __
> >> R-devel@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-devel
> >
> >
> 
> --
> Luke Tierney
> Ralph E. Wareham Professor of Mathematical Sciences
> University of Iowa  Phone: 319-335-3386
> Department of Statistics andFax:   319-335-3017
> Actuarial Science
> 241 Schaeffer Hall  email:   luke-tier...@uiowa.edu
> Iowa City, IA 52242 WWW:  http://www.stat.uiowa.edu

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] use of the tcltk package crashes R 4.0.1 for Windows

2020-06-07 Thread Fox, John
Dear Jeroen,

Thank you for tracking down the source of the problem.

You probably saw that Peter Dalgaard reported that the tcltk package apparently 
is working fine in R 4.0.1 on macOS. I haven't confirmed that myself because 
the Mac binary for R 4.0.1 isn't yet on CRAN.

Best,
 John

> On Jun 7, 2020, at 10:00 AM, Jeroen Ooms  wrote:
> 
> On Sun, Jun 7, 2020 at 3:13 AM Fox, John  wrote:
>> 
>> Hi,
>> 
>> The following code, from the examples in ?TkWidgets , immediately crashes R 
>> 4.0.1 for Windows:
>> 
>> - snip 
>> library("tcltk")
>> tt <- tktoplevel()
>> label.widget <- tklabel(tt, text = "Hello, World!")
>> button.widget <- tkbutton(tt, text = "Push",
>> command = function()cat("OW!\n"))
>> tkpack(label.widget, button.widget) # geometry manager
>> - snip 
> 
> 
> I can reproduce this. The backtrace shows the crash happens in
> dotTclObjv  [/src/library/tcltk/src/tcltk.c@243 ]. This looks like a
> bug that was introduced by commit 78408/78409 about a month ago. I
> think the problem is that this commit changes 'calloc' to 'Calloc'
> without changing the corresponding 'free' to 'Free'.
> 
> This has nothing to do with the Windows build or installation. Nothing
> has changed in the windows build procedure between 4.0.0 and 4.0.1.

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] use of the tcltk package crashes R 4.0.1 for Windows

2020-06-07 Thread Fox, John
Dear Peter,

First, thank you for following up on this problem.

Unless I somehow inexplicably missed it, as I just confirmed, the R 4.0.1 
Windows installer *doesn't* ask to install support files for Tcl/Tk.

Nor am I only one to notice this problem. I was made aware of it when several 
Rcmdr users wrote to me yesterday to say that the package was crashing R when 
it loads.

Finally, even if Tcl/Tk support is now a non-default option in R for Windows, R 
shouldn't crash if Tcl/Tk isn't installed.

Best,
 John

> On Jun 7, 2020, at 2:44 AM, peter dalgaard  wrote:
> 
> John,
> 
> The Windows installation instructions document has the following. So, one 
> obvious question is whether you did select it. (I haven't installed on 
> WIndows for ages, so I don't know whether this was changed recently or even 
> whether the selection is on or off by default).
> 
> -pd
> 
> Using package tcltk
> ===
> 
> The package tcltk supports building graphical interfaces with Tcl/Tk.
> "Support Files for Package tcltk" needs to be selected from the
> installer for this to work; alternatively you can use an existing
> installation of Tcl/Tk 8.6.x by following the instructions in the
> rw-FAQ.
> 
> 
> 
> 
>> On 7 Jun 2020, at 08:27 , peter dalgaard  wrote:
>> 
>> Not happening on Mac, so likely a Windows build issue.
>> 
>> (There's no 4.0.1 CRAN package yet, and no nightly build of 4.0.1 Patched, 
>> but the only thing changed in the sources since r78644 is the VERSION file.)
>> 
>> -pd
>> 
>>> On 7 Jun 2020, at 03:13 , Fox, John  wrote:
>>> 
>>> Hi,
>>> 
>>> The following code, from the examples in ?TkWidgets , immediately crashes R 
>>> 4.0.1 for Windows:
>>> 
>>> - snip 
>>> library("tcltk")
>>> tt <- tktoplevel()
>>> label.widget <- tklabel(tt, text = "Hello, World!")
>>> button.widget <- tkbutton(tt, text = "Push", 
>>>  command = function()cat("OW!\n"))
>>> tkpack(label.widget, button.widget) # geometry manager
>>> - snip 
>>> 
>>> Session info (prior to the crash):
>>> 
>>> - snip 
>>>> sessionInfo()
>>> R version 4.0.1 (2020-06-06)
>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>> Running under: Windows 10 x64 (build 18363)
>>> 
>>> Matrix products: default
>>> 
>>> locale:
>>> [1] LC_COLLATE=English_United States.1252 
>>> [2] LC_CTYPE=English_United States.1252   
>>> [3] LC_MONETARY=English_United States.1252
>>> [4] LC_NUMERIC=C  
>>> [5] LC_TIME=English_United States.1252
>>> 
>>> attached base packages:
>>> [1] tcltk stats graphics  grDevices utils datasets  methods  
>>> [8] base 
>>> 
>>> loaded via a namespace (and not attached):
>>> [1] compiler_4.0.1 tools_4.0.1   
>>> - snip 
>>> 
>>> I observe this behaviour both in the Rgui and when I run R in a terminal. I 
>>> think the problem is general to the use of the tcltk package.
>>> 
>>> Best,
>>> John
>>> 
>>> -
>>> John Fox
>>> Professor Emeritus
>>> McMaster University
>>> Hamilton, Ontario, Canada
>>> Web: https://socialsciences.mcmaster.ca/jfox/
>>> 
>>> __
>>> R-devel@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>> 
>> -- 
>> Peter Dalgaard, Professor,
>> Center for Statistics, Copenhagen Business School
>> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
>> Phone: (+45)38153501
>> Office: A 4.23
>> Email: pd@cbs.dk  Priv: pda...@gmail.com
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> 
> 
> 
> 
> 
> 
> 
> 

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] use of the tcltk package crashes R 4.0.1 for Windows

2020-06-06 Thread Fox, John
Hi,

The following code, from the examples in ?TkWidgets , immediately crashes R 
4.0.1 for Windows:

- snip 
library("tcltk")
tt <- tktoplevel()
label.widget <- tklabel(tt, text = "Hello, World!")
button.widget <- tkbutton(tt, text = "Push", 
 command = function()cat("OW!\n"))
tkpack(label.widget, button.widget) # geometry manager
- snip 

Session info (prior to the crash):

- snip 
> sessionInfo()
R version 4.0.1 (2020-06-06)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 18363)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C  
[5] LC_TIME=English_United States.1252

attached base packages:
[1] tcltk stats graphics  grDevices utils datasets  methods  
[8] base 

loaded via a namespace (and not attached):
[1] compiler_4.0.1 tools_4.0.1   
- snip 

I observe this behaviour both in the Rgui and when I run R in a terminal. I 
think the problem is general to the use of the tcltk package.

Best,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Tcl.Tk Tktable package missing from R 4.0.0 on Windows

2020-05-04 Thread Fox, John
Dear Jeroen,

Thank you very much for doing this.

I understand from other (off-list) correspondence that there is an intention to 
remove Tktable from the macOS distribution of R when Tcl/Tk is updated there, 
and so a more permanent solution is probably to provide Tktable via an R 
package. I plan to pursue that possibility.

Best,
 John

> On May 4, 2020, at 3:01 AM, Jeroen Ooms  wrote:
> 
> On Sun, May 3, 2020 at 6:15 PM Fox, John  wrote:
>> 
>> Dear R-devel list members,
>> 
>> The Tktable package for Tcl/Tk is apparently missing from the Windows 
>> distribution of R 4.0.0. I (actually a user of the Rcmdr package) discovered 
>> this when trying to use the new-data-set dialog in the Rcmdr package, 
>> producing the error, "Tcl package 'Tktable' must be installed first."
>> 
>> I believe the Tktable has been part of the R distribution for Windows since 
>> R version 2.9.0, and is still present in the macOS distribution of R 4.0.0.
>> 
>> I apologize for not discovering this problem prior to the release of R 
>> 4.0.0. I did test the Rcmdr package under R 4.0.0 on both Windows and macOS, 
>> but not every menu item and dialog on each platform.
>> 
>> Does anyone have more information about this problem?
> 
> Sorry this was my mistake. I somehow missed tktable when updating the
> tcltk bundle for rtools40. This didn't show up in any package check
> either.
> 
> I've added it back now, it should work again in today's builds r-devel
> and r-patched. Thanks for catching this.
> 
> Jeroen

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Tcl.Tk Tktable package missing from R 4.0.0 on Windows

2020-05-03 Thread Fox, John
Dear R-devel list members,

The Tktable package for Tcl/Tk is apparently missing from the Windows 
distribution of R 4.0.0. I (actually a user of the Rcmdr package) discovered 
this when trying to use the new-data-set dialog in the Rcmdr package, producing 
the error, "Tcl package 'Tktable' must be installed first." 

I believe the Tktable has been part of the R distribution for Windows since R 
version 2.9.0, and is still present in the macOS distribution of R 4.0.0.

I apologize for not discovering this problem prior to the release of R 4.0.0. I 
did test the Rcmdr package under R 4.0.0 on both Windows and macOS, but not 
every menu item and dialog on each platform.

Does anyone have more information about this problem? 

Thank you,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

2019-09-17 Thread Fox, John
Dear Herve,

Sorry, I should have said "matrices" rather than "data frames" -- brief() has 
methods for both.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Sep 17, 2019, at 8:29 AM, Fox, John  wrote:
> 
> Dear Herve,
> 
> The brief() generic function in the car package does something very similar 
> to that for data frames (and has methods for other classes of objects as 
> well).
> 
> Best,
> John
> 
>  -
>  John Fox, Professor Emeritus
>  McMaster University
>  Hamilton, Ontario, Canada
>  Web: http::/socserv.mcmaster.ca/jfox
> 
>> On Sep 17, 2019, at 2:52 AM, Pages, Herve  wrote:
>> 
>> Hi,
>> 
>> Alternatively, how about a new glance() generic that would do something 
>> like this:
>> 
>>> library(DelayedArray)
>>> glance <- DelayedArray:::show_compact_array
>> 
>>> M <- matrix(rnorm(1e6), nrow = 1000L, ncol = 2000L)
>>> glance(M)
>> <1000 x 2000> matrix object of type "double":
>>   [,1][,2][,3] ...[,1999][,2000]
>>   [1,]  -0.8854896   1.8010288   1.3051341   . -0.4473593  0.4684985
>>   [2,]  -0.8563415  -0.7102768  -0.9309155   . -1.8743504  0.4300557
>>   [3,]   1.0558159  -0.5956583   1.2689806   .  2.7292249  0.2608300
>>   [4,]   0.7547356   0.1465714   0.1798959   . -0.1778017  1.3417423
>>   [5,]   0.8037360  -2.7081809   0.9766657   . -0.9902788  0.1741957
>>...   .   .   .   .  .  .
>> [996,]  0.67220752  0.07804320 -0.38743454   .  0.4438639 -0.8130713
>> [997,] -0.67349962 -1.15292067 -0.54505567   .  0.4630923 -1.6287694
>> [998,]  0.03374595 -1.68061325 -0.88458368   . -0.2890962  0.2552267
>> [999,]  0.47861492  1.25530912  0.19436708   . -0.5193121 -1.1695501
>> [1000,]  1.52819218  2.23253275 -1.22051720   . -1.0342430 -0.1703396
>> 
>>> A <- array(rnorm(1e6), c(50, 20, 10, 100))
>>> glance(A)
>> <50 x 20 x 10 x 100> array object of type "double":
>> ,,1,1
>>[,1]   [,2]   [,3] ...  [,19]  [,20]
>> [1,] 0.78319619 0.82258390 0.09122269   .  1.7288189  0.7968574
>> [2,] 2.80687459 0.63709640 0.80844430   . -0.3963161 -1.2768284
>>  ...  .  .  .   .  .  .
>> [49,] -1.0696320 -0.1698111  2.0082890   .  0.4488292  0.5215745
>> [50,] -0.7012526 -2.0818229  0.7750518   .  0.3189076  0.1437394
>> 
>> ...
>> 
>> ,,10,100
>>[,1]   [,2]   [,3] ...  [,19]  [,20]
>> [1,]  0.5360649  0.5491561 -0.4098350   .  0.7647435  0.5640699
>> [2,]  0.7924093 -0.7395815 -1.3792913   .  0.1980287 -0.2897026
>>  ...  .  .  .   .  .  .
>> [49,]  0.6266209  0.3778512  1.4995778   . -0.3820651 -1.4241691
>> [50,]  1.9218715  3.5475949  0.5963763   .  0.4005210  0.4385623
>> 
>> H.
>> 
>> 
>> On 9/16/19 00:54, Michael Chirico wrote:
>>> Awesome. Gabe, since you already have a workshopped version, would you like
>>> to proceed? Feel free to ping me to review the patch once it's posted.
>>> 
>>> On Mon, Sep 16, 2019 at 3:26 PM Martin Maechler 
>>> wrote:
>>> 
>>>>>>>>> Michael Chirico
>>>>>>>>>on Sun, 15 Sep 2019 20:52:34 +0800 writes:
>>>> 
>>>>> Finally read in detail your response Gabe. Looks great,
>>>>> and I agree it's quite intuitive, as well as agree against
>>>>> non-recycling.
>>>> 
>>>>> Once the length(n) == length(dim(x)) behavior is enabled,
>>>>> I don't think there's any need/desire to have head() do
>>>>> x[1:6,1:6] anymore. head(x, c(6, 6)) is quite clear for
>>>>> those familiar with head(x, 6), it would seem to me.
>>>> 
>>>>> Mike C
>>>> 
>>>> Thank you, Gabe, and Michael.
>>>> I did like Gabe's proposal already back in July but was
>>>> busy and/or vacationing then ...
>>>> 
>>>> If you submit this with a patch (that includes changes to both
>>>> *.R and *.Rd , including some example) as "wishlist" item to R's
>>>> bugzilla, I'm willing/happy to check and commit this to R-devel.
>>>> 
>>>> Martin
>>>> 
>>>> 
>>>>> On Sat, Jul 13, 2019 at 

Re: [Rd] head.matrix can return 1000s of columns -- limit to n or add new argument?

2019-09-17 Thread Fox, John
Dear Herve,

The brief() generic function in the car package does something very similar to 
that for data frames (and has methods for other classes of objects as well).

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Sep 17, 2019, at 2:52 AM, Pages, Herve  wrote:
> 
> Hi,
> 
> Alternatively, how about a new glance() generic that would do something 
> like this:
> 
>> library(DelayedArray)
>> glance <- DelayedArray:::show_compact_array
> 
>> M <- matrix(rnorm(1e6), nrow = 1000L, ncol = 2000L)
>> glance(M)
> <1000 x 2000> matrix object of type "double":
>[,1][,2][,3] ...[,1999][,2000]
>[1,]  -0.8854896   1.8010288   1.3051341   . -0.4473593  0.4684985
>[2,]  -0.8563415  -0.7102768  -0.9309155   . -1.8743504  0.4300557
>[3,]   1.0558159  -0.5956583   1.2689806   .  2.7292249  0.2608300
>[4,]   0.7547356   0.1465714   0.1798959   . -0.1778017  1.3417423
>[5,]   0.8037360  -2.7081809   0.9766657   . -0.9902788  0.1741957
> ...   .   .   .   .  .  .
>  [996,]  0.67220752  0.07804320 -0.38743454   .  0.4438639 -0.8130713
>  [997,] -0.67349962 -1.15292067 -0.54505567   .  0.4630923 -1.6287694
>  [998,]  0.03374595 -1.68061325 -0.88458368   . -0.2890962  0.2552267
>  [999,]  0.47861492  1.25530912  0.19436708   . -0.5193121 -1.1695501
> [1000,]  1.52819218  2.23253275 -1.22051720   . -1.0342430 -0.1703396
> 
>> A <- array(rnorm(1e6), c(50, 20, 10, 100))
>> glance(A)
> <50 x 20 x 10 x 100> array object of type "double":
> ,,1,1
> [,1]   [,2]   [,3] ...  [,19]  [,20]
>  [1,] 0.78319619 0.82258390 0.09122269   .  1.7288189  0.7968574
>  [2,] 2.80687459 0.63709640 0.80844430   . -0.3963161 -1.2768284
>   ...  .  .  .   .  .  .
> [49,] -1.0696320 -0.1698111  2.0082890   .  0.4488292  0.5215745
> [50,] -0.7012526 -2.0818229  0.7750518   .  0.3189076  0.1437394
> 
> ...
> 
> ,,10,100
> [,1]   [,2]   [,3] ...  [,19]  [,20]
>  [1,]  0.5360649  0.5491561 -0.4098350   .  0.7647435  0.5640699
>  [2,]  0.7924093 -0.7395815 -1.3792913   .  0.1980287 -0.2897026
>   ...  .  .  .   .  .  .
> [49,]  0.6266209  0.3778512  1.4995778   . -0.3820651 -1.4241691
> [50,]  1.9218715  3.5475949  0.5963763   .  0.4005210  0.4385623
> 
> H.
> 
> 
> On 9/16/19 00:54, Michael Chirico wrote:
>> Awesome. Gabe, since you already have a workshopped version, would you like
>> to proceed? Feel free to ping me to review the patch once it's posted.
>> 
>> On Mon, Sep 16, 2019 at 3:26 PM Martin Maechler 
>> wrote:
>> 
 Michael Chirico
 on Sun, 15 Sep 2019 20:52:34 +0800 writes:
>>> 
 Finally read in detail your response Gabe. Looks great,
 and I agree it's quite intuitive, as well as agree against
 non-recycling.
>>> 
 Once the length(n) == length(dim(x)) behavior is enabled,
 I don't think there's any need/desire to have head() do
 x[1:6,1:6] anymore. head(x, c(6, 6)) is quite clear for
 those familiar with head(x, 6), it would seem to me.
>>> 
 Mike C
>>> 
>>> Thank you, Gabe, and Michael.
>>> I did like Gabe's proposal already back in July but was
>>> busy and/or vacationing then ...
>>> 
>>> If you submit this with a patch (that includes changes to both
>>> *.R and *.Rd , including some example) as "wishlist" item to R's
>>> bugzilla, I'm willing/happy to check and commit this to R-devel.
>>> 
>>> Martin
>>> 
>>> 
 On Sat, Jul 13, 2019 at 8:35 AM Gabriel Becker
  wrote:
>>> 
> Hi Michael and Abby,
> 
> So one thing that could happen that would be backwards
> compatible (with the exception of something that was an
> error no longer being an error) is head and tail could
> take vectors of length (dim(x)) rather than integers of
> length for n, with the default being n=6 being equivalent
> to n = c(6, dim(x)[2], <...>, dim(x)[k]), at least for
> the deprecation cycle, if not permanently. It not
> recycling would be unexpected based on the behavior of
> many R functions but would preserve the current behavior
> while granting more fine-grained control to users that
> feel they need it.
> 
> A rapidly thrown-together prototype of such a method for
> the head of a matrix case is as follows:
> 
> head2 = function(x, n = 6L, ...) { indvecs =
> lapply(seq_along(dim(x)), function(i) { if(length(n) >=
> i) { ni = n[i] } else { ni = dim(x)[i] } if(ni < 0L) ni =
> max(nrow(x) + ni, 0L) else ni = min(ni, dim(x)[i])
> seq_len(ni) }) lstargs = c(list(x),indvecs, drop = FALSE)
> do.call("[", lstargs) }
> 
> 
>> mat = matrix(1:100, 10, 10)
> 
>> *head(mat)*
> 
> [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
> 
> [1,] 1 1

Re: [Rd] inconsistent handling of factor, character, and logical predictors in lm()

2019-08-31 Thread Fox, John
Dear Bill,

Thanks for pointing this difference out -- I was unaware of it.

I think that the difference occurs in model.matrix.default(), which coerces 
character variables but not logical variables to factors. Later it treats both 
factors and logical variables as "factors" in that it applies contrasts to 
both, but unused factor levels are dropped while an unused logical level is not.

I don't see why logical variables shouldn't be treated just as character 
variables are currently, both with respect to single levels (whether this is 
considered an error or as collinear with the intercept and thus gets an NA 
coefficient) and with respect to $levels.

Best,
 John

> On Aug 31, 2019, at 1:21 PM, William Dunlap via R-devel 
>  wrote:
> 
>> Functions like lm() treat logical predictors as factors, *not* as
> numerical variables.
> 
> Not quite.  A factor with all elements the same causes lm() to give an
> error while a logical of all TRUEs or all FALSEs just omits it from the
> model (it gets a coefficient of NA).  This is a fairly common situation
> when you fit models to subsets of a big data.frame.  This is an argument
> for fixing the single-valued-factor problem, which would become more
> noticeable if logicals were treated as factors.
> 
>> d <- data.frame(Age=c(2,4,6,8,10), Weight=c(878, 890, 930, 800, 750),
> Diseased=c(FALSE,FALSE,FALSE,TRUE,TRUE))
>> coef(lm(data=d, Weight ~ Age + Diseased))
> (Intercept)  Age DiseasedTRUE
>877.7333   5.4000-151.
>> coef(lm(data=d, Weight ~ Age + factor(Diseased)))
> (Intercept)  Age factor(Diseased)TRUE
>877.7333   5.4000-151.
>> coef(lm(data=d, Weight ~ Age + Diseased, subset=Age<7))
> (Intercept)  Age DiseasedTRUE
>847.  13.   NA
>> coef(lm(data=d, Weight ~ Age + factor(Diseased), subset=Age<7))
> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>  contrasts can be applied only to factors with 2 or more levels
>> coef(lm(data=d, Weight ~ Age + factor(Diseased, levels=c(FALSE,TRUE)),
> subset=Age<7))
> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) :
>  contrasts can be applied only to factors with 2 or more levels
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> 
> On Sat, Aug 31, 2019 at 8:54 AM Fox, John  wrote:
> 
>> Dear Abby,
>> 
>>> On Aug 30, 2019, at 8:20 PM, Abby Spurdle  wrote:
>>> 
>>>> I think that it would be better to handle factors, character
>> predictors, and logical predictors consistently.
>>> 
>>> "logical predictors" can be regarded as categorical or continuous (i.e.
>> 0 or 1).
>>> And the model matrix should be the same, either way.
>> 
>> I think that you're mistaking a coincidence for a principle. The
>> coincidence is that FALSE/TRUE coerces to 0/1 and sorts to FALSE, TRUE.
>> Functions like lm() treat logical predictors as factors, *not* as numerical
>> variables.
>> 
>> That one would get the same coefficient in either case is a consequence of
>> the coincidence and the fact that the default contrasts for unordered
>> factors are contr.treatment(). For example, if you changed the contrasts
>> option, you'd get a different estimate (though of course a model with the
>> same fit to the data and an equivalent interpretation):
>> 
>>  snip --
>> 
>>> options(contrasts=c("contr.sum", "contr.poly"))
>>> m3 <- lm(Sepal.Length ~ Sepal.Width + I(Species == "setosa"), data=iris)
>>> m3
>> 
>> Call:
>> lm(formula = Sepal.Length ~ Sepal.Width + I(Species == "setosa"),
>>data = iris)
>> 
>> Coefficients:
>>(Intercept)  Sepal.Width  I(Species == "setosa")1
>> 2.6672   0.9418   0.8898
>> 
>>> head(model.matrix(m3))
>>  (Intercept) Sepal.Width I(Species == "setosa")1
>> 1   1 3.5  -1
>> 2   1 3.0  -1
>> 3   1 3.2  -1
>> 4   1 3.1  -1
>> 5   1 3.6  -1
>> 6   1 3.9  -1
>>> tail(model.matrix(m3))
>>(Intercept) Sepal.Width I(Species == "setosa")1
>> 145   1 3.3   1
>> 146   1 3.0   1
>> 147 

Re: [Rd] inconsistent handling of factor, character, and logical predictors in lm()

2019-08-31 Thread Fox, John
Dear Abby,

> On Aug 30, 2019, at 8:20 PM, Abby Spurdle  wrote:
> 
>> I think that it would be better to handle factors, character predictors, and 
>> logical predictors consistently.
> 
> "logical predictors" can be regarded as categorical or continuous (i.e. 0 or 
> 1).
> And the model matrix should be the same, either way.

I think that you're mistaking a coincidence for a principle. The coincidence is 
that FALSE/TRUE coerces to 0/1 and sorts to FALSE, TRUE. Functions like lm() 
treat logical predictors as factors, *not* as numerical variables. 

That one would get the same coefficient in either case is a consequence of the 
coincidence and the fact that the default contrasts for unordered factors are 
contr.treatment(). For example, if you changed the contrasts option, you'd get 
a different estimate (though of course a model with the same fit to the data 
and an equivalent interpretation):

 snip --

> options(contrasts=c("contr.sum", "contr.poly"))
> m3 <- lm(Sepal.Length ~ Sepal.Width + I(Species == "setosa"), data=iris)
> m3

Call:
lm(formula = Sepal.Length ~ Sepal.Width + I(Species == "setosa"), 
data = iris)

Coefficients:
(Intercept)  Sepal.Width  I(Species == "setosa")1  
 2.6672   0.9418   0.8898  

> head(model.matrix(m3))
  (Intercept) Sepal.Width I(Species == "setosa")1
1   1 3.5  -1
2   1 3.0  -1
3   1 3.2  -1
4   1 3.1  -1
5   1 3.6  -1
6   1 3.9  -1
> tail(model.matrix(m3))
(Intercept) Sepal.Width I(Species == "setosa")1
145   1 3.3   1
146   1 3.0   1
147   1 2.5   1
148   1 3.0   1
149   1 3.4   1
150   1 3.0   1

> lm(Sepal.Length ~ Sepal.Width + as.numeric(Species == "setosa"), data=iris)

Call:
lm(formula = Sepal.Length ~ Sepal.Width + as.numeric(Species == 
"setosa"), data = iris)

Coefficients:
(Intercept)  Sepal.Width  
as.numeric(Species == "setosa")  
 3.5571   0.9418
  -1.7797  

> -2*coef(m3)[3]
I(Species == "setosa")1 
  -1.779657 

 snip --


> 
> I think the first question to be asked is, which is the best approach, 
> categorical or continuous?
> The continuous approach seems simpler and more efficient to me, but
> output from the categorical approach may be more intuitive, for some
> people.

I think that this misses the point I was trying to make: lm() et al. treat 
logical variables as factors, not as numerical predictors. One could argue 
about what's the better approach but not about what lm() does. BTW, I prefer 
treating a logical predictor as a factor because the predictor is essentially 
categorical.

> 
> I note that the use factors and characters, doesn't necessarily
> produce consistent output, for $xlevels.
> (Because factors can have their levels re-ordered).

Again, this misses the point: Both factors and character predictors produce 
elements in $xlevels; logical predictors do not, even though they are treated 
in the model as factors. That factors have levels that aren't necessarily 
ordered alphabetically is a reason that I prefer using factors to using 
character predictors, but this has nothing to do with the point I was trying to 
make about $xlevels.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] inconsistent handling of factor, character, and logical predictors in lm()

2019-08-30 Thread Fox, John
Dear R-devel list members,

I've discovered an inconsistency in how lm() and similar functions handle 
logical predictors as opposed to factor or character predictors. An "lm" object 
for a model that includes factor or character predictors includes the levels of 
a factor or unique values of a character predictor in the $xlevels component of 
the object, but not the FALSE/TRUE values for a logical predictor even though 
the latter is treated as a factor in the fit.

For example:

 snip --

> m1 <- lm(Sepal.Length ~ Sepal.Width + Species, data=iris)
> m1$xlevels
$Species
[1] "setosa" "versicolor" "virginica" 
 
> m2 <- lm(Sepal.Length ~ Sepal.Width + as.character(Species), data=iris)
> m2$xlevels
$`as.character(Species)`
[1] "setosa" "versicolor" "virginica" 

> m3 <- lm(Sepal.Length ~ Sepal.Width + I(Species == "setosa"), data=iris)
> m3$xlevels
named list()

> m3

Call:
lm(formula = Sepal.Length ~ Sepal.Width + I(Species == "setosa"), 
data = iris)

Coefficients:
   (Intercept) Sepal.Width  I(Species == 
"setosa")TRUE  
3.5571  0.9418 
-1.7797  

 snip --

I believe that the culprit is .getXlevels(), which makes provision for factor 
and character predictors but not for logical predictors:

 snip --

> .getXlevels
function (Terms, m) 
{
xvars <- vapply(attr(Terms, "variables"), deparse2, 
"")[-1L]
if ((yvar <- attr(Terms, "response")) > 0) 
xvars <- xvars[-yvar]
if (length(xvars)) {
xlev <- lapply(m[xvars], function(x) if (is.factor(x)) 
levels(x)
else if (is.character(x)) 
levels(as.factor(x)))
xlev[!vapply(xlev, is.null, NA)]
}
}

 snip --

It would be simple to modify the last test in .getXlevels to 

else if (is.character(x) || is.logical(x))

which would cause .getXlevels() to return c("FALSE", "TRUE") (assuming both 
values are present in the data). I'd find that sufficient, but alternatively 
there could be a separate test for logical predictors that returns c(FALSE, 
TRUE).

I discovered this issue when a function in the effects package failed for a 
model with a logical predictor. Although it's possible to program around the 
problem, I think that it would be better to handle factors, character 
predictors, and logical predictors consistently.

Best,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Could generic functions check different S3 methods for an object when one of them produces an error?

2019-06-18 Thread Fox, John
Dear Iago,

The R S3 object system works as expected here, using the first available method 
processing the class vector from left to right. The problem is that the broom 
package doesn't export the confint.geeglm() method but rather reserves it for 
internal use. I can't think why the package authors chose to do that but you 
could ask them. The following therefore works (following on with your example):

> confint.geeglm <- broom:::confint.geeglm
> confint(geefit)
lwr upr
(Intercept) 3607.595981 5204.790222
Frost -2.7233176.097207
Murder   -84.166856   38.838155

I hope this helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Jun 17, 2019, at 3:56 AM, IAGO GINÉ VÁZQUEZ  wrote:
> 
> Hi,
> 
> 
> Let's say one has an object with multiple classes, and a generic function to 
> apply to it has associated S3 methods for more than one of those classes. 
> Further, the method it chooses (I do not know how; some order in the class 
> vector?) is not the suitable one and it produces an error. Would there be 
> some way to make the generic function to choice the correct method, or in 
> case that for any method taken it produces an error, to try another one.
> 
> 
> For example (commented in detail 
> here):
> 
> 
> # object with multiple classes: the output of function `geepack::geeglm`. The 
> output of `class(object)`:
> 
> ```
> 
> [1] "geeglm" "gee" "glm" "lm"
> 
> ```
> 
> The generic function: `stats::confint`.
> 
> The S3 method chosen: `confint.glm`. It produces an error. The correct method 
> in this case would be `broom:::confint.geeglm`.
> 
> 
> Thank you!
> 
> Iago
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] default for 'signif.stars'

2019-03-28 Thread Fox, John
Dear all,

I agree with both Russ and Terry that the significance stars option should 
default to FALSE. Here's what Sandy Weisberg and I say about significance 
starts in the current edition of the R Companion to Applied Regression:

'If you find the “statistical-significance” asterisks that R prints to 
the right of the p-values annoying, as we do, you can suppress them, as we will 
in the remainder of the R Companion, by entering the command: 
options(show.signif.stars=FALSE).'

This is a rare case in which I find myself disagreeing with Martin, whose 
arguments are almost invariably careful and considered. In particular, the 
crude discretization of p-values into several categories seems a poor 
visualization to me, and in any event "scanning" many p-values quickly, which 
is the use-case that Martin cites, avoids serious issues of simultaneous 
inference.

Best,
 John

> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of 
> Therneau, Terry M., Ph.D. via R-devel
> Sent: Thursday, March 28, 2019 9:28 AM
> To: r-devel@r-project.org
> Subject: Re: [Rd] default for 'signif.stars'
> 
> The addition of significant stars was, in my opinion, one of the worst 
> defaults ever added to R.   I would be delighted to see it removed, or 
> at least change the default.  It is one of the few overrides that I 
> have argued to add to our site- wide defaults file.
> 
> My bias comes from 30+ years in a medical statistics career where 
> fighting the disease of "dichotomania" has been an eternal struggle.  
> Continuous covariates are split in two, nuanced risk scores are 
> thresholded, decisions become yes/no,     Adding stars to output 
> is, to me, simply a gateway drug to this pernicous addiction.   We shouldn't 
> encourage it.
> 
> Wrt Abe's rant about the Nature article:  I've read the article and 
> found it to be well reasoned, and I can't say the same about the rant.   
> The issue in biomedical science is that the p-value has fallen victim to 
> Goodhart's law:
> "When a measure becomes a target, it ceases to be a good measure."  
> The article argues, and I would agree, that the .05 yes/no decision 
> rule is currently doing more harm than good in biomedical research.   
> What to do instead of this is a tough question, but it is fairly clear 
> that the current plan isn't working.   I have seen many cases of two 
> papers which both found a risk increase of 1.9 for something where one 
> paper claimed "smoking gun" and the other "completely exonerated".   
> Do YOU want to take a drug with 2x risk and a p= 0.2 'proof' that it 
> is okay?   Of course, if there is too much to do and too little time, 
> people will find a way to create a shortcut yes/no rule no matter what 
> we preach.   (We statisticians will do it
> too.)
> 
> Terry T.
> 
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] model.matrix.default() silently ignores bad contrasts.arg

2019-02-22 Thread Fox, John
Dear Martin and Ben,

I agree that a warning is a good idea (and perhaps that wasn't clear in my 
response to Ben's post). 

Also, it would be nice to correct the omission in the help file, which as far 
as I could see doesn't mention that a contrast-generating function (as opposed 
to its quoted name) can be an element of the contrasts.arg list.

Best,
 John

> -Original Message-
> From: Martin Maechler [mailto:maech...@stat.math.ethz.ch]
> Sent: Friday, February 22, 2019 11:50 AM
> To: Ben Bolker 
> Cc: Fox, John ; r-devel@r-project.org
> Subject: Re: [Rd] model.matrix.default() silently ignores bad contrasts.arg
> 
> >>>>> Ben Bolker
> >>>>> on Thu, 21 Feb 2019 08:18:51 -0500 writes:
> 
> > On Thu, Feb 21, 2019 at 7:49 AM Fox, John  wrote:
> >>
> >> Dear Ben,
> >>
> >> Perhaps I'm missing the point, but contrasts.arg is documented to be a
> list. From ?model.matrix: "contrasts.arg: A list, whose entries are values
> (numeric matrices or character strings naming functions) to be used as
> replacement values for the contrasts replacement function and whose
> names are the names of columns of data containing factors."
> 
> > I absolutely agree that this is not a bug/behaves as documented (I
> > could have said that more clearly).  It's just that (for reasons I
> > attempted to explain) this is a really easy mistake to make.
> 
> >> This isn't entirely accurate because a function also works as a named
> element of the list (in addition to a character string naming a function and a
> contrast matrix), as your example demonstrates, but nowhere that I'm
> aware of is it suggested that a non-list should work.
> >>
> >> It certainly would be an improvement if specifying contrast.arg as a 
> non-
> list generated an error or warning message, and it at least arguably would be
> convenient to allow a general contrast specification such as contrasts.arg-
> "contr.sum", but I don't see a bug here.
> 
> > I agree.  That's what my patch does (throws a warning message if
> > contrasts.arg is non-NULL and not a list).
> 
> I currently do think this is a good idea... "even though" I'm 99% sure that 
> this
> will make work for package maintainers and others whose code may
> suddenly show warnings.
> I hope they would know better than suppressWarnings(.) ...
> 
> I see a version of the patch using old style indentation which makes the diff
> even "considerably" smaller -- no need to submit this different, though --
> and I plan to test that a bit, and commit eventually to R-devel, possibly in 
> a 5
> days or so.
> 
> Thank you Ben for the suggestion and patch !
> Martin
> 
> > cheers
> > Ben Bolker
> 
> >> Best,
> >> John
> >>
> >> -
> >> John Fox, Professor Emeritus
> >> McMaster University
> >> Hamilton, Ontario, Canada
> >> Web: http::/socserv.mcmaster.ca/jfox
> >>
> >> > On Feb 20, 2019, at 7:14 PM, Ben Bolker  wrote:
> >> >
> >> > An lme4 user pointed out
> <https://github.com/lme4/lme4/issues/491> that
> >> > passing contrasts as a string or symbol to [g]lmer (which would work 
> if
> >> > we were using `contrasts<-` to set contrasts on a factor variable) is
> >> > *silently ignored*. This goes back to model.matrix(), and seems bad
> >> > (this is a very easy mistake to make, because of the multitude of 
> ways
> >> > to specify contrasts for factors in R  - e.g. options(contrasts=...);
> >> > setting contrasts on the specific factors; passing contrasts as a 
> list
> >> > to the model function ... )
> >> >
> >> > The relevant code is here:
> >> >
> >> > https://github.com/wch/r-
> source/blob/trunk/src/library/stats/R/models.R#L578-L603
> >> >
> >> > The following code shows the problem: a plain-vanilla model.matrix()
> >> > call with no contrasts argument, followed by two wrong contrasts
> >> > arguments, followed by a correct contrasts argument.
> >> >
> >> > data(cbpp, package="lme4")
> >> > mf1 <- model.matrix(~period, data=cbpp)
> >> > mf2 <- model.matrix(~period, contrasts.arg="contr.sum", data=cbpp)
> >&

Re: [Rd] Return/print standard error in t.test()

2019-02-21 Thread Fox, John
Dear Thomas,

it is, unfortunately, not that simple. t.test() returns an object of class 
"htest" and not all such objects have standard errors. I'm not entirely sure 
what the point is since it's easy to compute the standard error of the 
difference from the information in the object (adapting an example from 
?t.test):

> (res <- t.test(1:10, y = c(7:20)))

Welch Two Sample t-test

data:  1:10 and c(7:20)
t = -5.4349, df = 21.982, p-value = 1.855e-05
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -11.052802  -4.947198
sample estimates:
mean of x mean of y 
  5.5  13.5 

> as.vector(abs(diff(res$estimate)/res$statistic)) # SE
[1] 1.47196
> class(res)
[1] "htest"

and if you really want to print the SE as a matter of course, you could always 
write your own wrapper for t.test() that returns an object of class, say, 
"t.test" for which you can provide a print() method. Much of the advantage of 
working in a statistical computing environment like R (or Stata, for that 
matter) is that you can make things work the way you like.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Feb 21, 2019, at 3:57 PM, Thomas J. Leeper  wrote:
> 
> A recent thread on Twitter [1] by a Stata user highlighted that t.test()
> does not return or print the standard error of the mean difference, despite
> it being calculated by the function.
> 
> I know this isn’t the kind of change that’s likely to be made but could we
> at least return the SE even if the print() method isn’t updated? Or,
> better, update the print() method to display this as well?
> 
> Best,
> Thomas
> 
> [1]
> https://twitter.com/amandayagan/status/1098314654470819840?s=21
> -- 
> 
> Thomas J. Leeper
> http://www.thomasleeper.com
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] model.matrix.default() silently ignores bad contrasts.arg

2019-02-21 Thread Fox, John
Dear Ben,

Perhaps I'm missing the point, but contrasts.arg is documented to be a list. 
From ?model.matrix: "contrasts.arg: A list, whose entries are values (numeric 
matrices or character strings naming functions) to be used as replacement 
values for the contrasts replacement function and whose names are the names of 
columns of data containing factors." 

This isn't entirely accurate because a function also works as a named element 
of the list (in addition to a character string naming a function and a contrast 
matrix), as your example demonstrates, but nowhere that I'm aware of is it 
suggested that a non-list should work.

It certainly would be an improvement if specifying contrast.arg as a non-list 
generated an error or warning message, and it at least arguably would be 
convenient to allow a general contrast specification such as 
contrasts.arg-"contr.sum", but I don't see a bug here.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Feb 20, 2019, at 7:14 PM, Ben Bolker  wrote:
> 
> An lme4 user pointed out  that
> passing contrasts as a string or symbol to [g]lmer (which would work if
> we were using `contrasts<-` to set contrasts on a factor variable) is
> *silently ignored*. This goes back to model.matrix(), and seems bad
> (this is a very easy mistake to make, because of the multitude of ways
> to specify contrasts for factors in R  - e.g. options(contrasts=...);
> setting contrasts on the specific factors; passing contrasts as a list
> to the model function ... )
> 
> The relevant code is here:
> 
> https://github.com/wch/r-source/blob/trunk/src/library/stats/R/models.R#L578-L603
> 
> The following code shows the problem: a plain-vanilla model.matrix()
> call with no contrasts argument, followed by two wrong contrasts
> arguments, followed by a correct contrasts argument.
> 
> data(cbpp, package="lme4")
> mf1 <- model.matrix(~period, data=cbpp)
> mf2 <- model.matrix(~period, contrasts.arg="contr.sum", data=cbpp)
> all.equal(mf1,mf2) ## TRUE
> mf3 <- model.matrix(~period, contrasts.arg=contr.sum, data=cbpp)
> all.equal(mf1,mf3)  ## TRUE
> mf4 <- model.matrix(~period, contrasts.arg=list(period=contr.sum),
> data=cbpp)
> isTRUE(all.equal(mf1,mf4))  ## FALSE
> 
> 
>  I've attached a potential patch for this, which is IMO the mildest
> possible case (if contrasts.arg is non-NULL and not a list, it produces
> a warning).  I haven't been able to test it because of some mysterious
> issues I'm having with re-making R properly ...
> 
>  Thoughts?  Should I submit this as a bug report/patch?
> 
>  cheers
>   Ben Bolker
> 
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] formula(model.frame(..)) is misleading

2018-12-21 Thread Fox, John
Dear Martin,

Since no one else has picked up on this, I’ll take a crack at it: 

The proposal is to define the S3 class of model-frame objects as 
c(“model.frame”, “data.frame”) (not the formal class of these objects, even 
though this feature was coincidentally introduced in S4). That’s unlikely to do 
harm, since model frames would still “inherit” data.frame methods. 

It's possible that some packages rely on current data.frame methods that are 
eventually superseded by specific model.frame methods or do something peculiar 
with the class of model frames, so as far as I can see, one can’t know whether 
problems will arise before trying it.

I hope that helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 21, 2018, at 2:51 AM, Martin Maechler  
> wrote:
> 
>> William Dunlap via R-devel 
>>on Thu, 20 Dec 2018 15:09:56 -0800 writes:
> 
>> When formula() is applied to the output of model.frame()
>> it ignores the formula in the model.frame's 'terms'
>> attribute:
> 
>>> d <- data.frame(A=log(1:6), B=LETTERS[rep(1:2,c(2,4))],
>>> C=1/(1:6),
>> D=rep(letters[25:26],c(4,2)), Y=1:6)
>>> m0 <- model.frame(data=d, Y ~ A:B) formula(m0)
>>  Y ~ A + B
>>> `attributes<-`(terms(m0), value=NULL)
>>  Y ~ A:B
> 
>> This is in part because model.frame()'s output has class
>> "data.frame" instread of c("model.frame","data.frame"), as
>> SV4 did, so there are no methods for model.frames.
> 
>> Is there a reason that model.frame() returns a data.frame
>> with extra attributes but no special class or is it just
>> an oversight?
> 
> May guess is "oversight" || "well let's keep it simple"
> Do you (all readers) see situation where it could harm now (with
> the 20'000 packages on CRAN+BIoc+...) to do as SV4 (S version 4) has been 
> doing?
> 
> I'd be sympathetic to class()ing it.
> Martin
> 
>> Bill Dunlap TIBCO Software wdunlap tibco.com
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] logical variables in models

2018-12-19 Thread Fox, John
Dear R-devel list members,

This is an observation about how logical variables in models are handled, 
followed by questions.

As a general matter, character variables and logical variables are treated as 
if they were factors when they appear on the RHS of a model formula; for 
example:

- - - - snip- - - - -

> set.seed(123)
> c <- sample(letters[1:3], 10, replace=TRUE)
> f <- as.factor(sample(LETTERS[1:3], 10, replace=TRUE))
> L <- sample(c(TRUE, FALSE), 10, replace=TRUE)
> y <- rnorm(10)
> options(contrasts=c("contr.sum", "contr.poly"))
> mod <- lm(y ~ c + f + L)
> model.matrix(mod)
   (Intercept) c1 c2 f1 f2 L1
11  1  0 -1 -1  1
21 -1 -1  0  1  1
31  0  1 -1 -1  1
41 -1 -1  0  1  1
51 -1 -1  1  0  1
61  1  0 -1 -1  1
71  0  1  1  0  1
81 -1 -1  1  0  1
91  0  1  1  0 -1
10   1  0  1 -1 -1 -1
attr(,"assign")
[1] 0 1 1 2 2 3
attr(,"contrasts")
attr(,"contrasts")$c
[1] "contr.sum"

attr(,"contrasts")$f
[1] "contr.sum"

attr(,"contrasts")$L
[1] “contr.sum"

- - - - snip- - - - -

But logical variables don’t appear in the $xlevels component of the objects 
created by lm() and similar functions:

- - - - snip- - - - -

> mod$xlevels
$c
[1] "a" "b" "c"

$f
[1] "A" "B" “C"

- - - - snip- - - - -

Why the discrepancy? It’s true that the level-set (i.e., TRUE, FALSE) for a 
logical “factor” is known, but examining the $levels component is a simple way 
to detect variables treated as factors in the model. For example, I’d argue 
that .getXlevels() returns misleading information:

- - - - snip- - - - -

> .getXlevels(terms(mod), model.frame(mod))
$c
[1] "a" "b" "c"

$f
[1] "A" "B" “C"

- - - - snip- - - - -

An alternative for detecting “factors” is to examine the 'contrasts' attribute 
of the model matrix, although that doesn’t produce levels:

- - - - snip- - - - -

> names(attr(model.matrix(mod), "contrasts"))
[1] "c" "f" "L"

- - - - snip- - - - -

Is there are argument against making the treatment of logical variables 
consistent with that of factors and character variables? Comments?

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread Fox, John
Dear Heinz,

  --
> On Dec 17, 2018, at 10:19 AM, Heinz Tuechler  wrote:
> 
> Dear All,
> 
> do you think that use of a data argument is best practice in the example 
> below?

No, but it is *normally* or *usually* the best option, in my opinion.

Best,
 John

> 
> regards,
> 
> Heinz
> 
> ### trivial example
> plotwithline <- function(x, y) {
>plot(x, y)
>abline(lm(y~x)) ## data argument?
> }
> 
> set.seed(25)
> df0 <- data.frame(x=rnorm(20), y=rnorm(20))
> 
> plotwithline(df0[['x']], df0[['y']])
> 
> 
> 
> Fox, John wrote/hat geschrieben on/am 17.12.2018 15:21:
>> Dear Martin,
>> 
>> I think that everyone agrees that it’s generally preferable to use the data 
>> argument to lm() and I have nothing significant to add to the substance of 
>> the discussion, but I think that it’s a mistake not to add to the current 
>> examples, for the following reasons:
>> 
>> (1) Relegating examples using the data argument to “see also” doesn’t 
>> suggest that using the argument is a best practice. Most users won’t bother 
>> to click the links.
>> 
>> (2) In my opinion, an new initial example using the data argument would more 
>> clearly suggest that this is the normally the best option.
>> 
>> (3) I think that it would also be desirable to add a remark to the 
>> explanation of the data argument, something like, “Although the argument is 
>> optional, it's generally preferable to specify it explicitly.” And similarly 
>> on the help page for glm().
>> 
>> My two (or three) cents.
>> 
>> John
>> 
>>  -
>>  John Fox, Professor Emeritus
>>  McMaster University
>>  Hamilton, Ontario, Canada
>>  Web: http::/socserv.mcmaster.ca/jfox
>> 
>>> On Dec 17, 2018, at 3:05 AM, Martin Maechler  
>>> wrote:
>>> 
>>>>>>>> David Hugh-Jones
>>>>>>>>   on Sat, 15 Dec 2018 08:47:28 +0100 writes:
>>> 
>>>> I would argue examples should encourage good
>>>> practice. Beginners ought to learn to keep data in data
>>>> frames and not to overuse attach().
>>> 
>>> Note there's no attach() there in any of these examples!
>>> 
>>>> otherwise at their own risk, but they have less need of
>>>> explicit examples.
>>> 
>>> The glm examples are nice in sofar they show both uses.
>>> 
>>> I agree the lm() example(s) are  "didactically misleading" by
>>> not using data frames at all.
>>> 
>>> I disagree that only data frame examples should be shown.
>>> If  lm()  is one of the first R functions a beginneR must use --
>>> because they are in a basic stats class, say --  it may be
>>> *better* didactically to focus on lm()  in the very first
>>> example, and use data frames in a next one ...
>>>  and instead of next one, we have the pretty clear comment
>>> 
>>> ### less simple examples in "See Also" above
>>> 
>>> I'm not convinced (but you can try more) we should change those
>>> examples or add more there.
>>> 
>>> Martin
>>> 
>>>> On Fri, 14 Dec 2018 at 14:51, S Ellison
>>>>  wrote:
>>> 
>>>>> FWIW, before all the examples are changed to data frame
>>>>> variants, I think there's fairly good reason to have at
>>>>> least _one_ example that does _not_ place variables in a
>>>>> data frame.
>>>>> 
>>>>> The data argument in lm() is optional. And there is more
>>>>> than one way to manage data in a project. I personally
>>>>> don't much like lots of stray variables lurking about,
>>>>> but if those are the only variables out there and we can
>>>>> be sure they aren't affected by other code, it's hardly
>>>>> essential to create a data frame to hold something you
>>>>> already have.  Also, attach() is still part of R, for
>>>>> those folk who have a data frame but want to reference
>>>>> the contents across a wider range of functions without
>>>>> using with() a lot. lm() can reasonably omit the data
>>>>> argument there, too.
>>>>> 
>>>>> So while there are good reasons to use data frames, there
>>>>> are also good reasons to provide examples that don't.
>>>

Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread Fox, John
Dear Steve,

Since this relates as well to the message I posted a couple of minutes before 
yours, I agree that it’s possible to phrase “best practices” too categorically. 
In the current case, I believe that it’s reasonable to say that specifying the 
data argument is “generally” or “usually” the best option. That doesn’t rule 
out exceptions.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 17, 2018, at 7:49 AM, S Ellison  wrote:
> 
> 
> 
>> From: Thomas Yee [mailto:t@auckland.ac.nz]
>> 
>> Thanks for the discussion. I do feel quite strongly that
>> the variables should always be a part of a data frame. 
> 
> This seems pretty much a decision for R core, and I think it's useful to have 
> raised the issue.
> 
> But I, er, feel strongly that strong feelings and 'always' are unsafe in a 
> best practice argument. 
> 
> First, other folk with different use-cases or work practice may see 'best 
> practice' quite differently. So I would pretty much always expect exceptions.
> 
> Second, for examples of capability, there are too many exceptions in this 
> instance. For example:
> glm() can take a two-column matrix as a single response variable. 
> lm() can take a matrix as a response variable. 
> lm() can take a complete data frame as a predictor (see ?stackloss)
> 
> None of these work naturally if everything is in a data frame, and some won’t 
> work at all.
> 
> Steve E
> 
> 
> 
> 
> ***
> This email and any attachments are confidential. Any use, copying or
> disclosure other than by the intended recipient is unauthorised. If 
> you have received this message in error, please notify the sender 
> immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com 
> and delete this message and any copies from your computer and network. 
> LGC Limited. Registered in England 2991879. 
> Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Documentation examples for lm and glm

2018-12-17 Thread Fox, John
Dear Martin,

I think that everyone agrees that it’s generally preferable to use the data 
argument to lm() and I have nothing significant to add to the substance of the 
discussion, but I think that it’s a mistake not to add to the current examples, 
for the following reasons:

(1) Relegating examples using the data argument to “see also” doesn’t suggest 
that using the argument is a best practice. Most users won’t bother to click 
the links.

(2) In my opinion, an new initial example using the data argument would more 
clearly suggest that this is the normally the best option.

(3) I think that it would also be desirable to add a remark to the explanation 
of the data argument, something like, “Although the argument is optional, it's 
generally preferable to specify it explicitly.” And similarly on the help page 
for glm().

My two (or three) cents.

John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 17, 2018, at 3:05 AM, Martin Maechler  
> wrote:
> 
>> David Hugh-Jones 
>>on Sat, 15 Dec 2018 08:47:28 +0100 writes:
> 
>> I would argue examples should encourage good
>> practice. Beginners ought to learn to keep data in data
>> frames and not to overuse attach(). 
> 
> Note there's no attach() there in any of these examples!
> 
>> otherwise at their own risk, but they have less need of
>> explicit examples.
> 
> The glm examples are nice in sofar they show both uses.
> 
> I agree the lm() example(s) are  "didactically misleading" by
> not using data frames at all.
> 
> I disagree that only data frame examples should be shown.
> If  lm()  is one of the first R functions a beginneR must use --
> because they are in a basic stats class, say --  it may be
> *better* didactically to focus on lm()  in the very first
> example, and use data frames in a next one ...
>  and instead of next one, we have the pretty clear comment
> 
>  ### less simple examples in "See Also" above
> 
> I'm not convinced (but you can try more) we should change those
> examples or add more there.
> 
> Martin
> 
>> On Fri, 14 Dec 2018 at 14:51, S Ellison
>>  wrote:
> 
>>> FWIW, before all the examples are changed to data frame
>>> variants, I think there's fairly good reason to have at
>>> least _one_ example that does _not_ place variables in a
>>> data frame.
>>> 
>>> The data argument in lm() is optional. And there is more
>>> than one way to manage data in a project. I personally
>>> don't much like lots of stray variables lurking about,
>>> but if those are the only variables out there and we can
>>> be sure they aren't affected by other code, it's hardly
>>> essential to create a data frame to hold something you
>>> already have.  Also, attach() is still part of R, for
>>> those folk who have a data frame but want to reference
>>> the contents across a wider range of functions without
>>> using with() a lot. lm() can reasonably omit the data
>>> argument there, too.
>>> 
>>> So while there are good reasons to use data frames, there
>>> are also good reasons to provide examples that don't.
>>> 
>>> Steve Ellison
>>> 
>>> 
 -Original Message- > From: R-devel
>>> [mailto:r-devel-boun...@r-project.org] On Behalf Of Ben >
>>> Bolker > Sent: 13 December 2018 20:36 > To:
>>> r-devel@r-project.org > Subject: Re: [Rd] Documentation
>>> examples for lm and glm
 
 
 Agree.  Or just create the data frame with those
>>> variables in it > directly ...
 
 On 2018-12-13 3:26 p.m., Thomas Yee wrote: > > Hello,
> 
> something that has been on my mind for a decade or
>>> two has > > been the examples for lm() and glm(). They
>>> encourage poor style > > because of mismanagement of data
>>> frames. Also, having the > > variables in a data frame
>>> means that predict() > > is more likely to work properly.
> 
> For lm(), the variables should be put into a data
>>> frame.  > > As 2 vectors are assigned first in the
>>> general workspace they > > should be deleted afterwards.
> 
> For the glm(), the data frame d.AD is constructed but
>>> not used. Also, > > its 3 components were assigned first
>>> in the general workspace, so they > > float around
>>> dangerously afterwards like in the lm() example.
> 
> Rather than attached improved .Rd files here, they
>>> are put at > > www.stat.auckland.ac.nz/~yee/Rdfiles > >
>>> You are welcome to use them!
> 
> Best,
> 
> Thomas
> 
> __ > >
>>> R-devel@r-project.org mailing list > >
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
 
 __ >
>>> R-devel@r-project.org mailing list >
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> 
>>> 
>>> ***
>>> This email and any attachments are confidential. Any
>>> u...{{dropped:1

Re: [Rd] New vcov(*, complete=TRUE) etc -- coef() vs coef()

2017-11-07 Thread Fox, John
Dear Martin,

I think that your plan makes sense. It's too bad that aov() behaved differently 
in this respect from lm(), and thus created more work, but it's not be a bad 
thing that the difference is now explicit and documented.

I expect that that other problems like this will surface, particularly with 
contributed packages (and I know that you're aware that this has already 
happened with the car package). That is, packages that made provision for 
aliased coefficients based on the old behaviour of coef() and vcov() will now 
have to adapt to the new, more consistent behaviour.

Best,
 John

> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Martin
> Maechler
> Sent: Tuesday, November 7, 2017 4:48 PM
> To: r-devel@r-project.org
> Cc: Martin Maechler 
> Subject: [Rd] New vcov(*, complete=TRUE) etc -- coef() vs coef()
> 
> >>>>> Martin Maechler 
> >>>>> on Thu, 2 Nov 2017 21:59:00 +0100 writes:
> 
> >>>>> Fox, John 
> >>>>> on Thu, 14 Sep 2017 13:46:44 + writes:
> 
> >> Dear Martin, I made three points which likely got lost
> >> because of the way I presented them:
> 
> >> (1) Singularity is an unusual situation and should be
> >> made more prominent. It typically reflects a problem with
> >> the data or the specification of the model. That's not to
> >> say that it *never* makes sense to allow singular fits
> >> (as in the situations you mentions).
> 
> >> I'd favour setting singular.ok=FALSE as the default, but
> >> in the absence of that a warning or at least a note. A
> >> compromise would be to have a singular.ok option() that
> >> would be FALSE out of the box.
> 
> >> Any changes would have to be made very carefully so as
> >> not to create chaos.
> 
> > I for one, am too reluctant to want to change the default
> > there.
> 
> >> That goes for the points below as well.
> 
> >> (2) coef() and vcov() behave inconsistently, which can be
> >> problematic because one often uses them together in code.
> 
> > indeed; and I had agreed on that.  As of today, in R-devel
> > only they now behave compatibly.  NEWS entry
> 
> > • The “default” ("lm" etc) methods of vcov() have
> > gained new optional argument complete = TRUE which makes
> > the vcov() methods more consistent with the coef() methods
> > in the case of singular designs.  The former behavior is
> > now achieved by vcov(*, complete=FALSE).
> 
> 
> >> (3) As you noticed in your second message, lm() has a
> >> singular.ok argument and glm() doesn't.
> 
> > and that has been amended even earlier (a bit more than a
> > month ago) in R-devel svn rev 73380 with NEWS entry
> 
> > • glm() and glm.fit get the same singular.ok=TRUE
> > argument that lm() has had forever.  As a consequence, in
> > glm(*, method = ), user specified methods need
> > to accept a singular.ok argument as well.
> 
> >> I'll take a look at the code for glm() with an eye
> >> towards creating a patch, but I'm a bit reluctant to mess
> >> with the code for something as important as glm().
> 
> > and as a matter of fact you did send me +- the R code part
> > of that change.
> 
> > My current plan is to also add the 'complete = TRUE'
> > option to the "basic" coef() methods, such that you also
> > have consistent coef(*, complete=FALSE) and vcov(*,
> > complete=FALSE) behaviors.
> 
> and indeed I had added the above a bit later.
> 
> However, to my surprise, I have now found that we have a
> coef.aov() method -- completely undocumented which behaves *differently*:
> 
> where as the default coef() method which is called for lm(..) results gives 
> *all*
> coefficients, and gives  NA  for "aliased" ones, the aov method *drops* the  
> NA
> coefficients  and has done so "forever"  (I've checked R version 1.1.1 of 
> April 14,
> 2000).
> 
> vcov() on the other hand has not had a special "aov" method, but treats aov()
> and lm() results the same... which means that in R-devel the vcov() method for
> an aov() object  uses 'complete=TRUE' and gives NA rows and columns for the
> aliased coefficients, whereas  coef.aov()  removes all the NAs  and  gives 
> only
> the
> "non-aliased&quo

Re: [Rd] vcov and survival

2017-11-02 Thread Fox, John
Dear Martin,

Thank you for taking care of this.

Best,
 John

> -Original Message-
> From: Martin Maechler [mailto:maech...@stat.math.ethz.ch]
> Sent: Thursday, November 2, 2017 4:59 PM
> To: Fox, John 
> Cc: Martin Maechler ; Therneau, Terry M.,
> Ph.D. ; r-devel@r-project.org
> Subject: RE: [Rd] vcov and survival
> 
> >>>>> Fox, John 
> >>>>> on Thu, 14 Sep 2017 13:46:44 + writes:
> 
> > Dear Martin, I made three points which likely got lost
> > because of the way I presented them:
> 
> > (1) Singularity is an unusual situation and should be made
> > more prominent. It typically reflects a problem with the
> > data or the specification of the model. That's not to say
> > that it *never* makes sense to allow singular fits (as in
> > the situations you mentions).
> 
> > I'd favour setting singular.ok=FALSE as the default, but
> > in the absence of that a warning or at least a note. A
> > compromise would be to have a singular.ok option() that
> > would be FALSE out of the box.
> 
> > Any changes would have to be made very carefully so as not
> > to create chaos.
> 
> I for one, am too reluctant to want to change the default there.
> 
> >  That goes for the points below as well.
> 
> > (2) coef() and vcov() behave inconsistently, which can be
> > problematic because one often uses them together in code.
> 
> indeed; and I had agreed on that.
> As of today, in R-devel only they now behave compatibly.
> NEWS entry
> 
> • The “default” ("lm" etc) methods of vcov() have gained new
>   optional argument complete = TRUE which makes the vcov() methods
>   more consistent with the coef() methods in the case of singular
>   designs.  The former behavior is now achieved by vcov(*,
>   complete=FALSE).
> 
> 
> > (3) As you noticed in your second message, lm() has a
> > singular.ok argument and glm() doesn't.
> 
> and that has been amended even earlier (a bit more than a month
> ago) in R-devel svn rev 73380 with  NEWS  entry
> 
> • glm() and glm.fit get the same singular.ok=TRUE argument that
>   lm() has had forever.  As a consequence, in glm(*, method =
>   ), user specified methods need to accept a singular.ok
>   argument as well.
> 
> > I'll take a look at the code for glm() with an eye towards
> > creating a patch, but I'm a bit reluctant to mess with the
> > code for something as important as glm().
> 
> and as a matter of fact you did send me +- the R code part of that change.
> 
> My current plan is to also add the  'complete = TRUE' option to the "basic"
> coef() methods, such that you also have consistent coef(*, complete=FALSE)
> and vcov(*, complete=FALSE)  behaviors.
> 
> Thank you and Terry (and others?) for bringing up the issues and discussing
> them thoroughly!
> 
> Best,
> Martin.
> 
> 
> > Best, John
> 
> 
> 
> >> -Original Message- From: Martin Maechler
> >> [mailto:maech...@stat.math.ethz.ch] Sent: Thursday,
> >> September 14, 2017 4:23 AM To: Martin Maechler
> >>  Cc: Fox, John
> >> ; Therneau, Terry M., Ph.D.
> >> ; r-devel@r-project.org Subject: Re:
> >> [Rd] vcov and survival
> >>
> >> >>>>> Martin Maechler  >>>>>
> >> on Thu, 14 Sep 2017 10:13:02 +0200 writes:
> >>
> >> >>>>> Fox, John  >>>>> on Wed, 13 Sep
> >> 2017 22:45:07 + writes:
> >>
> >> >> Dear Terry, >> Even the behaviour of lm() and glm()
> >> isn't entirely consistent. In both cases, singularity
> >> results in NA coefficients by default, and these are
> >> reported in the model summary and coefficient vector, but
> >> not in the coefficient covariance matrix:
> >>
> >> >> 
> >>
> >> >>> mod.lm <- lm(Employed ~ GNP + Population + I(GNP +
> >> Population), >> + data=longley) >>> summary(mod.lm)
> >>
> >> >> Call: >> lm(formula = Employed ~ GNP + Population +
> >> I(GNP + Population), >> data = longley)
> >>
> >> >> Residuals: >> Min 1Q Median 3Q Max >> -0.80899
> >> -0.33282 -0.02329 0.25895 1.08800
> >>
> >> >> Coefficie

Re: [Rd] vcov and survival

2017-09-14 Thread Fox, John
Dear Terry,

It's not surprising that different modeling functions behave differently in 
this respect because there's no articulated standard. 

Please see my response to Martin for my take on the singular.ok argument. For a 
highly sophisticated user like you, singular.ok=TRUE isn't problematic -- 
you're not going to fail to notice an NA in the coefficient vector -- but I've 
seen students, e.g., doing exactly that. In principle having a singular.ok 
option defaulting to FALSE would satisfy everyone, but would probably break too 
much existing code.

Best,
 John

> -Original Message-
> From: Therneau, Terry M., Ph.D. [mailto:thern...@mayo.edu]
> Sent: Thursday, September 14, 2017 8:41 AM
> To: Martin Maechler 
> Cc: Fox, John ; Therneau, Terry M., Ph.D.
> ; r-devel@r-project.org
> Subject: Re: [Rd] vcov and survival
> 
> Thanks all for your comments.  No one said "all the other vcov methods do
> ", so I took some time this AM to look at several listed in the vcov help 
> page.
> Here is the code for the first few examples: data2 is constructed 
> specifically to
> create an NA coef midway in the list.
> 
> data1 <- data.frame(y = c(1,2,10,50, 5, 4, 8, 40, 60, 20, 21, 22,
>3,5,12,52, 7, 8,16, 48, 58, 28, 20,5),
>  x1 = factor(letters[rep(1:3, length=24)]),
>  x2 = factor(LETTERS[rep(1:4, length=24)]),
>  x3 = factor(rep(1:7, length=24)))
> data2 <- subset(data1, x1 !='a' | x2 != 'C')
> 
> fit1 <- lm(y ~ x1*x2, data2)
> table(is.na(coef(fit1)))
> dim(vcov(fit1))
> 
> fit2 <- glm(y ~ x1*x2, data=data2, poisson)
> table(is.na(coef(fit2)))
> dim(vcov(fit2))
> 
> fit3 <- lme(y ~ x1*x2, random= ~1|x3, data2)
> 
> 1. lm, mlm, glm, negbin objects all have an NA in coef(fit); and remove NA
> columns from the vcov object.
> 
> 2. I expected polr to return a generalized inverse of the Hessian since 
> vcov.polr
> has a call to ginv(object$Hessian), but it shortcuts earlier with a message
>   "design appears to be rank-deficient, so dropping some coefs"
> The undetermined coef appears in neither coef() more vcov().
> 
> 3. rlm declares that it does not work with singular data.
> 
> 4. multinom returns values for all coefficients and a full variance matrix.
> However, the returned variance is rank-deficient.  It is essentially a 
> g-inverse of
> the Hessian.
> 
> 5. coxph and survreg report an NA coef, and return a generalized inverse of 
> the
> Hessian matrix.  The g-inverse was chosen to be a particularly easy one in 
> that
> you can spot redundant colums via a row/col of zeros.
> 
> 6. nlme fails with a singularity error.  I didn't check out gls.
> 
> So my original question of whether I should make coxph consistent with the
> others has no answer, the 'others' are not consistent.
> 
> In response to two other points:
>   >> In my opinion singularity should at least produce a warning...
> I was one of those who lobbied heavily to change the singular.ok=FALSE
> default of lm to TRUE.  Data is messy, I have work to do, and don't need a
> package constantly harping at me.
> 
> In the same vein, stuffing NA into the vcov result is more pure, but would
> cause a lot of hassle.  I'm not sure that it is worth it.
> 
> For now, coxph will stay as is.
> But again, thanks to all for comments and I'll look forward to any more
> discussion.
> 
> Terry T.
> 
> 
> On 09/14/2017 03:23 AM, Martin Maechler wrote:
> >>>>>> Martin Maechler 
> >>>>>>  on Thu, 14 Sep 2017 10:13:02 +0200 writes:
> >
> >>>>>> Fox, John 
> >>>>>>  on Wed, 13 Sep 2017 22:45:07 + writes:
> >
> >  >> Dear Terry,
> >  >> Even the behaviour of lm() and glm() isn't entirely consistent. In 
> > both
> cases, singularity results in NA coefficients by default, and these are 
> reported
> in the model summary and coefficient vector, but not in the coefficient
> covariance matrix:
> >
> >  >> 
> >
> >  >>> mod.lm <- lm(Employed ~ GNP + Population + I(GNP + Population),
> >  >> +  data=longley)
> >  >>> summary(mod.lm)
> >
> >  >> Call:
> >  >> lm(formula = Employed ~ GNP + Population + I(GNP + Population),
> >  >> data = longley)
> >
> >  >> Residuals:
> >  >> Min   1Q   Median   3Q  Max
> >  >> -0.80899 -0.33282 -0.02329  0.25895  1.08

Re: [Rd] vcov and survival

2017-09-14 Thread Fox, John
Dear Martin,

I made three points which likely got lost because of the way I presented them:

(1) Singularity is an unusual situation and should be made more prominent. It 
typically reflects a problem with the data or the specification of the model. 
That's not to say that it *never* makes sense to allow singular fits (as in the 
situations you mentions). 

I'd favour setting singular.ok=FALSE as the default, but in the absence of that 
a warning or at least a note. A compromise would be to have a singular.ok 
option() that would be FALSE out of the box. 

Any changes would have to be made very carefully so as not to create chaos. 
That goes for the points below as well.

(2) coef() and vcov() behave inconsistently, which can be problematic because 
one often uses them together in code. 

(3) As you noticed in your second message, lm() has a singular.ok argument and 
glm() doesn't.

I'll take a look at the code for glm() with an eye towards creating a patch, 
but I'm a bit reluctant to mess with the code for something as important as 
glm().

Best,
 John



> -Original Message-
> From: Martin Maechler [mailto:maech...@stat.math.ethz.ch]
> Sent: Thursday, September 14, 2017 4:23 AM
> To: Martin Maechler 
> Cc: Fox, John ; Therneau, Terry M., Ph.D.
> ; r-devel@r-project.org
> Subject: Re: [Rd] vcov and survival
> 
> >>>>> Martin Maechler 
> >>>>> on Thu, 14 Sep 2017 10:13:02 +0200 writes:
> 
> >>>>> Fox, John 
> >>>>> on Wed, 13 Sep 2017 22:45:07 + writes:
> 
> >> Dear Terry,
> >> Even the behaviour of lm() and glm() isn't entirely consistent. In both
> cases, singularity results in NA coefficients by default, and these are 
> reported
> in the model summary and coefficient vector, but not in the coefficient
> covariance matrix:
> 
> >> 
> 
> >>> mod.lm <- lm(Employed ~ GNP + Population + I(GNP + Population),
> >> +  data=longley)
> >>> summary(mod.lm)
> 
> >> Call:
> >> lm(formula = Employed ~ GNP + Population + I(GNP + Population),
> >> data = longley)
> 
> >> Residuals:
> >> Min   1Q   Median   3Q  Max
> >> -0.80899 -0.33282 -0.02329  0.25895  1.08800
> 
> >> Coefficients: (1 not defined because of singularities)
> >> Estimate Std. Error t value Pr(>|t|)
> >> (Intercept) 88.93880   13.78503   6.452 2.16e-05 ***
> >> GNP  0.063170.01065   5.933 4.96e-05 ***
> >> Population  -0.409740.15214  -2.693   0.0184 *
> >> I(GNP + Population)   NA NA  NA   NA
> >> ---
> >> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> >> Residual standard error: 0.5459 on 13 degrees of freedom
> >> Multiple R-squared:  0.9791,   Adjusted R-squared:  0.9758
> >> F-statistic: 303.9 on 2 and 13 DF,  p-value: 1.221e-11
> 
> >>> vcov(mod.lm)
> >> (Intercept)   GNP Population
> >> (Intercept) 190.0269691  0.1445617813 -2.0954381
> >> GNP   0.1445618  0.0001133631 -0.0016054
> >> Population   -2.0954381 -0.0016053999  0.0231456
> >>> coef(mod.lm)
> >> (Intercept) GNP  Population I(GNP + Population)
> >> 88.93879831  0.06317244 -0.40974292  NA
> >>>
> >>> mod.glm <- glm(Employed ~ GNP + Population + I(GNP + Population),
> >> +   data=longley)
> >>> summary(mod.glm)
> 
> >> Call:
> >> glm(formula = Employed ~ GNP + Population + I(GNP + Population),
> >> data = longley)
> 
> >> Deviance Residuals:
> >> Min1QMedian3Q   Max
> >> -0.80899  -0.33282  -0.02329   0.25895   1.08800
> 
> >> Coefficients: (1 not defined because of singularities)
> >> Estimate Std. Error t value Pr(>|t|)
> >> (Intercept) 88.93880   13.78503   6.452 2.16e-05 ***
> >> GNP  0.063170.01065   5.933 4.96e-05 ***
> >> Population  -0.409740.15214  -2.693   0.0184 *
> >> I(GNP + Population)   NA NA  NA   NA
> >> ---
> >> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> 
> >> (Dispersion parameter for gaussian fami

Re: [Rd] vcov and survival

2017-09-13 Thread Fox, John
Dear Terry,

Even the behaviour of lm() and glm() isn't entirely consistent. In both cases, 
singularity results in NA coefficients by default, and these are reported in 
the model summary and coefficient vector, but not in the coefficient covariance 
matrix:



> mod.lm <- lm(Employed ~ GNP + Population + I(GNP + Population), 
+  data=longley)
> summary(mod.lm)

Call:
lm(formula = Employed ~ GNP + Population + I(GNP + Population), 
data = longley)

Residuals:
 Min   1Q   Median   3Q  Max 
-0.80899 -0.33282 -0.02329  0.25895  1.08800 

Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 88.93880   13.78503   6.452 2.16e-05 ***
GNP  0.063170.01065   5.933 4.96e-05 ***
Population  -0.409740.15214  -2.693   0.0184 *  
I(GNP + Population)   NA NA  NA   NA
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.5459 on 13 degrees of freedom
Multiple R-squared:  0.9791,Adjusted R-squared:  0.9758 
F-statistic: 303.9 on 2 and 13 DF,  p-value: 1.221e-11

> vcov(mod.lm)
(Intercept)   GNP Population
(Intercept) 190.0269691  0.1445617813 -2.0954381
GNP   0.1445618  0.0001133631 -0.0016054
Population   -2.0954381 -0.0016053999  0.0231456
> coef(mod.lm)
(Intercept) GNP  Population I(GNP + Population) 
88.93879831  0.06317244 -0.40974292  NA 
> 
> mod.glm <- glm(Employed ~ GNP + Population + I(GNP + Population), 
+   data=longley)
> summary(mod.glm)

Call:
glm(formula = Employed ~ GNP + Population + I(GNP + Population), 
data = longley)

Deviance Residuals: 
 Min1QMedian3Q   Max  
-0.80899  -0.33282  -0.02329   0.25895   1.08800  

Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
(Intercept) 88.93880   13.78503   6.452 2.16e-05 ***
GNP  0.063170.01065   5.933 4.96e-05 ***
Population  -0.409740.15214  -2.693   0.0184 *  
I(GNP + Population)   NA NA  NA   NA
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

(Dispersion parameter for gaussian family taken to be 0.2980278)

Null deviance: 185.0088  on 15  degrees of freedom
Residual deviance:   3.8744  on 13  degrees of freedom
AIC: 30.715

Number of Fisher Scoring iterations: 2

> coef(mod.glm)
(Intercept) GNP  Population I(GNP + Population) 
88.93879831  0.06317244 -0.40974292  NA 
> vcov(mod.glm)
(Intercept)   GNP Population
(Intercept) 190.0269691  0.1445617813 -2.0954381
GNP   0.1445618  0.0001133631 -0.0016054
Population   -2.0954381 -0.0016053999  0.0231456



Moreoever, lm() has a singular.ok() argument that defaults to TRUE, but glm() 
doesn't have this argument:



> mod.lm <- lm(Employed ~ GNP + Population + I(GNP + Population), 
+  data=longley, singular.ok=FALSE)
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : 
  singular fit encountered



In my opinion, singularity should at least produce a warning, both in calls to 
lm() and glm(), and in summary() output. Even better, again in my opinion, 
would be to produce an error by default in this situation, but doing so would 
likely break too much existing code. 

I prefer NA to 0 for the redundant coefficients because it at least suggests 
that the decision about what to exclude is arbitrary, and of course simply 
excluding coefficients isn't the only way to proceed. 

Finally, the differences in behaviour between coef() and vcov() and between 
lm() and glm() aren't really sensible.

Maybe there's some reason for all this that escapes me.

Best,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socserv.mcmaster.ca/jfox




> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of
> Therneau, Terry M., Ph.D.
> Sent: Wednesday, September 13, 2017 6:19 PM
> To: r-devel@r-project.org
> Subject: [Rd] vcov and survival
> 
> I have just noticed a difference in behavior between coxph and lm/glm:
> if one or more of the coefficients from the fit in NA, then lm and glm
> omit that row/column from the variance matrix; while coxph retains it
> but sets the values to zero.
> 
>Is this something that should be "fixed", i.e., made to agree? I
> suspect that doing so will break other packages, but then NA coefs are
> rather rare so perhaps not.
> 
> Terry Therneau
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/l

Re: [Rd] lm() gives different results to lm.ridge() and SPSS

2017-05-05 Thread Fox, John
Dear Nick,


On 2017-05-05, 9:40 AM, "R-devel on behalf of Nick Brown"
 wrote:

>>I conjecture that something in the vicinity of
>> res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>>summary(res) 
>> would reproduce the SPSS Beta values.
>
>Yes, that works. Thanks!

That you have to work hard in R to match the SPSS results isn’t such a bad
thing when you factor in the observation that standardizing the
interaction regressor, ZMEAN_PA * ZDIVERSITY_PA, separately from each of
its components, ZMEAN_PA and ZDIVERSITY_PA, is nonsense.

Best,
 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/


> 
>
>- Original Message -
>
>From: "peter dalgaard" 
>To: "Viechtbauer Wolfgang (SP)"
>, "Nick Brown"
>
>Cc: r-devel@r-project.org
>Sent: Friday, 5 May, 2017 3:33:29 PM
>Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>
>Thanks, I was getting to try this, but got side tracked by actual work...
>
>Your analysis reproduces the SPSS unscaled estimates. It still remains to
>figure out how Nick got
>
>> 
>coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>
>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>0.07342198 -0.39650356 -0.36569488 -0.09435788
>
>
>which does not match your output. I suspect that ZMEAN_PA and
>ZDIVERSITY_PA were scaled for this analysis (but the interaction term
>still obviously is not). I conjecture that something in the vicinity of
>
>res <- lm(DEPRESSION ~ scale(ZMEAN_PA) + scale(ZDIVERSITY_PA) +
>scale(ZMEAN_PA * ZDIVERSITY_PA), data=dat)
>summary(res) 
>
>would reproduce the SPSS Beta values.
>
>
>> On 5 May 2017, at 14:43 , Viechtbauer Wolfgang (SP)
>> wrote:
>> 
>> I had no problems running regression models in SPSS and R that yielded
>>the same results for these data.
>> 
>> The difference you are observing is from fitting different models. In
>>R, you fitted: 
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=dat)
>> summary(res) 
>> 
>> The interaction term is the product of ZMEAN_PA and ZDIVERSITY_PA. This
>>is not a standardized variable itself and not the same as "ZINTER_PA_C"
>>in the png you showed, which is not a variable in the dataset, but can
>>be created with: 
>> 
>> dat$ZINTER_PA_C <- with(dat, scale(ZMEAN_PA * ZDIVERSITY_PA))
>> 
>> If you want the same results as in SPSS, then you need to fit:
>> 
>> res <- lm(DEPRESSION ~ ZMEAN_PA + ZDIVERSITY_PA + ZINTER_PA_C,
>>data=dat) 
>> summary(res) 
>> 
>> This yields: 
>> 
>> Coefficients: 
>> Estimate Std. Error t value Pr(>|t|)
>> (Intercept) 6.41041 0.01722 372.21 <2e-16 ***
>> ZMEAN_PA -1.62726 0.04200 -38.74 <2e-16 ***
>> ZDIVERSITY_PA -1.50082 0.07447 -20.15 <2e-16 ***
>> ZINTER_PA_C -0.58955 0.05288 -11.15 <2e-16 ***
>> --- 
>> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>> 
>> Exactly the same as in the png.
>> 
>> Peter already mentioned this as a possible reason for the discrepancy:
>>https://stat.ethz.ch/pipermail/r-devel/2017-May/074191.html ("Is it
>>perhaps the case that x1 and x2 have already been scaled to have
>>standard deviation 1? In that case, x1*x2 won't be.")
>> 
>> Best, 
>> Wolfgang 
>> 
>> -Original Message-
>> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Nick
>>Brown 
>> Sent: Friday, May 05, 2017 10:40
>> To: peter dalgaard
>> Cc: r-devel@r-project.org
>> Subject: Re: [Rd] lm() gives different results to lm.ridge() and SPSS
>> 
>> Hi, 
>> 
>> Here is (I hope) all the relevant output from R.
>> 
>>> mean(s1$ZDEPRESSION, na.rm=T) [1] -1.041546e-16 >
>>>mean(s1$ZDIVERSITY_PA, na.rm=T) [1] -9.660583e-16 > mean(s1$ZMEAN_PA,
>>>na.rm=T) [1] -5.430282e-15 > lm.ridge(ZDEPRESSION ~ ZMEAN_PA *
>>>ZDIVERSITY_PA, data=s1)$coef ZMEAN_PA ZDIVERSITY_PA
>>>ZMEAN_PA:ZDIVERSITY_PA
>> -0.3962254 -0.3636026 -0.1425772 ## This is what I thought was the
>>problem originally. :-)
>> 
>> 
>>> coefficients(lm(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>>>(Intercept) ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>> 0.07342198 -0.39650356 -0.36569488 -0.09435788 >
>>coefficients(lm.ridge(ZDEPRESSION ~ ZMEAN_PA * ZDIVERSITY_PA, data=s1))
>>ZMEAN_PA ZDIVERSITY_PA ZMEAN_PA:ZDIVERSITY_PA
>> 0.07342198 -0.39650356 -0.36569488 -0.09435788 The equivalent from SPSS
>>is attached. The unstandardized coefficients in SPSS look nothing like
>>those in R. The standardized coefficients in SPSS match the
>>lm.ridge()$coef numbers very closely indeed, suggesting that the same
>>algorithm may be in use.
>> 
>> I have put the dataset file, which is the untouched original I received
>>from the authors, in this Dropbox folder:
>>https://www.dropbox.com/sh/xsebjy55ius1ysb/AADwYUyV1bl6-iAw7ACuF1_La?dl=0
>>. You can read it into R with this code (one variable needs to be
>>standardized and centered; everything else is already in the file):
>> 
>> s1 <- read.csv("Emodiversity_Study1.

Re: [Rd] Strange behavior when using progress bar (Fwd: Re: [R] The code itself disappears after starting to execute the for loop)

2016-12-07 Thread Fox, John
Dear Martin and Jon,

I can reproduce this problem in the Windows GUI, where I observed it using 
Jon's program after 75 iterations. I didn't observe the problem in a Windows 
terminal or under RStudio, letting the program run for more than 200 iterations 
in each case.

My system and session info:

- snip -

> Sys.info()
 sysname  release  version nodename 
   "Windows" "10 x64""build 14393" "JOHN-CARBON-X1" 
 machinelogin user   effective_user 
    "x86-64"   "John Fox"   "John Fox"   "John Fox" 

> sessionInfo()
R version 3.3.2 (2016-10-31)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 14393)

locale:
[1] LC_COLLATE=English_Canada.1252  LC_CTYPE=English_Canada.1252   
[3] LC_MONETARY=English_Canada.1252 LC_NUMERIC=C   
[5] LC_TIME=English_Canada.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

- snip -

I hope this helps,
 John

-
John Fox, Professor
McMaster University
Hamilton, Ontario
Canada L8S 4M4
Web: socserv.mcmaster.ca/jfox


> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Martin
> Maechler
> Sent: December 7, 2016 5:58 AM
> To: Jon Skoien 
> Cc: r-devel@r-project.org
> Subject: Re: [Rd] Strange behavior when using progress bar (Fwd: Re: [R] The
> code itself disappears after starting to execute the for loop)
> 
> >>>>> Jon Skoien 
> >>>>> on Wed, 7 Dec 2016 11:04:04 +0100 writes:
> 
> > I would like to ask once more if this is reproducible also for others?
> > If yes, should I submit it as a bug-report?
> 
> > Best,
> > Jon
> 
> Please  Windows users .. this is possibly only for you!
> 
> Note that I do *not* see problems on Linux (in ESS; did not try RStudio).
> 
> Please also indicate in which form you are running R.
> Here it does depend if this is inside RStudio, ESS, the "Windows GUI", the
> "Windows terminal", ...
> 
> Martin Maechler,
> ETH Zurich
> 
> 
> > On 11/28/2016 11:26 AM, Jon Skoien wrote:
> >> I first answered to the email below in r-help, but as I did not see
> >> any response, and it looks like a bug/unwanted behavior, I am also
> >> posting here. I have observed this in RGui, whereas it seems not to
> >> happen in RStudio.
> >>
> >> Similar to OP, I sometimes have a problem with functions using the
> >> progress bar. Frequently, the console is cleared after x iterations
> >> when the progress bar is called in a function which is wrapped in a
> >> loop. In the example below, this happened for me every ~44th
> >> iteration. Interestingly, it seems that reduction of the sleep times
> >> in this function increases the number of iterations before clearing.
> >> In my real application, where the progress bar is used in a much
> >> slower function, the console is cleared every 2-3 iteration, which
> >> means that I cannot scroll back to check the output.
> 
>  testit <- function(x = sort(runif(20)), ...) {
>pb <- txtProgressBar(...)
>for(i in c(0, x, 1)) {Sys.sleep(0.2); setTxtProgressBar(pb, i)}
>Sys.sleep(1)
>close(pb)
>  }
> 
>  it <- 0
>  while (TRUE) {testit(style = 3); it <- it + 1; print(paste("done", it))}
> 
> >> Is this only a problem for a few, or is it reproducible? Any hints to
> >> what the problem could be, or if it can be fixed? I have seen this in
> >> some versions of R, and could also reproduce in 3.3.2.
> 
> "some versions of R" ... all on Windows ?
> 
> >>
> >> Best wishes,
> >> Jon
> >>
> >> R version 3.3.2 (2016-10-31)
> >> Platform: x86_64-w64-mingw32/x64 (64-bit)
> >> Running under: Windows 8.1 x64 (build 9600)
> >>
> >> locale:
> >> [1] LC_COLLATE=English_United States.1252
> >> [2] LC_CTYPE=English_United States.1252
> >> [3] LC_MONETARY=English_United States.1252
> >> [4] LC_NUMERIC=C
> >> [5] LC_TIME=English_United States.1252
> >>
> >> attached base packages:
> >> [1] stats graphics  grDevices utils datasets  methods base
> 
> [.]
> 
> > Jon Olav Skøien
> > Joint Research Centre - European Commission
> > Institute for Space, Security & Migration
> > Disaster Risk Management Unit
> 
> > Via E. Fermi 2749, TP 122,  I-21027 Ispra (VA), ITALY
> 
> > jon.sko...@jrc.ec.europa.eu
> > Tel:  +39 0332 789205
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] improve 'package not installed' load errors?

2016-10-24 Thread Fox, John
Dear Kevin,

As others have mentioned, it's my sense that this kind of error has become more 
frequent -- at least I see students who encounter these errors more frequently. 
I agree that a less cryptic error message might help.

Best,
 John
--
John Fox, Professor
McMaster University
Hamilton, Ontario, Canada
Web: socserv.mcmaster.ca/jfox



> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Kevin
> Ushey
> Sent: Monday, October 24, 2016 1:51 PM
> To: R-devel 
> Subject: [Rd] improve 'package not installed' load errors?
> 
> Hi R-devel,
> 
> One of the more common issues that new R users see, and become stumped
> by, is error messages during package load of the form:
> 
> > library(ggplot2)
> Error in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()),
> versionCheck = vI[[j]]) :
>   there is no package called 'Rcpp'
> Error: package or namespace load failed for 'ggplot2'
> 
> Typically, error messages of this form are caused simply by one or more
> dependent packages (in this case, 'Rcpp') not being installed or
> available on the current library paths. (A side question, which I do not
> know the answer to, is how users get themselves into this state.)
> 
> I believe it would be helpful for new users if the error message
> reported here was a bit more direct, e.g.
> 
> > library(ggplot2)
> Error: 'ggplot2' depends on package 'Rcpp', but 'Rcpp' is not installed
> consider installing 'Rcpp' with install.packages("Rcpp")
> 
> In other words, it might be helpful to avoid printing the
> 'loadNamespace()' call on error (since it's mostly just scary /
> uninformative), and check up-front that the package is installed before
> attempting to call 'loadNamespace()'. I'm sure a number of novice users
> will still just throw their hands up in the air and say "I don't know
> what to do", but I think this would help steer a number of users in the
> right direction.
> 
> (The prescription to suggest installing a package from CRAN if available
> might be a step too far, but I think making it more clear that the error
> is due to a missing dependent package would help.)
> 
> Any thoughts?
> Kevin
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] suggested addition to model.matrix

2016-10-03 Thread Fox, John
Dear Spencer,

I don't think that the problem of "converting a data frame into a model matrix" 
is well-defined, because there isn't a unique mapping from one to the other. 

In your example, you build  the model matrix for the additive formula ~ a + b 
from the data frame matrix containing a and b, using "treatment" contrasts, but 
there are other possible formulas (e.g., ~ a*b) and contrasts [e.g., 
model.matrix(~ a + b, dd, contrasts=list(a=contr.sum, b=contr.helmert)].

So I think that the current approach is sensible -- to require both a data 
frame and a formula.

Best,
 John

> -Original Message-
> From: R-devel [mailto:r-devel-boun...@r-project.org] On Behalf Of Spencer
> Graves
> Sent: October 3, 2016 7:59 PM
> To: r-devel@r-project.org
> Subject: [Rd] suggested addition to model.matrix
> 
> Hello, All:
> 
> 
>What's the simplest way to convert a data.frame into a model.matrix?
> 
> 
>One way is given by the following example, modified from the examples 
> in
> help(model.matrix):
> 
> 
> dd <- data.frame(a = gl(3,4), b = gl(4,1,12))
> ab <- model.matrix(~ a + b, dd)
> ab0 <- model.matrix(~., dd)
> all.equal(ab, ab0)
> 
> 
>What do you think about replacing "model.matrix(~ a + b, dd)" in
> the current help(model.matrix) with this 3-line expansion?
> 
> 
>I suggest this, because I spent a few hours today trying to
> convert a data.frame into a model.matrix before finding this.
> 
> 
>Also, what do you think about adding something like the following
> to the stats package:
> 
> 
> model.matrix.data.frame <- function(object, ...){
>  model.matrix(~., object, ...)
> }
> 
> 
>And then extend the above example as follows:
> 
> ab. <- model.matrix(dd)
> all.equal(ab, ab.)
> 
> 
>Thanks,
>Spencer Graves
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel