Re: [R] Percentile bootstrap for the median : error message

2022-01-08 Thread Fox, John
Dear Sacha,

Here's your corrected and cleaned-up code:

> library(boot)
> set.seed(123)
> s <- rnorm(10,0,1)
> (m <- median(s)) 
[1] 0.000946463

> med <- function(d,i) {
+   median(d[i, ])
+ }

> set.seed(456)
> N <- 100
> n<-5

> out <- replicate(N, {
+   dat <- data.frame(sample(s,size=n))
+   boot.out <- boot(data = dat, statistic = med, R = 1)
+   boot.ci(boot.out, type = "perc")$perc[, 4:5]  
+ })

> mean(out[1, ] < m & m < out[2, ])
[1] 0.94

A couple of comments:

(1) I moved the definition of med() outside of the call to replicate() so the 
function doesn't get defined repeatedly -- something that, if I'm not mistaken, 
you were advised to do the last time you asked a very similar question.

(2) Maybe it's time to polish your debugging skills. I quickly found the error 
in the call to boot.ci() by calling browser() before the line dat <- 
data.frame(sample(s,size=n)) and stepping through the commands.

I hope this helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Jan 8, 2022, at 12:04 PM, varin sacha via R-help  
> wrote:
> 
> Dear R-experts,
> 
> Here below my R code for the percentile bootstrap confidence intervals with 
> an error message. 
> Is there a way to make my R code work ?
> Many thanks for your help and time.
> 
> 
> library(boot)
> 
> s=rnorm(10,0,1)
> (m<-median(s)) 
> 
> N <- 100
> n<-5
> out <- replicate(N, {
> 
> dat<-data.frame(sample(s,size=n))
> med<-function(d,i) {
> median(d[i, ])
> }
> 
>   boot.out <- boot(data = dat, statistic = med, R = 1)
> 
>   boot.ci(boot.out, type = "per")$per[, 4:5]
> 
> })
> 
> mean(out[1, ] < m & m < out[2, ])
> 
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up studentized confidence intervals ?

2021-12-29 Thread Fox, John
Dear varin sacha,

You didn't correctly adapt the code to the median. The outer call to mean() in 
the last line shouldn't be replaced with median() -- it computes the proportion 
of intervals that include the population median.

As well, you can't rely on the asymptotics of the bootstrap for a nonlinear 
statistic like the median with an n as small as 5, as your example, properly 
implemented (and with the code slightly cleaned up), illustrates:

> library(boot)
> set.seed(123)
> s <- rgamma(n=10, shape=2, rate=5)
> (m <- median(s))
[1] 0.3364465
> N <- 1000
> n <- 5
> set.seed(321)
> out <- replicate(N, {
+   dat <- data.frame(sample(s, size=n))
+   med <- function(d, i) {
+ median(d[i, ])
+   }
+   boot.out <- boot(data = dat, statistic = med, R = 1)
+   boot.ci(boot.out, type = "bca")$bca[, 4:5]
+ })
> #coverage probability
> mean(out[1, ] < m & m < out[2, ])
[1] 0.758


You do get the expected coverage, however, for a larger sample, here with n = 
100:

> N <- 1000
> n <- 100
> set.seed(321)
> out <- replicate(N, {
+   dat <- data.frame(sample(s, size=n))
+   med <- function(d, i) {
+ median(d[i, ])
+   }
+   boot.out <- boot(data = dat, statistic = med, R = 1)
+   boot.ci(boot.out, type = "bca")$bca[, 4:5]
+ })
> #coverage probability
> mean(out[1, ] < m & m < out[2, ])
[1] 0.952

I hope this helps,
 John

-- 
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/
 
 


On 2021-12-29, 2:09 PM, "R-help on behalf of varin sacha via R-help" 
 wrote:

Dear David,
Dear Rui,

Many thanks for your response. It perfectly works for the mean. Now I have 
a problem with my R code for the median. Because I always get 1 (100%) coverage 
probability that is more than very strange. Indeed, considering that an 
interval whose lower limit is the smallest value in the sample and whose upper 
limit is the largest value has 1/32 + 1/32 = 1/16 probability of non-coverage, 
implying that the confidence of such an interval is 15/16 rather than 1 (100%), 
I suspect that the confidence interval I use for the median is not correctly 
defined for n=5 observations, and likely contains all observations in the 
sample ? What is wrong with my R code ?


library(boot)

s=rgamma(n=10,shape=2,rate=5)
median(s)

N <- 100
out <- replicate(N, {
a<- sample(s,size=5)
median(a) 

dat<-data.frame(a)
med<-function(d,i) {
temp<-d[i,]
median(temp)
}

  boot.out <- boot(data = dat, statistic = med, R = 1)
  boot.ci(boot.out, type = "bca")$bca[, 4:5]
})

#coverage probability
median(out[1, ] < median(s) & median(s) < out[2, ])





Le jeudi 23 décembre 2021, 14:10:36 UTC+1, Rui Barradas 
 a écrit : 





Hello,

The code is running very slowly because you are recreating the function 
in the replicate() loop and because you are creating a data.frame also 
in the loop.

And because in the bootstrap statistic function med() you are computing 
the variance of yet another loop. This is probably statistically wrong 
but like David says, without a problem description it's hard to say.

Also, why compute variances if they are never used?

Here is complete code executing in much less than 2:00 hours. Note that 
it passes the vector a directly to med(), not a df with just one column.


library(boot)

set.seed(2021)
s <- sample(178:798, 10, replace = TRUE)
mean(s)

med <- function(d, i) {
  temp <- d[i]
  f <- mean(temp)
  g <- var(temp)
  c(Mean = f, Var = g)
}

N <- 1000
out <- replicate(N, {
  a <- sample(s, size = 5)
  boot.out <- boot(data = a, statistic = med, R = 1)
  boot.ci(boot.out, type = "stud")$stud[, 4:5]
})
mean(out[1, ] < mean(s) & mean(s) < out[2, ])
#[1] 0.952



Hope this helps,

Rui Barradas

Às 11:45 de 19/12/21, varin sacha via R-help escreveu:
> Dear R-experts,
> 
> Here below my R code working but really really slowly ! I need 2 hours 
with my computer to finally get an answer ! Is there a way to improve my R code 
to speed it up ? At least to win 1 hour ;=)
> 
> Many thanks
> 
> 
> library(boot)
> 
> s<- sample(178:798, 10, replace=TRUE)
> mean(s)
> 
> N <- 1000
> out <- replicate(N, {
> a<- sample(s,size=5)
> mean(a)
> dat<-data.frame(a)
> 
> med<-function(d,i) {
> temp<-d[i,]
> f<-mean(temp)
> g<-var(replicate(50,mean(sample(temp,replace=T
> return(c(f,g))
> 
> }
> 
>boot.out <- boot(data = dat, statistic = med, R = 1)
>boot.ci(boot.out, type = "stud")$stud[, 4:5]
> })
> mean(out[1,] < mean(s) & mean(s) < out[2,])
> ##

Re: [R] Adding SORT to UNIQUE

2021-12-21 Thread Fox, John
Dear Jeff,

I haven't investigated your claim systematically, but out of curiosity, I did 
try extending my previous example, admittedly arbitrarily. In doing so, I 
assumed that you intended col in the first case to be the column subscript, not 
the row subscript. Here's what I got (on a newish M1 MacBook Pro):

> system.time(
+   for ( col in colnames( D ) ) {
+ idx <- sample(1e6, 1000)
+ D[ idx, col ] <- idx
+   }
+ )
   user  system elapsed 
  0.913   6.545  43.737 

> system.time(
+   for ( col in colnames( D ) ) {
+ idx <- sample(1e6, 1000)
+ D[[ col ]][ idx ] <- idx
+   }
+ )
   user  system elapsed 
  0.876   6.828  52.033

Best,
 John

On 2021-12-21, 1:04 PM, "R-help on behalf of Jeff Newmiller" 
 wrote:

When your brain is wired to treat a data frame like a matrix, then you 
think things like

for ( col in colnames( col ) ) {
  idx <- expr
  D[ col, idx ] <- otherexpr
}

are reasonable, when

for ( col in colnames( col ) ) {
  idx <- expr
  D[[ col ]][ idx ] <- otherexpr
}

does actually run significantly faster.


    On December 21, 2021 9:28:52 AM PST, "Fox, John"  wrote:
>Dear Jeff,
>
>On 2021-12-21, 11:59 AM, "R-help on behalf of Jeff Newmiller" 
 wrote:
>
>Intuitive, perhaps, but noticably slower. 
>
>I think that in most applications, one wouldn't notice the difference; for 
example:
>
>> D <- data.frame(matrix(rnorm(1000*1e6), 1e6, 1000))
>
>> microbenchmark(D[, 1])
>Unit: microseconds
>   expr   minlqmean median uqmax neval
> D[, 1] 3.321 3.362 3.98561  3.444 3.5875 51.291   100
>
>> microbenchmark(D[[1]])
>Unit: microseconds
>   expr   minlqmean median uqmax neval
> D[[1]] 1.722 1.763 1.99137  1.804 1.8655 17.876   100
>
>Best,
> John
>
>
>And it doesn't work on tibbles by design. Data frames are lists of 
columns.
>
>
>On December 21, 2021 8:38:35 AM PST, Duncan Murdoch 
 wrote:
>>On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:
>>> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:
>>>> Thanks for the reply.
>>>>
>>>> sort(unique(Data[1]))
>>>> Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
>>>> decreasing)) :
>>>>  undefined columns selected
>>> 
>>> That's the wrong syntax:  Data[1] is not "column one of Data".  Use
>>> Data[[1]] for that, so
>>> 
>>> sort(unique(Data[[1]]))
>>
>>Actually, I'd probably recommend
>>
>>   sort(unique(Data[, 1]))
>>
>>instead.  This treats Data as a matrix rather than as a list. 
>>Dataframes are lists that look like matrices, but to me the matrix 
>>aspect is usually more intuitive.
>>
>>Duncan Murdoch
>>
>>> 
>>> I think Rui already pointed out the typo in the quoted text below...
>>> 
>>> Duncan Murdoch
>>> 
>>>>
>>>> The recommended syntax did not work, as listed above.
>>>>
>>>> What I want is the sort of distinct column output. Again, the 
column may
>>>> be text or numbers. This is a huge analysis effort with data 
coming at
>>>> me from many different sources.
>>>>
>>>>
>>>> *Stephen Dawson, DSL*
>>>> /Executive Strategy Consultant/
>>>> Business & Technology
>>>> +1 (865) 804-3454
>>>> http://www.shdawson.com <http://www.shdawson.com>
>>>>
>>>>
>>>> On 12/21/21 11:07 AM, Duncan Murdoch wrote:
>>>>> On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:
>>>>>> Thanks everyone for the replies.
>>>>>>
>>>>>> It is clear one either needs to write a function or put the 
unique
>>>>>> entries into another dataframe.
>>>>>>
>>>>>> It seems odd R cannot sort a list of unique column entries with 
ease.
>>>>>> Python and SQL can do it with ease.
>>>>>
>>>>> I've

Re: [R] Adding SORT to UNIQUE

2021-12-21 Thread Fox, John
Dear Jeff,

On 2021-12-21, 11:59 AM, "R-help on behalf of Jeff Newmiller" 
 wrote:

Intuitive, perhaps, but noticably slower. 

I think that in most applications, one wouldn't notice the difference; for 
example:

> D <- data.frame(matrix(rnorm(1000*1e6), 1e6, 1000))

> microbenchmark(D[, 1])
Unit: microseconds
   expr   minlqmean median uqmax neval
 D[, 1] 3.321 3.362 3.98561  3.444 3.5875 51.291   100

> microbenchmark(D[[1]])
Unit: microseconds
   expr   minlqmean median uqmax neval
 D[[1]] 1.722 1.763 1.99137  1.804 1.8655 17.876   100

Best,
 John


And it doesn't work on tibbles by design. Data frames are lists of columns.


On December 21, 2021 8:38:35 AM PST, Duncan Murdoch 
 wrote:
>On 21/12/2021 11:31 a.m., Duncan Murdoch wrote:
>> On 21/12/2021 11:20 a.m., Stephen H. Dawson, DSL wrote:
>>> Thanks for the reply.
>>>
>>> sort(unique(Data[1]))
>>> Error in `[.data.frame`(x, order(x, na.last = na.last, decreasing =
>>> decreasing)) :
>>>  undefined columns selected
>> 
>> That's the wrong syntax:  Data[1] is not "column one of Data".  Use
>> Data[[1]] for that, so
>> 
>> sort(unique(Data[[1]]))
>
>Actually, I'd probably recommend
>
>   sort(unique(Data[, 1]))
>
>instead.  This treats Data as a matrix rather than as a list. 
>Dataframes are lists that look like matrices, but to me the matrix 
>aspect is usually more intuitive.
>
>Duncan Murdoch
>
>> 
>> I think Rui already pointed out the typo in the quoted text below...
>> 
>> Duncan Murdoch
>> 
>>>
>>> The recommended syntax did not work, as listed above.
>>>
>>> What I want is the sort of distinct column output. Again, the column may
>>> be text or numbers. This is a huge analysis effort with data coming at
>>> me from many different sources.
>>>
>>>
>>> *Stephen Dawson, DSL*
>>> /Executive Strategy Consultant/
>>> Business & Technology
>>> +1 (865) 804-3454
>>> http://www.shdawson.com 
>>>
>>>
>>> On 12/21/21 11:07 AM, Duncan Murdoch wrote:
 On 21/12/2021 10:16 a.m., Stephen H. Dawson, DSL via R-help wrote:
> Thanks everyone for the replies.
>
> It is clear one either needs to write a function or put the unique
> entries into another dataframe.
>
> It seems odd R cannot sort a list of unique column entries with ease.
> Python and SQL can do it with ease.

 I've seen several responses that looked pretty simple.  It's hard to
 beat sort(unique(x)), though there's a fair bit of confusion about
 what you actually want.  Maybe you should post an example of the code
 you'd use in Python?

 Duncan Murdoch

>
> QUESTION
> Is there a simpler means than other than the unique function to 
capture
> distinct column entries, then sort that list?
>
>
> *Stephen Dawson, DSL*
> /Executive Strategy Consultant/
> Business & Technology
> +1 (865) 804-3454
> http://www.shdawson.com 
>
>
> On 12/20/21 5:53 PM, Rui Barradas wrote:
>> Hello,
>>
>> Inline.
>>
>> Às 21:18 de 20/12/21, Stephen H. Dawson, DSL via R-help escreveu:
>>> Thanks.
>>>
>>> sort(unique(Data[[1]]))
>>>
>>> This syntax provides row numbers, not column values.
>>
>> This is not right.
>> The syntax Data[1] extracts a sub-data.frame, the syntax Data[[1]]
>> extracts the column vector.
>>
>> As for my previous answer, it was not addressing the question, I
>> misinterpreted it as being a question on how to sort by numeric order
>> when the data is not numeric. Here is a, hopefully, complete answer.
>> Still with package stringr.
>>
>>
>> cols_to_sort <- 1:4
>>
>> Data2 <- lapply(Data[cols_to_sort], \(x){
>>  stringr::str_sort(unique(x), numeric = TRUE)
>> })
>>
>>
>> Or using Avi's suggestion of writing a function to do all the work 
and
>> simplify the lapply loop later,
>>
>>
>> unisort2 <- function(vec, ...) stringr::str_sort(unique(vec), ...)
>> Data2 <- lapply(Data[cols_to_sort], unisort, numeric = TRUE)
>>
>>
>> Hope this helps,
>>
>> Rui Barradas
>>
>>
>>>
>>> *Stephen Dawson, DSL*
>>> /Executive Strategy Consultant/
>>> Business & Technology
>>> +1 (865) 804-3454
>>> http://www.shdawson.com 
>>>
>>>
>>> On 12/20/21 11:58 AM, Stephen H. Dawson, DSL via R-help wro

Re: [R] Character (1a, 1b) to numeric

2020-07-10 Thread Fox, John
Dear Bert,

Wouldn't you know it, but your contribution arrived just after I pressed "send" 
on my last message? So here's how your solution compares:

> microbenchmark(John = John <- xn[x], 
+Rich = Rich <- xn[match(x, xc)], 
+Jeff = Jeff <- {
+   n <- as.integer( sub( "[a-i]$", "", x ) )
+   d <- match( sub( "^\\d+", "", x ), letters[1:9] )
+   d[ is.na( d ) ] <- 0
+   n + d / 10
+},
+David = David <- as.numeric(gsub("a", ".3", 
+  gsub("b", ".5", 
+   gsub("c", ".7", x,
+Bert = Bert <- {
+   nums <- sub("[[:alpha:]]+","",x)  
+   alph <- sub("\\d+","",x)  
+   as.numeric(nums) + ifelse(alph == "",0, vals[alph])
+},
+times=1000L
+)
Unit: microseconds
  expr   min lq   meanmedian uq   max neval  cld
  John   261.739   373.9765   599.9411   536.571   569.3750  14489.48  1000 a   
  Rich   250.697   372.4450   542.3208   520.383   554.7215  10682.73  1000 a   
  Jeff 10879.223 13477.7665 15647.7856 15549.255 17516.7420 146155.28  1000  b  
 David 14337.510 18375.0100 20325.8796 20187.174 22161.0195  32575.31  1000d
  Bert 12344.506 15753.2510 18024.2757 17702.838 19973.0465  32043.80  1000   c 
> all.equal(John, Rich)
[1] TRUE
> all.equal(John, David)
[1] "names for target but not for current"
> all.equal(John, Jeff)
[1] "names for target but not for current" "Mean relative difference: 
0.1498243" 
> all.equal(John, Bert)
[1] "names for target but not for current"

To make the comparison fair, I moved the parts of the solutions that don't 
depend on the length of the data outside the benchmark. Your solution does have 
the virtue of providing the right answer.

Best,
 John

> On Jul 10, 2020, at 3:54 PM, Bert Gunter  wrote:
> 
> ... and continuing with this cute little thread...
> 
> I found the OP's specification a little imprecise -- are your values always a 
> string that begins with *some sort" of numeric value followed by "some sort" 
> of alpha code? That is, could the numeric value be several digits and the 
> alpha code several letters? Probably not, and the existing solutions you have 
> been provided are almost certainly all you need. But for fun, assuming this 
> more general specification, here is a general way to split your alphanumeric 
> codes up into numeric and alpha parts and then convert by using a couple of 
> sub() 's.
> 
> > set.seed(131)
> > xc <- sample(c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c"), 15, replace = 
> > TRUE) 
> > nums <- sub("[[:alpha:]]+","",xc)  ## extract numeric part
> > alph <- sub("\\d+","",xc)   ## extract alpha part
> > codes <- letters[1:3] ## whatever alpha codes are used
> > vals <- setNames(c(.3,.5,.7), codes) ## whatever numeric values to convert 
> > codes to
> > xnew <- as.numeric(nums) + ifelse(alph == "",0, vals[alph])
> > data.frame (xc = xc, xnew = xnew)
>xc xnew
> 1  1a  1.3
> 2   2  2.0
> 3  1c  1.7
> 4  1c  1.7
> 5  1b  1.5
> 6  1a  1.3
> 7   2  2.0
> 8   2  2.0
> 9  1a  1.3
> 10 1a  1.3
> 11 2c  2.7
> 12 1b  1.5
> 13 1b  1.5
> 14  1  1.0
> 15 1c  1.7
> 
> Echoing others, no claim for optimality in any sense.
> 
> Cheers,
> Bert
> 
> 
> On Fri, Jul 10, 2020 at 12:28 PM David Carlson  wrote:
> Here is a different approach:
> 
> xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> xn <- as.numeric(gsub("a", ".3", gsub("b", ".5", gsub("c", ".7", xc
> xn
> # [1] 1.0 1.3 1.5 1.7 2.0 2.3 2.5 2.7
> 
> David L Carlson
> Professor Emeritus of Anthropology
> Texas A&M University
> 
> On Fri, Jul 10, 2020 at 1:10 PM Fox, John  wrote:
> 
> > Dear Jean-Louis,
> >
> > There must be many ways to do this. Here's one simple way (with no claim
> > of optimality!):
> >
> > > xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> > > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> >

Re: [R] Character (1a, 1b) to numeric

2020-07-10 Thread Fox, John
Hi,

We've had several solutions, and I was curious about their relative efficiency. 
Here's a test with a moderately large data vector:

> library("microbenchmark")
> set.seed(123) # for reproducibility
> x <- sample(xc, 1e4, replace=TRUE) # "data"
> microbenchmark(John = John <- xn[x], 
+Rich = Rich <- xn[match(x, xc)], 
+Jeff = Jeff <- {
+ n <- as.integer( sub( "[a-i]$", "", x ) )
+ d <- match( sub( "^\\d+", "", x ), letters[1:9] )
+ d[ is.na( d ) ] <- 0
+ n + d / 10
+ },
+David = David <- as.numeric(gsub("a", ".3", 
+  gsub("b", ".5", 
+   gsub("c", ".7", x,
+times=1000L
+)
Unit: microseconds
  expr   minlq   mean median uq   max neval cld
  John   228.816   345.371   513.5614   503.5965   533.0635  10829.08  1000 a  
  Rich   217.395   343.035   534.2074   489.0075   518.3260  15388.96  1000 a  
  Jeff 10325.471 13070.737 15387.2545 15397.9790 17204.0115 153486.94  1000  b 
 David 14256.673 18148.492 20185.7156 20170.3635 22067.6690  34998.95  1000   c
> all.equal(John, Rich)
[1] TRUE
> all.equal(John, David)
[1] "names for target but not for current"
> all.equal(John, Jeff)
[1] "names for target but not for current" "Mean relative difference: 
0.1498243" 

Of course, efficiency isn't the only consideration, and aesthetically (and no 
doubt subjectively) I prefer Rich Heiberger's solution. OTOH, Jeff's solution 
is more general in that it generates the correspondence between letters and 
numbers. The argument for Jeff's solution would, however, be stronger if it 
gave the desired answer.

Best,
 John

> On Jul 10, 2020, at 3:28 PM, David Carlson  wrote:
> 
> Here is a different approach:
> 
> xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> xn <- as.numeric(gsub("a", ".3", gsub("b", ".5", gsub("c", ".7", xc
> xn
> # [1] 1.0 1.3 1.5 1.7 2.0 2.3 2.5 2.7
> 
> David L Carlson
> Professor Emeritus of Anthropology
> Texas A&M University
> 
> On Fri, Jul 10, 2020 at 1:10 PM Fox, John  wrote:
> Dear Jean-Louis,
> 
> There must be many ways to do this. Here's one simple way (with no claim of 
> optimality!):
> 
> > xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> > 
> > set.seed(123) # for reproducibility
> > x <- sample(xc, 20, replace=TRUE) # "data"
> > 
> > names(xn) <- xc
> > z <- xn[x]
> > 
> > data.frame(z, x)
>  z  x
> 1  2.5 2b
> 2  2.5 2b
> 3  1.5 1b
> 4  2.3 2a
> 5  1.5 1b
> 6  1.3 1a
> 7  1.3 1a
> 8  2.3 2a
> 9  1.5 1b
> 10 2.0  2
> 11 1.7 1c
> 12 2.3 2a
> 13 2.3 2a
> 14 1.0  1
> 15 1.3 1a
> 16 1.5 1b
> 17 2.7 2c
> 18 2.0  2
> 19 1.5 1b
> 20 1.5 1b
> 
> I hope this helps,
>  John
> 
>   -
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
> 
> > On Jul 10, 2020, at 1:50 PM, Jean-Louis Abitbol  wrote:
> > 
> > Dear All
> > 
> > I have a character vector,  representing histology stages, such as for 
> > example:
> > xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> > 
> > and this goes on to 3, 3a etc in various order for each patient. I do have 
> > of course a pre-established  classification available which does change 
> > according to the histology criteria under assessment.
> > 
> > I would want to convert xc, for plotting reasons, to a numeric vector such 
> > as
> > 
> > xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> > 
> > Unfortunately I have no clue on how to do that.
> > 
> > Thanks for any help and apologies if I am missing the obvious way to do it.
> > 
> > JL
> > -- 
> > Verif30042020
> > 
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://urldefense.com/v3/__https://stat.ethz.ch/mailman/listinfo/r-help__;!!KwNVnqRv!V7p9rtNSgB

Re: [R] Character (1a, 1b) to numeric

2020-07-10 Thread Fox, John
Dear Jean-Louis,

There must be many ways to do this. Here's one simple way (with no claim of 
optimality!):

> xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> 
> set.seed(123) # for reproducibility
> x <- sample(xc, 20, replace=TRUE) # "data"
> 
> names(xn) <- xc
> z <- xn[x]
> 
> data.frame(z, x)
 z  x
1  2.5 2b
2  2.5 2b
3  1.5 1b
4  2.3 2a
5  1.5 1b
6  1.3 1a
7  1.3 1a
8  2.3 2a
9  1.5 1b
10 2.0  2
11 1.7 1c
12 2.3 2a
13 2.3 2a
14 1.0  1
15 1.3 1a
16 1.5 1b
17 2.7 2c
18 2.0  2
19 1.5 1b
20 1.5 1b

I hope this helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Jul 10, 2020, at 1:50 PM, Jean-Louis Abitbol  wrote:
> 
> Dear All
> 
> I have a character vector,  representing histology stages, such as for 
> example:
> xc <-  c("1", "1a", "1b", "1c", "2", "2a", "2b", "2c")
> 
> and this goes on to 3, 3a etc in various order for each patient. I do have of 
> course a pre-established  classification available which does change 
> according to the histology criteria under assessment.
> 
> I would want to convert xc, for plotting reasons, to a numeric vector such as
> 
> xn <- c(1, 1.3, 1.5, 1.7, 2, 2.3, 2.5, 2.7)
> 
> Unfortunately I have no clue on how to do that.
> 
> Thanks for any help and apologies if I am missing the obvious way to do it.
> 
> JL
> -- 
> Verif30042020
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot shows exponential values incompatible with data

2020-07-10 Thread Fox, John
Dear Jim,

As I pointed out yesterday, setting ylim as you suggest still results in 
"0e+00" as the smallest tick mark, as it should for evenly spaced ticks. 

Best,
 John

> On Jul 10, 2020, at 12:13 AM, Jim Lemon  wrote:
> 
> Hi Luigi,
> This is a result of the "pretty" function that calculates hopefully
> good looking axis ticks automatically. You can always specify
> ylim=c(1.0E09,max(Y)) if you want.
> 
> Jim
> 
> On Thu, Jul 9, 2020 at 10:59 PM Luigi Marongiu  
> wrote:
>> 
>> Hello,
>> I have these vectors:
>> ```
>> X <- 1:7
>> Y <- c(1438443863, 3910100650, 10628760108, 28891979048, 78536576706,
>> 213484643920, 580311678200)
>> plot(Y~X)
>> ```
>> The y-axis starts at 0e0, but the first value is 1.4 billion. Why the
>> axis does not start at 1e9?
>> 
>> 
>> 
>> --
>> Best regards,
>> Luigi
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot shows exponential values incompatible with data

2020-07-09 Thread Fox, John
Dear Bernard,

> On Jul 9, 2020, at 10:25 AM, Bernard Comcast  
> wrote:
> 
> Use the xlim option in the plot function?

I think you mean ylim, but as you'll find out when you try it, you still 
(reasonably) get an evenly spaced tick mark at 0:

plot(Y ~ X, ylim=c(1e9, 6e11))

The "right" thing to do with exponential values is to plot on a log scale or 
(as Rui reasonably suggested) use a logged axis.

Best,
 John

> 
> Bernard
> Sent from my iPhone so please excuse the spelling!"
> 
>> On Jul 9, 2020, at 10:06 AM, Luigi Marongiu  wrote:
>> 
>> Thank you,
>> but why it does not work in linear? With the log scale, I know it
>> works but I am not looking for it; is there a way to force a linear
>> scale?
>> Regards
>> Luigi
>> 
>>> On Thu, Jul 9, 2020 at 3:44 PM Fox, John  wrote:
>>> 
>>> Dear Luigi,
>>> 
>>>>> On Jul 9, 2020, at 8:59 AM, Luigi Marongiu  
>>>>> wrote:
>>>> 
>>>> Hello,
>>>> I have these vectors:
>>>> ```
>>>> X <- 1:7
>>>> Y <- c(1438443863, 3910100650, 10628760108, 28891979048, 78536576706,
>>>> 213484643920, 580311678200)
>>>> plot(Y~X)
>>>> ```
>>>> The y-axis starts at 0e0, but the first value is 1.4 billion. Why the
>>>> axis does not start at 1e9?
>>> 
>>> Because you're plotting on a linear, not log, scale, and 0*10^11 = 0.
>>> 
>>>> round(Y/1e11)
>>> [1] 0 0 0 0 1 2 6
>>> 
>>> Then try plot(log(Y) ~ X).
>>> 
>>> I hope this helps,
>>> John
>>> 
>>> -
>>> John Fox, Professor Emeritus
>>> McMaster University
>>> Hamilton, Ontario, Canada
>>> Web: http::/socserv.mcmaster.ca/jfox
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Best regards,
>>>> Luigi
>>>> 
>>>> __
>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide 
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> 
>> -- 
>> Best regards,
>> Luigi
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot shows exponential values incompatible with data

2020-07-09 Thread Fox, John
Dear Luigi,

> On Jul 9, 2020, at 9:59 AM, Luigi Marongiu  wrote:
> 
> Thank you,
> but why it does not work in linear? With the log scale, I know it
> works but I am not looking for it; is there a way to force a linear
> scale?

The scale *is* linear and the choice of tick marks, which are evenly spaced, is 
reasonable, given that 10^9 is 2 orders of magnitude smaller than 10^11. That 
is, on a linear scale with this range, 10^9 isn't much larger than 0.

If you really want a tick at 10^9, then you can just put one there:

plot(Y~X, axes=FALSE, frame=TRUE)
axis(1)
axis(2, at=c(1e9, (1:6)*1e11))

But now the ticks aren't evenly spaced (though they appear to be because, as I 
mentioned, 10^9 is "close" to 0).

Best,
 John

> Regards
> Luigi
> 
> On Thu, Jul 9, 2020 at 3:44 PM Fox, John  wrote:
>> 
>> Dear Luigi,
>> 
>>> On Jul 9, 2020, at 8:59 AM, Luigi Marongiu  wrote:
>>> 
>>> Hello,
>>> I have these vectors:
>>> ```
>>> X <- 1:7
>>> Y <- c(1438443863, 3910100650, 10628760108, 28891979048, 78536576706,
>>> 213484643920, 580311678200)
>>> plot(Y~X)
>>> ```
>>> The y-axis starts at 0e0, but the first value is 1.4 billion. Why the
>>> axis does not start at 1e9?
>> 
>> Because you're plotting on a linear, not log, scale, and 0*10^11 = 0.
>> 
>>> round(Y/1e11)
>> [1] 0 0 0 0 1 2 6
>> 
>> Then try plot(log(Y) ~ X).
>> 
>> I hope this helps,
>> John
>> 
>>  -
>>  John Fox, Professor Emeritus
>>  McMaster University
>>  Hamilton, Ontario, Canada
>>  Web: http::/socserv.mcmaster.ca/jfox
>>> 
>>> 
>>> 
>>> --
>>> Best regards,
>>> Luigi
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
> 
> -- 
> Best regards,
> Luigi
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] plot shows exponential values incompatible with data

2020-07-09 Thread Fox, John
Dear Luigi,

> On Jul 9, 2020, at 8:59 AM, Luigi Marongiu  wrote:
> 
> Hello,
> I have these vectors:
> ```
> X <- 1:7
> Y <- c(1438443863, 3910100650, 10628760108, 28891979048, 78536576706,
> 213484643920, 580311678200)
> plot(Y~X)
> ```
> The y-axis starts at 0e0, but the first value is 1.4 billion. Why the
> axis does not start at 1e9?

Because you're plotting on a linear, not log, scale, and 0*10^11 = 0. 

> round(Y/1e11)
[1] 0 0 0 0 1 2 6

Then try plot(log(Y) ~ X).

I hope this helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox
> 
> 
> 
> -- 
> Best regards,
> Luigi
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] R 4.0.2 scheduled for June 22

2020-06-09 Thread Fox, John
Dear Peter,

Thank you very much for this.

To clarify slightly, the bug affects not just the Rcmdr package but use of the 
tcltk package on Windows more generally.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Jun 9, 2020, at 5:28 PM, Peter Dalgaard via R-help  
> wrote:
> 
> Unfortunatly, a memory allocation bug prevented the R Commander package from 
> working on Windows. This is fixed in R-patched, but we cannot have this not 
> working in the official release when IT departments start installing for the 
> Fall semester, so we need to issue a new release.
> 
> Full schedule is available on developer.r-project.org.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> __
> r-de...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
> 
> ___
> r-annou...@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-announce
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

___
r-annou...@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-announce

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R 4.0.1 crashes with R Commander

2020-06-08 Thread Fox, John
Dear Paulo,

This is due to a known bug in R 4.0.1 for Windows that is general to Tcl/Tk. 
The bug should be fixed in the current patched version of R 4.0.1 for Windows, 
so you could use that or just go back to R 4.0.0. 

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Jun 7, 2020, at 1:53 PM, Paulo Figueiredo  wrote:
> 
> Hi again,
> 
> as an update, I tried to open R Commander under R 32 bits and it worked, but 
> not with R Studio choosing the 32 bit R.
> 
> Thus, trying to load R Commander under R Studio (32 and 64 bits) or R 64 bits 
> crashes the programmes. It only loads under 32 bit R 4.0.1.
> 
> Appreciate any help.
> 
> Cheers
> 
> 
> -- 
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "effects" package with "lme4"

2020-05-15 Thread Fox, John
Dear Axel,

> On May 15, 2020, at 7:39 PM, Axel Urbiz  wrote:
> 
> Dear John,
> 
> Thank you for your response. 
> 
> My apologies as I’m only recently getting exposed to mixed-model. My 
> understanding, is that the model specified below also has random intercepts 
> and slope, as they vary by Subject. `coef(fm1)` shows this. I was looking to 
> plot the fitted splines by Subject.
> 
> Sorry if my interpretation is incorrect.

There's no need to apologize. 

The functions in the effects package compute and graph fixed effects for mixed 
models. You could compute and graph the BLUP for the fitted spline for each 
subject (see below) but it's not what the effects package does.

I think that the following does what you want:

 snip --

subjects <- levels(sleepstudy$Subject)
fits <- matrix(0, length(subjects), 10)
rownames(fits) <- subjects
for (subject in subjects){
  fits[subject, ] <- predict(fm1, newdata=data.frame(Subject=subject, Days=0:9))
}
plot(c(0, 10), range(fits), type="n", xlab="Days", ylab="Reaction")
for (subject in subjects) lines(0:9, fits[subject, ])

 snip ------

Best,
 John
 
> 
> Best,
> Axel.
> 
> 
> 
>> On May 15, 2020, at 5:51 PM, Fox, John  wrote:
>> 
>> Dear Axel,
>> 
>> There only one fixed effect in the model, ns(Days, 3), so I don't know what 
>> you expected.
>> 
>> Best,
>> John
>> --
>> John Fox, Professor Emeritus
>> McMaster University
>> Hamilton, Ontario, Canada
>> Web: socialsciences.mcmaster.ca/jfox/
>> 
>> 
>> 
>>> -Original Message-
>>> From: Axel Urbiz 
>>> Sent: Friday, May 15, 2020 5:33 PM
>>> To: Fox, John ; R-help@r-project.org
>>> Subject: "effects" package with "lme4"
>>> 
>>> Hello John and others,
>>> 
>>> I’d appreciate your help as I’m trying to plot the effect of predictor
>>> “Days” on Reaction by Subject. I’m only getting one plot in the example
>>> below.
>>> 
>>> ### Start example
>>> 
>>> library(lme4)
>>> library(splines)
>>> data("sleepstudy")
>>> 
>>> fm1 <- lmer(Reaction ~ ns(Days, 3) + (ns(Days, 3) | Subject), sleepstudy)
>>> coef(fm1)
>>> plot(allEffects(fm1))
>>> 
>>> ### End example
>>> 
>>> Thanks,
>>> Axel.
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "effects" package with "lme4"

2020-05-15 Thread Fox, John
Dear Axel,

There only one fixed effect in the model, ns(Days, 3), so I don't know what you 
expected.

Best,
 John
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: Axel Urbiz 
> Sent: Friday, May 15, 2020 5:33 PM
> To: Fox, John ; R-help@r-project.org
> Subject: "effects" package with "lme4"
> 
> Hello John and others,
> 
> I’d appreciate your help as I’m trying to plot the effect of predictor
> “Days” on Reaction by Subject. I’m only getting one plot in the example
> below.
> 
> ### Start example
> 
> library(lme4)
> library(splines)
> data("sleepstudy")
> 
> fm1 <- lmer(Reaction ~ ns(Days, 3) + (ns(Days, 3) | Subject), sleepstudy)
> coef(fm1)
> plot(allEffects(fm1))
> 
> ### End example
> 
> Thanks,
> Axel.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loop inside dplyr::mutate

2020-05-09 Thread Fox, John
Dear Axel,

Assuming that you're not wedded to using mutate():

> D1 <- 1 - as.matrix(sim_data_wide[, 2:11])
> D2 <- matrix(0, 10, 10)
> colnames(D2) <- paste0("PC_", 1:10)
> for (i in 1:10) D2[, i] <- 1 - apply(D1[, 1:i, drop=FALSE], 1, prod)
> all.equal(D2, as.matrix(sim_data_wide[, 22:31]))
[1] TRUE 

I hope this helps,
 John

> On May 9, 2020, at 7:45 PM, Axel Urbiz  wrote:
> 
> Hello, 
> 
> Is there a less verbose approach to obtaining the PC_i variables inside the 
> mutate?
> 
> library(tidyverse)
> sim_data <- data.frame(borrower_id = sort(rep(1:10, 20)),
>   quarter = rep(1:20, 10),
>   pd = runif(length(rep(1:20, 10 # conditional probs
> 
> sim_data_wide <- tidyr::spread(sim_data, quarter, pd)  
> colnames(sim_data_wide)[-1] <- paste0("P_", colnames(sim_data_wide)[-1])
> 
> # Compute cumulative probs
> sim_data_wide <- sim_data_wide %>%
>  mutate(PC_1 = P_1,
> PC_2 = 1-(1-P_1)*(1-P_2),
> PC_3 = 1-(1-P_1)*(1-P_2)*(1-P_3),
> PC_4 = 1-(1-P_1)*(1-P_2)*(1-P_3)*(1-P_4),
> PC_5 = 1-(1-P_1)*(1-P_2)*(1-P_3)*(1-P_4)*(1-P_5),
> PC_6 = 
> 1-(1-P_1)*(1-P_2)*(1-P_3)*(1-P_4)*(1-P_5)*(1-P_6),
> PC_7 = 
> 1-(1-P_1)*(1-P_2)*(1-P_3)*(1-P_4)*(1-P_5)*(1-P_6)*(1-P_7),
> PC_8 = 
> 1-(1-P_1)*(1-P_2)*(1-P_3)*(1-P_4)*(1-P_5)*(1-P_6)*(1-P_7)*(1-P_8),
> PC_9 = 
> 1-(1-P_1)*(1-P_2)*(1-P_3)*(1-P_4)*(1-P_5)*(1-P_6)*(1-P_7)*(1-P_8)*(1-P_9),
> PC_10 = 
> 1-(1-P_1)*(1-P_2)*(1-P_3)*(1-P_4)*(1-P_5)*(1-P_6)*(1-P_7)*(1-P_8)*(1-P_9)*(1-P_10)
>)
> 
> 
> Thanks,
> Axel.
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible issue with scatterplot function in car package

2020-05-03 Thread Fox, John
Dear Yousri,

The problem with scatterplot() is now fixed in the development version 3.0-8 of 
the car package on R-Forge, which eventually will be submitted to CRAN. Until 
then, you can install the package via install.packages("car", 
repos="http://R-Forge.R-project.org";)

Thanks again for the bug report,
 John

> On May 3, 2020, at 4:37 AM, Yousri Fanous  wrote:
> 
> Thank you Professor John for your answer.
> 
> As you rightly said I am not using the ch in my example report as it has no 
> bearing to the issue.
> However it is the ch that led me to find the issue.
> I was trying to label each point with its corresponding aa$ch value.
> I used this code:
> 
> scatterplot(aa$x,aa$y,smooth = FALSE, grid = FALSE, frame = FALSE,regLine=F)
> text(aa$x,aa$y, labels=aa$ch,font=1 ,cex=.9,pos=3)
> 
> The annotation was correct for 4 points but not for the (2,5) point.
> I figured it is because it is close to the margin of the plot hence as a 
> quick solution I modified xlim to shift the point away from the margin.
> This worked for the annotation but eventually led to the issue I described.
> 
> Thank you so much for your time
> 
> Yousri Fanous
> 
> Software Developer
> IBM CANADA
> 
> On Sat, May 2, 2020 at 11:47 PM Fox, John  wrote:
> 
> Dear Yousri,
> 
> Yes, this is clearly a bug, and almost surely a long-standing one. We'll fix 
> it in the next release of the car package.
> 
> BTW, stringsAsFactors defaults to FALSE in R 4.0.0 (and you don't use the ch 
> variable in the example). Also, although it has no bearing on the bug, I'd 
> generally prefer
> 
> scatterplot(y ~ x, data=aa, smooth=FALSE, grid=FALSE, 
> frame=FALSE, regLine=FALSE, xlim=c(0, 8))
> 
> Thank you for the bug report,
>  John
> 
>   -
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
> 
> > On May 2, 2020, at 7:30 PM, Yousri Fanous  wrote:
> > 
> > library (car)
> > 
> > aa <- data.frame(x=c(2,  5, 6, 7, 8),
> > +  y=c(5,  10, 9, 12, 11),
> > + ch=c("N",  "Q", "R", "S", "T"),
> > + stringsAsFactors=FALSE)
> > 
> > scatterplot(aa$x,aa$y,smooth = FALSE, grid = FALSE, frame = FALSE,regLine=F)
> > 
> > Both x and y boxplots are correct
> > and in particular the median of the x box is at 6 which is confirmed
> > 
> >> median(aa$x)
> > [1] 6
> > 
> > Now I do only one addition to the scatterplot: I add xlim
> >> scatterplot(aa$x,aa$y,smooth = FALSE, grid = FALSE, frame =
> > FALSE,regLine=F,xlim=c(0,8))
> > 
> > This causes the boxplot on x-axis to be in error:
> > 1) the lower whisker starts now from zero
> > 2) the median is between 4 and 6 and no longer at 6 as before
> > 
> >> sessionInfo()
> > R version 3.6.3 (2020-02-29)
> > [1] car_3.0-7
> > 
> >   [[alternative HTML version deleted]]
> > 
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] possible issue with scatterplot function in car package

2020-05-02 Thread Fox, John


Dear Yousri,

Yes, this is clearly a bug, and almost surely a long-standing one. We'll fix it 
in the next release of the car package.

BTW, stringsAsFactors defaults to FALSE in R 4.0.0 (and you don't use the ch 
variable in the example). Also, although it has no bearing on the bug, I'd 
generally prefer

scatterplot(y ~ x, data=aa, smooth=FALSE, grid=FALSE, 
frame=FALSE, regLine=FALSE, xlim=c(0, 8))

Thank you for the bug report,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On May 2, 2020, at 7:30 PM, Yousri Fanous  wrote:
> 
> library (car)
> 
> aa <- data.frame(x=c(2,  5, 6, 7, 8),
> +  y=c(5,  10, 9, 12, 11),
> + ch=c("N",  "Q", "R", "S", "T"),
> + stringsAsFactors=FALSE)
> 
> scatterplot(aa$x,aa$y,smooth = FALSE, grid = FALSE, frame = FALSE,regLine=F)
> 
> Both x and y boxplots are correct
> and in particular the median of the x box is at 6 which is confirmed
> 
>> median(aa$x)
> [1] 6
> 
> Now I do only one addition to the scatterplot: I add xlim
>> scatterplot(aa$x,aa$y,smooth = FALSE, grid = FALSE, frame =
> FALSE,regLine=F,xlim=c(0,8))
> 
> This causes the boxplot on x-axis to be in error:
> 1) the lower whisker starts now from zero
> 2) the median is between 4 and 6 and no longer at 6 as before
> 
>> sessionInfo()
> R version 3.6.3 (2020-02-29)
> [1] car_3.0-7
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rtools required

2020-04-29 Thread Fox, John
Dear Steven,

It's possible that Windows will hide .Renviron, but it's generally a good idea, 
in my opinion, in Folder Options > View to click "Show hidden files" and 
uncheck "hide extensions". Then .Renviron should show up (once you've created 
it).

Best,
 John

> -Original Message-
> From: Bert Gunter 
> Sent: Wednesday, April 29, 2020 5:50 PM
> To: Steven 
> Cc: Fox, John ; R-help Mailing List  project.org>
> Subject: Re: [R] Rtools required
> 
> Type
> ?.Renviron
> ?R.home
> ?"environment variables"
> 
> at the R prompt to get what I think should be the info you need (or at
> least useful info).
> 
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> On Wed, Apr 29, 2020 at 2:37 PM Steven  wrote:
> >
> > Thanks John. Where is file .Renviron located? It must be a hidden file.
> > I cannot find it.
> >
> > On 2020/4/28 下午 08:29, Fox, John wrote:
> > > Dear Steven,
> > >
> > > Did you follow the instruction on the Rtools webpage to add
> > >
> > >   PATH="${RTOOLS40_HOME}\usr\bin;${PATH}"
> > >
> > > to your .Renviron file?
> > >
> > > I hope this helps,
> > >   John
> > >
> > >-
> > >John Fox, Professor Emeritus
> > >McMaster University
> > >Hamilton, Ontario, Canada
> > >Web: http::/socserv.mcmaster.ca/jfox
> > >
> > >> On Apr 28, 2020, at 4:38 AM, Steven  wrote:
> > >>
> > >> Dear All
> > >>
> > >> I updated to R-4.0.0. and also installed the latest Rtools 4.0 (to
> > >> now the new default folder c:\rtools40). While compiling a package
> > >> (binary) I received the follow marning message saying Rtools is
> > >> required. Any clues? Thanks.
> > >>
> > >> Steven Yen
> > >>
> > >> WARNING: Rtools is required to build R packages but is not
> > >> currently installed. Please download and install the appropriate
> > >> version of Rtools before proceeding:
> > >> https://cran.rstudio.com/bin/windows/Rtools/
> > >>
> > >>
> > >>  [[alternative HTML version deleted]]
> > >>
> > >> __
> > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > >> https://stat.ethz.ch/mailman/listinfo/r-help
> > >> PLEASE do read the posting guide
> > >> http://www.R-project.org/posting-guide.html
> > >> and provide commented, minimal, self-contained, reproducible code.
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rtools required

2020-04-28 Thread Fox, John
Dear Steven,

> -Original Message-
> From: Steven 
> Sent: Tuesday, April 28, 2020 9:50 AM
> To: Fox, John 
> Cc: R-help Mailing List 
> Subject: Re: [R] Rtools required
> 
> Hello John,
> 
> Perhaps you can help me. I am an idiot. I visited the Rtools web page and
> learn to run the following lines in R: Still I am getting the same warning
> message.
> 
>  > writeLines('PATH="${RTOOLS40_HOME}\\usr\\bin;${PATH}"', con =
> "~/.Renviron")
>  > Sys.which("make")
>    make
> "C:\\rtools40\\usr\\bin\\make.exe

The first command writes the modification to your path in the .Renviron file in 
your home directory, which should be executed at the start of each R session. I 
assume that you executed the second command in a fresh session, and it 
indicates that the Rtools are indeed accessible. 

Given that, I don't know why you're still having a problem, assuming that you 
tried to build the package in a fresh session *after* you created .Renviron.

Sorry I can't be of more help,
 John

> 
> On 2020/4/28 下午 08:29, Fox, John wrote:
> > Dear Steven,
> >
> > Did you follow the instruction on the Rtools webpage to add
> >
> > PATH="${RTOOLS40_HOME}\usr\bin;${PATH}"
> >
> > to your .Renviron file?
> >
> > I hope this helps,
> >   John
> >
> >-
> >John Fox, Professor Emeritus
> >McMaster University
> >Hamilton, Ontario, Canada
> >Web: http::/socserv.mcmaster.ca/jfox
> >
> >> On Apr 28, 2020, at 4:38 AM, Steven  wrote:
> >>
> >> Dear All
> >>
> >> I updated to R-4.0.0. and also installed the latest Rtools 4.0 (to
> >> now the new default folder c:\rtools40). While compiling a package
> >> (binary) I received the follow marning message saying Rtools is
> >> required. Any clues? Thanks.
> >>
> >> Steven Yen
> >>
> >> WARNING: Rtools is required to build R packages but is not currently
> >> installed. Please download and install the appropriate version of
> >> Rtools before proceeding:
> >> https://cran.rstudio.com/bin/windows/Rtools/
> >>
> >>
> >>[[alternative HTML version deleted]]
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rtools required

2020-04-28 Thread Fox, John
Dear Steven,

Did you follow the instruction on the Rtools webpage to add 

PATH="${RTOOLS40_HOME}\usr\bin;${PATH}"

to your .Renviron file?

I hope this helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Apr 28, 2020, at 4:38 AM, Steven  wrote:
> 
> Dear All
> 
> I updated to R-4.0.0. and also installed the latest Rtools 4.0 (to now 
> the new default folder c:\rtools40). While compiling a package (binary) 
> I received the follow marning message saying Rtools is required. Any 
> clues? Thanks.
> 
> Steven Yen
> 
> WARNING: Rtools is required to build R packages but is not currently 
> installed. Please download and install the appropriate version of Rtools 
> before proceeding: https://cran.rstudio.com/bin/windows/Rtools/
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] deciphering help for `attach`

2020-04-27 Thread Fox, John
Dear Bert,

> -Original Message-
> From: Bert Gunter 
> Sent: Monday, April 27, 2020 11:26 AM
> To: Fox, John 
> Cc: edwar...@psu.ac.th; r-help@r-project.org
> Subject: Re: [R] deciphering help for `attach`
> 
> What is the use case for attach? As the Help says, I find that with() or
> sometimes within() handles the situations where I would use it.

I don't believe that I was making a general case for using attach(), and in 
fact was arguing to the contrary that its use can produce confusion. I was 
simply trying to clarify what the help page says, which was the focus of the 
original question.

I've only once encountered a situation where I wanted to attach an environment 
(not a data frame) to the search path.

Best,
 John

> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> On Mon, Apr 27, 2020 at 7:58 AM Fox, John  wrote:
> >
> > Dear Edward,
> >
> > Paragraph 4 in the help page goes on to say, "Rather, a new environment
> is created on the search path and the elements of a list (including
> columns of a data frame) or objects in a save file or an environment are
> copied into the new environment."
> >
> > That seems reasonably clear to me. Here's an example that also
> illustrates how attach can lead to confusion:
> >
> > --- snip 
> >
> > > str(cars)
> > 'data.frame':   50 obs. of  2 variables:
> >  $ speed: num  4 4 7 7 8 9 10 10 10 11 ...
> >  $ dist : num  2 10 4 22 16 10 18 26 34 17 ...
> >
> > > attach(cars)
> >
> > > search()
> >  [1] ".GlobalEnv""cars"  "tools:rstudio"
> "package:stats"
> >  [5] "package:graphics"  "package:grDevices" "package:utils"
> "package:datasets"
> >  [9] "package:methods"   "Autoloads" "package:base"
> >
> > > objects()
> > character(0)
> >
> > > objects(pos=2)
> > [1] "dist"  "speed"
> >
> > > str(get("dist", pos=2))
> >  num [1:50] 2 10 4 22 16 10 18 26 34 17 ...
> >
> > > dist <- 1:10
> >
> > > head(dist) # shadows dist in copy of cars
> > [1] 1 2 3 4 5 6
> >
> > > head(get("dist", pos=2))
> > [1]  2 10  4 22 16 10
> >
> > > assign("dist", 10:1, pos=2) # changes dist in objects copied from
> > > cars
> >
> > > head(get("dist", pos=2))
> > [1] 10  9  8  7  6  5
> >
> > --- snip 
> >
> > Paragraph 5 also seems clear to me. Here's an example:
> >
> > --- snip 
> >
> > > attach(NULL)
> >
> > > search()
> >  [1] ".GlobalEnv""NULL"  "cars"
> "tools:rstudio"
> >  [5] "package:stats" "package:graphics"  "package:grDevices"
> "package:utils"
> >  [9] "package:datasets"  "package:methods"   "Autoloads"
> "package:base"
> >
> > > assign("x", 10, pos=2)
> >
> > > x
> > [1] 10
> >
> > --- snip 
> >
> > Now that may beg the question of why one would want to do something like
> this, which isn't addressed in the help file, but a fair comment is that
> if you don't need to store objects in an environment that's accessible on
> the path, why worry about it? After all, no one is forcing you to use this
> trick.
> >
> > Finally, I too recommended that students use attach() when I first
> starting teaching with R, until I noticed that they frequently tied
> themselves into knots by attaching different versions of the same data
> during a session, producing confusion about where the data were coming
> from and what version they were using. It's not hard in R to avoid the use
> of attach(). Of course, attach() is still part of the language, so you,
> and your students, are free to continue using it if you wish, and perhaps
> your students avoid the problems that mine often created.
> >
> > Best,
> >  John
> >
> >   -
> >   John Fox, Professor Emeritus
> >   McMaster University
> >   Hamilton, Ontario, Canada
> >   Web: http::/socserv.mcmaster.ca/jfox
> >
> > > On Apr 27, 2020, at 9:26 AM, Edward McNeil  wrote:
&

Re: [R] deciphering help for `attach`

2020-04-27 Thread Fox, John
Dear Edward,

Paragraph 4 in the help page goes on to say, "Rather, a new environment is 
created on the search path and the elements of a list (including columns of a 
data frame) or objects in a save file or an environment are copied into the new 
environment."

That seems reasonably clear to me. Here's an example that also illustrates how 
attach can lead to confusion:

--- snip 

> str(cars)
'data.frame':   50 obs. of  2 variables:
 $ speed: num  4 4 7 7 8 9 10 10 10 11 ...
 $ dist : num  2 10 4 22 16 10 18 26 34 17 ...

> attach(cars)

> search()
 [1] ".GlobalEnv""cars"  "tools:rstudio" 
"package:stats"
 [5] "package:graphics"  "package:grDevices" "package:utils" 
"package:datasets" 
 [9] "package:methods"   "Autoloads" "package:base"   
  
> objects()
character(0)

> objects(pos=2)
[1] "dist"  "speed"

> str(get("dist", pos=2))
 num [1:50] 2 10 4 22 16 10 18 26 34 17 ...

> dist <- 1:10

> head(dist) # shadows dist in copy of cars
[1] 1 2 3 4 5 6

> head(get("dist", pos=2))
[1]  2 10  4 22 16 10

> assign("dist", 10:1, pos=2) # changes dist in objects copied from cars

> head(get("dist", pos=2))
[1] 10  9  8  7  6  5

--- snip 

Paragraph 5 also seems clear to me. Here's an example:

--- snip 

> attach(NULL)

> search()
 [1] ".GlobalEnv""NULL"  "cars"  
"tools:rstudio"
 [5] "package:stats" "package:graphics"  "package:grDevices" 
"package:utils"
 [9] "package:datasets"  "package:methods"   "Autoloads" "package:base" 

> assign("x", 10, pos=2)

> x
[1] 10

--- snip 

Now that may beg the question of why one would want to do something like this, 
which isn't addressed in the help file, but a fair comment is that if you don't 
need to store objects in an environment that's accessible on the path, why 
worry about it? After all, no one is forcing you to use this trick.

Finally, I too recommended that students use attach() when I first starting 
teaching with R, until I noticed that they frequently tied themselves into 
knots by attaching different versions of the same data during a session, 
producing confusion about where the data were coming from and what version they 
were using. It's not hard in R to avoid the use of attach(). Of course, 
attach() is still part of the language, so you, and your students, are free to 
continue using it if you wish, and perhaps your students avoid the problems 
that mine often created.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Apr 27, 2020, at 9:26 AM, Edward McNeil  wrote:
> 
> Dear Petr,
> Thanks for your quick reply. Much appreciated. However, you haven't really 
> answered
> either of my questions, although I don't quite understand your reference to 
> La Gioconda.
> 
> In any case, despite your strong recommendation not to use `attach`, I am 
> going to keep
> using it, as I have done successfully for the past 16 years, and keep 
> teaching it, until
> it either kills me or disappears from R. Unfortunately I have to teach R to 
> students and
> I don't like it when they ask me "tricky" questions to which I have no 
> answer. ;)
> -- 
> Edward McNeil
> 
> On Mon, April 27, 2020 8:00 pm, PIKAL Petr wrote:
> Hi.
> 
> I strongly recommend not to use attach. I agree that mentioned statements are 
> rather
> contradictory and probably others could give you more insightful answer. You 
> could
> consider that by attaching some data, you create something like a copy of 
> original data
> in your system with a feature that you can use column names directly. If you 
> change
> something in the data after attachment, you change only attached version and 
> not an
> original.
> 
> It is similar as if you take a picture of Gioconda an use some creativity to 
> add a
> moustache to this picture. In any circumstances moustache does not propagate 
> to the
> original Louvre painting. Do not perform any tricks, preferably do not 
> perform attach.
> 
> Cheers
> Petr
> 
>> -Original Message-
>> From: R-help  On Behalf Of Edward McNeil
>> Sent: Monday, April 27, 2020 2:07 PM
>> To: r-help@r-project.org
>> Subject: [R] deciphering help for `attach`
>> 
>> Hi,
>> I have two related questions.
>> 
>> 1. In the help page for `attach` under "Details" it says in paragraph 3:
>> "By default the database is attached ..."
>> 
>> But then paragraph 4 starts: "The database is not actually attached."
>> 
>> Could somebody explain this contradiction? Is the data(base) attached or
>> not?
>> 
>> 2. What is meant by the 5th paragraph: "One useful ‘trick’ is to use what =
>> NULL (or equivalently a length-zero list) to create a new environment on the
>> search path into which objects can be assigned by `assign` ... "?
>> 
>> I don't understand what this "trick" is or why a "trick" needs to be 
>> performed
>> here.
>> 

Re: [R] Correct way to cite R and RStudio in a manuscipt

2020-04-15 Thread Fox, John
Dear John,

For R, see citation() .

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Apr 15, 2020, at 10:16 AM, Sorkin, John  wrote:
> 
> What is the proper way to cite R and Rstudio is a manuscript?
> John
> 
> 
> John David Sorkin M.D., Ph.D.
> 
> Professor of Medicine
> 
> Chief, Biostatistics and Informatics
> 
> University of Maryland School of Medicine Division of Gerontology and 
> Geriatric Medicine
> 
> Baltimore VA Medical Center
> 
> 10 North Greene Street
> 
> GRECC (BT/18/GR)
> 
> Baltimore, MD 21201-1524
> 
> (Phone) 410-605-7119
> 
> (Fax) 410-605-7913 (Please call phone number above prior to faxing) 
> 
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem launching Rcmdr

2020-03-16 Thread Fox, John
Dear Brian,

> On Mar 16, 2020, at 8:23 PM, Brian Grossman  wrote:
> 
> John,
> 
> It appears that the new version of R   3.6.3 does not have this issue. I will 
> be doing more testing and will let you know if we have any issues.

I'm glad that the Rcmdr is now working normally for you -- thanks for letting 
me know.

Since no one else has reported a similar problem under R 3.6.2, I suspect that 
your installation was somehow broken or strangely configured. I guess that 
we'll never know what the source of the problem was.

Best,
 John

> 
> Thank you for your assistance,
> 
> Brian
> 
> On Tue, Mar 10, 2020 at 8:46 AM Fox, John  wrote:
> Dear Brian,
> 
> Normally I'd expect that a workspace saved from a previous session and loaded 
> at the start of the current session would cause this kind of anomalous 
> behaviour, but that doesn't explain why the Rcmdr starts up properly in a 
> second (concurrent?) session, nor why it doesn't start up properly when R is 
> run with the --vanilla switch.
> 
> Can you report the result of sessionInfo() at the start of a session?
> 
> If all else fails, you could try uninstalling and reinstalling R and packages.
> 
> Best,
>  John
> 
>   -
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
> 
> > On Mar 9, 2020, at 3:25 PM, Brian Grossman  wrote:
> > 
> > I'm having a problem with launching Rcmdr. When I try to launch it the
> > first time through R using the command library(Rcmdr) it will go through
> > the process of launching and get to the point where it says
> > 
> > "Registered S3 methods overwritten by 'lme4':
> >  method  from
> >  cooks.distance.influence.merMod car
> >  influence.merModcar
> >  dfbeta.influence.merMod car
> >  dfbetas.influence.merModcar
> > lattice theme set by effectsTheme()
> > See ?effectsTheme for details."
> > 
> > and then it just hangs there and never launches Rcmdr. If you launch
> > another instance of R and run the same command it will complete and launch
> > Rcmdr successfully. I have tried launching R with R.exe --vanilla with the
> > same results.
> > 
> > The system information is Windows 10 version 1903, i5 8500 processor, 8GB
> > RAM, 256Gb  SSD. R version 3.6.2 Platform: x86_64-w64-mingw32/x64 (64-bit)
> > 
> > Hopefully I haven't left out any important information. Thank you for any
> > suggestions.
> > 
> >   [[alternative HTML version deleted]]
> > 
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> 
> 
> -- 
> Brian Grossman, (he/him/his) Desktop Support Specialist
> College of Literature, Science, and the Arts | University of Michigan 
> LSA Technology Services | G240 Angell Hall | 435 S. State Street | Ann Arbor, 
> MI I 48109
> Desk: 734.764.0774 | Cell: 734.260.1017 | For Immediate Assistance: 
> 734.615.0100 | Email: gross...@umich.edu
> 
> Submit a support request anytime using our Guided Web Form
>  
> 
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem launching Rcmdr

2020-03-11 Thread Fox, John
Dear Brian,

I'm afraid that I have no idea about what's going on, particularly why your 
second attempt to start the Rcmdr works, whether after a first attempt, or now 
after using tcltk. 

Did you follow up on Peter's suggestion that your antivirus software may be the 
source of the problem?

Best,
 John

> On Mar 11, 2020, at 4:42 PM, Brian Grossman  wrote:
> 
> I did some testing and was able to run the tcltk commands without any 
> problems. It brought up the graph normally when I hit the submit button and 
> when I changed the values and hit submit it would change the graph 
> accordingly.The interesting thing is that after the tkdestroy(tt) command I 
> tried running library(Rcmdr) and it launched successfully. I'm not sure what 
> that means, but hopefully someone else will.
> 
> Thank you,
> 
> Brian
> 
> On Wed, Mar 11, 2020 at 1:02 PM Fox, John  wrote:
> Dear Bernhard, Peter, and Brian,
> 
> Thanks, Peter, for the suggestion. If you're right then the problem is with 
> tcltk more generally and not directly with the Rcmdr package.
> 
> Brian: That's easily checked and might clarify what's going on: Load the 
> tcltk package via library(tcltk), and then try the following example from 
> ?TkWidgetcmds:
> 
>  snip --
> 
> tt <- tktoplevel()
> tkpack(txt.w <- tktext(tt))
> tkinsert(txt.w, "0.0", "plot(1:10)")
> eval.txt <- function()
>eval(parse(text = tclvalue(tkget(txt.w, "0.0", "end"
> tkpack(but.w <- tkbutton(tt, text = "Submit", command = eval.txt))
> 
> ## Try pressing the button, edit the text and when finished:
> 
> tkdestroy(tt)
> 
>  snip --
> 
> Best,
>  John
> 
> 
> > -Original Message-
> > From: Pfaff, Bernhard Dr. 
> > Sent: Wednesday, March 11, 2020 9:18 AM
> > To: Peter Dalgaard ; Fox, John 
> > Cc: r-help@r-project.org
> > Subject: AW: Re: [R] Problem launching Rcmdr
> > 
> > Good catch, Peter; Cylance might be the culprit - at least I encountered
> > problems by compiling C++ sources and/or building packages with interfaced
> > routines and here a memory checker kicked in.
> > Maybe something akin is happening by launching Rcmdr (tcl/tk)?
> > 
> > -Ursprüngliche Nachricht-----
> > Von: R-help  Im Auftrag von Peter Dalgaard
> > Gesendet: Mittwoch, 11. März 2020 10:29
> > An: Fox, John 
> > Cc: r-help@r-project.org
> > Betreff: [EXT] Re: [R] Problem launching Rcmdr
> > 
> > Any chance that a virus checker is interfering?
> > 
> > -pd
> > 
> > > On 10 Mar 2020, at 23:43 , Fox, John  wrote:
> > >
> > > Dear Brian,
> > >
> > > (Please keep r-help in the loop so that if someone else has this
> > > problem they'll have something to refer to.)
> > >
> > > Your session at start-up seems completely clean, so I'm at a loss to
> > understand what the problem is. I, and I assume very many other people,
> > are using the Rcmdr with essentially the same Windows setup. What's
> > particularly hard for me to understand is that you're able to start the 
> > Rcmdr in
> > a second R session. Does the first R session have to remain open for this to
> > work?
> > >
> > > A next step is to reinstall packages, starting with the Rcmdr package, if 
> > > you
> > haven't already tried that, and eventually to reinstall R, including 
> > deleting the
> > R package library. BTW, I usually prefer to install R in c:\R\ rather than 
> > under
> > Program Files so that the system library is used for packages that I
> > subsequently install, although it should work perfectly fine to install 
> > packages
> > into a personal library.
> > >
> > > Best,
> > > John
> > >
> > >> -Original Message-
> > >> From: Brian Grossman 
> > >> Sent: Tuesday, March 10, 2020 5:07 PM
> > >> To: Fox, John 
> > >> Subject: Re: [R] Problem launching Rcmdr
> > >>
> > >> John,
> > >>
> > >> Thanks for the reply. Here is the output from running sessionInfo()
> > >> right after opening R.
> > >>
> > >>> sessionInfo()
> > >> R version 3.6.2 (2019-12-12)
> > >> Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10
> > >> x64 (build 18362)
> > >>
> > >> Matrix products: default
> > >>
> > >> locale:
> > >> [1] LC_COLLAT

Re: [R] Problem launching Rcmdr

2020-03-11 Thread Fox, John
Dear Bernhard, Peter, and Brian,

Thanks, Peter, for the suggestion. If you're right then the problem is with 
tcltk more generally and not directly with the Rcmdr package.

Brian: That's easily checked and might clarify what's going on: Load the tcltk 
package via library(tcltk), and then try the following example from 
?TkWidgetcmds:

 snip --

tt <- tktoplevel()
tkpack(txt.w <- tktext(tt))
tkinsert(txt.w, "0.0", "plot(1:10)")
eval.txt <- function()
   eval(parse(text = tclvalue(tkget(txt.w, "0.0", "end"
tkpack(but.w <- tkbutton(tt, text = "Submit", command = eval.txt))

## Try pressing the button, edit the text and when finished:

tkdestroy(tt)

 snip --

Best,
 John


> -Original Message-
> From: Pfaff, Bernhard Dr. 
> Sent: Wednesday, March 11, 2020 9:18 AM
> To: Peter Dalgaard ; Fox, John 
> Cc: r-help@r-project.org
> Subject: AW: Re: [R] Problem launching Rcmdr
> 
> Good catch, Peter; Cylance might be the culprit - at least I encountered
> problems by compiling C++ sources and/or building packages with interfaced
> routines and here a memory checker kicked in.
> Maybe something akin is happening by launching Rcmdr (tcl/tk)?
> 
> -Ursprüngliche Nachricht-
> Von: R-help  Im Auftrag von Peter Dalgaard
> Gesendet: Mittwoch, 11. März 2020 10:29
> An: Fox, John 
> Cc: r-help@r-project.org
> Betreff: [EXT] Re: [R] Problem launching Rcmdr
> 
> Any chance that a virus checker is interfering?
> 
> -pd
> 
> > On 10 Mar 2020, at 23:43 , Fox, John  wrote:
> >
> > Dear Brian,
> >
> > (Please keep r-help in the loop so that if someone else has this
> > problem they'll have something to refer to.)
> >
> > Your session at start-up seems completely clean, so I'm at a loss to
> understand what the problem is. I, and I assume very many other people,
> are using the Rcmdr with essentially the same Windows setup. What's
> particularly hard for me to understand is that you're able to start the Rcmdr 
> in
> a second R session. Does the first R session have to remain open for this to
> work?
> >
> > A next step is to reinstall packages, starting with the Rcmdr package, if 
> > you
> haven't already tried that, and eventually to reinstall R, including deleting 
> the
> R package library. BTW, I usually prefer to install R in c:\R\ rather than 
> under
> Program Files so that the system library is used for packages that I
> subsequently install, although it should work perfectly fine to install 
> packages
> into a personal library.
> >
> > Best,
> > John
> >
> >> -Original Message-
> >> From: Brian Grossman 
> >> Sent: Tuesday, March 10, 2020 5:07 PM
> >> To: Fox, John 
> >> Subject: Re: [R] Problem launching Rcmdr
> >>
> >> John,
> >>
> >> Thanks for the reply. Here is the output from running sessionInfo()
> >> right after opening R.
> >>
> >>> sessionInfo()
> >> R version 3.6.2 (2019-12-12)
> >> Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10
> >> x64 (build 18362)
> >>
> >> Matrix products: default
> >>
> >> locale:
> >> [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United
> >> States.1252 [3] LC_MONETARY=English_United States.1252 [4]
> >> LC_NUMERIC=C [5] LC_TIME=English_United States.1252
> >>
> >> attached base packages:
> >> [1] stats graphics  grDevices utils datasets  methods   base
> >>
> >> loaded via a namespace (and not attached):
> >> [1] compiler_3.6.2
> >>
> >>
> >> Brian
> >>
> >> On Tue, Mar 10, 2020 at 8:46 AM Fox, John  >> <mailto:j...@mcmaster.ca> > wrote:
> >>
> >>
> >>Dear Brian,
> >>
> >>Normally I'd expect that a workspace saved from a previous session
> >> and loaded at the start of the current session would cause this kind
> >> of anomalous behaviour, but that doesn't explain why the Rcmdr starts
> >> up properly in a second (concurrent?) session, nor why it doesn't
> >> start up properly when R is run with the --vanilla switch.
> >>
> >>Can you report the result of sessionInfo() at the start of a session?
> >>
> >>If all else fails, you could try uninstalling and reinstalling R and
> >> packages.
> >>
> >>Best,
> >> John
> >>
> >>  ---

Re: [R] Problem launching Rcmdr

2020-03-10 Thread Fox, John
Dear Brian,

(Please keep r-help in the loop so that if someone else has this problem 
they'll have something to refer to.)

Your session at start-up seems completely clean, so I'm at a loss to understand 
what the problem is. I, and I assume very many other people, are using the 
Rcmdr with essentially the same Windows setup. What's particularly hard for me 
to understand is that you're able to start the Rcmdr in a second R session. 
Does the first R session have to remain open for this to work?

A next step is to reinstall packages, starting with the Rcmdr package, if you 
haven't already tried that, and eventually to reinstall R, including deleting 
the R package library. BTW, I usually prefer to install R in c:\R\ rather than 
under Program Files so that the system library is used for packages that I 
subsequently install, although it should work perfectly fine to install 
packages into a personal library.

Best,
 John

> -Original Message-
> From: Brian Grossman 
> Sent: Tuesday, March 10, 2020 5:07 PM
> To: Fox, John 
> Subject: Re: [R] Problem launching Rcmdr
> 
> John,
> 
> Thanks for the reply. Here is the output from running sessionInfo() right 
> after
> opening R.
> 
> > sessionInfo()
> R version 3.6.2 (2019-12-12)
> Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64
> (build 18362)
> 
> Matrix products: default
> 
> locale:
> [1] LC_COLLATE=English_United States.1252 [2] LC_CTYPE=English_United
> States.1252 [3] LC_MONETARY=English_United States.1252 [4]
> LC_NUMERIC=C [5] LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> loaded via a namespace (and not attached):
> [1] compiler_3.6.2
> 
> 
> Brian
> 
> On Tue, Mar 10, 2020 at 8:46 AM Fox, John  <mailto:j...@mcmaster.ca> > wrote:
> 
> 
>   Dear Brian,
> 
>   Normally I'd expect that a workspace saved from a previous session
> and loaded at the start of the current session would cause this kind of
> anomalous behaviour, but that doesn't explain why the Rcmdr starts up
> properly in a second (concurrent?) session, nor why it doesn't start up
> properly when R is run with the --vanilla switch.
> 
>   Can you report the result of sessionInfo() at the start of a session?
> 
>   If all else fails, you could try uninstalling and reinstalling R and
> packages.
> 
>   Best,
>John
> 
> -
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> Web: http::/socserv.mcmaster.ca/jfox
> <http://socserv.mcmaster.ca/jfox>
> 
>   > On Mar 9, 2020, at 3:25 PM, Brian Grossman
> mailto:gross...@umich.edu> > wrote:
>   >
>   > I'm having a problem with launching Rcmdr. When I try to launch it
> the
>   > first time through R using the command library(Rcmdr) it will go
> through
>   > the process of launching and get to the point where it says
>   >
>   > "Registered S3 methods overwritten by 'lme4':
>   >  method  from
>   >  cooks.distance.influence.merMod car
>   >  influence.merModcar
>   >  dfbeta.influence.merMod car
>   >  dfbetas.influence.merModcar
>   > lattice theme set by effectsTheme()
>   > See ?effectsTheme for details."
>   >
>   > and then it just hangs there and never launches Rcmdr. If you
> launch
>   > another instance of R and run the same command it will complete
> and launch
>   > Rcmdr successfully. I have tried launching R with R.exe --vanilla with
> the
>   > same results.
>   >
>   > The system information is Windows 10 version 1903, i5 8500
> processor, 8GB
>   > RAM, 256Gb  SSD. R version 3.6.2 Platform: x86_64-w64-
> mingw32/x64 (64-bit)
>   >
>   > Hopefully I haven't left out any important information. Thank you
> for any
>   > suggestions.
>   >
>   >   [[alternative HTML version deleted]]
>   >
>   > __
>   > R-help@r-project.org <mailto:R-help@r-project.org>  mailing list --
> To UNSUBSCRIBE and more, see
>   > https://stat.ethz.ch/mailman/listinfo/r-help
>   > PLEASE do read the posting guide http://www.R-
> project.org/posting-guide.html
>   > and provide commented, minimal, self-contained, reproducible
> code.
> 
> 
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem launching Rcmdr

2020-03-10 Thread Fox, John
Dear Brian,

Normally I'd expect that a workspace saved from a previous session and loaded 
at the start of the current session would cause this kind of anomalous 
behaviour, but that doesn't explain why the Rcmdr starts up properly in a 
second (concurrent?) session, nor why it doesn't start up properly when R is 
run with the --vanilla switch.

Can you report the result of sessionInfo() at the start of a session?

If all else fails, you could try uninstalling and reinstalling R and packages.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Mar 9, 2020, at 3:25 PM, Brian Grossman  wrote:
> 
> I'm having a problem with launching Rcmdr. When I try to launch it the
> first time through R using the command library(Rcmdr) it will go through
> the process of launching and get to the point where it says
> 
> "Registered S3 methods overwritten by 'lme4':
>  method  from
>  cooks.distance.influence.merMod car
>  influence.merModcar
>  dfbeta.influence.merMod car
>  dfbetas.influence.merModcar
> lattice theme set by effectsTheme()
> See ?effectsTheme for details."
> 
> and then it just hangs there and never launches Rcmdr. If you launch
> another instance of R and run the same command it will complete and launch
> Rcmdr successfully. I have tried launching R with R.exe --vanilla with the
> same results.
> 
> The system information is Windows 10 version 1903, i5 8500 processor, 8GB
> RAM, 256Gb  SSD. R version 3.6.2 Platform: x86_64-w64-mingw32/x64 (64-bit)
> 
> Hopefully I haven't left out any important information. Thank you for any
> suggestions.
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] issue with Rcmdr

2020-01-06 Thread Fox, John
Dear Toufik,

> -Original Message-
> From: Toufik Zahaf 
> Sent: Monday, January 6, 2020 4:07 PM
> To: Jeff Newmiller 
> Cc: r-help@r-project.org; Fox, John ; tzahaf
> 
> Subject: Re: [R] issue with Rcmdr
> 
> Dears
> 
> Thanks a lot , I understand that may be Rcmdr is not ‘adapted’ to run with R
> 3.5.3 so I will try to update R version to 3.6.2 or use the oldest version of 
> R I
> used in the past.

The Rcmdr worked perfectly fine with R 3.5.3 about a year ago but it's possible 
that you've installed incompatible versions of some packages. As a general 
matter, keeping R up-to-date isn't a bad idea.

Best,
 John

> 
> I will let you know the outcome
> 
> Best regards
> Toufik
> 
> > Le 6 janv. 2020 à 20:37, Jeff Newmiller  a écrit :
> >
> > That version of R happens to be "current" if using MRAN... which has some
> benefits (MKL comes pre-configured) and some... ah... "philosophical" ideas
> about stability (uses checkpoint package out of the box... which freezes
> packages at 2019-04-15 UTC unless actions are taken to use a different
> snapshot [1]). The OP may be "stuck in time" at Rcmdr 2.5-2 if they have not
> invoked checkpoint or adjusted the "repos" option... and if the latter then 
> they
> may be encountering package incompatibilities with R 3.5.3 that should have
> been caught with a minimum R-version spec in the Rcmdr package.
> >
> > [1] https://mran.microsoft.com/documents/rro/reproducibility
> >
> >> On January 6, 2020 10:20:04 AM PST, "Fox, John" 
> wrote:
> >> Dear Toufik,
> >>
> >> You've already had a suggestion to check whether RcmdrMisc is
> >> installed. It should have been installed automatically when you
> >> installed the Rcmdr package.
> >>
> >> If RcmdrMisc is installed, see whether you can load it directly via
> >> the command library(RcmdrMisc). If that too fails, you could try
> >> reinstalling RcmdrMisc via install.packages("RcmdrMisc").
> >>
> >> Finally, you're using an old version of R. You might try installing
> >> the current version, which is 3.6.2. Then install the Rcmdr package
> >> by install.packages("Rcmdr").
> >>
> >> I hope this helps,
> >> John
> >>
> >> -
> >> John Fox
> >> Professor Emeritus
> >> McMaster University
> >> Hamilton, Ontario, Canada
> >> Web: https://socialsciences.mcmaster.ca/jfox/
> >>
> >>
> >>
> >>> -Original Message-
> >>> From: R-help  On Behalf Of tzahaf
> >>> Sent: Monday, January 6, 2020 8:16 AM
> >>> To: r-help@r-project.org
> >>> Subject: [R] issue with Rcmdr
> >>>
> >>> Dear
> >>>
> >>> I have a problem when trying to use Rcmdr.  This is the msg I
> >> receive:
> >>>
> >>>
> >>> package ‘Rcmdr’ successfully unpacked and MD5 sums checked
> >>>
> >>> The downloaded binary packages are in
> >>>
> >>>
> C:\Users\toufiz00\AppData\Local\Temp\RtmpgXuxDP\downloaded_packages
> >>>> local({pkg <- select.list(sort(.packages(all.available =
> >>>> TRUE)),graphics=TRUE)
> >>> + if(nchar(pkg)) library(pkg, character.only=TRUE)})
> >>> Loading required package: RcmdrMisc
> >>> Error: package or namespace load failed for ‘RcmdrMisc’ in
> >> rbind(info,
> >>> getNamespaceInfo(env, "S3methods")):
> >>>  number of columns of matrices must match (see arg 2)
> >>> Error: package ‘RcmdrMisc’ could not be loaded
> >>>>
> >>>
> >>> I am running R 3.5.3 on WIN 10
> >>>
> >>> thanks for your help
> >>>
> >>> Best
> >>>
> >>> Toufik
> >>>
> >>> __
> >>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>> PLEASE do read the posting guide http://www.R-project.org/posting-
> >>> guide.html and provide commented, minimal, self-contained,
> >>> reproducible code.
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >
> > --
> > Sent from my phone. Please excuse my brevity.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] issue with Rcmdr

2020-01-06 Thread Fox, John
Dear Toufik,

You've already had a suggestion to check whether RcmdrMisc is installed. It 
should have been installed automatically when you installed the Rcmdr package. 

If RcmdrMisc is installed, see whether you can load it directly via the command 
library(RcmdrMisc). If that too fails, you could try reinstalling RcmdrMisc via 
install.packages("RcmdrMisc"). 

Finally, you're using an old version of R. You might try installing the current 
version, which is 3.6.2. Then install the Rcmdr package by 
install.packages("Rcmdr").

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help  On Behalf Of tzahaf
> Sent: Monday, January 6, 2020 8:16 AM
> To: r-help@r-project.org
> Subject: [R] issue with Rcmdr
> 
> Dear
> 
> I have a problem when trying to use Rcmdr.  This is the msg I receive:
> 
> 
> package ‘Rcmdr’ successfully unpacked and MD5 sums checked
> 
> The downloaded binary packages are in
> 
> C:\Users\toufiz00\AppData\Local\Temp\RtmpgXuxDP\downloaded_packages
> > local({pkg <- select.list(sort(.packages(all.available =
> > TRUE)),graphics=TRUE)
> + if(nchar(pkg)) library(pkg, character.only=TRUE)})
> Loading required package: RcmdrMisc
> Error: package or namespace load failed for ‘RcmdrMisc’ in rbind(info,
> getNamespaceInfo(env, "S3methods")):
>   number of columns of matrices must match (see arg 2)
> Error: package ‘RcmdrMisc’ could not be loaded
> >
> 
> I am running R 3.5.3 on WIN 10
> 
> thanks for your help
> 
> Best
> 
> Toufik
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is the SD in output of glm with Gaussian distribution

2019-12-09 Thread Fox, John
Dear Bert,

It's perhaps a bit pedantic to point it out, but the dispersion is estimated 
from the Pearson statistic (sum of squared residuals or weighted squared 
residuals) not from the residual deviance. You can see this in the code for 
summary.glm().

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 9, 2019, at 10:45 AM, Bert Gunter  wrote:
> 
> In addition, as John's included output shows, only 1 parameter, the
> intercept, is fit. As he also said, the sd is estimated from the residual
> deviance -- it is not a model parameter.
> 
> Suggest you spend some time with a glm tutorial/text.
> 
> Bert
> 
> On Mon, Dec 9, 2019 at 7:17 AM Marc Girondot via R-help <
> r-help@r-project.org> wrote:
> 
>> Let do a simple glm:
>> 
>>> y=rnorm(100)
>>> gnul <- glm(y ~ 1)
>>> gnul$coefficients
>> (Intercept)
>>   0.1399966
>> 
>> The logLik shows the fit of two parameters (DF=2) (intercept) and sd
>> 
>>> logLik(gnul)
>> 'log Lik.' -138.7902 (df=2)
>> 
>> But where is the sd term in the glm object?
>> 
>> If I do the same with optim, I can have its value
>> 
>>> dnormx <- function(x, data) {1E9*-sum(dnorm(data, mean=x["mean"],
>> sd=x["sd"], log = TRUE))}
>>> parg <- c(mean=0, sd=1)
>>> o0 <- optim(par = parg, fn=dnormx, data=y, method="BFGS")
>>> o0$value/1E9
>> [1] 138.7902
>>> o0$par
>>  meansd
>> 
>> 0.1399966 0.9694405
>> 
>> But I would like have the value in the glm.
>> 
>> (and in the meantime, I don't understand why gnul$df.residual returned
>> 99... for me it should be 98=100 - number of observations) -1 (for mean)
>> - 1 (for sd); but it is statistical question... I have asked it in
>> crossvalidated [no answer still] !)
>> 
>> Thanks
>> 
>> Marc
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where is the SD in output of glm with Gaussian distribution

2019-12-09 Thread Fox, John
Dear Marc,

For your simple model, the standard deviation of y is the square-root of the 
estimated dispersion parameter:

> set.seed(123)
> y <- rnorm(100)
> gnul <- glm(y ~ 1)
> summary(gnul)

Call:
glm(formula = y ~ 1)

Deviance Residuals: 
 Min1QMedian3Q   Max  
-2.39957  -0.58426  -0.02865   0.60141   2.09693  

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept)  0.090410.091280.990.324

(Dispersion parameter for gaussian family taken to be 0.8332328)

Null deviance: 82.49  on 99  degrees of freedom
Residual deviance: 82.49  on 99  degrees of freedom
AIC: 268.54

Number of Fisher Scoring iterations: 2

> sqrt(0.8332328)
[1] 0.9128159
> mean(y)
[1] 0.09040591
> sd(y)
[1] 0.9128159

I hope this helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 9, 2019, at 10:16 AM, Marc Girondot via R-help  
> wrote:
> 
> Let do a simple glm:
> 
> > y=rnorm(100)
> > gnul <- glm(y ~ 1)
> > gnul$coefficients
> (Intercept)
>   0.1399966
> 
> The logLik shows the fit of two parameters (DF=2) (intercept) and sd
> 
> > logLik(gnul)
> 'log Lik.' -138.7902 (df=2)
> 
> But where is the sd term in the glm object?
> 
> If I do the same with optim, I can have its value
> 
> > dnormx <- function(x, data) {1E9*-sum(dnorm(data, mean=x["mean"], 
> > sd=x["sd"], log = TRUE))}
> > parg <- c(mean=0, sd=1)
> > o0 <- optim(par = parg, fn=dnormx, data=y, method="BFGS")
> > o0$value/1E9
> [1] 138.7902
> > o0$par
>  meansd
> 
> 0.1399966 0.9694405
> 
> But I would like have the value in the glm.
> 
> (and in the meantime, I don't understand why gnul$df.residual returned 99... 
> for me it should be 98=100 - number of observations) -1 (for mean) - 1 (for 
> sd); but it is statistical question... I have asked it in crossvalidated [no 
> answer still] !)
> 
> Thanks
> 
> Marc
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R commander (Rcmdr) won't start [SOLVED]

2019-12-08 Thread Fox, John
Dear Gabriele,

> On Dec 8, 2019, at 3:35 AM, gabriele pallotti  wrote:
> 
> Dear John, 
> thank you for your prompt reply. An inexperienced user like me tends to see 
> the .Rdata folder like the document folder for other programs, and, as one 
> doesn't have to delete the document folder when updating Libreoffice, tends 
> to think the same for R. 

The potential problem with .RData is that it's not just a file that saves the 
contents of the R workspace at the end of a session, but that the saved 
workspace is then loaded at the start of a subsequent session. This can create 
problems (again, not just for the Rcmdr) in that subsequent session, and not 
only when R is updated.

> But let me take the opportunity to express a huge thank you for your work on 
> Rcmdr. For people like me, who do statistical analyses just a few times a 
> year, it is a very precious resource, which can also serve as an introduction 
> to command-line R. As an applied linguist with a background in semiotics, I 
> would have plenty of reasons to explain why a graphic interface is such a 
> good thing. As a taxpayer, I appreciate the fact that it makes us saves money 
> as it allows students and researchers in the humanities, like me, to do some 
> basic statistics without buying SPSS licences. 
> I don't want to open a debate here on the pros and cons of graphical 
> interfaces. Let me just say you're doing an excellent service to the 
> community, both with your package and your replies to this forum, which shows 
> you're a really kind person and deserve all our appreciation. 

Thank you for your very kind remarks.

Best,
 John

> Best wishes
> Gabriele Pallotti (Italy)
> 
> 
> 
> 
> Il giorno ven 6 dic 2019 alle ore 22:58 Fox, John  ha 
> scritto:
> Dear Gabriele,
> 
> I'm glad that you were able to solve your problem. I spent a bit of time 
> today updating my R from 3.6.0 to 3.6.1 and updating all R packages on 
> Ubuntu, and, for what is now an obvious reason, I was unable to duplicate the 
> problem.
> 
> Saving the .Rhistory file is benign but saving the R workspace at the end of 
> a session in .RData can be problematic, and not just for the Rcmdr. You'll 
> notice that while R makes saving the workspace the default (presumably to 
> avoid inadvertent data loss, and in my opinion not a good default choice), 
> the Rcmdr doesn't offer to save the R workspace when you select "File > Exit 
> > From Commander and R" from the Rcmdr menus.
> 
> Best,
>  John
> 
>   -
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: http::/socserv.mcmaster.ca/jfox
> 
> > On Dec 6, 2019, at 1:38 PM, gabriele pallotti  
> > wrote:
> > 
> > I managed to get Rcmdr working simply by deleting the .Rdata and .Rhistory
> > file from the work directory. It is rather weird, as I thought they only
> > contained data and settings, but probably some of these belonged to the
> > older version of R/Rcmdr and were not compatible with the new version.
> > I'll keep the old data files in a separate folder and try to open them as a
> > workspace after launching Rcmdr, but I won't do it now as I need Rcmdr to
> > work in the next days and I don't want to take any risks...
> > 
> >   [[alternative HTML version deleted]]
> > 
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R commander (Rcmdr) won't start [SOLVED]

2019-12-06 Thread Fox, John
Dear Gabriele,

I'm glad that you were able to solve your problem. I spent a bit of time today 
updating my R from 3.6.0 to 3.6.1 and updating all R packages on Ubuntu, and, 
for what is now an obvious reason, I was unable to duplicate the problem.

Saving the .Rhistory file is benign but saving the R workspace at the end of a 
session in .RData can be problematic, and not just for the Rcmdr. You'll notice 
that while R makes saving the workspace the default (presumably to avoid 
inadvertent data loss, and in my opinion not a good default choice), the Rcmdr 
doesn't offer to save the R workspace when you select "File > Exit > From 
Commander and R" from the Rcmdr menus.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Dec 6, 2019, at 1:38 PM, gabriele pallotti  wrote:
> 
> I managed to get Rcmdr working simply by deleting the .Rdata and .Rhistory
> file from the work directory. It is rather weird, as I thought they only
> contained data and settings, but probably some of these belonged to the
> older version of R/Rcmdr and were not compatible with the new version.
> I'll keep the old data files in a separate folder and try to open them as a
> workspace after launching Rcmdr, but I won't do it now as I need Rcmdr to
> work in the next days and I don't want to take any risks...
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Orthogonal polynomials used by R

2019-11-28 Thread Fox, John
Dear Ashim,

Please see my brief remarks below:

> On Nov 28, 2019, at 11:02 AM, Ashim Kapoor  wrote:
> 
> On Thu, Nov 28, 2019 at 7:38 PM Fox, John  wrote:
> 
>> Dear Ashim,
>> 
>> I'm afraid that much of what you say here is confused.
>> 
>> First, because poly(x) and poly(x, raw=TRUE) produce the same fitted
>> values (as I previously explained), they also produce the same residuals,
>> and consequently the same CV criteria. From the point of view of CV,
>> there's therefore no reason to prefer orthogonal polynomials. And you still
>> don't explain why you want to interpret the coefficients of the polynomial.
>> 
> 
> The trend in the variable that I am trying to create an ARIMA model for is
> given by poly(x,4). That is why I wished to know what these polynomials
> look like.

The polynomial "looks" exactly the same whether or not you use raw or 
orthogonal regressors as a basis for it. That is, the two bases represent 
exactly the same regression surface (i.e., curve in the case of one x). To see 
what the fitted polynomial looks like, graph it. But I've now made essentially 
this point three times, so if it's not clear I regret the unclarity but I don't 
really have anything to add.

For other points, see below.

> 
> I used  :
> 
> trend <- predict(lm(gdp~poly(x,4)),newdata = data.frame(
> x=94:103),interval="confidence")
> 
> and I was able to (numerically) extrapolate the poly(x,4) trend, although,
> I think it would be interesting to know what polynomials I was dealing with
> in this case. Just some intuition as to if the linear / quadratic / cubic /
> fourth order polynomial trend is dominating. I don't know how I would
> interpret them, but it would be fun to know.

I'm not sure how you intend to interpret the coefficients, say of the raw 
polynomial. Their magnitudes shouldn't be compared because the size of the 
powers of x grows with the powers. 

BTW, it's very risky to use high-order polynomials for extrapolation beyond the 
observed range of x, even if the model fits well within the observed range of 
x, and of course raw and orthogonal polynomial produce exactly the same 
(problematic) extrapolations (although those produced by raw polynomials may be 
subject to more rounding error). To be clear, I'm not arguing that one should 
in general use raw polynomials in preference to orthogonal polynomials, just 
that the former have generally interpretable coefficients and the latter don't.

> 
> Please allow me to show you a trick. I read this on the internet, here :-
> 
> https://www.datasciencecentral.com/profiles/blogs/deep-dive-into-polynomial-regression-and-overfitting
> 
> Please see the LAST comment by Scott Stelljes where he suggests using an
> orthogonal polynomial basis. He does not elaborate buttoleaves the reader to
> work out the details.

This blog focuses on the numerical stability of raw versus orthogonal 
polynomials. If by "stepwise" you mean adding successive powers to the model, 
you'll get exactly the same sequence of fits with raw as with orthogonal 
polynomial, as I've now explained several times.

> 
> Here is what I think of this. Take a big number say 20 and take a variable
> in which we are trying to find the order of the polynomial in the trend.
> Like this :-
> 
>> summary(lm(gdp ~ poly(x,20)))
> 
> Call:
> lm(formula = gdp ~ poly(x, 20))
> 
> Residuals:
> Min   1Q   Median   3Q  Max
> -1235661  -367798   -80453   240360  1450906
> 
> Coefficients:
>   Estimate Std. Error t value Pr(>|t|)
> (Intercept)17601482  66934 262.968  < 2e-16 ***
> poly(x, 20)1  125679081 645487 194.704  < 2e-16 ***
> poly(x, 20)2   43108747 645487  66.785  < 2e-16 ***
> poly(x, 20)33605839 645487   5.586 3.89e-07 ***
> poly(x, 20)4   -2977277 645487  -4.612 1.69e-05 ***
> poly(x, 20)51085732 645487   1.682   0.0969 .
> poly(x, 20)61124125 645487   1.742   0.0859 .
> poly(x, 20)7-108676 645487  -0.168   0.8668
> poly(x, 20)8-976915 645487  -1.513   0.1345
> poly(x, 20)9   -1635444 645487  -2.534   0.0135 *
> poly(x, 20)10   -715019 645487  -1.108   0.2717
> poly(x, 20)11347102 645487   0.538   0.5924
> poly(x, 20)12   -176728 645487  -0.274   0.7850
> poly(x, 20)13   -634151 645487  -0.982   0.3292
> poly(x, 20)14   -537725 645487  -0.833   0.4076
> poly(x, 20)15-58674 645487  -0.091   0.9278
> poly(x, 20)16-67030 645487  -0.104   0.9176
> poly(x, 20)17   -809443 645487  -1.254   0.2139
> poly(x, 20)18   -668879 645487  -1.036   0.3036
> poly(x, 20)19   -302384 645487  -0.468   0.6409
&g

Re: [R] Orthogonal polynomials used by R

2019-11-28 Thread Fox, John
Dear Ashim,

I'm afraid that much of what you say here is confused.

First, because poly(x) and poly(x, raw=TRUE) produce the same fitted values (as 
I previously explained), they also produce the same residuals, and consequently 
the same CV criteria. From the point of view of CV, there's therefore no reason 
to prefer orthogonal polynomials. And you still don't explain why you want to 
interpret the coefficients of the polynomial.

Second, the model formula gdp~1+x+x^2 and other similar formulas in your 
message don't do what you think. Like + and *, the ^ operator has special 
meaning on the right-hand side of an R model formula. See ?Formula and perhaps 
read something about statistical models in R. For example:

> x <- 1:93
> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> (m <- lm(y ~ x + x^2))

Call:
lm(formula = y ~ x + x^2)

Coefficients:
(Intercept)x  
  -15781393   667147  

While gpp ~ x + I(x^2) would work, a better way to fit a raw quadratic is as 
gdp ~ poly(x, 2, raw=TRUE), as I suggested in my earlier message.

Finally, as to what you should do, I generally try to avoid statistical 
consulting by email. If you can find competent statistical help locally, such 
as at a nearby university, I'd recommend talking to someone about the purpose 
of your research and the nature of your data. If that's not possible, then 
others have suggested where you might find help, but to get useful advice 
you'll have to provide much more information about your research.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Nov 28, 2019, at 12:46 AM, Ashim Kapoor  wrote:
> 
> Dear Peter and John,
> 
> Many thanks for your prompt replies. 
> 
> Here is what I was trying to do.  I was trying to build a statistical model 
> of a given time series using Box Jenkins methodology. The series has 93 data 
> points. Before I analyse the ACF and PACF, I am required to de-trend the 
> series. The series seems to have an upward trend. I wanted to find out what 
> order polynomial should I fit the series 
> without overfitting.  For this I want to use orthogonal polynomials(I think 
> someone on the internet was talking about preventing overfitting by using 
> orthogonal polynomials) . This seems to me as a poor man's cross validation. 
> 
> So my plan is to keep increasing the degree of the orthogonal polynomials 
> till the coefficient of the last orthogonal polynomial becomes insignificant.
> 
> Note : If I do NOT use orthogonal polynomials, I will overfit the data set 
> and I don't think that is a good way to detect the true order of the 
> polynomial.
> 
> Also now that I have detrended the series and built an ARIMA model of the 
> residuals, now I want to forecast. For this I need to use the original 
> polynomials and their coefficients.
> 
> I hope I was clear and that my methodology is ok.
> 
> I have another query here :-
> 
> Note : If I used cross-validation to determine the order of the polynomial, I 
> don't get a clear answer.
> 
> See here :-
> library(boot)
> mydf = data.frame(cbind(gdp,x))
> d<-(c(
> cv.glm(data = mydf,glm(gdp~x),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,2)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,3)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,4)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,5)),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~poly(x,6)),K=10)$delta[1]))
> print(d)
> ## [1] 2.178574e+13 7.303031e+11 5.994783e+11 4.943586e+11 4.596648e+11
> ## [6] 4.980159e+11
> 
> # Here it chooses 5. (but 4 and 5 are kind of similar).
> 
> 
> d1 <- (c(
> cv.glm(data = mydf,glm(gdp~1+x),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5),K=10)$delta[1],
> cv.glm(data = mydf,glm(gdp~1+x+x^2+x^3+x^4+x^5+x^6),K=10)$delta[1]))
> 
> print(d1)
> ## [1] 2.149647e+13 2.253999e+13 2.182175e+13 2.177170e+13 2.198675e+13
> ## [6] 2.145754e+13
> 
> # here it chooses 1 or 6
> 
> Query : Why does it choose 1? Notice : Is this just round off noise / noise 
> due to sampling error created by Cross Validation when it creates the K 
> folds? Is this due to the ill conditioned model matrix?
> 
> Best Regards,
> Ashim.
> 
> 
> 
> 
> 
> On Wed, Nov 27, 2019 at 10:41 PM Fox, John  wrote:
> Dear Ashim,
> 
> Orthogonal polynomials are used because they tend to produce more accurate 
> numerical computations, not because their coefficients are interpretable,

Re: [R] Orthogonal polynomials used by R

2019-11-27 Thread Fox, John
Dear Ashim,

Orthogonal polynomials are used because they tend to produce more accurate 
numerical computations, not because their coefficients are interpretable, so I 
wonder why you're interested in the coefficients. 

The regressors produced are orthogonal to the constant regressor and are 
orthogonal to each other (and in fact are orthonormal), as it's simple to 
demonstrate:

--- snip ---

> x <- 1:93
> y <- 1 + x + x^2 + x^3 + x^4 + rnorm(93)
> (m <- lm(y ~ poly(x, 4)))

Call:
lm(formula = y ~ poly(x, 4))

Coefficients:
(Intercept)  poly(x, 4)1  poly(x, 4)2  poly(x, 4)3  poly(x, 4)4  
   15574516172715069 94769949 27683528  3429259  

> X <- model.matrix(m)
> head(X)
  (Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
1   1  -0.1776843   0.2245083  -0.2572066  0.27935949
2   1  -0.1738216   0.2098665  -0.2236579  0.21862917
3   1  -0.1699589   0.1955464  -0.1919525  0.16390514
4   1  -0.1660962   0.1815482  -0.1620496  0.11487597
5   1  -0.1622335   0.1678717  -0.1339080  0.07123722
6   1  -0.1583708   0.1545171  -0.1074869  0.03269145

> zapsmall(crossprod(X))# X'X
(Intercept) poly(x, 4)1 poly(x, 4)2 poly(x, 4)3 poly(x, 4)4
(Intercept)  93   0   0   0   0
poly(x, 4)1   0   1   0   0   0
poly(x, 4)2   0   0   1   0   0
poly(x, 4)3   0   0   0   1   0
poly(x, 4)4   0   0   0   0   1

--- snip ---

If for some not immediately obvious reason you're interested in the regression 
coefficients, why not just use a "raw" polynomial:

--- snip ---

> (m1 <- lm(y ~ poly(x, 4, raw=TRUE)))

Call:
lm(formula = y ~ poly(x, 4, raw = TRUE))

Coefficients:
(Intercept)  poly(x, 4, raw = TRUE)1  poly(x, 4, raw = TRUE)2  
poly(x, 4, raw = TRUE)3  
 1.5640   0.8985   1.0037   
1.  
poly(x, 4, raw = TRUE)4  
 1.  

--- snip ---

These coefficients are simply interpretable but the model matrix is more poorly 
conditioned:

--- snip ---

> head(X1)
  (Intercept) poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 4, raw = 
TRUE)3
1   1   1   1   
1
2   1   2   4   
8
3   1   3   9   
   27
4   1   4  16   
   64
5   1   5  25   
  125
6   1   6  36   
  216
  poly(x, 4, raw = TRUE)4
1   1
2  16
3  81
4 256
5 625
61296
> round(cor(X1[, -1]), 2)
poly(x, 4, raw = TRUE)1 poly(x, 4, raw = TRUE)2 poly(x, 
4, raw = TRUE)3
poly(x, 4, raw = TRUE)11.000.97 
   0.92
poly(x, 4, raw = TRUE)20.971.00 
   0.99
poly(x, 4, raw = TRUE)30.920.99 
   1.00
poly(x, 4, raw = TRUE)40.870.96 
   0.99
poly(x, 4, raw = TRUE)4
poly(x, 4, raw = TRUE)10.87
poly(x, 4, raw = TRUE)20.96
poly(x, 4, raw = TRUE)30.99
poly(x, 4, raw = TRUE)41.00

--- snip ---

The two parametrizations are equivalent, however, in that they represent the 
same regression surface, and so, e.g., produce the same fitted values:

--- snip ---

> all.equal(fitted(m), fitted(m1))
[1] TRUE

--- snip ---

Because one is usually not interested in the individual coefficients of a 
polynomial there usually isn't a reason to prefer one parametrization to the 
other on the grounds of interpretability, so why do you need to interpret the 
regression equation?

I hope this helps,
 John

  - 
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Nov 27, 2019, at 10:17 AM, Ashim Kapoor  wrote:
> 
> Dear Petr,
> 
> Many thanks for the quick response.
> 
> I also read this:-
> https://en.wikipedia.org/wiki/Discrete_orthogonal_polynomials
> 
> Also I read  in ?poly:-
> The orthogonal polynomial is summarized by the coefficients, which
> can be used to evaluate it via the three-term recursion given in
> Kennedy & Gentle (1980, pp. 343-4), and used in the ‘predict’ part
> of the code.
> 
> I do

Re: [R] Help Installing Rtools

2019-09-06 Thread Fox, John
Dear Harold,

Have you checked that the Rtools directory is on the Windows path? If not, you 
could try rerunning the Rtools installer and allow it to modify the path, or 
simply add the Rtools directory to the path yourself.

I hope that this helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Sep 5, 2019, at 9:43 AM, Doran, Harold  wrote:
> 
> I've done the following steps, but am unable to source in the Cpp files. 
> Details of my session and sessionInfo are below. Am I missing a package, or a 
> critical step? I found one answer regarding the 'make" error on stackoverflow 
> suggesting the problem is resolved by grabbing the more recent version of 
> Rtools, which I have done.
> 
> Thanks.
> 
>> install.packages('installr')
>> library(installr)
>> install.Rtools()
> 
> This step installed Rtools35.exe correctly as far as I can tell. Then,
> 
>> library(devtools)
> Loading required package: usethis
>> library(Rcpp)
>> sourceCpp("test.cpp")
> Warning message:
> In system(cmd) : 'make' not found
> Error in sourceCpp("test.cpp") :
>  Error 1 occurred building shared library.
> 
> WARNING: The tools required to build C++ code for R were not found.
> 
> Please download and install the appropriate version of Rtools:
> 
> http://cran.r-project.org/bin/windows/Rtools/
> 
>> sessionInfo()
> R version 3.6.1 (2019-07-05)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 10 x64 (build 18362)
> 
> Matrix products: default
> 
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> other attached packages:
> [1] Rcpp_1.0.2  devtools_2.1.0  usethis_1.5.1   htmltab_0.7.1
> [5] installr_0.22.0 stringr_1.4.0
> 
> loaded via a namespace (and not attached):
> [1] magrittr_1.5  pkgload_1.0.2 R6_2.4.0  rlang_0.3.4
> [5] httr_1.4.1tools_3.6.1   pkgbuild_1.0.5sessioninfo_1.1.1
> [9] cli_1.1.0 withr_2.1.2   remotes_2.1.0 rprojroot_1.3-2
> [13] assertthat_0.2.1  digest_0.6.19 crayon_1.3.4  processx_3.4.1
> [17] callr_3.3.1   fs_1.3.1  ps_1.3.0  testthat_2.2.1
> [21] curl_4.0  memoise_1.1.0 glue_1.3.1stringi_1.4.3
> [25] compiler_3.6.1backports_1.1.4   desc_1.2.0prettyunits_1.0.2
> [29] XML_3.98-1.20
>> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [effects] allEffects does not accept integer value for xlevels

2019-09-04 Thread Fox, John
Dear Rolf,

Thanks for trying to help. The bug wasn't in AnalyzeModel().

There was a bug in Effect.lm(), Effect.multinom(), and Effect.polr() in how 
xlevels=n (e.g., xlevels=4) was handled, now fixed in the development version 
of the effects package on R-Forge, from which it can be installed via 
install.packages("effects", repos="http://R-Forge.R-project.org";). Despite my 
implication to the contrary, xlevels=n works properly in predictorEffect() in 
the version of effects currently on CRAN.

I'll wait for a decent interval before updating effects again on CRAN.

Best,
 John

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rolf
> Turner
> Sent: Wednesday, September 4, 2019 5:39 PM
> To: Fox, John 
> Cc: r-help@r-project.org; sa...@umn.edu
> Subject: Re: [R] [effects] allEffects does not accept integer value for
> xlevels
> 
> 
> I'm obviously not understanding something here, but it seems to me that
> the conjecture
> 
> >>> It appears to me that the cause is buried in effects:::Analyze.model
> >>>
> >>> in or close to the the lines
> >>>
> >>> if (is.numeric(xlevels) & length(xlevels) == 1L) {
> >>> levs <- xlevels
> >>> for (name in focal.predictors) xlevels[[name]] <- levs
> >>>}
> >>>
> >>>
> >>>
> >>> where xlevels -- while not being a list in this case -- is
> >>> subscripted by xlevels[[name]].
> 
> is not correct.  There is no problem with using [[...]] to extract entries
> from vectors.  E.g.:
> 
> x <- 1:3
> names(x) <- c("mung","gorp","clyde")
> x[["gorp"]]
> 
> produces
> 
> [1] 2
> 
> cheers,
> 
> Rolf
> 
> On 5/09/19 2:19 AM, Fox, John wrote:
> > Dear Gerrit,
> >
> > Yes, that appears to be a bug in Effect() -- too bad that it wasn't
> discovered earlier because a new version of the package was submitted
> yesterday, but thank you for the bug report.
> >
> > We'll fix the bug, but until then a work-around is to specify the
> > number of levels for each numeric predictor, as in
> >
> > allEffects(mod.cowles, xlevels=list(neuroticism=4, extraversion=4))
> >
> > I used 4 levels here to verify that this works correctly, since 5 is the
> default.
> >
> > As well, although unrelated to this bug, you might take a look at
> predictorEffects(), which we recommend in preference to allEffects().
> >
> > Best,
> >   John
> >
> > --
> > John Fox, Professor Emeritus
> > McMaster University
> > Hamilton, Ontario, Canada
> > Web: socialsciences.mcmaster.ca/jfox/
> >
> >
> >
> >> -Original Message-
> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> >> Gerrit Eichner
> >> Sent: Wednesday, September 4, 2019 9:25 AM
> >> To: r-help@r-project.org
> >> Subject: [R] [effects] allEffects does not accept integer value for
> >> xlevels
> >>
> >> Dear list,
> >>
> >> citing from allEffects' help page (of package effects 4.1-2):
> >> "If xlevels=n is an integer, then each numeric predictor is
> >> represented by n equally spaced values rounded to 'nice' numbers."
> >>
> >>
> >> However, adapting the first example from allEffects' help page throws
> >> an an error:
> >>
> >> mod.cowles <- glm(volunteer ~ sex + neuroticism*extraversion,
> >> data=Cowles, family=binomial)
> >> allEffects(mod.cowles,
> >> xlevels=5) Error in xlevels[[name]] : subscript out of bounds
> >>
> >>
> >> It appears to me that the cause is buried in effects:::Analyze.model
> >>
> >> in or close to the the lines
> >>
> >> if (is.numeric(xlevels) & length(xlevels) == 1L) {
> >> levs <- xlevels
> >> for (name in focal.predictors) xlevels[[name]] <- levs
> >>}
> >>
> >>
> >>
> >> where xlevels -- while not being a list in this case -- is
> >> subscripted by xlevels[[name]].
> >>
> >> Is anyone aware of a workaround (without having to specify all
> >> numeric predictors of the used model explicitly in a list and giving
> >> it to xlevels when calling allEffects), and without having to write
> >> my own Analyze.model function? ;-)
> >>
> >>
> >>Thx in adva

Re: [R] [effects] allEffects does not accept integer value for xlevels

2019-09-04 Thread Fox, John
Dear Gerrit,

Yes, that appears to be a bug in Effect() -- too bad that it wasn't discovered 
earlier because a new version of the package was submitted yesterday, but thank 
you for the bug report.

We'll fix the bug, but until then a work-around is to specify the number of 
levels for each numeric predictor, as in

allEffects(mod.cowles, xlevels=list(neuroticism=4, extraversion=4))

I used 4 levels here to verify that this works correctly, since 5 is the 
default.

As well, although unrelated to this bug, you might take a look at 
predictorEffects(), which we recommend in preference to allEffects().

Best,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Gerrit
> Eichner
> Sent: Wednesday, September 4, 2019 9:25 AM
> To: r-help@r-project.org
> Subject: [R] [effects] allEffects does not accept integer value for
> xlevels
> 
> Dear list,
> 
> citing from allEffects' help page (of package effects 4.1-2):
> "If xlevels=n is an integer, then each numeric predictor is represented by
> n equally spaced values rounded to 'nice' numbers."
> 
> 
> However, adapting the first example from allEffects' help page throws an
> an error:
> 
> mod.cowles <- glm(volunteer ~ sex + neuroticism*extraversion,
>data=Cowles, family=binomial) allEffects(mod.cowles,
> xlevels=5) Error in xlevels[[name]] : subscript out of bounds
> 
> 
> It appears to me that the cause is buried in effects:::Analyze.model
> 
> in or close to the the lines
> 
> if (is.numeric(xlevels) & length(xlevels) == 1L) {
>levs <- xlevels
>for (name in focal.predictors) xlevels[[name]] <- levs
>   }
> 
> 
> 
> where xlevels -- while not being a list in this case -- is subscripted by
> xlevels[[name]].
> 
> Is anyone aware of a workaround (without having to specify all numeric
> predictors of the used model explicitly in a list and giving it to xlevels
> when calling allEffects), and without having to write my own Analyze.model
> function? ;-)
> 
> 
>   Thx in advance and best regards  --  Gerrit
> 
> 
> 
> PS:  sessionInfo()
> 
> R version 3.6.1 (2019-07-05)
> Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 10 x64
> (build 18362)
> 
> Matrix products: default
> 
> Random number generation:
>   RNG: Mersenne-Twister
>   Normal:  Inversion
>   Sample:  Rounding
> 
> locale:
> [1] LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252 [3]
> LC_MONETARY=German_Germany.1252 LC_NUMERIC=C [5]
> LC_TIME=German_Germany.1252
> 
> attached base packages:
> [1] stats graphics  grDevices utils datasets  methods   base
> 
> other attached packages:
> [1] effects_4.1-2 carData_3.0-2
> 
> loaded via a namespace (and not attached):
>   [1] Rcpp_1.0.2lattice_0.20-38   MASS_7.3-51.4 grid_3.6.1
> 
>   [5] DBI_1.0.0 nlme_3.1-141  survey_3.36
> estimability_1.3
>   [9] minqa_1.2.4   nloptr_1.2.1  Matrix_1.2-17 boot_1.3-23
> 
> [13] splines_3.6.1 lme4_1.1-21.9001  survival_2.44-1.1 compiler_3.6.1
> [17] colorspace_1.4-1  mitools_2.4   nnet_7.3-12
> 
> 
> -
> Dr. Gerrit Eichner   Mathematical Institute, Room 212
> gerrit.eich...@math.uni-giessen.de   Justus-Liebig-University Giessen
> Tel: +49-(0)641-99-32104  Arndtstr. 2, 35392 Giessen, Germany
> http://www.uni-giessen.de/eichner
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Increasing number of observations worsen the regression model

2019-05-26 Thread Fox, John
Dear Raffaele,

Using your code, with one modification -- setting the seed for R's random 
number generator to make the result reproducible -- I get:

> set.seed(12345)

. . .

> lmMod <- lm(yvar~xvar)
> print(summary(lmMod))

Call:
lm(formula = yvar ~ xvar)

Residuals:
Min  1Q  Median  3Q Max 
-4.0293 -0.6732  0.0021  0.6749  4.2883 

Coefficients:
 Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.0057713  0.0057529   174.8   <2e-16 ***
xvar2.889  0.0009998  2000.4   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.9964 on 29998 degrees of freedom
Multiple R-squared:  0.9926,Adjusted R-squared:  0.9926 
F-statistic: 4.002e+06 on 1 and 29998 DF,  p-value: < 2.2e-16

which is more or less what one would expect.

My guess: you've saved your R workspace from a previous session, and it is then 
loaded at the start of your R session; something in the saved workspace is 
affecting the result, although frankly I can't think what that might be.

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Raffa
> Sent: Saturday, May 25, 2019 8:38 AM
> To: r-help@r-project.org
> Subject: [R] Increasing number of observations worsen the regression model
> 
> I have the following code:
> 
> ```
> 
> rm(list=ls())
> N = 3
> xvar <- runif(N, -10, 10)
> e <- rnorm(N, mean=0, sd=1)
> yvar <- 1 + 2*xvar + e
> plot(xvar,yvar)
> lmMod <- lm(yvar~xvar)
> print(summary(lmMod))
> domain <- seq(min(xvar), max(xvar))    # define a vector of x values to feed
> into model lines(domain, predict(lmMod, newdata =
> data.frame(xvar=domain)))    # add regression line, using `predict` to 
> generate
> y-values
> 
> ```
> 
> I expected the coefficients to be something similar to [1,2]. Instead R keeps
> throwing at me random numbers that are not statistically significant and don't
> fit the model, and I have 20k observations. For example
> 
> ```
> 
> Call:
> lm(formula = yvar ~ xvar)
> 
> Residuals:
>      Min  1Q  Median  3Q Max
> -21.384  -8.908   1.016  10.972  23.663
> 
> Coefficients:
>   Estimate Std. Error t value Pr(>|t|)
> (Intercept) 0.0007145  0.0670316   0.011    0.991
> xvar    0.0168271  0.0116420   1.445    0.148
> 
> Residual standard error: 11.61 on 29998 degrees of freedom Multiple R-
> squared:  7.038e-05,    Adjusted R-squared: 3.705e-05
> F-statistic: 2.112 on 1 and 29998 DF,  p-value: 0.1462
> 
> ```
> 
> 
> The strange thing is that the code works perfectly for N=200 or N=2000.
> It's only for larger N that this thing happen U(for example, N=2). I have
> tried to ask for example in CrossValidated
>  observations-worsen-the-regression-model>
> but the code works for them. Any help?
> 
> I am runnign R 3.6.0 on Kubuntu 19.04
> 
> Best regards
> 
> Raffaele
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Memmory issue

2019-05-06 Thread Fox, John
Dear Ravindra,

My guess is that you're using the 32-bit version of R for Windows rather than 
the 64-bit version. Read question 2.9 in the R FAQ for Windows  
.

If this is the case, the solution is to use the 64-bit version of R, which 
normally would be installed along with the 32-bit version.

I hope that this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ravindra
> Bhingarde
> Sent: Sunday, May 5, 2019 9:42 AM
> To: r-help@r-project.org
> Cc: ps_choug...@rediffmail.com
> Subject: [R] Memmory issue
> 
> Dear sir
> 
> We are use R software R 3.6.0 version with intel core i7 9th gen processor
> 16gb memory but r application utilize 3 gb ram .So kindly give solution for
> same
> 
> Ravindra
> 9011096015
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm fails on some large input

2019-04-18 Thread Fox, John
Dear Dingyuan Wang,

But your question was answered clearly earlier in this thread (I forget by 
whom), showing that lm() provides the solution to the regression of x on y if 
the criterion for singularity is tightened:

> lm(x ~ y)

Call:
lm(formula = x ~ y)

Coefficients:
(Intercept)y  
  94.73   NA  

> lm(x ~ y, tol=1e-10)

Call:
lm(formula = x ~ y, tol = 1e-10)

Coefficients:
(Intercept)y  
 -2.403e+091.595e+00  

Best,
 John

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Dingyuan
> Wang
> Sent: Thursday, April 18, 2019 12:36 PM
> To: Michael Dewey ; r-help@r-project.org
> Subject: Re: [R] lm fails on some large input
> 
> I just want to make a line out of timestamps vs some coordinates, so y~x or
> x~y doesn't matter.
> 
> Yes, I know the answer. When trying R, I'm surprised that R can't solve that
> either. I first noticed that PostgreSQL can't solve it, and found that they 
> fixed
> that in pg 12.
> 
> https://www.postgresql.org/message-
> id/153313051300.1397.9594490737341194671%40wrigleys.postgresql.org
> 
> Therefore I come to ask whether someone know how to fix this in R, or I must
> submit it as a bug?
> 
> 2019/4/18 23:24, Michael Dewey:
> > Perhaps subtract 1506705766 from y?
> >
> > Saying some other software does it well implies you know what the
> > _correct_ answer is here but I would question what that means with
> > this sort of data-set.
> >
> > On 17/04/2019 07:26, Dingyuan Wang wrote:
> >> Hi,
> >>
> >> This input doesn't have any interesting properties except y is unix
> >> time. Spreadsheets can do this well.
> >> Is this a bug that lm can't do x ~ y?
> >>
> >> R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
> >> Copyright (C) 2018 The R Foundation for Statistical Computing
> >> Platform: x86_64-pc-linux-gnu (64-bit)
> >>
> >>  > x = c(79.744, 123.904, 87.29601, 116.352, 67.71201, 72.96001,
> >> 101.632, 108.928, 94.08)
> >>  > y = c(1506705739.385, 1506705766.895, 1506705746.293,
> >> 1506705761.873, 1506705734.743, 1506705735.351, 1506705756.26,
> >> 1506705761.307, 1506705747.372)
> >>  > m = lm(x ~ y)
> >>  > summary(m)
> >>
> >> Call:
> >> lm(formula = x ~ y)
> >>
> >> Residuals:
> >>   Min   1Q   Median   3Q  Max
> >> -27.0222 -14.9902  -0.6542  14.1938  29.1698
> >>
> >> Coefficients: (1 not defined because of singularities)
> >>  Estimate Std. Error t value Pr(>|t|)
> >> (Intercept)   94.734  6.511   14.55 4.88e-07 *** y
> >> NA NA  NA   NA
> >> ---
> >> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >>
> >> Residual standard error: 19.53 on 8 degrees of freedom
> >>
> >>  > summary(lm(y ~ x))
> >>
> >> Call:
> >> lm(formula = y ~ x)
> >>
> >> Residuals:
> >>  Min  1Q  Median  3Q Max
> >> -2.1687 -1.3345 -0.9466  1.3826  2.6551
> >>
> >> Coefficients:
> >>   Estimate Std. Error   t value Pr(>|t|)
> >> (Intercept) 1.507e+09  3.294e+00 4.574e+08  < 2e-16 *** x
> >> 6.136e-01  3.413e-02 1.798e+01 4.07e-07 ***
> >> ---
> >> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >>
> >> Residual standard error: 1.885 on 7 degrees of freedom Multiple
> >> R-squared:  0.9788,    Adjusted R-squared:  0.9758
> >> F-statistic: 323.3 on 1 and 7 DF,  p-value: 4.068e-07
> >>
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >> ---
> >> This email has been checked for viruses by AVG.
> >> https://www.avg.com
> >>
> >>
> >
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] lm fails on some large input

2019-04-18 Thread Fox, John
Dear Peter,

> -Original Message-
> From: peter dalgaard [mailto:pda...@gmail.com]
> Sent: Thursday, April 18, 2019 12:23 PM
> To: Fox, John 
> Cc: Michael Dewey ; Dingyuan Wang
> ; r-help@r-project.org
> Subject: Re: [R] lm fails on some large input
> 
> Um, you need to reverse y and x there. The question was about lm(y ~ x)
> 

Good catch! I missed that in the original posting, and lm() does indeed produce 
the LS solution for the regression of y on x. And, as I'd have expected, the 
naïve approach also fails for the regression of x on y:

> Y <- cbind(1, y)
> b <- solve(t(Y) %*% Y) %*% t(Y) %*% x
Error in solve.default(t(Y) %*% Y) : 
  system is computationally singular: reciprocal condition number = 6.19587e-35

resolving the mystery.

Thanks,
 John

> > X <- cbind(1, y)
> > solve(crossprod(X))
> Error in solve.default(crossprod(X)) :
>   system is computationally singular: reciprocal condition number = 6.19587e-
> 35
> 
> Actually, lm can QR perfectly OK, but it gets caught by its singularity 
> detection:
> 
> > qr <- qr(X, tol=1e-10)
> > qr # without the tol bit, you get same thing but $rank == 1
> $qr
>  y
>  [1,] -3.000 -4.520117e+09
>  [2,]  0.333 -3.426530e+01
>  [3,]  0.333 -2.947103e-02
>  [4,]  0.333  4.252164e-01
>  [5,]  0.333 -3.665468e-01
>  [6,]  0.333 -3.488029e-01
>  [7,]  0.333  2.614064e-01
>  [8,]  0.333  4.086982e-01
>  [9,]  0.333  2.018556e-03
> 
> $rank
> [1] 2
> 
> $qraux
> [1] 1.33 1.571779
> 
> $pivot
> [1] 1 2
> 
> attr(,"class")
> [1] "qr"
> > x = c(79.744, 123.904, 87.29601, 116.352, 67.71201, 72.96001, 101.632,
> > 108.928, 94.08)
> > qr.coef(qr,x)
>   y
> -2.403345e+09  1.595099e+00
> 
> > lm(x~y)
> 
> Call:
> lm(formula = x ~ y)
> 
> Coefficients:
> (Intercept)y
>   94.73   NA
> 
> > lm(x~y, tol=1e-10)
> 
> Call:
> lm(formula = x ~ y, tol = 1e-10)
> 
> Coefficients:
> (Intercept)y
>  -2.403e+091.595e+00
> 
> > lm(x~I(y-mean(y)))
> 
> Call:
> lm(formula = x ~ I(y - mean(y)))
> 
> Coefficients:
>(Intercept)  I(y - mean(y))
> 94.734   1.595
> 
> 
> > On 18 Apr 2019, at 17:56 , Fox, John  wrote:
> >
> > Dear Michael and Dingyuan Wang,
> >
> >> -Original Message-
> >> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> >> Michael Dewey
> >> Sent: Thursday, April 18, 2019 11:25 AM
> >> To: Dingyuan Wang ; r-help@r-project.org
> >> Subject: Re: [R] lm fails on some large input
> >>
> >> Perhaps subtract 1506705766 from y?
> >>
> >> Saying some other software does it well implies you know what the
> >> _correct_ answer is here but I would question what that means with
> >> this sort of data- set.
> >
> > It's rather an interesting problem, though, because the naïve computation of
> the LS solution works:
> >
> > plot(x, y)
> > X <- cbind(1, x)
> > b <- solve(t(X) %*% X) %*% t(X) %*% y
> > b
> > abline(b)
> >
> > That surprised me, because I expected that lm() computation, using the QR
> decomposition, would be more numerically stable.
> >
> > Best,
> > John
> >
> > -
> > John Fox
> > Professor Emeritus
> > McMaster University
> > Hamilton, Ontario, Canada
> > Web: https://socialsciences.mcmaster.ca/jfox/
> >
> >
> >
> >>
> >> On 17/04/2019 07:26, Dingyuan Wang wrote:
> >>> Hi,
> >>>
> >>> This input doesn't have any interesting properties except y is unix
> >>> time. Spreadsheets can do this well.
> >>> Is this a bug that lm can't do x ~ y?
> >>>
> >>> R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
> >>> Copyright (C) 2018 The R Foundation for Statistical Computing
> >>> Platform: x86_64-pc-linux-gnu (64-bit)
> >>>
> >>>> x = c(79.744, 123.904, 87.29601, 116.352, 67.71201, 72.96001,
> >>> 101.632, 108.928, 94.08)  > y = c(1506705739.385, 1506705766.895,
> >>> 1506705746.293, 1506705761.873, 1506705734.743, 1506705735.351,
> >>> 1506705756.26, 1506705761.307,
> >>> 1506705747.372)
> >>>> m = lm(x ~ y)
> >>>> summary(m)
> >>>
> >>> Call:
> >>> lm(formula 

Re: [R] lm fails on some large input

2019-04-18 Thread Fox, John
Dear Michael and Dingyuan Wang,

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Michael
> Dewey
> Sent: Thursday, April 18, 2019 11:25 AM
> To: Dingyuan Wang ; r-help@r-project.org
> Subject: Re: [R] lm fails on some large input
> 
> Perhaps subtract 1506705766 from y?
> 
> Saying some other software does it well implies you know what the _correct_
> answer is here but I would question what that means with this sort of data-
> set.

It's rather an interesting problem, though, because the naïve computation of 
the LS solution works:

plot(x, y)
X <- cbind(1, x)
b <- solve(t(X) %*% X) %*% t(X) %*% y
b
abline(b)

That surprised me, because I expected that lm() computation, using the QR 
decomposition, would be more numerically stable.

Best,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> 
> On 17/04/2019 07:26, Dingyuan Wang wrote:
> > Hi,
> >
> > This input doesn't have any interesting properties except y is unix
> > time. Spreadsheets can do this well.
> > Is this a bug that lm can't do x ~ y?
> >
> > R version 3.5.2 (2018-12-20) -- "Eggshell Igloo"
> > Copyright (C) 2018 The R Foundation for Statistical Computing
> > Platform: x86_64-pc-linux-gnu (64-bit)
> >
> >  > x = c(79.744, 123.904, 87.29601, 116.352, 67.71201, 72.96001,
> > 101.632, 108.928, 94.08)  > y = c(1506705739.385, 1506705766.895,
> > 1506705746.293, 1506705761.873, 1506705734.743, 1506705735.351,
> > 1506705756.26, 1506705761.307,
> > 1506705747.372)
> >  > m = lm(x ~ y)
> >  > summary(m)
> >
> > Call:
> > lm(formula = x ~ y)
> >
> > Residuals:
> >   Min   1Q   Median   3Q  Max
> > -27.0222 -14.9902  -0.6542  14.1938  29.1698
> >
> > Coefficients: (1 not defined because of singularities)
> >      Estimate Std. Error t value Pr(>|t|)
> > (Intercept)   94.734  6.511   14.55 4.88e-07 *** y
> > NA NA  NA   NA
> > ---
> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Residual standard error: 19.53 on 8 degrees of freedom
> >
> >  > summary(lm(y ~ x))
> >
> > Call:
> > lm(formula = y ~ x)
> >
> > Residuals:
> >      Min  1Q  Median  3Q Max
> > -2.1687 -1.3345 -0.9466  1.3826  2.6551
> >
> > Coefficients:
> >   Estimate Std. Error   t value Pr(>|t|)
> > (Intercept) 1.507e+09  3.294e+00 4.574e+08  < 2e-16 *** x
> > 6.136e-01  3.413e-02 1.798e+01 4.07e-07 ***
> > ---
> > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > Residual standard error: 1.885 on 7 degrees of freedom Multiple
> > R-squared:  0.9788,    Adjusted R-squared:  0.9758
> > F-statistic: 323.3 on 1 and 7 DF,  p-value: 4.068e-07
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> >
> > ---
> > This email has been checked for viruses by AVG.
> > https://www.avg.com
> >
> >
> 
> --
> Michael
> http://www.dewey.myzen.co.uk/home.html
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] isSingular for lm?

2019-04-05 Thread Fox, John
Hi Peter,


> On Apr 5, 2019, at 7:22 AM, peter dalgaard  wrote:
> 
> Can't you just check for NA coefficients?
> 
>> y <- rnorm(10) ; x <- rep(0,10)
>> coef(lm(y~x))
> (Intercept)   x 
> -0.0962404  NA 
> 
> so 
> 
>> any(is.na(coef(lm(y~x
> [1] TRUE
> 
> I have a vague recollection that at some point there might have been dragons 
> lurking in there (? - NA coefs silently removed), but I can't see a problem 
> with it presently.

I think that the problem you recall related to vcov.lm() and not coef(). 
vcov.lm() silently removed NAs, which motivated the introduction of the 
complete argument, defaulting to TRUE. AFAIK, coef() always included NAs in the 
coefficient vector for a singular fit.

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> 
> -pd
> 
>> On 5 Apr 2019, at 12:14 , Witold E Wolski  wrote:
>> 
>> lme4 has a function isSingular to check if the fitted model is Singular,
>> 
>> Although lm has the parameter singular.ok = TRUE by defualt, I could
>> not find a function to check if the fitted model is singular.
>> 
>> What would be the correct way to implement such a function for and lm object?
>> Check if df.residuals == 0
>> 
>> Thanks
>> Witek
>> 
>> 
>> 
>> 
>> 
>> -- 
>> Witold Eryk Wolski
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> -- 
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School
> Solbjerg Plads 3, 2000 Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Potential Issue with lm.influence

2019-04-03 Thread Fox, John
Dear Eric,

I'm afraid that your argument doesn't make sense to me. As you saw when you 
tried

fit3 <- update(fit, subset = !(Name %in% c("Jupiter ", "Saturn ")))

glm.nb() effectively wasn't able to estimate the theta parameter of the 
negative binomial model. So why would it be better to base deletion diagnostics 
on actually refitting the model?

The lesson to me here is that if you fit a sufficiently unreasonable model to 
data, the computations may break down. Other than drawing attention to the NaN 
with an explicit warning, I don't see what more could usefully be done.

Best,
 John

> On Apr 2, 2019, at 9:08 PM, Eric Bridgeford  wrote:
> 
> Hey John,
> 
> I am aware they are high leverage points, and that the model is not the
> best for them. The purpose of this dataset was to explore high leverage
> points, and diagnostic statistics through which one would identify them.
> 
> What I am saying is that the current behavior of the function seems a
> little non-specific to me; the influence for this problem is
> finite/computable manually by fitting n models to n-1 points (manually
> holding out each point individually to obtain the loo-variance, and
> computing the influence in the non-approximate way).
> 
> I am just suggesting that it seems the function could be improved by, say,
> throwing specific warnings when NaNs may arise. Ie, "Your have points that
> are very high leverage. The approximation technique is not numerically
> stable for these points and the results should be used with caution"
> etc...; I am sure there are other also pre-hoc approaches to diagnose other
> ways in which this function could fail). The approximation technique not
> behaving well for points that are ultra high leverage just seems peculiar
> that that would return an NaN with no other recommendations/advice/specific
> warnings, especially since the influence is frequently used to diagnosing
> this specific issue.
> 
> Alternatively, one could afford an optional argument type="manual" that
> computes the held-out variance manually rather than the approximate
> fashion, and add a comment to use this in the help menu when you have high
> leverage points (this is what I ended up doing to obtain the true influence
> and the externally studentized residual).
> 
> I just think some more specificity could be of use for future users, to
> make the R:stats community even better :) Does that make sense?
> 
> Sincerely,
> Eric
> 
> On Tue, Apr 2, 2019 at 7:53 PM Fox, John  wrote:
> 
>> Dear Eric,
>> 
>> Have you looked at your data? -- for example:
>> 
>>plot(log(Moons) ~ Volume, data = moon_data)
>>text(log(Moons) ~ Volume, data = moon_data, labels=Name, adj=1,
>> subset = Volume > 400)
>> 
>> The negative-binomial model doesn't look reasonable, does it?
>> 
>> After you eliminate Jupiter there's one very high leverage point left,
>> Saturn. Computing studentized residuals entails an approximation to
>> deleting that as well from the model, so try fitting
>> 
>>fit3 <- update(fit, subset = !(Name %in% c("Jupiter ", "Saturn ")))
>>summary(fit3)
>> 
>> which runs into numeric difficulties.
>> 
>> Then look at:
>> 
>>plot(log(Moons) ~ Volume, data = moon_data, subset = Volume < 400)
>> 
>> Finally, try
>> 
>>plot(log(Moons) ~ log(Volume), data = moon_data)
>>fit4 <- update(fit2, . ~ log(Volume))
>>rstudent(fit4)
>> 
>> I hope this helps,
>> John
>> 
>> -
>> John Fox
>> Professor Emeritus
>> McMaster University
>> Hamilton, Ontario, Canada
>> Web: https://socialsciences.mcmaster.ca/jfox/
>> 
>> 
>> 
>> 
>>> -Original Message-
>>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Eric
>>> Bridgeford
>>> Sent: Tuesday, April 2, 2019 5:01 PM
>>> To: Bert Gunter 
>>> Cc: R-help 
>>> Subject: Re: [R] Fwd: Potential Issue with lm.influence
>>> 
>>> I agree the influence documentation suggests NaNs may result; however, as
>>> these can be manually computed and are, indeed, finite/existing (ie,
>>> computing the held-out influence by manually training n models for n
>> points
>>> to obtain n leave one out influence measures), I don't possibly see how
>> the
>>> function SHOULD return NaN, and given that it is returning NaN, that
>>> suggests to me that there should be either a) 

Re: [R] Potential Issue with lm.influence

2019-04-03 Thread Fox, John
Hi Peter,

Yes, that's another reflection of the degree to which Jupiter and Saturn are 
out of line with the data for the other planet when you fit the very 
unreasonable negative binomial model with Volume untransformed.

Best,
 John

> On Apr 3, 2019, at 5:36 AM, peter dalgaard  wrote:
> 
> Yes, also notice that 
> 
>> predict(fit3, new=moon_data, type="resp")
>   123456 
> 1.060694e+00 1.102008e+00 1.109695e+00 1.065515e+00 1.057896e+00 1.892312e+29 
>   789   10   11   12 
> 3.531271e+17 2.295015e+01 1.739889e+01 1.058165e+00 1.058041e+00 1.057957e+00 
>  13 
> 1.058217e+00 
> 
> 
> so the model of fit3 predicts that Jupiter and Saturn should have several 
> bazillions of moons each!
> 
> -pd
> 
> 
> 
>> On 3 Apr 2019, at 01:53 , Fox, John  wrote:
>> 
>> Dear Eric,
>> 
>> Have you looked at your data? -- for example:
>> 
>>  plot(log(Moons) ~ Volume, data = moon_data)
>>  text(log(Moons) ~ Volume, data = moon_data, labels=Name, adj=1, subset 
>> = Volume > 400)
>> 
>> The negative-binomial model doesn't look reasonable, does it?
>> 
>> After you eliminate Jupiter there's one very high leverage point left, 
>> Saturn. Computing studentized residuals entails an approximation to deleting 
>> that as well from the model, so try fitting
>> 
>>  fit3 <- update(fit, subset = !(Name %in% c("Jupiter ", "Saturn ")))
>>  summary(fit3)
>> 
>> which runs into numeric difficulties.
>> 
>> Then look at:
>> 
>>  plot(log(Moons) ~ Volume, data = moon_data, subset = Volume < 400)
>> 
>> Finally, try
>> 
>>  plot(log(Moons) ~ log(Volume), data = moon_data)
>>  fit4 <- update(fit2, . ~ log(Volume))
>>  rstudent(fit4)
>> 
>> I hope this helps,
>> John
>> 
>> -
>> John Fox
>> Professor Emeritus
>> McMaster University
>> Hamilton, Ontario, Canada
>> Web: https://socialsciences.mcmaster.ca/jfox/
>> 
>> 
>> 
>> 
>>> -Original Message-
>>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Eric
>>> Bridgeford
>>> Sent: Tuesday, April 2, 2019 5:01 PM
>>> To: Bert Gunter 
>>> Cc: R-help 
>>> Subject: Re: [R] Fwd: Potential Issue with lm.influence
>>> 
>>> I agree the influence documentation suggests NaNs may result; however, as
>>> these can be manually computed and are, indeed, finite/existing (ie,
>>> computing the held-out influence by manually training n models for n points
>>> to obtain n leave one out influence measures), I don't possibly see how the
>>> function SHOULD return NaN, and given that it is returning NaN, that
>>> suggests to me that there should be either a) Providing an alternative
>>> method to compute them that (may be slower) that returns the correct
>>> results in the even that lm.influence does not return a good approximation
>>> (ie, a command line argument for type="approx" that does the
>>> approximation strategy employed currently, or an alternative type="direct"
>>> or something like that that computes them manually), or b) a heuristic to
>>> suggest why NaNs might result from one's particular inputs/what can be
>>> done to fix it (if the approximation strategy is the source of the problem) 
>>> or
>>> what the issue is with the data that will cause NaNs. Hence I was looking to
>>> start a discussion around the specific strategy employed to compute the
>>> elements.
>>> 
>>> Below is the code:
>>> moon_data <- structure(list(Name = structure(c(8L, 13L, 2L, 7L, 1L, 5L, 11L,
>>>  12L, 9L, 10L, 4L, 6L, 3L), 
>>> .Label = c("Ceres ", "Earth",
>>> "Eris ",
>>> 
>>>"Haumea ", "Jupiter ", "Makemake ", "Mars ", "Mercury ", "Neptune ",
>>> 
>>>"Pluto ", "Saturn ", "Uranus ", "Venus "), class = "factor"),
>>>   Distance = c(0.39, 0.72, 1, 1.52, 2.75, 5.2, 
>>> 9.54, 19.22,
>>>30.06, 39.5, 43.35, 45.8, 67.7), 
>>> Diameter = c(0.382, 0.949,
>>&

Re: [R] Fwd: Potential Issue with lm.influence

2019-04-02 Thread Fox, John
Dear Eric,

Have you looked at your data? -- for example:

plot(log(Moons) ~ Volume, data = moon_data)
text(log(Moons) ~ Volume, data = moon_data, labels=Name, adj=1, subset 
= Volume > 400)

The negative-binomial model doesn't look reasonable, does it?

After you eliminate Jupiter there's one very high leverage point left, Saturn. 
Computing studentized residuals entails an approximation to deleting that as 
well from the model, so try fitting

fit3 <- update(fit, subset = !(Name %in% c("Jupiter ", "Saturn ")))
summary(fit3)

which runs into numeric difficulties.

Then look at:

plot(log(Moons) ~ Volume, data = moon_data, subset = Volume < 400)

Finally, try

plot(log(Moons) ~ log(Volume), data = moon_data)
fit4 <- update(fit2, . ~ log(Volume))
rstudent(fit4)

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/




> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Eric
> Bridgeford
> Sent: Tuesday, April 2, 2019 5:01 PM
> To: Bert Gunter 
> Cc: R-help 
> Subject: Re: [R] Fwd: Potential Issue with lm.influence
> 
> I agree the influence documentation suggests NaNs may result; however, as
> these can be manually computed and are, indeed, finite/existing (ie,
> computing the held-out influence by manually training n models for n points
> to obtain n leave one out influence measures), I don't possibly see how the
> function SHOULD return NaN, and given that it is returning NaN, that
> suggests to me that there should be either a) Providing an alternative
> method to compute them that (may be slower) that returns the correct
> results in the even that lm.influence does not return a good approximation
> (ie, a command line argument for type="approx" that does the
> approximation strategy employed currently, or an alternative type="direct"
> or something like that that computes them manually), or b) a heuristic to
> suggest why NaNs might result from one's particular inputs/what can be
> done to fix it (if the approximation strategy is the source of the problem) or
> what the issue is with the data that will cause NaNs. Hence I was looking to
> start a discussion around the specific strategy employed to compute the
> elements.
> 
> Below is the code:
> moon_data <- structure(list(Name = structure(c(8L, 13L, 2L, 7L, 1L, 5L, 11L,
>12L, 9L, 10L, 4L, 6L, 3L), 
> .Label = c("Ceres ", "Earth",
> "Eris ",
> 
>  "Haumea ", "Jupiter ", "Makemake ", "Mars ", "Mercury ", "Neptune ",
> 
>  "Pluto ", "Saturn ", "Uranus ", "Venus "), class = "factor"),
> Distance = c(0.39, 0.72, 1, 1.52, 2.75, 5.2, 
> 9.54, 19.22,
>  30.06, 39.5, 43.35, 45.8, 67.7), 
> Diameter = c(0.382, 0.949,
> 
>1, 0.532, 0.08, 11.209, 9.449, 4.007, 3.883, 0.18, 0.15,
> 
>0.12, 0.19), Mass = c(0.06, 0.82, 1, 0.11, 2e-04, 317.8,
> 
>  95.2, 14.6, 17.2, 0.0022, 7e-04, 7e-04, 
> 0.0025), Moons = c(0L,
> 
> 
> 0L, 1L, 2L, 0L, 64L, 62L, 27L, 13L, 4L, 2L, 0L, 1L), Volume =
> c(0.0291869497930152,
> 
> 
> 
> 0.447504348276571, 0.523598775598299, 0.0788376225681443,
> 
> 
> 
> 0.000268082573106329, 737.393372232996, 441.729261571372,
> 
> 
> 
> 33.6865588825666, 30.6549628355953, 0.00305362805928928,
> 
> 
> 
> 0.00176714586764426, 0.00090477868423386, 0.00359136400182873
> 
> 
> )), row.names = c(NA, -13L), class = "data.frame")
> 
> fit <- glm.nb(Moons ~ Volume, data = moon_data)
> rstudent(fit)
> 
> fit2 <- update(fit, subset = Name != "Jupiter ")
> rstudent(fit2)
> 
> influence(fit2)$sigma
> 
> #12345789
>  10   11   12   13
> # 1.077945 1.077813 1.165025 1.181685 1.077954  NaN 1.044454 1.152110
> 1.187586 1.181696 1.077954 1.165147
> 
> Sincerely,
> Eric
> 
> On Tue, Apr 2, 2019 at 4:38 PM Bert Gunter 
> wrote:
> 
> > Also, I suggest you read ?influence which may explain the source of
> > your NaN's .
> >
> > Bert Gunter
> >
> > "The trouble with having an open mind is that people keep coming along
> > and sticking things into it."
> > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> >
> >
> > On Tue, Apr 2, 2019 at 1:29 PM Bert Gunter 
> wrote:
> >
> >> I told you already: **Include code inline **
> >>
> >> See ?dput for how to include a text version of objects, such as data
> >> frames, inline.
> >>
> >> Otherwise, I believe .txt text files are not stripped if you insist
> >> on
> >> *attaching* data or code. Others may have better advice.
> >>
> >>
> >> Bert Gunter
> >>
> >> "The trouble with having an open mind is that people keep coming
> >> along and stic

Re: [R] Multilevel models

2019-03-04 Thread Fox, John
Dear Saul,

The most commonly used mixed-effect models software in R, in the lme4 and nlme 
packages, use the Laird-Ware form of the model, which isn't explicitly 
hierarchical. That is, higher-level variables are simply invariant within 
groups and appear in the model formula in the same manner as individual-level 
variables. So there's no problem -- just specify the model as you normally 
would.

By the way, you're more likely to get responses about mixed models if you post 
to the R-sig-mixed-models list 
 rather than to the 
more general R-help list.

I hope this helps,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Mar 3, 2019, at 5:19 AM, Saul Weaver  wrote:
> 
> Hello,
> 
> I have data with workers within departments. I am interested in testing the
> effects of peers' satisfaction on employees' productivity. To assess peer
> satisfaction, I calculate, for each employee, the average satisfaction of
> the employees' peers within the department. In other words, I calculate the
> average satisfaction in the department, while excluding the focal employee.
> I'm not sure about the level of this variable, because on the one hand, it
> is unique for each employee, but on the other hand, the values of this
> variable across employees are not independent of each other. How would I
> account for this issue in R?
> 
> Thank you,
> 
> S Weaver
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obtaining values of estimates from a regression; How do I get values from a list?

2019-02-22 Thread Fox, John
Dear John,

This seems to be more complicated than it needs to be. One normally uses coef() 
to extract coefficients from a model object. Thus

> coef(fitchange)
(Intercept) pre 
-54.1010158   0.6557661 
> coef(fitchange)[2]
  pre 
0.6557661 

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Feb 22, 2019, at 3:26 AM, Sorkin, John  wrote:
> 
> Problem solved:
> 
> summary(fitchange)$coefficients[2,1]
> 
> 
> 
> John David Sorkin M.D., Ph.D.
> Professor of Medicine
> Chief, Biostatistics and Informatics
> University of Maryland School of Medicine Division of Gerontology and 
> Geriatric Medicine
> Baltimore VA Medical Center
> 10 North Greene Street
> GRECC (BT/18/GR)
> Baltimore, MD 21201-1524
> (Phone) 410-605-7119
> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
> 
> 
> 
> From: R-help  on behalf of Ivan Calandra 
> 
> Sent: Friday, February 22, 2019 2:58:47 AM
> To: r-help@r-project.org
> Subject: Re: [R] Obtaining values of estimates from a regression; How do I 
> get values from a list?
> 
> I find that the str() function is really helpful to understand how an object 
> is
> structured, and therefore how to extract part(s) of it.
> 
> Try for example:
> str(zz)
> and it might help you understand why zz$coefficients[2,1] is what you were
> looking for.
> 
> HTH
> Ivan
> 
> --
> Dr. Ivan Calandra
> TraCEr, laboratory for Traceology and Controlled Experiments
> MONREPOS Archaeological Research Centre and
> Museum for Human Behavioural Evolution
> Schloss Monrepos
> 56567 Neuwied, Germany
> +49 (0) 2631 9772-243
> https://www.researchgate.net/profile/Ivan_Calandra
> 
> On February 22, 2019 at 8:50 AM Eric Berger  wrote:
>> You have some choices
>> 
>> fitchange$coefficients[2]
>> 
>> zz$coefficients[2,1]
>> 
>> Note that class(zz$coefficients) shows that it is a matrix.
>> 
>> HTH,
>> Eric
>> 
>> 
>> On Fri, Feb 22, 2019 at 9:45 AM Sorkin, John 
>> wrote:
>> 
>>> I am trying to obtain the coefficients from a regression (performed using
>>> lm). I would like to get the value for the slope (i.e. estimate) for pre
>>> from the following regression:
>>> 
>>> 
>>> fitchange <- lm(post-pre~pre,data=mydata2)
>>> 
>>> 
>>> I have tried the following without any success:
>>> 
>>> 
>>> zz <- summary(fitchange)["coefficients"]
>>> class(zz)
>>> print(zz)
>>> zz[[2,1]]
>>> zz[2,1]
>>> zz["pre","Estimate"]
>>> 
>>> 
>>> I clearly don't know how to select elements from the list returned by the
>>> summary function.
>>> 
>>> 
>>> A reproducible version of my code follows:
>>> 
>>> 
>>> mydata2 <-structure(list(pre = c(71.3302299440613, 86.2703384845455,
>>> 120.941698468568,
>>> 80.9020778388552, 84.9927752038908,
>>> 77.9108032451793, 111.007107108483,
>>> 93.288442414475, 126.097826796255,
>>> 111.63734644637),
>>> post = c(45.9294556667686,
>>> 114.661937978585, 138.501558726477,
>>> 55.355775963925, 97.7906200355594,
>>> 71.1008233796004, 149.308274695789,
>>> 122.828428213951, 143.690814568562,
>>> 116.607579975539)), class = "data.frame",
>>> row.names = c(NA, -10L))
>>> 
>>> fitchange <- lm(post-pre~pre,data=mydata2)
>>> zz <- summary(fitchange)["coefficients"]
>>> class(zz)
>>> print(zz)
>>> zz[[2,1]]
>>> zz[2,1]
>>> zz["pre","Estimate"]
>>> 
>>> 
>>> Any help you can offer would be appreciated.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> John David Sorkin M.D., Ph.D.
>>> Professor of Medicine
>>> Chief, Biostatistics and Informatics
>>> University of Maryland School of Medicine Division of Gerontology and
>>> Geriatric Medicine
>>> Baltimore VA Medical Center
>>> 10 North Greene Street
>>> GRECC (BT/18/GR)
>>> Baltimore, MD 21201-1524
>>> (Phone) 410-605-7119
>>> (Fax) 410-605-7913 (Please call phone number above prior to faxing)
>>> 
>>> 
>>> [[alternative HTML version deleted]]
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>> 
>> 
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>[[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible cod

Re: [R] adding a hex sticker to a package

2019-01-21 Thread Fox, John
Dear Terry,

I added a hex sticker to the effects package, and there will be one in the next 
versions of the car and Rcmdr packages. I put a pdf with the hex sticker in 
install/docs, and display it with the function effectsHexsticker(); see 
?effectsHexsticker. I imagine that there are other ways to do this as well.

Best,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Therneau,
> Terry M., Ph.D. via R-help
> Sent: Monday, January 21, 2019 12:52 PM
> To: R-help 
> Subject: [R] adding a hex sticker to a package
> 
> I've created a hex sticker for survival.  How should that be added to the
> package directory?   It's temporarily in man/figures on the github page.
> 
> Terry T.
> 
> (Actually, the idea was from Ryan Lennon. I liked it, and we found someone
> with actual graphical skills to execute it. )
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] importing data error question

2019-01-18 Thread Fox, John
Dear Jihee,

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of ???
> Sent: Wednesday, January 16, 2019 7:02 PM
> To: Fox, John 
> Cc: r-help@r-project.org
> Subject: Re: [R] importing data error question
> 
> Thanks for your help!
> 
> I was having trouble with finding how to use english...
> 
> Even though I try to use english language, I couldn't change language of R
> commander. (it is still korean)
> 
> Sorry but.. do you know how to change language of "R commander"? I have
> no idea why it doesn't change.

But the screenshots you sent in previous messages *did* show the Rcmdr in 
English, so you apparently successfully changed the language, I assume via the 
command Sys.setenv(LANGUAGE="en") that I suggested.

John

> 
> Best,
> 
> Jihee
> 
> From:  "Fox, John" 
> 
> Sent: Thursday, January 17, 2019 1:59:03 AM
> 
> To:"우지희" 
> 
> Cc:"r-help@r-project.org" 
> 
> Subject:Re: [R] importing data error question
> 
>   Dear jihee,
> 
>  I've looked into this problem further, using my Mac where it's easier to
> temporarily change languages and character sets than on Windows, and I
> discovered the following:
> 
>  I was able to duplicate your problem with importing Excel files when working
> in Korean. There's a similar problem with the import SAS b7dat files but not
> with the other file-import dialogs.
> 
>  I observed a similar problem when working in Chinese (LANG="zh") but not in
> simplified Chinese (zh_CN) or Japanese (ja), so the problem isn't simply with
> non-Latin character sets. There is no problem in English, Spanish (es), or
> French (fr), and I didn't check the other languages into which the Rcmdr is
> translated.
> 
>  I think that the problem originates in the Korean and Chinese translation 
> files
> and I'll contact the translators to see whether they can fix it.
> 
>  Thank you for reporting this issue.
> 
>  John
> 
>  > On Jan 14, 2019, at 11:36 PM, Fox, John  wrote:
>  >
>  > Dear jihee,
>  >
>  >> On Jan 14, 2019, at 9:00 PM, 우지희  wrote:
>  >>
>  >> You said previously that you were using a Mac, so I'm surprised that you
> now say that you're using Windows. I don't have a Windows 7 system, but I
> can confirm that importing from Excel files works perfectly fine under
> Windows 10, as I just verified, and I'd be surprised if the Windows version
> matters.
>  >>
>  >> --> no, I never said i was using a Mac.
>  >
>  > Sorry, I guess I got that from the error message you originally reported,
> which was "Error in structure(.External(.C_dotTclObjv, objv), class = 
> "tclObj") :
> [tcl] bad Macintosh file type "“*”"." I've never seen that error and it seems
> peculiar that it would occur on a Windows system.
>  >
>  >>
>  >> You still haven't reported the versions of R, the Rcmdr package, and the
> other packages that you're using. The easiest way to do this is to show the
> output of the sessionInfo() command.
>  >>
>  >> --> sessionInfo()
>  >> R version 3.5.2 (2018-12-20)
>  >> Platform: x86_64-w64-mingw32/x64 (64-bit)  >> Running under: Windows
> 7 x64 (build 7601) Service Pack 1  >>  >> Matrix products: default  >>  >>
> locale:
>  >> [1] LC_COLLATE=Korean_Korea.949 LC_CTYPE=Korean_Korea.949  >> [3]
> LC_MONETARY=Korean_Korea.949 LC_NUMERIC=C  >> [5]
> LC_TIME=Korean_Korea.949  >>  >> attached base packages:
>  >> [1] tcltk splines stats graphics grDevices utils datasets methods  >> [9] 
> base
> >>  >> other attached packages:
>  >> [1] RcmdrPlugin.SensoMineR_1.11-01 RcmdrPlugin.FactoMineR_1.6-0  >>
> [3] Rcmdr_2.5-1 effects_4.1-0  >> [5] RcmdrMisc_2.5-1 sandwich_2.5-0  >> [7]
> car_3.0-2 carData_3.0-2  >> [9] SensoMineR_1.23 FactoMineR_1.41  >>  >>
> loaded via a namespace (and not attached):
>  >> [1] gtools_3.8.1 Formula_1.2-3 latticeExtra_0.6-28  >> [4] 
> cellranger_1.1.0
> pillar_1.3.1 backports_1.1.3  >> [7] lattice_0.20-38 digest_0.6.18
> RColorBrewer_1.1-2  >> [10] checkmate_1.8.5 minqa_1.2.4 colorspace_1.3-2
> >> [13] survey_3.35 htmltools_0.3.6 Matrix_1.2-15  >> [16] plyr_1.8.4
> pkgconfig_2.0.2 haven_2.0.0  >> [19] scales_1.0.0 openxlsx_4.1.0 rio_0.5.16
> >> [22] lme4_1.1-19 htmlTable_1.13.1 tibble_1.4.2  >> [25] relimp_1.0-5
> ggplot2_3.1.0 nnet_7.3-12  >> [28] lazyeval_0.2.1 survival_2.4

Re: [R] importing data error question

2019-01-18 Thread Fox, John
Dear Jihee,

> On Jan 17, 2019, at 7:00 PM, 우지희  wrote:
> 
> Dear John,
>  
> (1) I noticed that you loaded the FactoMineR and SensoMineR plug-ins. Try 
> again without loading these plug-ins.
> not worked :(
>  
> 

OK. I don't understand why that doesn't work. There is likely some peculiarity 
in your system, but I have no idea what it is, and I can't think what else I 
might do without access to your computer.

>  
>  
> (2) Download and try reading the plain-text data file from 
> <https://socialsciences.mcmaster.ca/jfox/Courses/R/ICPSR/Prestige.xlsx>, 
> using "Data > Import data > From text file, clipboard, or URL"; you can take 
> all of the defaults in the resulting dialog box.
>  
> I think importing is working but I can't view data set. it says ERROR: DATA 
> FRAME TOO WIDE
>  
> 

You tried to read the Excel file Prestige.xlsx as if it were a plain-text file, 
which produces nonsense. This was my fault: I sent the wrong link; the correct 
file is at 
<https://socialsciences.mcmaster.ca/jfox/Courses/R/ICPSR/Prestige.txt>.

Best,
 John

>  
>  
>  
> I have no idea neither. ;(
> I might give up from now,,
>  
>  
> Thanks again!
>  
> Best,
> Jihee
>  
>  
>  
> From: "Fox, John" 
> Sent: Friday, January 18, 2019 12:02:42 AM
> To:"우지희" 
> Cc:"" 
> Subject:Re: [R] importing data error question
>  
>  
> Dear Jihee,
> 
> Your latest attempt has gotten farther than the previous one but has produced 
> a different error. The command to read the data set was generated properly. 
> You can see whether the data set was in fact read by typing prestige (the 
> name you gave to the data set) at the > command prompt in the R console. 
> Assuming that the data set was read, an error occurred when the Rcmdr tried 
> to make it the active data set. I'm afraid that I don't understand how this 
> could happen because this procedure works correctly for me and for others. 
> The underlying code is invoked whenever the Rcmdr reads a dara set.
> 
> I suggest that you try two additional things:
> 
> (1) I noticed that you loaded the FactoMineR and SensoMineR plug-ins. Try 
> again without loading these plug-ins.
> 
> (2) Download and try reading the plain-text data file from 
> <https://socialsciences.mcmaster.ca/jfox/Courses/R/ICPSR/Prestige.xlsx>, 
> using "Data > Import data > From text file, clipboard, or URL"; you can take 
> all of the defaults in the resulting dialog box.
> 
> If neither of these works then I'm afraid that I'm out of ideas. There's 
> something peculiar about your R installation that I can't detect.
> 
> Best,
> John
> 
> 
> 
> > On Jan 17, 2019, at 12:24 AM, 우지희  wrote:
> > 
> > Dear John,
> >  
> > I tried with your file. R commander could read the file but there's still 
> > no active dataset
> >  
> > Anyway I'll send my file, too
> >  
> > Jihee
> >  
> > <528c421a382d426895f6446b32fbc6f0.png>
> >  
> > From: "Fox, John" 
> > Sent: Thursday, January 17, 2019 2:09:52 PM
> > To:"우지희" 
> > Cc:"" 
> > Subject:Re: [R] importing data error question
> >  
> >  
> > Dear Jihee,
> > 
> > This appears to be a different problem. You were apparently able to access 
> > the spreadsheet file, but the R Commander didn't find a suitable worksheet 
> > in it.
> > 
> > Try downloading and reading the file at 
> > <https://socialsciences.mcmaster.ca/jfox/Courses/R/ICPSR/Prestige.xlsx>. If 
> > that works, send me privately (i.e., directly) your Excel spreadsheet file 
> > and I'll take a look at it.
> > 
> > Best,
> > John
> > 
> > > On Jan 16, 2019, at 9:49 PM, 우지희  wrote:
> > > 
> > > Dear John,
> > >  
> > > now i can use english thank you very much!!
> > >  
> > > um.. but nothing's changed... with that {r} message at R Markdown.
> > >  
> > > There's no dataset.
> > >  
> > > i tried both .xls and .xlsx .
> > >  
> > >  
> > > Jihee
> > >  
> > >  
> > >  
> > > 
> > >  
> > >  
> > >  
> > > From: "Fox, John" 
> > > Sent: Thursday, January 17, 2019 10:59:44 AM
> > > To:"우지희" 
> > > Cc:"" 
> > > Subject:Re: [R] importing data error question
> > >  
> > >  
> > > Dear Jihee,
> > > 
> > &g

Re: [R] importing data error question

2019-01-17 Thread Fox, John
Dear Jihee,

Your latest attempt has gotten farther than the previous one but has produced a 
different error. The command to read the data set was generated properly. You 
can see whether the data set was in fact read by typing prestige (the name you 
gave to the data set) at the > command prompt in the R console. Assuming that 
the data set was read, an error occurred when the Rcmdr tried to make it the 
active data set. I'm afraid that I don't understand how this could happen 
because this procedure works correctly for me and for others. The underlying 
code is invoked whenever the Rcmdr reads a dara set.

I suggest that you try two additional things:

(1) I noticed that you loaded the FactoMineR and SensoMineR plug-ins. Try again 
without loading these plug-ins.

(2) Download and try reading the plain-text data file from 
<https://socialsciences.mcmaster.ca/jfox/Courses/R/ICPSR/Prestige.xlsx>, using 
"Data > Import data > From text file, clipboard, or URL"; you can take all of 
the defaults in the resulting dialog box.

If neither of these works then I'm afraid that I'm out of ideas. There's 
something peculiar about your R installation that I can't detect.

Best,
 John



> On Jan 17, 2019, at 12:24 AM, 우지희  wrote:
> 
> Dear John,
>  
> I tried with your file. R commander could read the file but there's still no 
> active dataset
>  
> Anyway I'll send my file, too
>  
> Jihee
>  
> <528c421a382d426895f6446b32fbc6f0.png>
>  
> From: "Fox, John" 
> Sent: Thursday, January 17, 2019 2:09:52 PM
> To:"우지희" 
> Cc:"" 
> Subject:Re: [R] importing data error question
>  
>  
> Dear Jihee,
> 
> This appears to be a different problem. You were apparently able to access 
> the spreadsheet file, but the R Commander didn't find a suitable worksheet in 
> it.
> 
> Try downloading and reading the file at 
> <https://socialsciences.mcmaster.ca/jfox/Courses/R/ICPSR/Prestige.xlsx>. If 
> that works, send me privately (i.e., directly) your Excel spreadsheet file 
> and I'll take a look at it.
> 
> Best,
> John
> 
> > On Jan 16, 2019, at 9:49 PM, 우지희  wrote:
> > 
> > Dear John,
> >  
> > now i can use english thank you very much!!
> >  
> > um.. but nothing's changed... with that {r} message at R Markdown.
> >  
> > There's no dataset.
> >  
> > i tried both .xls and .xlsx .
> >  
> >  
> > Jihee
> >  
> >  
> >  
> > 
> >  
> >  
> >  
> > From: "Fox, John" 
> > Sent: Thursday, January 17, 2019 10:59:44 AM
> > To:"우지희" 
> > Cc:"" 
> > Subject:Re: [R] importing data error question
> >  
> >  
> > Dear Jihee,
> > 
> > Probably the easiest way to change the language to English temporarily in R 
> > is to enter the command
> > 
> > Sys.setenv(LANGUAGE="en")
> > 
> > at the R command prompt prior to loading the Rcmdr package.
> > 
> > I hope that this helps,
> > John
> > 
> > 
> > > On Jan 16, 2019, at 7:02 PM, 우지희  wrote:
> > > 
> > > Thanks for your help!
> > >  
> > > I was having trouble with finding how to use english...
> > >  
> > > Even though I try to use english language, I couldn't change language of 
> > > R commander. (it is still korean)
> > >  
> > > Sorry but.. do you know how to change language of "R commander"? I have 
> > > no idea why it doesn't change.
> > >  
> > > Best,
> > > Jihee
> > >  
> > > From: "Fox, John" 
> > > Sent: Thursday, January 17, 2019 1:59:03 AM
> > > To:"우지희" 
> > > Cc:"r-help@r-project.org" 
> > > Subject:Re: [R] importing data error question
> > >  
> > >  
> > > Dear jihee,
> > > 
> > > I've looked into this problem further, using my Mac where it's easier to 
> > > temporarily change languages and character sets than on Windows, and I 
> > > discovered the following:
> > > 
> > > I was able to duplicate your problem with importing Excel files when 
> > > working in Korean. There's a similar problem with the import SAS b7dat 
> > > files but not with the other file-import dialogs.
> > > 
> > > I observed a similar problem when working in Chinese (LANG="zh") but not 
> > > in simplified Chinese (zh_CN) or Japanese (ja), so the problem isn't 
> > > simply with non-Latin 

Re: [R] importing data error question

2019-01-16 Thread Fox, John
Dear Jihee,

This appears to be a different problem. You were  apparently able to access the 
spreadsheet file, but the R Commander didn't find a suitable worksheet in it.

Try downloading and reading the file at 
<https://socialsciences.mcmaster.ca/jfox/Courses/R/ICPSR/Prestige.xlsx>. If 
that works, send me privately (i.e., directly) your Excel spreadsheet file and 
I'll take a look at it.

Best,
 John

> On Jan 16, 2019, at 9:49 PM, 우지희  wrote:
> 
> Dear John,
>  
> now i can use english thank you very much!!
>  
> um.. but nothing's changed... with that {r} message at R Markdown.
>  
> There's no dataset.
>  
> i tried both .xls and .xlsx .
>  
>  
> Jihee
>  
>  
>  
> 
>  
>  
>  
> From: "Fox, John" 
> Sent: Thursday, January 17, 2019 10:59:44 AM
> To:"우지희" 
> Cc:"" 
> Subject:Re: [R] importing data error question
>  
>  
> Dear Jihee,
> 
> Probably the easiest way to change the language to English temporarily in R 
> is to enter the command
> 
> Sys.setenv(LANGUAGE="en")
> 
> at the R command prompt prior to loading the Rcmdr package.
> 
> I hope that this helps,
> John
> 
> 
> > On Jan 16, 2019, at 7:02 PM, 우지희  wrote:
> > 
> > Thanks for your help!
> >  
> > I was having trouble with finding how to use english...
> >  
> > Even though I try to use english language, I couldn't change language of R 
> > commander. (it is still korean)
> >  
> > Sorry but.. do you know how to change language of "R commander"? I have no 
> > idea why it doesn't change.
> >  
> > Best,
> > Jihee
> >  
> > From: "Fox, John" 
> > Sent: Thursday, January 17, 2019 1:59:03 AM
> > To:"우지희" 
> > Cc:"r-help@r-project.org" 
> > Subject:Re: [R] importing data error question
> >  
> >  
> > Dear jihee,
> > 
> > I've looked into this problem further, using my Mac where it's easier to 
> > temporarily change languages and character sets than on Windows, and I 
> > discovered the following:
> > 
> > I was able to duplicate your problem with importing Excel files when 
> > working in Korean. There's a similar problem with the import SAS b7dat 
> > files but not with the other file-import dialogs.
> > 
> > I observed a similar problem when working in Chinese (LANG="zh") but not in 
> > simplified Chinese (zh_CN) or Japanese (ja), so the problem isn't simply 
> > with non-Latin character sets. There is no problem in English, Spanish 
> > (es), or French (fr), and I didn't check the other languages into which the 
> > Rcmdr is translated.
> > 
> > I think that the problem originates in the Korean and Chinese translation 
> > files and I'll contact the translators to see whether they can fix it.
> > 
> > Thank you for reporting this issue.
> > 
> > John
> > 
> > > On Jan 14, 2019, at 11:36 PM, Fox, John  wrote:
> > > 
> > > Dear jihee,
> > > 
> > >> On Jan 14, 2019, at 9:00 PM, 우지희  wrote:
> > >> 
> > >> You said previously that you were using a Mac, so I'm surprised that you 
> > >> now say that you're using Windows. I don't have a Windows 7 system, but 
> > >> I can confirm that importing from Excel files works perfectly fine under 
> > >> Windows 10, as I just verified, and I'd be surprised if the Windows 
> > >> version matters. 
> > >> 
> > >> --> no, I never said i was using a Mac. 
> > > 
> > > Sorry, I guess I got that from the error message you originally reported, 
> > > which was "Error in structure(.External(.C_dotTclObjv, objv), class = 
> > > "tclObj") : [tcl] bad Macintosh file type "“*”"." I've never seen that 
> > > error and it seems peculiar that it would occur on a Windows system.
> > > 
> > >> 
> > >> You still haven't reported the versions of R, the Rcmdr package, and the 
> > >> other packages that you're using. The easiest way to do this is to show 
> > >> the output of the sessionInfo() command. 
> > >> 
> > >> --> sessionInfo()
> > >> R version 3.5.2 (2018-12-20)
> > >> Platform: x86_64-w64-mingw32/x64 (64-bit)
> > >> Running under: Windows 7 x64 (build 7601) Service Pack 1
> > >> 
> > >> Matrix products: default
> > >> 
> > >> locale:
> > &g

Re: [R] importing data error question

2019-01-16 Thread Fox, John
Dear Jihee,

Probably the easiest way to change the language to English temporarily in R is 
to enter the command

Sys.setenv(LANGUAGE="en")

at the R command prompt prior to loading the Rcmdr package.

I hope that this helps,
 John


> On Jan 16, 2019, at 7:02 PM, 우지희  wrote:
> 
> Thanks for your help!
>  
> I was having trouble with finding how to use english...
>  
> Even though I try to use english language, I couldn't change language of R 
> commander. (it is still korean)
>  
> Sorry but.. do you know how to change language of "R commander"? I have no 
> idea why it doesn't change.
>  
> Best,
> Jihee
>  
> From: "Fox, John" 
> Sent: Thursday, January 17, 2019 1:59:03 AM
> To:"우지희" 
> Cc:"r-help@r-project.org" 
> Subject:Re: [R] importing data error question
>  
>  
> Dear jihee,
> 
> I've looked into this problem further, using my Mac where it's easier to 
> temporarily change languages and character sets than on Windows, and I 
> discovered the following:
> 
> I was able to duplicate your problem with importing Excel files when working 
> in Korean. There's a similar problem with the import SAS b7dat files but not 
> with the other file-import dialogs.
> 
> I observed a similar problem when working in Chinese (LANG="zh") but not in 
> simplified Chinese (zh_CN) or Japanese (ja), so the problem isn't simply with 
> non-Latin character sets. There is no problem in English, Spanish (es), or 
> French (fr), and I didn't check the other languages into which the Rcmdr is 
> translated.
> 
> I think that the problem originates in the Korean and Chinese translation 
> files and I'll contact the translators to see whether they can fix it.
> 
> Thank you for reporting this issue.
> 
> John
> 
> > On Jan 14, 2019, at 11:36 PM, Fox, John  wrote:
> > 
> > Dear jihee,
> > 
> >> On Jan 14, 2019, at 9:00 PM, 우지희  wrote:
> >> 
> >> You said previously that you were using a Mac, so I'm surprised that you 
> >> now say that you're using Windows. I don't have a Windows 7 system, but I 
> >> can confirm that importing from Excel files works perfectly fine under 
> >> Windows 10, as I just verified, and I'd be surprised if the Windows 
> >> version matters. 
> >> 
> >> --> no, I never said i was using a Mac. 
> > 
> > Sorry, I guess I got that from the error message you originally reported, 
> > which was "Error in structure(.External(.C_dotTclObjv, objv), class = 
> > "tclObj") : [tcl] bad Macintosh file type "“*”"." I've never seen that 
> > error and it seems peculiar that it would occur on a Windows system.
> > 
> >> 
> >> You still haven't reported the versions of R, the Rcmdr package, and the 
> >> other packages that you're using. The easiest way to do this is to show 
> >> the output of the sessionInfo() command. 
> >> 
> >> --> sessionInfo()
> >> R version 3.5.2 (2018-12-20)
> >> Platform: x86_64-w64-mingw32/x64 (64-bit)
> >> Running under: Windows 7 x64 (build 7601) Service Pack 1
> >> 
> >> Matrix products: default
> >> 
> >> locale:
> >> [1] LC_COLLATE=Korean_Korea.949 LC_CTYPE=Korean_Korea.949  
> >> [3] LC_MONETARY=Korean_Korea.949 LC_NUMERIC=C  
> >> [5] LC_TIME=Korean_Korea.949  
> >> 
> >> attached base packages:
> >> [1] tcltk splines stats graphics grDevices utils datasets methods  
> >> [9] base  
> >> 
> >> other attached packages:
> >> [1] RcmdrPlugin.SensoMineR_1.11-01 RcmdrPlugin.FactoMineR_1.6-0  
> >> [3] Rcmdr_2.5-1 effects_4.1-0  
> >> [5] RcmdrMisc_2.5-1 sandwich_2.5-0  
> >> [7] car_3.0-2 carData_3.0-2  
> >> [9] SensoMineR_1.23 FactoMineR_1.41  
> >> 
> >> loaded via a namespace (and not attached):
> >> [1] gtools_3.8.1 Formula_1.2-3 latticeExtra_0.6-28 
> >> [4] cellranger_1.1.0 pillar_1.3.1 backports_1.1.3  
> >> [7] lattice_0.20-38 digest_0.6.18 RColorBrewer_1.1-2  
> >> [10] checkmate_1.8.5 minqa_1.2.4 colorspace_1.3-2  
> >> [13] survey_3.35 htmltools_0.3.6 Matrix_1.2-15  
> >> [16] plyr_1.8.4 pkgconfig_2.0.2 haven_2.0.0  
> >> [19] scales_1.0.0 openxlsx_4.1.0 rio_0.5.16  
> >> [22] lme4_1.1-19 htmlTable_1.13.1 tibble_1.4.2  
> >> [25] relimp_1.0-5 ggplot2_3.1.0 nnet_7.3-12  
> >> [28] lazyeval_0.2.1 survival_2.43-3 magrittr_1.5  
> >> [31] crayon_1.3.4 readxl_1.2.0 nlm

Re: [R] importing data error question

2019-01-16 Thread Fox, John
Dear jihee,

I've looked into this problem further, using my Mac where it's easier to 
temporarily change languages and character sets than on Windows, and I 
discovered the following:

I was able to duplicate your problem with importing Excel files when working in 
Korean. There's a similar problem with the import SAS b7dat files but not with 
the other file-import dialogs.

I observed a similar problem when working in Chinese (LANG="zh") but not in 
simplified Chinese (zh_CN) or Japanese (ja), so the problem isn't simply with 
non-Latin character sets. There is no problem in English, Spanish (es), or 
French (fr), and I didn't check the other languages into which the Rcmdr is 
translated.

I think that the problem originates in the Korean and Chinese translation files 
and I'll contact the translators to see whether they can fix it.

Thank you for reporting this issue.

John

> On Jan 14, 2019, at 11:36 PM, Fox, John  wrote:
> 
> Dear jihee,
> 
>> On Jan 14, 2019, at 9:00 PM, 우지희  wrote:
>> 
>> You said previously that you were using a Mac, so I'm surprised that you now 
>> say that you're using Windows. I don't have a Windows 7 system, but I can 
>> confirm that importing from Excel files works perfectly fine under Windows 
>> 10, as I just verified, and I'd be surprised if the Windows version matters. 
>> 
>> --> no, I never said i was using a Mac. 
> 
> Sorry, I guess I got that from the error message you originally reported, 
> which was "Error in structure(.External(.C_dotTclObjv, objv), class = 
> "tclObj") : [tcl] bad Macintosh file type "“*”"." I've never seen that error 
> and it seems peculiar that it would occur on a Windows system.
> 
>> 
>> You still haven't reported the versions of R, the Rcmdr package, and the 
>> other packages that you're using. The easiest way to do this is to show the 
>> output of the sessionInfo() command. 
>> 
>> --> sessionInfo()
>> R version 3.5.2 (2018-12-20)
>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>> Running under: Windows 7 x64 (build 7601) Service Pack 1
>> 
>> Matrix products: default
>> 
>> locale:
>> [1] LC_COLLATE=Korean_Korea.949  LC_CTYPE=Korean_Korea.949   
>> [3] LC_MONETARY=Korean_Korea.949 LC_NUMERIC=C
>> [5] LC_TIME=Korean_Korea.949
>> 
>> attached base packages:
>> [1] tcltk splines   stats graphics  grDevices utils datasets  
>> methods  
>> [9] base 
>> 
>> other attached packages:
>> [1] RcmdrPlugin.SensoMineR_1.11-01 RcmdrPlugin.FactoMineR_1.6-0  
>> [3] Rcmdr_2.5-1effects_4.1-0 
>> [5] RcmdrMisc_2.5-1sandwich_2.5-0
>> [7] car_3.0-2  carData_3.0-2 
>> [9] SensoMineR_1.23FactoMineR_1.41   
>> 
>> loaded via a namespace (and not attached):
>> [1] gtools_3.8.1 Formula_1.2-3latticeExtra_0.6-28 
>> [4] cellranger_1.1.0 pillar_1.3.1 backports_1.1.3 
>> [7] lattice_0.20-38  digest_0.6.18RColorBrewer_1.1-2  
>> [10] checkmate_1.8.5  minqa_1.2.4  colorspace_1.3-2
>> [13] survey_3.35  htmltools_0.3.6  Matrix_1.2-15   
>> [16] plyr_1.8.4   pkgconfig_2.0.2  haven_2.0.0 
>> [19] scales_1.0.0 openxlsx_4.1.0   rio_0.5.16  
>> [22] lme4_1.1-19  htmlTable_1.13.1 tibble_1.4.2
>> [25] relimp_1.0-5 ggplot2_3.1.0nnet_7.3-12 
>> [28] lazyeval_0.2.1   survival_2.43-3  magrittr_1.5
>> [31] crayon_1.3.4 readxl_1.2.0 nlme_3.1-137
>> [34] MASS_7.3-51.1forcats_0.3.0foreign_0.8-71  
>> [37] class_7.3-14 tools_3.5.2  data.table_1.11.8   
>> [40] hms_0.4.2tcltk2_1.2-11stringr_1.3.1   
>> [43] munsell_0.5.0cluster_2.0.7-1  zip_1.0.0   
>> [46] flashClust_1.01-2compiler_3.5.2   e1071_1.7-0 
>> [49] rlang_0.3.1  grid_3.5.2   nloptr_1.2.1
>> [52] rstudioapi_0.9.0 htmlwidgets_1.3  leaps_3.0   
>> [55] base64enc_0.1-3  gtable_0.2.0 abind_1.4-5 
>> [58] curl_3.2 reshape2_1.4.3   AlgDesign_1.1-7.3   
>> [61] gridExtra_2.3zoo_1.8-4knitr_1.21  
>> [64] nortest_1.0-4Hmisc_4.1-1  KernSmooth_2.23-15  
>> [67] stringi_1.2.4Rcpp_1.0.0   rpart_4.1-13
>> [70] acepack_1.4.1scatt

Re: [R] importing data error question

2019-01-14 Thread Fox, John
Dear jihee,

> On Jan 14, 2019, at 9:00 PM, 우지희  wrote:
> 
> You said previously that you were using a Mac, so I'm surprised that you now 
> say that you're using Windows. I don't have a Windows 7 system, but I can 
> confirm that importing from Excel files works perfectly fine under Windows 
> 10, as I just verified, and I'd be surprised if the Windows version matters. 
> 
> --> no, I never said i was using a Mac. 

Sorry, I guess I got that from the error message you originally reported, which 
was "Error in structure(.External(.C_dotTclObjv, objv), class = "tclObj") : 
[tcl] bad Macintosh file type "“*”"." I've never seen that error and it seems 
peculiar that it would occur on a Windows system.

> 
> You still haven't reported the versions of R, the Rcmdr package, and the 
> other packages that you're using. The easiest way to do this is to show the 
> output of the sessionInfo() command. 
> 
> --> sessionInfo()
> R version 3.5.2 (2018-12-20)
> Platform: x86_64-w64-mingw32/x64 (64-bit)
> Running under: Windows 7 x64 (build 7601) Service Pack 1
> 
> Matrix products: default
> 
> locale:
> [1] LC_COLLATE=Korean_Korea.949  LC_CTYPE=Korean_Korea.949   
> [3] LC_MONETARY=Korean_Korea.949 LC_NUMERIC=C
> [5] LC_TIME=Korean_Korea.949
> 
> attached base packages:
> [1] tcltk splines   stats graphics  grDevices utils datasets  
> methods  
> [9] base 
> 
> other attached packages:
>  [1] RcmdrPlugin.SensoMineR_1.11-01 RcmdrPlugin.FactoMineR_1.6-0  
>  [3] Rcmdr_2.5-1effects_4.1-0 
>  [5] RcmdrMisc_2.5-1sandwich_2.5-0
>  [7] car_3.0-2  carData_3.0-2 
>  [9] SensoMineR_1.23FactoMineR_1.41   
> 
> loaded via a namespace (and not attached):
>  [1] gtools_3.8.1 Formula_1.2-3latticeExtra_0.6-28 
>  [4] cellranger_1.1.0 pillar_1.3.1 backports_1.1.3 
>  [7] lattice_0.20-38  digest_0.6.18RColorBrewer_1.1-2  
> [10] checkmate_1.8.5  minqa_1.2.4  colorspace_1.3-2
> [13] survey_3.35  htmltools_0.3.6  Matrix_1.2-15   
> [16] plyr_1.8.4   pkgconfig_2.0.2  haven_2.0.0 
> [19] scales_1.0.0 openxlsx_4.1.0   rio_0.5.16  
> [22] lme4_1.1-19  htmlTable_1.13.1 tibble_1.4.2
> [25] relimp_1.0-5 ggplot2_3.1.0nnet_7.3-12 
> [28] lazyeval_0.2.1   survival_2.43-3  magrittr_1.5
> [31] crayon_1.3.4 readxl_1.2.0 nlme_3.1-137
> [34] MASS_7.3-51.1forcats_0.3.0foreign_0.8-71  
> [37] class_7.3-14 tools_3.5.2  data.table_1.11.8   
> [40] hms_0.4.2tcltk2_1.2-11stringr_1.3.1   
> [43] munsell_0.5.0cluster_2.0.7-1  zip_1.0.0   
> [46] flashClust_1.01-2compiler_3.5.2   e1071_1.7-0 
> [49] rlang_0.3.1  grid_3.5.2   nloptr_1.2.1
> [52] rstudioapi_0.9.0 htmlwidgets_1.3  leaps_3.0   
> [55] base64enc_0.1-3  gtable_0.2.0 abind_1.4-5 
> [58] curl_3.2 reshape2_1.4.3   AlgDesign_1.1-7.3   
> [61] gridExtra_2.3zoo_1.8-4knitr_1.21  
> [64] nortest_1.0-4Hmisc_4.1-1  KernSmooth_2.23-15  
> [67] stringi_1.2.4Rcpp_1.0.0   rpart_4.1-13
> [70] acepack_1.4.1scatterplot3d_0.3-41 xfun_0.4
> 
> This was the status that I tried to import Excel data. 

These packages seem up-to-date.

> 
> Also, have you tried importing an Excel file in the Rcmdr *without* the two 
> plug-in packages loaded, as I suggested in my original response?  
> 
> --> I tried without plug-in packages, but It didn't work. 

OK, so you tried the setup that works for me and, I assume from the lack of 
similar error reports, for others.

> 
> It occurs to me that the problem may be produced by using the Rcmdr under R 
> with a non-Latin set, but if that were the case I would have expected the 
> problem to have surfaced earlier. Did you try reading another kind of file, 
> such as a plain-text data file? 
> 
> --> I don't know what is plain-text data file. 

A plain-text data file could, e.g., be created from an Excel file by exporting 
a worksheet as a .csv (comma-separated-values) file; you could read this into 
the Rcmdr via Data > Import data > from text file, specifying the field 
separator as commas.

> 
> i'll try R with English. 

I'm curious to see what happens.

Best,
 John

> 
> From:  "Fox, John"  
> 
> Sent: Mo

Re: [R] importing data error question

2019-01-14 Thread Fox, John
Dear jihee,

> On Jan 13, 2019, at 9:28 PM, 우지희  wrote:
> 
>  
>  
> From: "우지희" 
> Sent: Monday, January 14, 2019 9:40:26 AM
> To:"Fox, John" 
> Subject:Re: [R] importing data error question
>  
>  
> Thanks for your replies.
>  
> I'm using windows 7, I loaded FactoMineR,

You said previously that you were using a Mac, so I'm surprised that you now 
say that you're using Windows. I don't have a Windows 7 system, but I can 
confirm that importing from Excel files works perfectly fine under Windows 10, 
as I just verified, and I'd be surprised if the Windows version matters.

> SensoMineR and then Rcmdr. (Downloaded FacroMineR, SensoMineR, Rcmdr, 
> Rcmdrplugin.FactomineR, Rcmdrplugin.SensomineR and other required packages 
> that downloaded automatically)
> This problem occurred when I select Data > Import data > From Excel file.
> I checked FactoMineR and SensoMineR packages are loaded and using..

You still haven't reported the versions of R, the Rcmdr package, and the other 
packages that you're using. The easiest way to do this is to show the output of 
the sessionInfo() command.

Also, have you tried importing an Excel file in the Rcmdr *without* the two 
plug-in packages loaded, as I suggested in my original response? 

It occurs to me that the problem may be produced by using the Rcmdr under R 
with a non-Latin set, but if that were the case I would have expected the 
problem to have surfaced earlier. Did you try reading another kind of file, 
such as a plain-text data file?

Best,
 John

>  
>  
>  
> From: "Fox, John" 
> Sent: Friday, January 11, 2019 10:48:38 PM
> To:"PIKAL Petr" 
> Cc:"우지희" ; "r-help@R-project.org" 
> Subject:Re: [R] importing data error question
>  
>  
> Dear Petr and jihee,
> 
> The Rcmdr can import Excel files, and as I just verified, it can do so on a 
> Mac listing files of all types (*) in the open-file dialog box (which is the 
> default). 
> 
> So, as Petr suggests, more information is required to help you, including the 
> versions of macOS, R, and all packages you have loaded. In particular, does 
> the problem occur when you try to read the Excel file *without* FactoMineR 
> and SensoMineR loaded? Also, when the does problem occur -- immediately when 
> you select Data > Import data > From Excel file, or at some other point?
> 
> Best,
> John
> 
> -
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> Web: http::/socserv.mcmaster.ca/jfox
> 
> > On Jan 11, 2019, at 5:07 AM, PIKAL Petr  wrote:
> > 
> > Hi
> > 
> > I do not use Rcmdr but from documentation it seems to me that it does not 
> > have much to do with importing data from Excel.
> > 
> > So without some additional info from your side (at least used commands) you 
> > hardly get any reasonable answer.
> > 
> > Cheers
> > Petr
> > 
> >> -Original Message-
> >> From: R-help  On Behalf Of ???
> >> Sent: Friday, January 11, 2019 9:14 AM
> >> To: r-help@R-project.org
> >> Subject: [R] importing data error question
> >> 
> >> Hi I'm jihee and I have a question about error...
> >> 
> >> I'm using R 3.5.2 and tried to use Rcmdr package.
> >> 
> >> and using FactoMineR and SensoMineR to analyze sensory data through PCA
> >> 
> >> but i can't import excel data with Rcmdr.
> >> 
> >> it has this messege :
> >> 
> >> Error in structure(.External(.C_dotTclObjv, objv), class = "tclObj") :
> >> [tcl] bad Macintosh file type "“*”"
> >> 
> >> what is wrong with my R??? T_T
> >> 
> >> Thanks for your help.
> >> 
> >> jihee.
> >> [[alternative HTML version deleted]]
> >> 
> >> __
> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> > Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
> > partnerů PRECHEZA a.s. jsou zveřejněny na: 
> > https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
> > processing and protection of business partner’s personal data are available 
> > on website: https://www.precheza.cz/en/personal-data-protection-principles/
> >

Re: [R] importing data error question

2019-01-11 Thread Fox, John
Dear Petr and jihee,

The Rcmdr can import Excel files, and as I just verified, it can do so on a Mac 
listing files of all types (*) in the open-file dialog box (which is the 
default). 

So, as Petr suggests, more information is required to help you, including the 
versions of macOS, R, and all packages you have loaded. In particular, does the 
problem occur when you try to read the Excel file *without* FactoMineR and 
SensoMineR loaded? Also, when the does problem occur -- immediately when you 
select Data > Import data > From Excel file, or at some other point?

Best,
 John

  -
  John Fox, Professor Emeritus
  McMaster University
  Hamilton, Ontario, Canada
  Web: http::/socserv.mcmaster.ca/jfox

> On Jan 11, 2019, at 5:07 AM, PIKAL Petr  wrote:
> 
> Hi
> 
> I do not use Rcmdr but from documentation it seems to me that it does not 
> have much to do with importing data from Excel.
> 
> So without some additional info from your side (at least used commands) you 
> hardly get any reasonable answer.
> 
> Cheers
> Petr
> 
>> -Original Message-
>> From: R-help  On Behalf Of ???
>> Sent: Friday, January 11, 2019 9:14 AM
>> To: r-help@R-project.org
>> Subject: [R] importing data error question
>> 
>> Hi I'm jihee and I have a question about error...
>> 
>> I'm using R 3.5.2 and tried to use Rcmdr package.
>> 
>> and using FactoMineR and SensoMineR to analyze sensory data through PCA
>> 
>> but i can't import excel data with Rcmdr.
>> 
>> it has this messege :
>> 
>> Error in structure(.External(.C_dotTclObjv, objv), class = "tclObj") :
>>   [tcl] bad Macintosh file type "“*”"
>> 
>> what is wrong with my R??? T_T
>> 
>> Thanks for your help.
>> 
>> jihee.
>> [[alternative HTML version deleted]]
>> 
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> Osobní údaje: Informace o zpracování a ochraně osobních údajů obchodních 
> partnerů PRECHEZA a.s. jsou zveřejněny na: 
> https://www.precheza.cz/zasady-ochrany-osobnich-udaju/ | Information about 
> processing and protection of business partner’s personal data are available 
> on website: https://www.precheza.cz/en/personal-data-protection-principles/
> Důvěrnost: Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a 
> podléhají tomuto právně závaznému prohláąení o vyloučení odpovědnosti: 
> https://www.precheza.cz/01-dovetek/ | This email and any documents attached 
> to it may be confidential and are subject to the legally binding disclaimer: 
> https://www.precheza.cz/en/01-disclaimer/
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Strange degrees of freedom and SS from car::Anova with type II SS?

2018-12-05 Thread Fox, John
Dear R.,

The problem you constructed is too ill-conditioned for the method that Anova() 
uses to compute type-II sums of squares and the associated degrees of freedom, 
with an immense condition number of the coefficient covariance matrix:

> library(car)
Loading required package: carData

> mod <- lm(prestige ~ women * type * income * education, data=Prestige)
> e <- eigen(vcov(mod))$values
> max(e)/min(e)
[1] 2.776205e+17

Simply centering the numerical predictors reduces the condition number by a 
factor of 10^3, which allows Anova() to work, even though the problem is still 
extremely ill-conditioned:

> Prestige.c <- within(Prestige, {
+   income <- income - mean(income)
+   education <- education - mean(education)
+   women <- women - mean(women)
+ })
> mod.c <- lm(prestige ~ women * type * income * education, data=Prestige.c)
> e.c <- eigen(vcov(mod.c))$values
> max(e)/min(e)
[1] 2.776205e+17

> Anova(mod.c)
Anova Table (Type II tests)

Response: prestige
 Sum Sq Df F valuePr(>F)
women167.29  1  4.9516 0.0291142 *  
type 744.30  2 11.0150 6.494e-05 ***
income   789.00  1 23.3529 7.112e-06 ***
education699.54  1 20.7050 2.057e-05 ***
women:type   140.32  2  2.0766 0.1326023
women:income  33.14  1  0.9807 0.3252424
type:income  653.40  2  9.6697 0.0001859 ***
women:education   30.36  1  0.8986 0.3462316
type:education 0.72  2  0.0107 0.9893462
income:education   7.88  1  0.2331 0.6306681
women:type:income136.80  2  2.0245 0.1393087
women:type:education 140.18  2  2.0745 0.1328633
women:income:education   100.42  1  2.9722 0.032 .  
type:income:education 82.02  2  1.2138 0.3029069
women:type:income:education2.05  2  0.0303 0.9701334
Residuals   2500.16 74  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> mod.c.2 <- update(mod.c, . ~ . - women:type:income:education)
> sum(residuals(mod.c.2)^2) - sum(residuals(mod.c)^2)
[1] 2.049735

Beyond demonstrating that the algorithm that Anova() uses can be made to fail 
if the coefficient covariance matrix is sufficiently ill-conditioned problem, 
I’m not sure what the point of this is. I suppose that we could try to detect 
this condition, which falls in the small region between where lm() detects a 
singularity and the projections used by Anova() break down.

Best,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: http::/socserv.mcmaster.ca/jfox

> On Dec 5, 2018, at 7:33 PM, Ramon Diaz-Uriarte  wrote:
> 
> 
> Dear All,
> 
> I do not understand the degrees of freedom returned by car::Anova under
> some models. They seem to be too many (e.g., numerical variables getting
> more than 1 df, factors getting more df than levels there are).
> 
> This is a reproducible example:
> 
> library(car)
> data(Prestige)
> 
> ## Make sure no issues from NAs in comparisons of SS below
> prestige_nona <- na.omit(Prestige)
> 
> Anova(lm(prestige ~ women * type * income * education,
> data = prestige_nona))
> 
> ## Notice how women, a numerical variable, has 3 df
> ## and type (factor with 3 levels) has 4 df.
> 
> 
> ## In contrast this seems to get the df right:
> Anova(lm(prestige ~ women * type * income * education,
> data = prestige_nona), type = "III")
> 
> ## And also gives the df I'd expect
> anova(lm(prestige ~ women * type * income * education,
> data = prestige_nona))
> 
> 
> 
> ## Type II SS for women in the above model I do not understand either.
> m_1 <- lm(prestige ~ type * income * education, data = prestige_nona)
> m_2 <- lm(prestige ~ type * income * education + women, data = prestige_nona)
> ## Does not match women SS
> sum(residuals(m_1)^2) - sum(residuals(m_2)^2)
> 
> ## See [1] below for examples where they match.
> 
> 
> Looking at the code, I do not understand what the call from
> linearHypothesis returns here (specially compared to other models), and the
> problem seems to be in the return from ConjComp, possibly due to the the
> vcov of the model? (But this is over my head).
> 
> 
> I understand this is not a reasonable model to fit, and there are possibly
> serious collinearity problems. But I was surprised by the dfs in the
> absence of any warning of something gone wrong. So I think there is
> something very basic I do not understand.
> 
> 
> 
> Thanks,
> 
> 
> R.
> 
> 
> [1] In contrast, in other models I see what I'd expect. For example:
> 
> ## 1 df for women, 2 for type
> Anova(lm(prestige ~ type * income * women, data = prestige_nona))
> m_1 <- lm(prestige ~ type * income, data = prestige_nona)
> m_2 <- lm(prestige ~ type * income + women, data = prestige_nona)
> ## Type II SS for women
> sum(residua

Re: [R] Question Mixed-Design Anova in R

2018-11-23 Thread Fox, John
Dear Lisa,

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of peter
> dalgaard
> Sent: Friday, November 23, 2018 10:16 AM
> To: Lisa van der Burgh <40760...@student.eur.nl>
> Cc: r-help@R-project.org
> Subject: Re: [R] Question Mixed-Design Anova in R
> 
> You seem to be bringing in a ton of stuff without looking at features in base
> R...
> 
> Check
> 
> help(mauchly.test)
> help(anova.mlm)
> 
> and examples therein. There are also options in the "car" package.

With respect to the latter, see in particular the O'Brien-Kaiser example in 
?Anova.

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/


> 
> -pd
> 
> > On 23 Nov 2018, at 11:43 , Lisa van der Burgh <40760...@student.eur.nl>
> wrote:
> >
> > Hi Everyone,
> >
> >
> >
> > I have a question about Mixed-Design Anova in R. I want to obtain Mauchly s
> test of Sphericity and the Greenhouse-Geisser correction. I have managed to
> do it in SPSS:
> >
> >
> >
> > GLM Measure1 Measure2 Measure3 Measure4 Measure5 Measure6 BY
> Grouping
> >
> >  /WSFACTOR=Measure 6 Polynomial
> >
> >  /METHOD=SSTYPE(3)
> >
> >  /PLOT=PROFILE(Measure*Grouping)
> >
> >  /CRITERIA=ALPHA(.05)
> >
> >  /WSDESIGN=Measure
> >
> >  /DESIGN=Grouping.
> >
> >
> >
> > I have tried to replicate this in R:
> >
> > library("dplyr")
> >
> > library("tidyr")
> >
> > library("ggplot2")
> >
> > library("ez")
> >
> >
> >
> > PatientID <- c(1:10)
> >
> > Measure1 <- c(3,5,7,4,NA,7,4,4,7,2)
> >
> > Measure2 <- c(1,2,5,6,8,9,5,NA,6,7)
> >
> > Measure3 <- c(3,3,5,7,NA,4,5,7,8,1)
> >
> > Measure4 <- c(1,2,5,NA,3,NA,6,7,3,6)
> >
> > Measure5 <- c(2,3,NA,8,3,5,6,3,6,4)
> >
> > Measure6 <- c(1,2,4,6,8,3,5,6,NA,4)
> >
> > Grouping <- c(1,0,1,1,1,0,0,1,1,0)
> >
> > dataframe <- data.frame(PatientID, Measure1, Measure2, Measure3,
> > Measure4, Measure5, Measure6, Grouping)
> >
> > dataframe$Grouping <- as.factor(dataframe$Grouping)
> >
> > dataframe
> >
> >
> >
> > ezPrecis(dataframe)
> >
> > glimpse(dataframe)
> >
> >
> >
> > dataframe %>% count(PatientID)
> >
> >
> >
> > dataframe %>% count(PatientID, Grouping, Measure1, Measure2,
> Measure3,
> > Measure4, Measure5, Measure6) %>%
> >
> >  filter(PatientID %in% c(1:243)) %>%
> >
> >  print(n = 10)
> >
> >
> >
> > # So, we have a mixed design with one between factor (Grouping) and 6
> within factors (Measure 1 to 6).
> >
> >
> >
> > dat_means <- dataframe %>%
> >
> >  group_by(Grouping, Measure1, Measure2, Measure3, Measure4,
> Measure5,
> > Measure6) %>%
> >
> >  summarise(mRT = mean(c(Measure1, Measure2, Measure3, Measure4,
> > Measure5, Measure6))) %>% ungroup()
> >
> > View(dat_means)
> >
> >
> >
> > ggplot(dat_means, aes(c(Measure1, Measure2, Measure3, Measure4,
> > Measure5, Measure6), mRT, colour = Grouping)) +
> >
> >  geom_line(aes(group = Grouping)) +
> >
> >  geom_point(aes(shape = Grouping), size = 3) +
> >
> >  facet_wrap(~group)
> >
> >
> >
> > ANOVA <- ezANOVA(dat, x, PatientID, within = .( c(Measure1, Measure2,
> > Measure3, Measure4, Measure5, Measure6)),
> >
> >between = Grouping, type = 3)
> >
> >
> >
> > print(ANOVA)
> >
> >
> >
> >
> >
> > However, this does not work. I know I am probably doing it completely
> wrong, but I do not know how to solve it. Besides that, I do not know what to
> fill in at the  x .
> >
> > Can somebody help me?
> >
> >
> >
> > Thank you in advance.
> >
> > Lisa
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
> 
> --
> Peter Dalgaard, Professor,
> Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000
> Frederiksberg, Denmark
> Phone: (+45)38153501
> Office: A 4.23
> Email: pd@cbs.dk  Priv: pda...@gmail.com
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] unique() duplicate() not what i am looking for

2018-11-19 Thread Fox, John
Dear Knut,

Here's one way:

> as.vector((table(Dup) > 1)[Dup])
[1]  TRUE  TRUE  TRUE FALSE  TRUE  TRUE  TRUE FALSE

Someone will probably think of something cleverer.

I hope this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Knut
> Krueger
> Sent: Monday, November 19, 2018 9:42 AM
> To: r-help@r-project.org >> r-help mailing list 
> Subject: [R] unique() duplicate() not what i am looking for
> 
> It should be simple but i do not find the right keywords:
> 
> 
> Dup =  c(1,2,3,4,1,2,3,5)
> 
> I need 4,5 as result
> 
> unique(Dup) gives me [1] 4 1 2 3 5
> 
> duplicated(Dup) gives me
> [1] FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE FALSE
> 
> I need
> [1] TRUE TRUE TRUE FALSE  TRUE  TRUE  TRUE FALSE
> 
> 
> Kind regards Knut
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] The Suggests field in a DESCRIPTION file.

2018-11-17 Thread Fox, John
Dear Rolf,

"fortunes" needs to be quoted in requireNamespace("fortunes", quietly=TRUE).

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rolf Turner
> Sent: Saturday, November 17, 2018 5:17 PM
> To: r-help@r-project.org
> Subject: [R] The Suggests field in a DESCRIPTION file.
> 
> 
> I am building a package which contains a function from which I wish to call 
> the
> fortune() function from the fortunes package --- if that package is available.
> 
> I have place the line
> 
> Suggests: fortunes
> 
> in the DESCRIPTION file.
> 
> In my code for the function that I am writing (let's call it "foo") I put
> 
> > fortOK <- requireNamespace(fortunes,quietly=TRUE)
> > if(fortOK) {
> > fortunes::fortune()
> > }
> 
> thinking that I was following all of the prescriptions in "Writing R 
> Extensions".
> Yet when I do R CMD check on the package I get
> 
> > * checking R code for possible problems ... NOTE
> > foo: no visible binding for global variable ‘fortunes’
> > Undefined global functions or variables:
> >   fortunes
> 
> What am I doing wrong?
> 
> Thanks for any insight.
> 
> cheers,
> 
> Rolf Turner
> 
> P. S.  I tried putting an "imports(fortunes)" in the NAMESPACE file, but this 
> just
> made matters worse:
> 
> > * checking package dependencies ... ERROR Namespace dependency not
> > required: ‘fortunes’
> 
> Huh?  What on earth is this actually saying?  I cannot parse this error
> message.
> 
> R. T.
> 
> --
> Technical Editor ANZJS
> Department of Statistics
> University of Auckland
> Phone: +64-9-373-7599 ext. 88276
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Line with linearly changing thickness – installation issues

2018-11-11 Thread Fox, John
And here's a simpler, loopless version:

tlines <- function(x, y, thickness, col="black", unit=0.005){
# line of varying thickness
#   x: vector of x coordinates
#   y: vector of y coordinates
#   thickness: units of thickness at each set of coordinates
#   col: line colour
#   unit: unit of thickness as fraction of vertical axis
if (length(x) != length(y)) "x and y are of different lengths"
if (length(x) != length(thickness)) 
"length of thickness is different from x and y"
if (length(x) < 2) "x and y are too short"
usr <- par("usr")
units <- (usr[4] - usr[3])*unit/2
x <- c(x, rev(x))
y <- c(y + thickness*units, rev(y) - rev(thickness)*units)
polygon(x=x, y=y, col=col, border=col)
}

Best,
 John

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Fox, John
> Sent: Sunday, November 11, 2018 1:30 PM
> To: David Winsemius ; Ferri Leberl
> 
> Cc: r-help@r-project.org
> Subject: Re: [R] Line with linearly changing thickness – installation issues
> 
> Dear David and Ferri,
> 
> Here's a simple implementation using polygon() (as David suggested). It's
> much less sophisticated than Paul Murrell's -- in particular, the ends of the 
> line
> are simply vertical (but, with a bit more work, that too could be addressed) 
> --
> and uses standard R graphics rather than grid.
> 
> tline <- function(x, y, thickness, col="black", unit=0.005){
> # line of varying thickness
> #   x: vector of x coordinates
> #   y: vector of y coordinates
> #   thickness: units of thickness at each set of coordinates
> #   col: line colour
> #   unit: unit of thickness as fraction of vertical axis
> tl <- function(x1, x2, y1, y2, start, end){
> polygon(x=c(x1, x1, x2, x2, x1),
> y=c(y1 - start*units, y1 + start*units,
> y2 + end*units, y2 - end*units,
> y1 - start*units),
> col=col, border=col)
> }
> if (length(x) != length(y)) "x and y are of different lengths"
> if (length(x) != length(thickness))
> "length of thickness is different from x and y"
> if (length(x) < 2) "x and y are too short"
> usr <- par("usr")
> units <- (usr[4] - usr[3])*unit/2
> for (i in 2:length(x)){
> tl(x[i - 1], x[i], y[i - 1], y[i], thickness[i - 1], thickness[i])
> }
> }
> 
> # example:
> 
> plot(c(-1, 1), c(0, 100), type="n")
> tline(seq(-1, 1, by=0.1), y=(seq(1, 9.5, length=21))^2,
>   thickness=seq(1, 20, length=21), col="blue")
> 
> I hope this helps,
>  John
> 
> --
> John Fox, Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> Web: socialsciences.mcmaster.ca/jfox/
> 
> 
> 
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of David
> > Winsemius
> > Sent: Sunday, November 11, 2018 12:23 PM
> > To: r-help@r-project.org
> > Subject: Re: [R] Line with linearly changing thickness – installation
> > issues
> >
> > I would have imagined that drawing a polygon would be the way most
> > people would have attempted.
> >
> > Regarding Murrell's package:
> >
> > I thought the package name was "vwline". My attempt to install was
> > unsuccessful>
> >
> >  > devtools::install_github("pmur002/vwline")
> > Error in utils::download.file(url, path, method = download_method(),
> > quiet = quiet,  :
> >    cannot open URL
> > 'https://api.github.com/repos/pmur002/vwline/contents/DESCRIPTION?ref=
> > ma
> > ster'
> >  > install.packages("~/vwline-0.2-1.tar.gz", repo=NULL) Installing
> > package into ‘/home/david/R/x86_64-pc-linux-gnu-library/3.5.1’
> > (as ‘lib’ is unspecified)
> > Warning in untar2(tarfile, files, list, exdir, restore_times) :
> >    skipping pax global extended headers
> > ERROR: cannot extract package from ‘/home/david/vwline-0.2-1.tar.gz’
> > Warning in install.packages :
> >    installation of package ‘/home/david/vwline-0.2-1.tar.gz’ had non-
> > zero exit status
> >
> >
> > Furthermore, I do get the same error from attempting to install
> > pkg:twine from github.
> >
> > -- David
> >
> > Doing this from an Rstudio console running R 3.5.1 in Ubuntu 18.04
> >
> > On 11/11/18 8:30 AM, Ferri Leberl wrote:
> > > Dear All,
> > > T

Re: [R] Line with linearly changing thickness – installation issues

2018-11-11 Thread Fox, John
Dear David and Ferri,

Here's a simple implementation using polygon() (as David suggested). It's much 
less sophisticated than Paul Murrell's -- in particular, the ends of the line 
are simply vertical (but, with a bit more work, that too could be addressed) -- 
and uses standard R graphics rather than grid.

tline <- function(x, y, thickness, col="black", unit=0.005){
# line of varying thickness
#   x: vector of x coordinates
#   y: vector of y coordinates
#   thickness: units of thickness at each set of coordinates
#   col: line colour
#   unit: unit of thickness as fraction of vertical axis
tl <- function(x1, x2, y1, y2, start, end){
polygon(x=c(x1, x1, x2, x2, x1), 
y=c(y1 - start*units, y1 + start*units, 
y2 + end*units, y2 - end*units, 
y1 - start*units),
col=col, border=col)
}
if (length(x) != length(y)) "x and y are of different lengths"
if (length(x) != length(thickness)) 
"length of thickness is different from x and y"
if (length(x) < 2) "x and y are too short"
usr <- par("usr")
units <- (usr[4] - usr[3])*unit/2
for (i in 2:length(x)){
tl(x[i - 1], x[i], y[i - 1], y[i], thickness[i - 1], thickness[i])
}
}

# example:

plot(c(-1, 1), c(0, 100), type="n")
tline(seq(-1, 1, by=0.1), y=(seq(1, 9.5, length=21))^2, 
  thickness=seq(1, 20, length=21), col="blue")

I hope this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of David
> Winsemius
> Sent: Sunday, November 11, 2018 12:23 PM
> To: r-help@r-project.org
> Subject: Re: [R] Line with linearly changing thickness – installation
> issues
> 
> I would have imagined that drawing a polygon would be the way most
> people would have attempted.
> 
> Regarding Murrell's package:
> 
> I thought the package name was "vwline". My attempt to install was
> unsuccessful>
> 
>  > devtools::install_github("pmur002/vwline")
> Error in utils::download.file(url, path, method = download_method(),
> quiet = quiet,  :
>    cannot open URL
> 'https://api.github.com/repos/pmur002/vwline/contents/DESCRIPTION?ref=ma
> ster'
>  > install.packages("~/vwline-0.2-1.tar.gz", repo=NULL) Installing
> package into ‘/home/david/R/x86_64-pc-linux-gnu-library/3.5.1’
> (as ‘lib’ is unspecified)
> Warning in untar2(tarfile, files, list, exdir, restore_times) :
>    skipping pax global extended headers
> ERROR: cannot extract package from ‘/home/david/vwline-0.2-1.tar.gz’
> Warning in install.packages :
>    installation of package ‘/home/david/vwline-0.2-1.tar.gz’ had non-
> zero exit status
> 
> 
> Furthermore, I do get the same error from attempting to install
> pkg:twine from github.
> 
> -- David
> 
> Doing this from an Rstudio console running R 3.5.1 in Ubuntu 18.04
> 
> On 11/11/18 8:30 AM, Ferri Leberl wrote:
> > Dear All,
> > Thanks to Peter for his hint to the lwline package.
> > As a pitty, I have difficulties to get it installed, as it requires
> https://github.com/Gibbsdavidl/twine which failes for me.
> >
> > install_github("g...@github.com:Gibbsdavidl/twine.git")
> >
> > ends with
> >
> > ** building package indices
> > Error in read.table(zfile, header = TRUE, as.is = FALSE) :
> >more columns than column names
> > ERROR: installing package indices failed
> > * removing ‘/usr/local/lib/R/site-library/twine’
> > Fehler in i.p(...) :
> >(konvertiert von Warnung) installation of package
> > ‘/tmp/RtmpD3exKe/file730c303b4c3/twine_0.1.tar.gz’ had non-zero exit
> > status
> >
> > I found hints like
> > https://community.rstudio.com/t/lazydata-failed-for-for-package/4196
> > and https://stat.ethz.ch/pipermail/r-help/2011-March/272829.html
> > that boil down to problems within the data subdir of the project – but
> I cannot (and should not) edit the project, can I?
> >
> > Can anybody help me solving the problem?
> > Thank you in advance!
> > Yours, Ferri
> >
> >
> >
> > Gesendet: Sonntag, 11. November 2018 um 15:38 Uhr
> > Von: "Peter Dalgaard" 
> > An: "Ferri Leberl" 
> > Cc: r-help@r-project.org
> > Betreff: Re: [R] Line with linearly changing thickness Hmm... I don't
> > recall whether this has been packaged up, but Paul Murrell talked
> about it at useR in Brisbane.
> >
> > https://www.youtube.com/watch?v=L6FawdEA3W0
> >
> > -pd
> >
> >> On 11 Nov 2018, at 11:44 , Ferri Leberl  wrote:
> >>
> >>
> >> Dear All,
> >> I want to depict flows: At point x there is an input of a units. at
> point y, b units arrive.
> >> Obviously, the line thicknes can be manipulated with (a constant)
> cex. But I want the thickness to change linearly from ~a in x to ~b in
> y.
> >> Is there an out of the box solution for this?
> >> Thank you in advance!
> >> Yours, Ferri
> >>
> >> __

Re: [R] Sum of Squares Type I, II, III for ANOVA

2018-11-06 Thread Fox, John
Dear Thanh Tran,

When you start a discussion on r-help, it's polite to keep it there so other 
people can see what transpires. I'm consequently cc'ing this response to the 
r-help list.

The problem with your code is that anova(), as opposed to Anova(), has no type 
argument.

Here's what I get with your data. I hope that the code and output don't get too 
mangled:

> data <- read.csv("Saha research.csv", header=TRUE)

> data <- within(data, {
+ tem <- as.factor(temperature)
+ ac <- as.factor (AC)
+ av <- as.factor(AV)
+ thick <- as.factor(Thickness)
+ })

> library(car)
Loading required package: carData

> options(contrasts = c("contr.sum", "contr.poly"))

> mod <- lm(KIC ~ tem*ac + tem*av + tem*thick + ac*av +ac*thick + av*thick, 
+   data=data)

> anova(mod) # type I (sequential)
Analysis of Variance Table

Response: KIC
   Df  Sum Sq Mean Sq  F valuePr(>F)
tem 2 15.3917  7.6958 427.9926 < 2.2e-16 ***
ac  2  0.1709  0.0854   4.7510 0.0096967 ** 
av  1  1.9097  1.9097 106.2055 < 2.2e-16 ***
thick   2  0.2041  0.1021   5.6756 0.0040359 ** 
tem:ac  4  0.5653  0.1413   7.8598 6.973e-06 ***
tem:av  2  1.7192  0.8596  47.8046 < 2.2e-16 ***
tem:thick   4  0.0728  0.0182   1.0120 0.4024210
ac:av   2  0.3175  0.1588   8.8297 0.0002154 ***
ac:thick4  0.0883  0.0221   1.2280 0.3003570
av:thick2  0.0662  0.0331   1.8421 0.1613058
Residuals 190  3.4164  0.0180   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> Anova(mod) # type II
Anova Table (Type II tests)

Response: KIC
   Sum Sq  Df  F valuePr(>F)
tem   15.3917   2 427.9926 < 2.2e-16 ***
ac 0.1709   2   4.7510 0.0096967 ** 
av 1.9097   1 106.2055 < 2.2e-16 ***
thick  0.2041   2   5.6756 0.0040359 ** 
tem:ac 0.5653   4   7.8598 6.973e-06 ***
tem:av 1.7192   2  47.8046 < 2.2e-16 ***
tem:thick  0.0728   4   1.0120 0.4024210
ac:av  0.3175   2   8.8297 0.0002154 ***
ac:thick   0.0883   4   1.2280 0.3003570
av:thick   0.0662   2   1.8421 0.1613058
Residuals  3.4164 190   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

> Anova(mod, type=3) # type III
Anova Table (Type III tests)

Response: KIC
 Sum Sq  Df   F valuePr(>F)
(Intercept) 102.430   1 5696.4740 < 2.2e-16 ***
tem  15.392   2  427.9926 < 2.2e-16 ***
ac0.171   24.7510 0.0096967 ** 
av1.910   1  106.2055 < 2.2e-16 ***
thick 0.204   25.6756 0.0040359 ** 
tem:ac0.565   47.8598 6.973e-06 ***
tem:av1.719   2   47.8046 < 2.2e-16 ***
tem:thick 0.073   41.0120 0.4024210
ac:av 0.318   28.8297 0.0002154 ***
ac:thick  0.088   41.2280 0.3003570
av:thick  0.066   21.8421 0.1613058
Residuals 3.416 190
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

If you have questions about Minitab there's probably another place to ask. It's 
not my opinion that type-III tests are generally preferable to type-II tests. 
Focus, in my opinion, should be on what hypotheses are being tested. If you 
want to see more detail, you could consult the book with which the car package 
is associated: see citation(package="car").

Best,
 John

> -Original Message-
> From: Thanh Tran [mailto:masternha...@gmail.com]
> Sent: Tuesday, November 6, 2018 9:15 PM
> To: Fox, John 
> Subject: Re: [R] Sum of Squares Type I, II, III for ANOVA
> 
> Dear  Prof. John Fox,
> Thank you for your answer. The CSV data was added as the attached file again.
> I try to set the contrasts properly *before* I fit the model but I received a
> problem as follows.
> 
> >  setwd("C:/NHAT/HOC TAP/R/Test/Anova") data = read.csv("Saha
> > research.csv", header =T)
> > attach(data)
> > tem = as.factor(temperature)
> > ac= as.factor (AC)
> >  av = as.factor(AV)
> >  thick = as.factor(Thickness)
> > library(car)
> Loading required package: carData
> > options(contrasts = c("contr.sum", "contr.poly")) mod <- lm(KIC ~
> > tem*ac + tem*av + tem*thick + ac*av +ac*thick + av*thick)
> > anova(mod,type= 3)
> Error: $ operator is invalid for atomic vectors
> 
> 
> Another problem is that in the paper that I read, the authors used MINITAB to
> analyze Anova. The authors use "adjusted sums of squares" calculate the p-
> value. So which should I use? Type I adjusted SS or Type III sequential SS?
> Minitab help tells me that I would "usually" want to use type III adjusted 
> SS, as
> type I sequential "sums of squares

Re: [R] Sum of Squares Type I, II, III for ANOVA

2018-11-06 Thread Fox, John
Dear Nhat Tran,

One more thing: You could specify the model even more compactly as

  mod <- lm(KIC ~ (tem + ac + av + thick)^2)

Best,
 John

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Fox, John
> Sent: Tuesday, November 6, 2018 8:41 PM
> To: Thanh Tran 
> Cc: r-help@r-project.org
> Subject: Re: [R] Sum of Squares Type I, II, III for ANOVA
> 
> Dear Nhat Tran,
> 
> The output that you show is unreadable and as far as I can see, the data 
> aren't
> attached, but perhaps the following will help: First, if you want Anova() to
> compute type III tests, then you have to set the contrasts properly *before*
> you fit the model, not after. Second, you can specify the model much more
> compactly as
> 
>   mod <- lm(KIC ~ tem*ac + tem*av + tem*thick + ac*av +ac*thick + av*thick)
> 
> Finally, as sound general practice, I'd not attach the data, but rather put 
> your
> recoded variables in the data frame and then specify the data argument to
> lm().
> 
> I hope that this helps,
>  John
> 
> -
> John Fox
> Professor Emeritus
> McMaster University
> Hamilton, Ontario, Canada
> Web: https://socialsciences.mcmaster.ca/jfox/
> 
> 
> 
> > -Original Message-
> > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Thanh
> > Tran
> > Sent: Tuesday, November 6, 2018 6:58 PM
> > To: r-help@r-project.org
> > Subject: [R] Sum of Squares Type I, II, III for ANOVA
> >
> > Hi everyone,
> > I'm studying the ANOVA in R and have some questions to share. I
> > investigate the effects of 4 factors (temperature-3 levels, asphalt
> > content-3 levels, air
> > voids-2 levels, and sample thickness-3 levels) on the hardness of
> > asphalt concrete in the tensile test (abbreviated as KIC). These data
> > were taken from a acticle paper. The codes were wrriten as the follows:
> >
> > > data = read.csv("Saha research.csv", header =T)
> > > attach(data)
> > > tem = as.factor(temperature)
> > > ac= as.factor (AC)
> > > av = as.factor(AV)
> > > thick = as.factor(Thickness)
> > > model =
> > lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av:thick
> > )
> > > anova(model) #Type I tests
> > > library(car) Loading required package: carData >
> >
> anova(lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av
> > :thick),type=2)
> > Error: $ operator is invalid for atomic vectors
> > > options(contrasts = c("contr.sum", "contr.poly"))
> > > Anova(model,type="3") # Type III tests
> > > Anova(model,type="2") # Type II tests
> >
> > With R, three results from Type I, II, and III almost have the same as 
> > follows.
> >
> > Analysis of Variance Table Response: KIC Df Sum Sq Mean Sq F value
> > Pr(>F) tem 2 15.3917 7.6958 427.9926 < 2.2e-16 *** ac 2 0.1709 0.0854
> > 4.7510
> > 0.0096967 ** av 1 1.9097 1.9097 106.2055 < 2.2e-16 *** thick 2 0.2041
> > 0.1021 5.6756 0.0040359 ** tem:ac 4 0.5653 0.1413 7.8598 6.973e-06 ***
> > tem:av 2 1.7192 0.8596 47.8046 < 2.2e-16 *** tem:thick 4 0.0728 0.0182
> > 1.0120 0.4024210 ac:av 2 0.3175 0.1588 8.8297 0.0002154 *** ac:thick 4
> > 0.0883 0.0221 1.2280 0.3003570 av:thick 2 0.0662 0.0331 1.8421
> > 0.1613058 Residuals 190 3.4164 0.0180 --- Signif. codes: 0 ‘***’ 0.001 ‘**’
> 0.01 ‘*’
> > 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> > However, these results are different from the results in the article,
> > especially for the interaction (air voids and sample thickness). The
> > results presented in the article are as follows:
> > Analysis of variance for KIC, using Adjusted SS for tests. Source DF
> > Seq SS Adj MS F-stat P-value Model findings Temperature 2 15.39355
> > 7.69677 426.68
> > <0.01 Significant AC 2 0.95784 0.47892 26.55 <0.01 Significant AV 1
> > 0.57035
> > 0.57035 31.62 <0.01 Significant Thickness 2 0.20269 0.10135 5.62 <0.01
> > Significant Temperature⁄AC 4 1.37762 0.34441 19.09 <0.01 Significant
> > Temperature⁄AV 2 0.8329 0.41645 23.09 <0.01 Significant
> > Temperature⁄thickness 4 0.07135 0.01784 0.99 0.415 Not significant
> > AC⁄AV 2
> > 0.86557 0.43279 23.99 <0.01 Significant AC⁄thickness 4 0.04337 0.01084
> > 0.6
> > 0.662 Not significant AV⁄thickness 2 0.17394 0.08697 4.82 <0.01
> > Significant Error 190 3.42734 0.01804 Total 215 23.91653
> >
> > Therefore, I wonder that whether there is an error in my code o

Re: [R] Sum of Squares Type I, II, III for ANOVA

2018-11-06 Thread Fox, John
Dear Nhat Tran,

The output that you show is unreadable and as far as I can see, the data aren't 
attached, but perhaps the following will help: First, if you want Anova() to 
compute type III tests, then you have to set the contrasts properly *before* 
you fit the model, not after. Second, you can specify the model much more 
compactly as

  mod <- lm(KIC ~ tem*ac + tem*av + tem*thick + ac*av +ac*thick + av*thick)

Finally, as sound general practice, I'd not attach the data, but rather put 
your recoded variables in the data frame and then specify the data argument to 
lm().

I hope that this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Thanh Tran
> Sent: Tuesday, November 6, 2018 6:58 PM
> To: r-help@r-project.org
> Subject: [R] Sum of Squares Type I, II, III for ANOVA
> 
> Hi everyone,
> I'm studying the ANOVA in R and have some questions to share. I investigate
> the effects of 4 factors (temperature-3 levels, asphalt content-3 levels, air
> voids-2 levels, and sample thickness-3 levels) on the hardness of asphalt
> concrete in the tensile test (abbreviated as KIC). These data were taken from 
> a
> acticle paper. The codes were wrriten as the follows:
> 
> > data = read.csv("Saha research.csv", header =T)
> > attach(data)
> > tem = as.factor(temperature)
> > ac= as.factor (AC)
> > av = as.factor(AV)
> > thick = as.factor(Thickness)
> > model =
> lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av:thick)
> > anova(model) #Type I tests
> > library(car) Loading required package: carData >
> anova(lm(KIC~tem+ac+av+thick+tem:ac+tem:av+tem:thick+ac:av+ac:thick+av
> :thick),type=2)
> Error: $ operator is invalid for atomic vectors
> > options(contrasts = c("contr.sum", "contr.poly"))
> > Anova(model,type="3") # Type III tests
> > Anova(model,type="2") # Type II tests
> 
> With R, three results from Type I, II, and III almost have the same as 
> follows.
> 
> Analysis of Variance Table Response: KIC Df Sum Sq Mean Sq F value Pr(>F)
> tem 2 15.3917 7.6958 427.9926 < 2.2e-16 *** ac 2 0.1709 0.0854 4.7510
> 0.0096967 ** av 1 1.9097 1.9097 106.2055 < 2.2e-16 *** thick 2 0.2041
> 0.1021 5.6756 0.0040359 ** tem:ac 4 0.5653 0.1413 7.8598 6.973e-06 ***
> tem:av 2 1.7192 0.8596 47.8046 < 2.2e-16 *** tem:thick 4 0.0728 0.0182
> 1.0120 0.4024210 ac:av 2 0.3175 0.1588 8.8297 0.0002154 *** ac:thick 4
> 0.0883 0.0221 1.2280 0.3003570 av:thick 2 0.0662 0.0331 1.8421 0.1613058
> Residuals 190 3.4164 0.0180 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’
> 0.05 ‘.’ 0.1 ‘ ’ 1
> 
> However, these results are different from the results in the article, 
> especially
> for the interaction (air voids and sample thickness). The results presented in
> the article are as follows:
> Analysis of variance for KIC, using Adjusted SS for tests. Source DF Seq SS 
> Adj
> MS F-stat P-value Model findings Temperature 2 15.39355 7.69677 426.68
> <0.01 Significant AC 2 0.95784 0.47892 26.55 <0.01 Significant AV 1 0.57035
> 0.57035 31.62 <0.01 Significant Thickness 2 0.20269 0.10135 5.62 <0.01
> Significant Temperature⁄AC 4 1.37762 0.34441 19.09 <0.01 Significant
> Temperature⁄AV 2 0.8329 0.41645 23.09 <0.01 Significant
> Temperature⁄thickness 4 0.07135 0.01784 0.99 0.415 Not significant AC⁄AV 2
> 0.86557 0.43279 23.99 <0.01 Significant AC⁄thickness 4 0.04337 0.01084 0.6
> 0.662 Not significant AV⁄thickness 2 0.17394 0.08697 4.82 <0.01 Significant
> Error 190 3.42734 0.01804 Total 215 23.91653
> 
> Therefore, I wonder that whether there is an error in my code or there is
> another type of ANOVA in R. If you could answer my problems, I would be
> most grateful.
> Best regards,
> Nhat Tran
> Ps: I also added a CSV file and the paper for practicing R.
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] "logical indexing, " [was] match() question or needle haystack problem for a data.frame

2018-10-26 Thread Fox, John
Dear Knut,

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Knut
> Krueger
> Sent: Friday, October 26, 2018 2:29 AM
> To: r-help mailing list 
> Subject: Re: [R] "logical indexing, " [was] match() question or needle 
> haystack
> problem for a data.frame
> 
> Am 25.10.18 um 16:13 schrieb peter dalgaard:
> >
> >
> > Yes: x[!(x$A %in% y$B),]
> 
> Ok thats in my opinion a little workaround

Not a work-around but a solution. That is, one can't expect to find an existing 
function for every problem.

> why?:
> 
> There is an
> = and !=

Actually == and !=

> < and >
> 
> 
> means the opposite is available between terms.
> 
> why is there f.e no %!in%, %notin%  or !%in%
> 
> This would be more intuitive.

If you feel strongly about this, it's not hard to supply it:

> `%!in%` <- function(x, y) !(x %in% y)
> (1:4) %in% (2*1:4)
[1] FALSE  TRUE FALSE  TRUE
> (1:4) %!in% (2*1:4)
[1]  TRUE FALSE  TRUE FALSE

Best,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/


> 
> Kind regards Knut
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] new edition of An R Companion to Applied Regression

2018-10-15 Thread Fox, John
Dear r-help list members,

Sandy Weisberg and I would like to announce a new (third) edition of our book 
An R Companion to Applied Regression, which has recently been published by Sage 
Publications. The book provides a broad introduction to R in the general 
context of applied regression analysis, including linear models, generalized 
linear models, and, new to the third edition, mixed-effects models.

The R Companion is associated with two widely used CRAN packages, the car and 
effects packages. In anticipation of the new edition of the R Companion we 
contributed substantially revised versions of these packages to CRAN: version 
3.0-x of the car package and version 4.0-x of the effects package.

More information about the book, including a variety of on-line resources 
(chapter R scripts, on-line appendices, etc.), is available at 
.

Best,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Problem with lm.resid() when weights are provided

2018-09-17 Thread Fox, John
Dear Hamed,

> -Original Message-
> From: Hamed Ha [mailto:hamedhas...@gmail.com]
> Sent: Monday, September 17, 2018 3:56 AM
> To: Fox, John 
> Cc: r-help@r-project.org
> Subject: Re: [R] Problem with lm.resid() when weights are provided
> 
> H i John,
> 
> 
> Thank you for your reply.
> 
> 
> I see your point, thanks. I checked lm.wfit() and realised that there is a tol
> parameter that is already set to 10^-7. This is not even the half decimal to 
> the
> machine precision. Furthermore, plying with tol parameter does not solve the
> problem, as far as I checked.

tol plays a different role in lm.wfit(). It's for the QR decomposition (done in 
C code), I suppose to determine the rank of the weighted model matrix. 
Generally in this kind of context, you'd use something like the square root of 
the machine double epsilon to define a number that's effectively 0, and the 
tolerance used here isn't too far off that -- about an order of magnitude 
larger.
 
I'm not an expert in computer arithmetic or numerical linear algebra, so I 
don't have anything more to say about this.

> 
> 
> I still see this issue as critical and we should report it to the R core team 
> to be
> investigated more. What do you think?

I don't think that it's a critical issue because it isn't sensible to specify 
nonzero weights so close to 0. A simple solution is to change these weights to 
0 in your code calling lm().

That said, I suppose that it might be better to make lm.wfit() more robust to 
near-zero weights. If you feel strongly about this, you can file a bug report, 
but I'm not interested in pursuing it.

Best,
 John

> 
> 
> Regards,
> Hamed.
> 
> 
> On Fri, 14 Sep 2018 at 22:46, Fox, John  <mailto:j...@mcmaster.ca> > wrote:
> 
> 
>   Dear Hamed,
> 
>   When you post a question to r-help, generally you should cc
> subsequent messages there as well, as I've done to this response.
> 
>   The algorithm that lm() uses is much more numerically stable than
> inverting the weighted sum-of-squares-and-product matrix. If you want to see
> how the computations are done, look at lm.wfit(), in which the residuals and
> fits are computed as
> 
>   z$residuals <- z$residuals/wts
>   z$fitted.values <- y - z$residuals
> 
>   Zero weights are handled specially, and your tiny weights are thus the
> source of the problem. When you divide by a number less than the machine
> double-epsilon, you can't expect numerically stable results. I suppose that
> lm.wfit() could check for 0 weights to a tolerance rather than exactly.
> 
>   John
> 
>   > -Original Message-
>   > From: Hamed Ha [mailto:hamedhas...@gmail.com
> <mailto:hamedhas...@gmail.com> ]
>   > Sent: Friday, September 14, 2018 5:34 PM
>   > To: Fox, John mailto:j...@mcmaster.ca> >
>   > Subject: Re: [R] Problem with lm.resid() when weights are provided
>   >
>   > Hi John,
>   >
>   > Thank you for your reply.
>   >
>   > I agree that the small weights are the potential source of the
> instability in the
>   > result. I also suspected that there are some failure/bugs in the 
> actual
>   > algorithm that R uses for fitting the model. I remember that at some
> points I
>   > checked the theoretical estimation of the parameters, solve(t(x)
> %*% w %*%
>   > x) %*% t(x) %*% w %*% y, (besides the point that I had to set tol
> parameter in
>   > solve() to a super small value) and realised  that lm() and the
> theoretical
>   > results match together. That is the parameter estimation is right in
> R.
>   > Moreover, I checked the predictions, predict(lm.fit), and it was 
> right.
> Then the
>   > only source of error remained was resid() function. I further checked
> this
>   > function and it is nothing more than calling a sub-element from and
> lm() fit.
>   > Putting all together, I think that there is something wrong/bug/miss-
>   > configuration in the lm() algorithm and I highly recommend the R
> core team to
>   > fix that.
>   >
>   > Please feel free to contact me for more details if required.
>   >
>   > Warm regards,
>   > Hamed.
>   >
>   >
>   >
>   >
>   >
>   >
>   >
>   >
>   >
>   > On Fri, 14 Sep 2018 at 13:35, Fox, John  <mailto:j...@mcmaster.ca>
>   > <mailto:j...@mcmaster.ca <mailto:j...@mcmaster.ca> > > wrote:
>   >
>   &g

Re: [R] Problem with lm.resid() when weights are provided

2018-09-14 Thread Fox, John
Dear Hamed,

When you post a question to r-help, generally you should cc subsequent messages 
there as well, as I've done to this response.

The algorithm that lm() uses is much more numerically stable than inverting the 
weighted sum-of-squares-and-product matrix. If you want to see how the 
computations are done, look at lm.wfit(), in which the residuals and fits are 
computed as 

z$residuals <- z$residuals/wts
z$fitted.values <- y - z$residuals

Zero weights are handled specially, and your tiny weights are thus the source 
of the problem. When you divide by a number less than the machine 
double-epsilon, you can't expect numerically stable results. I suppose that 
lm.wfit() could check for 0 weights to a tolerance rather than exactly.

John

> -Original Message-
> From: Hamed Ha [mailto:hamedhas...@gmail.com]
> Sent: Friday, September 14, 2018 5:34 PM
> To: Fox, John 
> Subject: Re: [R] Problem with lm.resid() when weights are provided
> 
> Hi John,
> 
> Thank you for your reply.
> 
> I agree that the small weights are the potential source of the instability in 
> the
> result. I also suspected that there are some failure/bugs in the actual
> algorithm that R uses for fitting the model. I remember that at some points I
> checked the theoretical estimation of the parameters, solve(t(x) %*% w %*%
> x) %*% t(x) %*% w %*% y, (besides the point that I had to set tol parameter in
> solve() to a super small value) and realised  that lm() and the theoretical
> results match together. That is the parameter estimation is right in R.
> Moreover, I checked the predictions, predict(lm.fit), and it was right. Then 
> the
> only source of error remained was resid() function. I further checked this
> function and it is nothing more than calling a sub-element from and lm() fit.
> Putting all together, I think that there is something wrong/bug/miss-
> configuration in the lm() algorithm and I highly recommend the R core team to
> fix that.
> 
> Please feel free to contact me for more details if required.
> 
> Warm regards,
> Hamed.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Fri, 14 Sep 2018 at 13:35, Fox, John  <mailto:j...@mcmaster.ca> > wrote:
> 
> 
>   Dear Hamed,
> 
>   I don't think that anyone has picked up on this problem.
> 
>   What's peculiar about your weights is that several are 0 within
> rounding error but not exactly 0:
> 
>   > head(df)
>  y  x   weight
>   1  1.5115614  0.5520924 2.117337e-34
>   2 -0.6365313 -0.1259932 2.117337e-34
>   3  0.3778278  0.4209538 4.934135e-31
>   4  3.0379232  1.4031545 2.679495e-24
>   5  1.5364652  0.4607686 2.679495e-24
>   6 -2.3772787 -0.7396358 6.244160e-21
> 
>   I can reproduce the results that you report:
> 
>   > (mod.1 <- lm(y ~ x, data=df))
> 
>   Call:
>   lm(formula = y ~ x, data = df)
> 
>   Coefficients:
>   (Intercept)x
>  -0.04173  2.03790
> 
>   > max(resid(mod.1))
>   [1] 1.14046
>   > (mod.2 <- lm(y ~ x, data=df, weights=weight))
> 
>   Call:
>   lm(formula = y ~ x, data = df, weights = weight)
> 
>   Coefficients:
>   (Intercept)x
>  -0.05786  1.96087
> 
>   > max(resid(mod.2))
>   [1] 36.84939
> 
>   But the problem disappears when the tiny nonzero weight are set to 0:
> 
>   > df2 <- df
>   > df2$weight <- zapsmall(df2$weight)
>   > head(df2)
>  y  x weight
>   1  1.5115614  0.5520924  0
>   2 -0.6365313 -0.1259932  0
>   3  0.3778278  0.4209538  0
>   4  3.0379232  1.4031545  0
>   5  1.5364652  0.4607686  0
>   6 -2.3772787 -0.7396358  0
>   > (mod.3 <- update(mod.2, data=df2))
> 
>   Call:
>   lm(formula = y ~ x, data = df2, weights = weight)
> 
>   Coefficients:
>   (Intercept)x
>  -0.05786  1.96087
> 
>   > max(resid(mod.3))
>   [1] 1.146663
> 
>   I don't know exactly why this happens, but suspect numerical
> instability produced by the near-zero weights, which are smaller than the
> machine double-epsilon
> 
>   > .Machine$double.neg.eps
>   [1] 1.110223e-16
> 
>   The problem also disappears, e.g., if the tiny weight are set to 1e-15
> rather than 0.
> 
>   I hope this helps,
>John
> 
>   -
>   John Fox
>   Professor Emeritus
>   McMaster University
>   Hami

Re: [R] Problem with lm.resid() when weights are provided

2018-09-14 Thread Fox, John
Dear Hamed,

I don't think that anyone has picked up on this problem.

What's peculiar about your weights is that several are 0 within rounding error 
but not exactly 0:

> head(df)
   y  x   weight
1  1.5115614  0.5520924 2.117337e-34
2 -0.6365313 -0.1259932 2.117337e-34
3  0.3778278  0.4209538 4.934135e-31
4  3.0379232  1.4031545 2.679495e-24
5  1.5364652  0.4607686 2.679495e-24
6 -2.3772787 -0.7396358 6.244160e-21

I can reproduce the results that you report:

> (mod.1 <- lm(y ~ x, data=df))

Call:
lm(formula = y ~ x, data = df)

Coefficients:
(Intercept)x  
   -0.04173  2.03790  

> max(resid(mod.1))
[1] 1.14046
> (mod.2 <- lm(y ~ x, data=df, weights=weight))

Call:
lm(formula = y ~ x, data = df, weights = weight)

Coefficients:
(Intercept)x  
   -0.05786  1.96087  

> max(resid(mod.2))
[1] 36.84939

But the problem disappears when the tiny nonzero weight are set to 0:

> df2 <- df
> df2$weight <- zapsmall(df2$weight)
> head(df2)
   y  x weight
1  1.5115614  0.5520924  0
2 -0.6365313 -0.1259932  0
3  0.3778278  0.4209538  0
4  3.0379232  1.4031545  0
5  1.5364652  0.4607686  0
6 -2.3772787 -0.7396358  0
> (mod.3 <- update(mod.2, data=df2))

Call:
lm(formula = y ~ x, data = df2, weights = weight)

Coefficients:
(Intercept)x  
   -0.05786  1.96087  

> max(resid(mod.3))
[1] 1.146663

I don't know exactly why this happens, but suspect numerical instability 
produced by the near-zero weights, which are smaller than the machine 
double-epsilon

> .Machine$double.neg.eps
[1] 1.110223e-16

The problem also disappears, e.g., if the tiny weight are set to 1e-15 rather 
than 0.

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Hamed Ha
> Sent: Tuesday, September 11, 2018 8:39 AM
> To: r-help@r-project.org
> Subject: [R] Problem with lm.resid() when weights are provided
> 
> Dear R Help Team.
> 
> I get some weird results when I use the lm function with weight. The issue can
> be reproduced by the example below:
> 
> 
> The input data is (weights are intentionally designed to reflect some
> structures in the data)
> 
> 
> > df
> y x weight
>  1.51156139  0.55209240 2.117337e-34
> -0.63653132 -0.12599316 2.117337e-34
>  0.37782776  0.42095384 4.934135e-31
>  3.03792318  1.40315446 2.679495e-24
>  1.53646523  0.46076858 2.679495e-24
> -2.37727874 -0.73963576 6.244160e-21
>  0.37183065  0.20407468 1.455107e-17
> -1.53917553 -0.95519361 1.455107e-17
>  1.10926675  0.03897129 3.390908e-14
> -0.37786333 -0.17523593 3.390908e-14
>  2.43973603  0.97970095 7.902000e-11
> -0.35432394 -0.03742559 7.902000e-11
>  2.19296613  1.00355263 4.289362e-04
>  0.49845532  0.34816207 4.289362e-04
>  1.25005260  0.76306225 5.00e-01
>  0.84360691  0.45152356 5.00e-01
>  0.29565993  0.53880068 5.00e-01
> -0.54081334 -0.28104525 5.00e-01
>  0.83612836 -0.12885659 9.995711e-01
> -1.42526769 -0.87107631 9.98e-01
>  0.10204789 -0.11649899 1.00e+00
>  1.14292898  0.37249631 1.00e+00
> -3.02942081 -1.28966997 1.00e+00
> -1.37549764 -0.74676145 1.00e+00
> -2.00118016 -0.55182759 1.00e+00
> -4.24441674 -1.94603608 1.00e+00
>  1.17168144  1.00868008 1.00e+00
>  2.64007761  1.26333069 1.00e+00
>  1.98550114  1.18509599 1.00e+00
> -0.58941683 -0.61972416 9.98e-01
> -4.57559611 -2.30914920 9.995711e-01
> -0.82610544 -0.39347576 9.995711e-01
> -0.02768220  0.20076910 9.995711e-01
>  0.78186399  0.25690215 9.995711e-01
> -0.88314153 -0.20200148 5.00e-01
> -4.17076452 -2.03547588 5.00e-01
>  0.93373070  0.54190626 4.289362e-04
> -0.08517734  0.17692491 4.289362e-04
> -4.47546619 -2.14876688 4.289362e-04
> -1.65509103 -0.76898087 4.289362e-04
> -0.39403030 -0.12689705 4.289362e-04
>  0.01203300 -0.18689898 1.841442e-07
> -4.82762639 -2.31391121 1.841442e-07
> -0.72658380 -0.39751171 3.397282e-14
> -2.35886866 -1.01082109 0.00e+00
> -2.03762707 -0.96439902 0.00e+00
>  0.90115123  0.60172286 0.00e+00
>  1.55999194  0.83433953 0.00e+00
>  3.07994058  1.30942776 0.00e+00
>  1.78871462  1.10605530 0.00e+00
> 
> 
> 
> Running simple linear model returns:
> 
> > lm(y~x,data=df)
> 
> Call:
> lm(formula = y ~ x, data = df)
> 
> Coefficients:
> (Intercept)x
>-0.04173  2.03790
> 
> and
> > max(resid(lm(y~x,data=df)))
> [1] 1.14046
> 
> 
> *HOWEVER if I use the weighted model then:*
> 
> lm(formula = y ~ x, data = df, weights = df$weights)
> 
> Coefficients:
> (Intercept)x
>-0.05786  1.96087
> 
> and
> > max(resid(lm(y~x,data=df,weights=df$weights)))
> [1] 60.91888
> 
> 
> as you see, the estimation of the coefficients are nearly the same but the
> resid() function returns 

Re: [R] Marginal effects with plm

2018-09-06 Thread Fox, John
Dear Milu,

I get the same error as you with this example -- I tried a different plm model 
-- which of course is why a reproducible example is a good idea.

Here's where the error is:

--- snip ---

> Ef.hd <- Effect(c("pc", "emp", "unemp"), zz)
Error in UseMethod("droplevels") : 
  no applicable method for 'droplevels' applied to an object of class "NULL"
> traceback()
10: droplevels(index)
9: model.frame.pFormula(formula = log(gsp) ~ log(pcap) + log(pc) + 
   log(emp) + unemp, data = Produc, drop.unused.levels = TRUE)
8: stats::model.frame(formula = log(gsp) ~ log(pcap) + log(pc) + 
   log(emp) + unemp, data = Produc, drop.unused.levels = TRUE)
7: eval(mf, parent.frame())
6: eval(mf, parent.frame())
5: glm(formula = log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, 
   data = Produc, control = list(epsilon = 1, maxit = 1, trace = FALSE))
4: eval(cl)
3: eval(cl)
2: Effect.default(c("pc", "emp", "unemp"), zz)
1: Effect(c("pc", "emp", "unemp"), zz)

--- snip ---

So the error is in model.frame.pFormula(), which is from the plm package. It 
would probably require substantial effort to get this to work.

Best,
 John


> -Original Message-
> From: Miluji Sb [mailto:miluj...@gmail.com]
> Sent: Thursday, September 6, 2018 8:52 AM
> To: Fox, John 
> Cc: r-help mailing list 
> Subject: Re: [R] Marginal effects with plm
> 
> Dear John,
> 
> Apologies for not providing reproducible example. I just tried with a
> plm example but ran into the same issue;
> 
> library(plm)
> data("Produc", package = "plm")
> zz <- plm(log(gsp) ~ log(pcap) + log(pc) + log(emp) + unemp, data =
> Produc, index = c("state","year"))
> 
> Ef.hd <- Effect(c("pc", "emp", "unemp"), zz)
> 
> Error in UseMethod("droplevels") :
>   no applicable method for 'droplevels' applied to an object of class
> "NULL"
> 
> What am I doing wrong? Thanks again.
> 
> Sincerely,
> 
> Milu
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Marginal effects with plm

2018-09-06 Thread Fox, John
Dear Milu,

Effect() doesn't have a specific plm method so the default method is invoked. 
Before responding to your initial question. I tried Effect() with an example 
from ?plm and it worked.

Without a reproducible example that produces the error that you encountered, 
there's no way to answer your question.

Best,
 John

> -Original Message-
> From: Miluji Sb [mailto:miluj...@gmail.com]
> Sent: Thursday, September 6, 2018 5:37 AM
> To: Fox, John 
> Cc: r-help mailing list 
> Subject: Re: [R] Marginal effects with plm
> 
> Dear John,
> 
> Thank you very much for the solution and the suggestion. I have tried the
> following;
> 
> plm1 <- plm(formula = log(gva_ind) ~  poly(x1, 2, raw=TRUE) +
> heat*debt_dummy + tt, data = df, index=c("region","year"))
> 
> Ef.hd <- Effect(c("heat", "debt_dummy"), plm1)
> 
> 
> But get the following error;  - Error in UseMethod("droplevels") : no 
> applicable
> method for 'droplevels' applied to an object of class "NULL"
> 
> Is this something to do with the way the plm object? Thanks again!
> 
> Sincerely,
> 
> Milu
> 
> On Thu, Sep 6, 2018 at 1:12 AM Fox, John  <mailto:j...@mcmaster.ca> > wrote:
> 
> 
>   Dear Milu,
> 
>   Depending upon what you mean by "marginal effects," you might try
> the effects package. For example, for your model, try
> 
>   (Ef.hd <- Effect(c("heat", "debt_dummy"), plm1))
>   plot(Ef.hd)
> 
>   A couple of comments about the model: I'd prefer to specify the
> formula as log(y) ~ poly(x1, 2) + heat*debt + tt or log(y) ~ poly(x1, 2,
> raw=TRUE) + heat*debt + tt (assuming that debt_dummy is a precoded
> dummy regressor for a factor debt).
> 
>   I hope this helps,
>John
> 
>   --
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: socialsciences.mcmaster.ca/jfox/
> <http://socialsciences.mcmaster.ca/jfox/>
> 
> 
> 
>   > -Original Message-
>   > From: R-help [mailto:r-help-boun...@r-project.org <mailto:r-help-
> boun...@r-project.org> ] On Behalf Of Miluji
>   > Sb
>   > Sent: Wednesday, September 5, 2018 6:30 PM
>   > To: r-help mailing list mailto:r-help@r-
> project.org> >
>   > Subject: [R] Marginal effects with plm
>   >
>   > Dear all,
>   >
>   > I am running the following panel regression;
>   >
>   > plm1 <- plm(formula = log(y) ~ x1 + I(x1^2) + heat*debt_dummy + tt,
> data
>   > = df, index=c("region","year"))
>   >
>   > where 'df' is a pdata.frame. I would like to obtain marginal effects 
> of
>   > 'y'
>   > for the variable 'x1'. I have tried the packages 'prediction' and
>   > 'margins'
>   > without luck.
>   >
>   > Is it possible to obtain marginal effects with 'plm'? Any help will be
>   > highly appreciated. Thank you.
>   >
>   > Error in UseMethod("predict") :
>   >   no applicable method for 'predict' applied to an object of class
>   > "c('plm', 'panelmodel')"
>   >
>   > Sincerely,
>   >
>   > Milu
>   >
>   >   [[alternative HTML version deleted]]
>   >
>   > __
>   > R-help@r-project.org <mailto:R-help@r-project.org>  mailing list --
> To UNSUBSCRIBE and more, see
>   > https://stat.ethz.ch/mailman/listinfo/r-help
>   > PLEASE do read the posting guide http://www.R-project.org/posting-
>   > guide.html
>   > and provide commented, minimal, self-contained, reproducible
> code.
> 

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Marginal effects with plm

2018-09-05 Thread Fox, John
Dear Milu,

Depending upon what you mean by "marginal effects," you might try the effects 
package. For example, for your model, try 

(Ef.hd <- Effect(c("heat", "debt_dummy"), plm1))
plot(Ef.hd)

A couple of comments about the model: I'd prefer to specify the formula as 
log(y) ~ poly(x1, 2) + heat*debt + tt or log(y) ~ poly(x1, 2, raw=TRUE) + 
heat*debt + tt (assuming that debt_dummy is a precoded dummy regressor for a 
factor debt).

I hope this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Miluji
> Sb
> Sent: Wednesday, September 5, 2018 6:30 PM
> To: r-help mailing list 
> Subject: [R] Marginal effects with plm
> 
> Dear all,
> 
> I am running the following panel regression;
> 
> plm1 <- plm(formula = log(y) ~ x1 + I(x1^2) + heat*debt_dummy + tt, data
> = df, index=c("region","year"))
> 
> where 'df' is a pdata.frame. I would like to obtain marginal effects of
> 'y'   
> for the variable 'x1'. I have tried the packages 'prediction' and
> 'margins'
> without luck.
> 
> Is it possible to obtain marginal effects with 'plm'? Any help will be
> highly appreciated. Thank you.
> 
> Error in UseMethod("predict") :
>   no applicable method for 'predict' applied to an object of class
> "c('plm', 'panelmodel')"
> 
> Sincerely,
> 
> Milu
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] standardized regression coefficients in GAM

2018-08-30 Thread Fox, John
Dear Dani,

I'll address your questions briefly below. They aren't unique to GAMs.

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of dani
> Sent: Thursday, August 30, 2018 12:47 PM
> To: r-help@r-project.org
> Subject: [R] standardized regression coefficients in GAM
> 
> Hello everyone,
> 
> 
> I was wondering if anyone can help me calculate standardized regression
> coefficients from a GAM model.
> 
> I have some dummy and some continuous covariates in my GAM model. I
> know I could standardize only the continuous covariates and re-run the model
> to get the standardized coefficients. Can anyone help with some R code to
> create the standardized coefficients after obtaining a GAM model based on
> unstandardized coefficients?

Why one would want to do this isn't clear to me, but you can just multiply each 
such coefficient by the standard deviation of the corresponding X and divide by 
the standard deviation of Y.

> 
> 
> Also, on a separate note, what do I do with the dummy covariates - should I
> just include them as they are in the model with standardized variables? I do
> not see how I can standardize dummy variables.

Standardizing dummy regressors is nonsense, so don't do it. If there are 
interaction regressors in your model, don't standardize those either.

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> 
> 
> Thank you!
> 
> Best,
> 
> Dani
> 
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Obtaining Complete Dataset with Imputed Values

2018-08-30 Thread Fox, John
Dear Paul and WHP,

My guess: Paul apparently has loaded the mice package after the mi package. 
Both packages have complete() functions, but for objects of different classes 
-- "mids" in the case of mice. Consquently, complete() in the mice package is 
shadowing complete() in the mi package.

The solution is not to load both packages unless you actually need to, load mi 
after mice (though then other similar problems might surface), or invoke 
mi::complete()  explicitly.

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/




> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bill Poling
> Sent: Thursday, August 30, 2018 8:42 AM
> To: Paul Bernal ; r-help@r-project.org
> Subject: Re: [R] Obtaining Complete Dataset with Imputed Values
> 
> Good morning Paul.
> 
> I am unfamiliar with the package you are using but I have been working
> through the tutorial for this purpose using finalfit, if that is any help.
> 
> Cheers
> 
> WHP
> 
> http://www.datasurg.net/2018/08/29/five-steps-for-missing-data-with-
> finalfit/
> 
> 
> From: R-help  On Behalf Of Paul Bernal
> Sent: Friday, August 24, 2018 2:57 PM
> To: r-help@r-project.org
> Subject: [R] Obtaining Complete Dataset with Imputed Values
> 
> Dear friends, hope all is well with you,
> 
> I am working with package mi for data inputation. Currently working with R
> version 3.5.0 (64-bit).
> 
> Say my data is defined as dat, and I do the following:
> 
> datimputations <- mi(dat[2:5], n.iter=50) completedat <-
> complete(datimputations)
> 
> After using the complete function, I get the following error message:
> 
> Error in complete(datimputations, m = 1) : 'data' not of class 'mids'
> 
> How can I retrieve the processed dataframe (along with the imputed values)?
> 
> Here is my dput() for you to see
> 
> > dput(head(dat,100))
> structure(list(TransitDate = structure(c(496990800, 499669200, 502261200,
> 504939600, 507618000, 510037200, 512715600, 515307600, 517986000,
> 520578000, 523256400, 525934800, 528526800, 531205200, 533797200,
> 536475600, 539154000, 541573200, 544251600, 546843600, 549522000,
> 552114000, 554792400, 557470800, 560062800, 562741200, 565333200,
> 568011600, 57069, 573195600, 575874000, 578466000, 581144400,
> 583736400, 586414800, 589093200, 591685200, 594363600, 596955600,
> 599634000, 602312400, 604731600, 60741, 610002000, 612680400,
> 615272400, 617950800, 620629200, 623221200, 625899600, 628491600,
> 63117, 633848400, 636267600, 638946000, 641538000, 644216400,
> 646808400, 649486800, 652165200, 654757200, 657435600, 660027600,
> 662706000, 665384400, 667803600, 670482000, 673074000, 675752400,
> 678344400, 681022800, 683701200, 686293200, 688971600, 691563600,
> 694242000, 696920400, 699426000, 702104400, 704696400, 707371200,
> 709963200, 712641600, 71532, 717912000, 720590400, 723182400,
> 725860800, 728539200, 730958400, 733636800, 736232400, 738910800,
> 741502800, 744181200, 746859600, 749451600, 75213, 754722000,
> 757400400), class = c("POSIXct", "POSIXt"), tzone = ""), Transits = c(14L, 
> 14L,
> 13L, 10L, 11L, 14L, 14L, 14L, 16L, 6L, 8L, 6L, 6L, 7L, 7L, 9L, 7L, 9L, 3L, 
> 12L, 7L,
> 8L, 10L, 9L, 10L, 11L, 9L, 9L, 5L, 11L, 12L, 7L, 12L, 10L, 9L, 13L, 7L, 7L, 
> 8L, 4L,
> 4L, 7L, 5L, 7L, 7L, 6L, 9L, 4L, 7L, 9L, 5L, 5L, 10L, 6L, 6L, 13L, 6L, 7L, 
> 10L, 7L, 8L,
> 5L, 6L, 7L, 6L, 9L, 8L, 10L, 9L, 9L, 12L, 5L, 9L, 6L, 7L, 10L, 10L, 9L, 14L, 
> 14L, 15L,
> 14L, 16L, 17L, 18L, 11L, 15L, 14L, 8L, 13L, 10L, 9L, 12L, 8L, 12L, 10L, 11L, 
> 10L,
> 9L, 10L), CargoTons = c(154973L, 129636L, 136884L, 86348L, 109907L,
> 154506L, 144083L, 152794L, 124861L, 60330L, 65221L, 61718L, 53997L,
> 83536L, 63218L, 98222L, 54719L, 98470L, 18263L, 104255L, 62869L, 62523L,
> 75344L, 81476L, 92818L, 87457L, 85231L, 77897L, 57699L, 96989L, 109361L,
> 59799L, 91116L, 82241L, 74251L, 124361L, 68751L, 61719L, 68017L, 37760L,
> 32513L, 56359L, 51333L, 80859L, 75852L, 65760L, 96043L, 38820L, 63202L,
> 102647L, 49104L, 53482L, 121305L, 71795L, 76704L, 146097L, 73047L,
> 68557L, 110642L, 77616L, 97767L, 52059L, 58658L, 66350L, 69303L, 76013L,
> 91909L, 108445L, 94454L, 101249L, 112131L, 56290L, 118342L, 70618L,
> 64783L, 112839L, 120506L, 94243L, 130768L, 133643L, 146321L, 140736L,
> 147234L, 158953L, 189888L, 93819L, 130021L, 130124L, 55088L, 114783L,
> 95184L, 82205L, 80321L, 65422L, 98933L, 93713L, 98417L, 97210L, 88464L,
> 94659L), RcnstPCUMS = c(229914L, 214547L, 215890L, 158695L, 173125L,
> 222533L, 212490L, 222125L, 266913L, 94268L, 112967L, 95480L, 87654L,
> 108996L, 97973L, 139247L, 93817L, 133197L, 40020L, 169749L, 102590L,
> 112121L, 140241L, 122989L, 144592L, 144979L, 123748L, 123249L, 70081L,
> 155218L, 168096L, 104743L, 163384L, 142648L, 129188L, 183170L, 99299L,
> 99873L, 111648L, 55890L, 59183L, 95568L, 72550L, 104562L, 100

Re: [R] graphing repeated curves

2018-08-22 Thread Fox, John
Dear Bert,

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter
> Sent: Wednesday, August 22, 2018 8:38 PM
> To: Jim Lemon 
> Cc: rss@gmail.com; R-help 
> Subject: Re: [R] graphing repeated curves
> 
> I do not think this does what the OP wants -- it does not produce polynomials
> of the form desired.
> 
> John Fox's solution using poly() seems to me to be the right approach, but I

Actually, I didn't do a good job of graphing the polynomials between the 
observed x-values. Here's a better solution:

x <- with(mtcars, seq(min(hp), max(hp), length=500))
plot(mpg ~ hp, data=mtcars)
for (p in 1:6){
m <- lm(mpg ~ poly(hp, p), data=mtcars)
lines(x, predict(m, newdata=data.frame(hp=x)), lty=p, col=p)
}
legend("top", legend=1:6, lty=1:6, col=1:6, title="order", inset=0.02)

Best,
 John

> will show what I think is a considerably simpler way to build up the
> polynomial expressions just as an example of one way to do this sort of thing
> in more general circumstances:
> 
> fm <- vector("character",6)
> fm[1]<- "mpg ~ hp"
> for(i in 2:6)fm[i]<- paste0(fm[i-1]," + I(hp^", i,")") ## yielding:
> > fm
> [1] "mpg ~ hp"
> [2] "mpg ~ hp + I(hp^2)"
> [3] "mpg ~ hp + I(hp^2) + I(hp^3)"
> [4] "mpg ~ hp + I(hp^2) + I(hp^3) + I(hp^4)"
> [5] "mpg ~ hp + I(hp^2) + I(hp^3) + I(hp^4) + I(hp^5)"
> [6] "mpg ~ hp + I(hp^2) + I(hp^3) + I(hp^4) + I(hp^5) + I(hp^6)"
> 
> Although fm is a character vector, the character strings will be automatically
> coerced by lm to formulas (see ?lm), so, e.g.
> 
> results <- lapply(fm, lm,data = mtcars)
> 
> would yield a list of regressions which could then be summarized, plotted or
> whatever (again using lapply). e.g.
> 
> > results[[3]]
> 
> Call:
> FUN(formula = X[[i]], data = ..1)
> 
> Coefficients:
> (Intercept)   hp  I(hp^2)  I(hp^3)
>   4.422e+01   -2.945e-019.115e-04   -8.701e-07
> 
> One could also choose to do the plotting or whatever within the lapply call,
> but I prefer to keep things simple if possible.
> 
> Cheers,
> Bert
> 
> 
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Wed, Aug 22, 2018 at 4:43 PM Jim Lemon 
> wrote:
> 
> > Hi Richard,
> > This may be what you want:
> >
> > data(mtcars)
> > m<-list()
> > for(i in 1:6) {
> >  rhterms<-paste(paste0("I(hp^",1:i,")"),sep="+")
> >  lmexp<-paste0("lm(mpg~",rhterms,",mtcars)")
> >  cat(lmexp,"\n")
> >  m[[i]]<-eval(parse(text=lmexp))
> > }
> > plot(mpg~hp,mtcars,type="n")
> > for(i in 1:6) abline(m[[i]],col=i)
> >
> > Jim
> >
> >
> > On Thu, Aug 23, 2018 at 9:07 AM, Richard Sherman 
> > wrote:
> > > Hi all,
> > >
> > > I have a simple graphing question that is not really a graphing
> > question, but a question about repeating a task.
> > >
> > > I’m fiddling with some of McElreath’s Statistical Rethinking, and
> > there’s a graph illustrating extreme overfitting (a number of
> > polynomial terms in x equal to the number of observations), a subject
> > I know well having taught it to grad students for many years.
> > >
> > > The plot I want to reproduce has, in effect:
> > >
> > > m1 <- lm( y ~ x)
> > > m2 <- lm( y ~ x + x^2)
> > >
> > > …etc., through lm( y ~ x + x^2 + x^3 + x^4 + x^5 + x^6 ), followed
> > > by
> > some plot() or lines() or ggplot2() call to render the data and fitted
> > curves.
> > >
> > > Obviously I don’t want to run such regressions for any real purpose,
> > > but
> > I think it might be useful to learn how to do such a thing in R
> > without writing down each lm() call individually. It’s not obvious
> > where I’d want to apply this, but I like learning how to repeat things in a
> compact way.
> > >
> > > So, something like:
> > >
> > > data( mtcars )
> > > d <- mtcars
> > > v <- c( 1 , 2 , 3 , 4 , 5 , 6  )
> > > m1 <- lm( mpg ~ hp  , data = d )
> > >
> > > and then somehow use for() with an index or some flavor of apply()
> > > with
> > the vector v to repeat this process yielding
> > >
> > > m2 <- lm( mpg ~ hp + I( hp ^2 ) , data=d)
> > > m3 <- lm( mpg ~ hp + I( hp^2 ) + I(hp^3) , data=d )
> > >
> > > … and the rest through m6 <- lm( mpg ~ hp + I(hp^2) + I(hp^3) +
> > > I(hp^4)
> > + I(hp^5) + I(hp^6) , data=d )
> > >
> > > But finding a way to index these values including not just each
> > > value
> > but each value+1 , then value+1 and value+2, and so on escapes me.
> > Obviously I don’t want to include index values below zero.
> > >
> > > ===
> > > Richard Sherman
> > > rss@gmail.com
> > >
> > > __
> > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > __

Re: [R] graphing repeated curves

2018-08-22 Thread Fox, John
Dear Richard,

How about this:

ord <- order(mtcars$hp)
mtcars$hp <- mtcars$hp[ord]
mtcars$mpg <- mtcars$mpg[ord]
plot(mpg ~ hp, data=mtcars)
for (p in 1:6){
m <- lm(mpg ~ poly(hp, p), data=mtcars)
lines(mtcars$hp, fitted(m), lty=p, col=p)
}
legend("topright", legend=1:6, lty=1:6, col=1:6, title="order", inset=0.02)

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Richard
> Sherman
> Sent: Wednesday, August 22, 2018 7:07 PM
> To: r-help@r-project.org
> Subject: [R] graphing repeated curves
> 
> Hi all,
> 
> I have a simple graphing question that is not really a graphing question, but 
> a
> question about repeating a task.
> 
> I’m fiddling with some of McElreath’s Statistical Rethinking, and there’s a
> graph illustrating extreme overfitting (a number of polynomial terms in x
> equal to the number of observations), a subject I know well having taught it 
> to
> grad students for many years.
> 
> The plot I want to reproduce has, in effect:
> 
> m1 <- lm( y ~ x)
> m2 <- lm( y ~ x + x^2)
> 
> …etc., through lm( y ~ x + x^2 + x^3 + x^4 + x^5 + x^6 ), followed by some
> plot() or lines() or ggplot2() call to render the data and fitted curves.
> 
> Obviously I don’t want to run such regressions for any real purpose, but I 
> think
> it might be useful to learn how to do such a thing in R without writing down
> each lm() call individually. It’s not obvious where I’d want to apply this, 
> but I
> like learning how to repeat things in a compact way.
> 
> So, something like:
> 
> data( mtcars )
> d <- mtcars
> v <- c( 1 , 2 , 3 , 4 , 5 , 6  )
> m1 <- lm( mpg ~ hp  , data = d )
> 
> and then somehow use for() with an index or some flavor of apply() with the
> vector v to repeat this process yielding
> 
> m2 <- lm( mpg ~ hp + I( hp ^2 ) , data=d)
> m3 <- lm( mpg ~ hp + I( hp^2 ) + I(hp^3) , data=d )
> 
> … and the rest through m6 <- lm( mpg ~ hp + I(hp^2) + I(hp^3) + I(hp^4) +
> I(hp^5) + I(hp^6) , data=d )
> 
> But finding a way to index these values including not just each value but each
> value+1 , then value+1 and value+2, and so on escapes me. Obviously I don’t
> want to include index values below zero.
> 
> ===
> Richard Sherman
> rss@gmail.com
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] looking for formula parser that allows coefficients

2018-08-21 Thread Fox, John
Dear Paul,

Is it possible that you're overthinking this? That is, to you really need an R 
model formula or just want to evaluate an arithmetic expression using the 
columns of X?

If the latter, the following approach may work for you:

> evalFormula <- function(X, expr){
+   if (is.null(colnames(X))) colnames(X) <- paste0("x", 1:ncol(X))
+   with(as.data.frame(X), eval(parse(text=expr)))
+ }

> X <- matrix(1:20, 5, 4)
> X
 [,1] [,2] [,3] [,4]
[1,]16   11   16
[2,]27   12   17
[3,]38   13   18
[4,]49   14   19
[5,]5   10   15   20

> evalFormula(X, '2 + 3*x1 + 4*x2 + 5*x3 + 6*x1*x2')
[1] 120 180 252 336 432

I hope that this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Paul
> Johnson
> Sent: Tuesday, August 21, 2018 6:46 PM
> To: R-help 
> Subject: [R] looking for formula parser that allows coefficients
> 
> Can you point me at any packages that allow users to write a formula with
> coefficients?
> 
> I want to write a data simulator that has a matrix X with lots of columns, and
> then users can generate predictive models by entering a formula that uses
> some of the variables, allowing interactions, like
> 
> y ~ 2 + 1.1 * x1 + 3 * x3 + 0.1 * x1:x3 + 0.2 * x2:x2
> 
> Currently, in the rockchalk package, I have a function simulates data
> (genCorrelatedData2), but my interface to enter the beta coefficients is poor.
> I assumed user would always enter 0's as place holder for the unused
> coefficients, and the intercept is always first. The unnamed vector is too
> confusing.  I have them specify:
> 
> c(2, 1.1, 0, 3, 0, 0, 0.2, ...)
> 
> I the documentation I say (ridiculously) it is easy to figure out from the
> examples, but it really isnt.
> It function prints out the equation it thinks you intended, thats minimum
> protection against user error, but still not very good:
> 
> dat <- genCorrelatedData2(N = 10, rho = 0.0,
>   beta = c(1, 2, 1, 1, 0, 0.2, 0, 0, 0),
>   means = c(0,0,0), sds = c(1,1,1), stde = 0) [1] "The equation that 
> was
> calculated was"
> y = 1 + 2*x1 + 1*x2 + 1*x3
>  + 0*x1*x1 + 0.2*x2*x1 + 0*x3*x1
>  + 0*x1*x2 + 0*x2*x2 + 0*x3*x2
>  + 0*x1*x3 + 0*x2*x3 + 0*x3*x3
>  + N(0,0) random error
> 
> But still, it is not very good.
> 
> As I look at this now, I realize expect just the vech, not the whole vector 
> of all
> interaction terms, so it is even more difficult than I thought to get the 
> correct
> input.Hence, I'd like to let the user write a formula.
> 
> The alternative for the user interface is to have named coefficients.
> I can more or less easily allow a named vector for beta
> 
> beta = c("(Intercept)" = 1, "x1" = 2, "x2" = 1, "x3" = 1, "x2:x1" = 0.1)
> 
> I could build a formula from that.  That's not too bad. But I still think it 
> would
> be cool to allow formula input.
> 
> Have you ever seen it done?
> pj
> --
> Paul E. Johnson   http://pj.freefaculty.org
> Director, Center for Research Methods and Data Analysis http://crmda.ku.edu
> 
> To write to me directly, please address me at pauljohn at ku.edu.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Formatting multi-way ANOVA output for spectra analysis

2018-07-25 Thread Fox, John
Dear Robert,

Although you don't say so, it sounds as if you may be using the Anova() 
function in the car package, which is what the R Commander uses for ANOVA. If 
so, in most cases, Anova() returns an object of class c("anova", "data.frame"), 
which can be manipulated as a data frame. To see this, try something like

str(Anova(your.model))

You should be able to extract, manipulate, and graph whatever components of the 
object interest you.

I hope this helps,
 John

-
John Fox
Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: https://socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Robert D.
> Bowers M.A.
> Sent: Wednesday, July 25, 2018 1:12 PM
> To: r-help@r-project.org
> Subject: [R] Formatting multi-way ANOVA output for spectra analysis
> 
> I've studied R a little bit, although I haven't used it in some time (except 
> via
> RCommander).  I'm working on my dissertation project and have
> spectrometer data that I need to evaluate.  I need to find a way to simplify 
> the
> output from multi-way ANOVA so I can reduce the areas of the spectrum to
> only those where there are significant differences between sites.  (A
> preliminary study on a too-small sample size indicates that certain areas of
> the spectrum can distinguish between sites.  This project is the next step.)
> 
> The dataset is comprised of analyses done on samples from five separate
> locations, with 50 samples taken from each site.  The output of the
> spectrometer per sample is values for 2048 individual wavelengths, in a
> spreadsheet with the wavelength as the first column.  Since I'm doing the
> analysis wavelength-by-wavelength, I've transposed the data and broke the
> data for the project down into smaller spreadsheets (so that R can perform
> ANOVA on each wavelength).
> 
> The problem is, I can do ANOVA now on each wavelength, but I don't need a
> full output table for each... I just need to know if there is significant 
> variation
> between any of the sites at that wavelength, based on 95% confidence level
> (or better).  If I could get some sort of simple chart (or a single line in a
> spreadsheet), that would help to narrow down the areas of the spectrum that I
> need to focus on to evaluate the results of the tests.
> 
> I've been reading information about ANOVA, but have found very little that is
> clear about formatting the output - and I don't need to rehash all of the
> math.  I just need to find out how to hack down the output to just the part I
> need (if possible).  Once that's done, I can decide what wavelengths are
> valuable for future tests and simplify the process.
> 
> Thanks for any help given!
> 
> Bob
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package 'data.table' in version R-3.5.0 not successfully being installed

2018-04-27 Thread Fox, John
Dear Akhilesh,

I hope that it's clear that the Windows binary I provided for the data.table 
package is a temporary solution, and that the maintainer should fix the package 
so that it passes its own tests. You should be careful using the package in its 
current state.

Best,
 John

> -Original Message-
> From: Akhilesh Singh [mailto:akhileshsingh.i...@gmail.com]
> Sent: Friday, April 27, 2018 4:10 AM
> To: Fox, John 
> Cc: r-help mailing list 
> Subject: Re: [R] Package 'data.table' in version R-3.5.0 not successfully 
> being
> installed
> 
> Dear Dr. John Fox,
> 
> The solution provided by you has worked. I downloaded the binary file of
> data.table package made available your website
> <https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.zip
> <https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.zip> >,
> installed it in R and RStudio. And I happy to report that it has successfully
> worked. Many thanks to you, and to other members of R-team who have tried
> help me.
> 
> 
> With regards,
> 
> 
> Dr. A.K. Singh
> 
> On Thu, Apr 26, 2018 at 8:03 PM, Fox, John  <mailto:j...@mcmaster.ca> > wrote:
> 
> 
>   Dear A.K. Singh,
> 
>   As you discovered, the data.table package has an error under R 3.5.0
> that prevents CRAN from distributing a Windows binary for the package. The
> reason that you weren't able to install the package from source is apparently
> that you haven't installed the R package-building tools for Windows. See
> <https://cran.r-project.org/bin/windows/Rtools/ <https://cran.r-
> project.org/bin/windows/Rtools/> >.
> 
>   Because a number of users of my Rcmdr and car packages have
> contacted me with a similar issue, as a temporary work-around I've placed a
> Windows binary for the data.table package on my website at
> <https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.zip
> <https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.zip> >.
> You should be able to install the package from there via the command
> 
> 
> install.packages("https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_
> 1.10.4-3.zip
> <https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.zip> ",
> repos=NULL, type="win.binary")
> 
>   I expect that this problem will go away when the maintainer of the
> data.table package fixes the error.
> 
>   I hope this helps,
>John
> 
>   --
>   John Fox, Professor Emeritus
>   McMaster University
>   Hamilton, Ontario, Canada
>   Web: socialsciences.mcmaster.ca/jfox/
> <http://socialsciences.mcmaster.ca/jfox/>
> 
> 
> 
> 
>   > -Original Message-
>   > From: R-help [mailto:r-help-boun...@r-project.org <mailto:r-help-
> boun...@r-project.org> ] On Behalf Of Akhilesh
>   > Singh
>   > Sent: Thursday, April 26, 2018 8:08 AM
>   > To: r-help mailing list mailto:r-help@r-
> project.org> >
>   > Subject: [R] Package 'data.table' in version R-3.5.0 not successfully
>   > being installed
>   >
>   > Dear Sir,
>   >
>   > I am using R on Windows OS platform. I upgraded my R-system to
> version
>   > R-3.5.0. While upgrading my libraries in R as well as in RStudio, I am
>   > stuck up in the package 'data.table', which is required by many
> other
>   > packages in R-codes in my R-Markdown files.
>   >
>   > I tried to install 'data.table' from "USA-berkely" and "UK-bristol"  
> as
>   > well as "RStudio" mirrors when the following errors are being
> shown:
>   >
>   > From  USA-berkely and UK-bristol mirrors:
>   > =
>   > Package which is only available in source form, and may need
>   >   compilation of C/C++/Fortran: ‘data.table’
>   >   These will not be installed
>   > Warning message:
>   > In download.file(url, destfile = f, quiet = TRUE) :
>   >   InternetOpenUrl failed: ''
>   >
>   > From RStudio mirror:
>   > 
>   > Package which is only available in source form, and may need
> compilation
>   > of
>   >   C/C++/Fortran: ‘data.table’
>   >   These will not be installed
>   >
>   > Afterwards, I consulted google users, I downloaded the source
> package:
>   > "data.table_1.10.4-3.tar.gz" f

Re: [R] Package 'data.table' in version R-3.5.0 not successfully being installed

2018-04-27 Thread Fox, John
Dear Peter,

> -Original Message-
> From: peter dalgaard [mailto:pda...@gmail.com]
> Sent: Friday, April 27, 2018 8:47 AM
> To: Dénes Tóth 
> Cc: Akhilesh Singh ; r-help mailing list  h...@r-project.org>; Fox, John 
> Subject: Re: [R] Package 'data.table' in version R-3.5.0 not successfully 
> being
> installed
> 
> Hmm, looks like that thread has more noise than signal...
> 
> AFAICT, data.table currently fails selftests i 3.5.0 on all platforms on the 
> CRAN
> builders, so RTools issues are only incidental and it would be better to fix
> data.table in the sources.

Yes. Using Rtools to install the package from source or installing the binary I 
provided is just a band-aid that allows packages with dependencies on 
data.table (direct or indirect) to load.

Best,
 John

> 
> From the looks of it, I wouldn't be surprised if the root cause is the 
> changes to
> POSIXlt methods in 3.5.0, but I haven't actually been digging in to check 
> that.
> 
> - Peter D.
> 
> > On 26 Apr 2018, at 23:32 , Dénes Tóth  wrote:
> >
> > You might find this discussion useful, too:
> > https://github.com/Rdatatable/data.table/issues/2797
> >
> >
> > On 04/26/2018 11:01 PM, Henrik Bengtsson wrote:
> >> If you're installing packages to the default location in your home
> >> account and you didn't remove those library folders, you still have
> >> you R 3.4 package installs there, e.g.
> >>> dir(dirname(.libPaths()[1]), full.names = TRUE)
> >> [1] "/home/hb/R/x86_64-pc-linux-gnu-library/3.4"
> >> [2] "/home/hb/R/x86_64-pc-linux-gnu-library/3.5"
> >> [3] "/home/hb/R/x86_64-pc-linux-gnu-library/3.6"
> >> /Henrik
> >> On Thu, Apr 26, 2018 at 11:41 AM, Akhilesh Singh
> >>  wrote:
> >>> You are right. I do take backups. But, this time I was too sure that
> >>> nothing will go wrong. But, this was over-confidence. I need to take
> >>> more care in future. Thanks anyway.
> >>>
> >>> With regards,
> >>>
> >>> Dr. A.K. Singh
> >>>
> >>> On Thu 26 Apr, 2018, 11:49 PM Duncan Murdoch,
> >>> 
> >>> wrote:
> >>>
> >>>> On 26/04/2018 1:54 PM, Akhilesh Singh wrote:
> >>>>> My thanks to Dr. John Fox and Dr. Duncan Murdoch. But, I have
> >>>>> upgraded all my R-3.4.3 libraries to R-3.5.0, and I have not
> >>>>> backed-up copies of old version. So, I would give a try each to
> >>>>> the solutions suggested by John Fox and Dengan Murdoch.
> >>>>
> >>>> Here is some unsolicited advice:  I would strongly recommend that
> >>>> you make it a higher priority to have backups available.  In my
> >>>> experience computer hardware is becoming quite reliable, but
> >>>> software isn't, and the person next to the keyboard isn't either.
> >>>> (My last desperate need for a backup was due to a hardware failure
> >>>> 2 years ago, but it wasn't the manufacturer's fault:  my laptop
> >>>> accidentally drowned.)
> >>>>
> >>>> Backups can save you a lot of grief in the event of a mistake, or a
> >>>> software or hardware failure.  But even in the case of routine
> >>>> events like software updates that don't go as planned, they can save
> time.
> >>>>
> >>>> Duncan Murdoch
> >>>>
> >>>>
> >>>>>
> >>>>> With regards,
> >>>>>
> >>>>> Dr. A.K. Singh
> >>>>>
> >>>>> On Thu 26 Apr, 2018, 9:44 PM Duncan Murdoch,
> >>>>> mailto:murdoch.dun...@gmail.com>>
> wrote:
> >>>>>
> >>>>> On 26/04/2018 10:33 AM, Fox, John wrote:
> >>>>>  > Dear A.K. Singh,
> >>>>>  >
> >>>>>  > As you discovered, the data.table package has an error under R
> >>>>> 3.5.0 that prevents CRAN from distributing a Windows binary for the
> >>>>> package. The reason that you weren't able to install the package
> >>>>> from source is apparently that you haven't installed the R
> >>>>> package-building tools for Windows. See
> >>>>> <https://cran.r-project.org/bin/windows/Rtools/>.
> >>>>>  >
> >>>>>  > Becaus

Re: [R] Package 'data.table' in version R-3.5.0 not successfully being installed

2018-04-26 Thread Fox, John
Dear Duncan,

I think that your advice to downgrade may make sense if A. K. Singh actually 
needs to use data.table. In the case of the car package, for example, which 
depends on data.table indirectly via the rio package, data.table never gets 
used. As well, the examples and vignettes in data.table appear to work under R 
2.5.0, so it's possible that (much of) the functionality of the package is 
intact.

Best,
 John

> -Original Message-
> From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
> Sent: Thursday, April 26, 2018 12:14 PM
> To: Fox, John ; Akhilesh Singh
> 
> Cc: r-help mailing list 
> Subject: Re: [R] Package 'data.table' in version R-3.5.0 not
> successfully being installed
> 
> On 26/04/2018 10:33 AM, Fox, John wrote:
> > Dear A.K. Singh,
> >
> > As you discovered, the data.table package has an error under R 3.5.0
> that prevents CRAN from distributing a Windows binary for the package.
> The reason that you weren't able to install the package from source is
> apparently that you haven't installed the R package-building tools for
> Windows. See <https://cran.r-project.org/bin/windows/Rtools/>.
> >
> > Because a number of users of my Rcmdr and car packages have contacted
> > me with a similar issue, as a temporary work-around I've placed a
> > Windows binary for the data.table package on my website at
> > <https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.z
> > ip>. You should be able to install the package from there via the
> > command
> >
> >
> > install.packages("https://socialsciences.mcmaster.ca/jfox/.Pickup/data
> > .table_1.10.4-3.zip", repos=NULL, type="win.binary")
> >
> > I expect that this problem will go away when the maintainer of the
> data.table package fixes the error.
> 
> You can see the errors in the package on this web page:
> 
> https://cloud.r-project.org/web/checks/check_results_data.table.html
> 
> Currently it is failing self-tests on all platforms except r-oldrel,
> which is the previous release of R.  I'd recommend backing out of R
> 3.5.0 and going to R 3.4.4 if that's a possibility for you.
> 
> Yet another possibility is to use a version of data.table from Github,
> which is newer than the version on CRAN and may have fixed the errors,
> but that would require an installation from source, which not every
> Windows user is comfortable with.
> 
> Duncan Murdoch

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Package 'data.table' in version R-3.5.0 not successfully being installed

2018-04-26 Thread Fox, John
Dear A.K. Singh,

As you discovered, the data.table package has an error under R 3.5.0 that 
prevents CRAN from distributing a Windows binary for the package. The reason 
that you weren't able to install the package from source is apparently that you 
haven't installed the R package-building tools for Windows. See 
.

Because a number of users of my Rcmdr and car packages have contacted me with a 
similar issue, as a temporary work-around I've placed a Windows binary for the 
data.table package on my website at 
. You 
should be able to install the package from there via the command


install.packages("https://socialsciences.mcmaster.ca/jfox/.Pickup/data.table_1.10.4-3.zip";,
 repos=NULL, type="win.binary")

I expect that this problem will go away when the maintainer of the data.table 
package fixes the error.

I hope this helps,
 John

--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Akhilesh
> Singh
> Sent: Thursday, April 26, 2018 8:08 AM
> To: r-help mailing list 
> Subject: [R] Package 'data.table' in version R-3.5.0 not successfully
> being installed
> 
> Dear Sir,
> 
> I am using R on Windows OS platform. I upgraded my R-system to version
> R-3.5.0. While upgrading my libraries in R as well as in RStudio, I am
> stuck up in the package 'data.table', which is required by many other
> packages in R-codes in my R-Markdown files.
> 
> I tried to install 'data.table' from "USA-berkely" and "UK-bristol"  as
> well as "RStudio" mirrors when the following errors are being shown:
> 
> From  USA-berkely and UK-bristol mirrors:
> =
> Package which is only available in source form, and may need
>   compilation of C/C++/Fortran: ‘data.table’
>   These will not be installed
> Warning message:
> In download.file(url, destfile = f, quiet = TRUE) :
>   InternetOpenUrl failed: ''
> 
> From RStudio mirror:
> 
> Package which is only available in source form, and may need compilation
> of
>   C/C++/Fortran: ‘data.table’
>   These will not be installed
> 
> Afterwards, I consulted google users, I downloaded the source package:
> "data.table_1.10.4-3.tar.gz" from CRAN, stored it on desktop, and tried
> following command for installing from source only:
> 
> 
> install.packages("C:\\Users\\Dr. A.K.
> Singh\\Desktop\\data.table_1.10.4-3.tar.gz", repos = NULL,
> type="source")
> 
> This generated following errors messages:
> 
> > install.packages("C:\\Users\\Dr. A.K.
> Singh\\Desktop\\data.table_1.10.4-3.tar.gz", repos = NULL,
> type="source") Installing package into ‘C:/Users/Dr. A.K.
> Singh/Documents/R/win-library/3.5’
> (as ‘lib’ is unspecified)
> * installing *source* package 'data.table' ...
> ** package 'data.table' successfully unpacked and MD5 sums checked
> ** libs
> Warning in system(cmd) : 'make' not found
> ERROR: compilation failed for package 'data.table'
> * removing 'C:/Users/Dr. A.K. Singh/Documents/R/win-
> library/3.5/data.table'
> * restoring previous 'C:/Users/Dr. A.K.
> Singh/Documents/R/win-library/3.5/data.table'
> In R CMD INSTALL
> Warning message:
> In install.packages("C:\\Users\\Dr. A.K.
> Singh\\Desktop\\data.table_1.10.4-3.tar.gz",  :
>   installation of package
> ‘C:/Users/DRAK~1.SIN/Desktop/data.table_1.10.4-3.tar.gz’ had non-zero
> exit status
> >
> 
> 
> I is requested to kindly help. I was writing a book using R-Markdown.
> And I am stuck up as described above.
> 
> Dr. A.K. Singh
> Professor and Head
> Department of Agricultural Statistics
> Indira Gandhi Krishi Vishwavidyalaya
> Raipur
> Chhattisgarh
> India
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] questions about performing Robust multiple regression using bootstrap

2018-02-26 Thread Fox, John
Dear Faiz,

Bootstrapping R^2 using Boot() is straightforward: Simply write a function that 
returns R^2, possibly in a vector with the regression coefficients, and use it 
as the f argument to Boot(). That will get you, e.g., bootstrapped confidence 
intervals for R^2. (Why you want that is another question.) See the example in 
?Boot that shows how to bootstrap the estimated error variance (without the 
regression coefficients).

On the other hand, bootstrap hypothesis tests aren't entirely straightforward 
(and you might ask yourself why you need them when you have bootstrap 
confidence intervals). If memory serves, there's a discussion in the Davison  
and Hinkley reference in ?Boot (I don't have a copy of the book at my current 
location, so I can't check). There's also a brief discussion in Sec. 21.4 of my 
Applied Regression Analysis and Generalized Linear Models, 3rd ed. 

I hope this helps,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of faiz rasool
> Sent: Monday, February 26, 2018 6:30 AM
> To: R-help@r-project.org
> Subject: [R] questions about performing Robust multiple regression using
> bootstrap
> 
> Dear list,
> 
> I am slightly confused about how I  can do the following in R.
> 
> I want to  perform   robust multiple regression. I’ve used the Boot
> function in CAR package to find confidence intervals and standard errors.
> Inadition to these, I want to find the  robust estimates  for the F test and  
> r-
> square. Finally, I  would like to know the significance levels of bootstrap 
> results.
> 
> Below I  explain my question  using commented R code.
> 
> [1] reg=lm(a~b+c+d+e) # perform multiple regression.
> [2] library(car) #load the car package.
> [3] bootstrap=Boot(reg) #perform bootstrap using the Boot function in car
> package.
> [4] summary(bootstrap) #show the results of bootstrap.
> [5]now  I would like to type a code that can give me robust estimates  of R-
> square, F tests, and  significance  levels for coefficients and f.
> 
> 
> Thanks for any help.
> 
> Regards,
> Faiz.
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Parallel assignments and goto

2018-02-14 Thread Fox, John
Dear Thomas,

This looks like a really interesting project, and I don't think that anyone 
responded to your message, though I may be mistaken.

I took at a look at implementing parallel assignment, and came up with:

passign <- function(..., envir=parent.frame()){
exprs <- list(...)
vars <- names(exprs)
exprs <- lapply(exprs, FUN=eval, envir=envir)
for (i in seq_along(exprs)){
assign(vars[i], exprs[[i]], envir=envir)
}
}

For example,

> fun <- function(){
+ a <- 10
+ passign(a=1, b=a + 2, c=3)
+ cat("a =", a, " b =", b, " c =", c, "\n")
+ }
> fun()
a = 1  b = 12  c = 3

This proves to be faster than what you tried, but still much slower than using 
a local variable (or variables) -- see below. I wouldn't be surprised if 
someone can come up with a faster implementation, but I suspect that the 
overhead of function calls will be hard to overcome. BTW, a version of my 
passign() that uses mapply() in place of a for loop (not shown) is even slower.

> factorial_tr_3 <- function (n, acc = 1) {
+ repeat {
+ if (n <= 1) 
+ return(acc)
+ else {
+ passign(n = n - 1, acc = acc * n)
+ next
+ }
+ }
+ }

> microbenchmark::microbenchmark(factorial(100),
+ factorial_tr_1(100),
+ factorial_tr_2(100),
+ factorial_tr_3(100))
Unit: microseconds
expr   minlq   mean medianuq   
max neval cld
  factorial(100)55.00969.290   100.4507   104.5515   131.174   
228.496   100 a  
 factorial_tr_1(100)10.22711.63714.496713.753015.515
89.565   100 a  
 factorial_tr_2(100) 21523.751 23038.417 24477.1734 24058.3635 25041.988 
45814.136   100   c
 factorial_tr_3(100)   806.789   861.797   914.3651   879.9565   925.444  
2139.329   100  b

Best,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/




> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Thomas
> Mailund
> Sent: Sunday, February 11, 2018 10:49 AM
> To: r-help@r-project.org
> Subject: [R] Parallel assignments and goto
> 
> Hi guys,
> 
> I am working on some code for automatically translating recursive functions 
> into
> looping functions to implemented tail-recursion optimisations. See
> https://github.com/mailund/tailr
> 
> As a toy-example, consider the factorial function
> 
> factorial <- function(n, acc = 1) {
> if (n <= 1) acc
> else factorial(n - 1, acc * n)
> }
> 
> I can automatically translate this into the loop-version
> 
> factorial_tr_1 <- function (n, acc = 1) {
> repeat {
> if (n <= 1)
> return(acc)
> else {
> .tailr_n <- n - 1
> .tailr_acc <- acc * acc
> n <- .tailr_n
> acc <- .tailr_acc
> next
> }
> }
> }
> 
> which will run faster and not have problems with recursion depths. However,
> I’m not entirely happy with this version for two reasons: I am not happy with
> introducing the temporary variables and this rewrite will not work if I try to
> over-scope an evaluation context.
> 
> I have two related questions, one related to parallel assignments — i.e.
> expressions to variables so the expression uses the old variable values and 
> not
> the new values until the assignments are all done — and one related to
> restarting a loop from nested loops or from nested expressions in `with`
> expressions or similar.
> 
> I can implement parallel assignment using something like rlang::env_bind:
> 
> factorial_tr_2 <- function (n, acc = 1) {
> .tailr_env <- rlang::get_env()
> repeat {
> if (n <= 1)
> return(acc)
> else {
> rlang::env_bind(.tailr_env, n = n - 1, acc = acc * n)
> next
> }
> }
> }
> 
> This reduces the number of additional variables I need to one, but is a 
> couple of
> orders of magnitude slower than the first version.
> 
> > microbenchmark::microbenchmark(factorial(100),
> +factorial_tr_1(100),
> +factorial_tr_2(100))
> Unit: microseconds
>  expr  min   lq   meanmedian   uq 
>  max neval
>   factorial(100)   53.978   60.543   77.76203   71.0635   85.947  180.251 
>   100
>  factorial_tr_1(100)9.0229.903   11.52563   11.0430   11.984   28.464
> 100
>  factorial_tr_2(100) 5870.565 6109.905 6534.13607 6320.4830 6756.463
> 8177.635   100
> 
> 
> Is there another way to do parallel assignments that doesn’t cost this much in
> running time?
> 
> My other problem is the use of `next`. I would like to combine tail-recursion
> optimisation with pattern matching as in https://github.com/mailund/pmatch
> where I can, for example, define a linked list like this:
> 
> devtools::install_github("mailund/pmatch”)
> library(pmatch)
> llist := N

Re: [R] effects & lme4: error since original data frame notfoundWASeffects: error when original data frame is missing

2018-01-17 Thread Fox, John
Dear Gerrit,

This issue is discussed in a vignette in the car package (both for functions in 
the car and effects packages): vignette("embedding", package="car") . The 
solution suggested there is the essentially the one that you used.

I hope this helps,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Gerrit
> Eichner
> Sent: Wednesday, January 17, 2018 9:50 AM
> To: r-help@r-project.org
> Subject: Re: [R] effects & lme4: error since original data frame
> notfoundWASeffects: error when original data frame is missing
> 
> Third "hi" in this regard and for the archives:
> 
> I found a (maybe "dirty") workaround which at least does what I need by
> creating a copy of the required data frame in the .GlobalEnv by means of
> assign:
> 
> foo <- function() {
>assign("X", sleepstudy, pos = 1)
>fm <- lmer(Reaction ~ Days + (Days | Subject), data = X)
>Effect("Days", fm)
> }
> 
> 
>   Hth  --  Gerrit
> 
> -
> Dr. Gerrit Eichner   Mathematical Institute, Room 212
> gerrit.eich...@math.uni-giessen.de   Justus-Liebig-University Giessen
> Tel: +49-(0)641-99-32104  Arndtstr. 2, 35392 Giessen, Germany
> Fax: +49-(0)641-99-32109http://www.uni-giessen.de/eichner
> -
> 
> Am 17.01.2018 um 15:02 schrieb Gerrit Eichner:
> > Hi, again,
> >
> > I have to modify my query since my first (too simple) example doesn't
> > reflect my actual problem. Second try:
> >
> > When asking Effect() inside a function to compute an effect of an
> > lmer-fit which uses a data frame local to the body of the function, as
> > in the following example (simplifying my actual application), I get
> > the "Error in is.data.frame(data) :
> > object 'X' not found":
> >
> >  > foo <- function() {
> > +  X <- sleepstudy
> > +  fm <- lmer(Reaction ~ Days + (Days | Subject), data = X)
> > +  Effect("Days", fm)
> > + }
> >
> >  > foo()
> >
> > Error in is.data.frame(data) : object 'X' not found
> >
> >
> > With lm-objects there is no problem:
> >
> >  > foo2 <- function() {
> > +   X <- sleepstudy
> > +   fm <- lm(Reaction ~ Days, data = X)
> > +   Effect("Days", fm)
> > + }
> >
> >  > foo2()
> >
> > 
> >
> > Any idea how to work around this problem?
> > Once again, thx in advance!
> >
> >   Regards  --  Gerrit
> >
> > PS: > sessionInfo()
> > R version 3.4.2 (2017-09-28)
> > Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8
> > x64 (build 9200)
> >
> > Matrix products: default
> >
> > locale:
> > [1]
> LC_COLLATE=German_Germany.1252  LC_CTYPE=German_Germany.1252 [3]
> > LC_MONETARY=German_Germany.1252 LC_NUMERIC=C [5]
> > LC_TIME=German_Germany.1252
> >
> > attached base packages:
> > [1] stats graphics  grDevices utils datasets  methods   base
> >
> > other attached packages:
> > [1] effects_4.0-0   carData_3.0-0   lme4_1.1-14 Matrix_1.2-11
> > car_2.1-5 [6] lattice_0.20-35
> >
> > loaded via a namespace (and not attached):
> >   [1] Rcpp_0.12.13   MASS_7.3-47    grid_3.4.2
> > MatrixModels_0.4-1
> >   [5] nlme_3.1-131   survey_3.32-1  SparseM_1.77 minqa_1.2.4
> >   [9] nloptr_1.0.4   splines_3.4.2  tools_3.4.2
> > survival_2.41-3 [13] pbkrtest_0.4-7 yaml_2.1.14
> > parallel_3.4.2 compiler_3.4.2 [17] colorspace_1.3-2   mgcv_1.8-22
> > nnet_7.3-12 quantreg_5.33
> >
> > -
> > Dr. Gerrit Eichner   Mathematical Institute, Room 212
> > gerrit.eich...@math.uni-giessen.de   Justus-Liebig-University Giessen
> > Tel: +49-(0)641-99-32104  Arndtstr. 2, 35392 Giessen, Germany
> > Fax: +49-(0)641-99-32109    http://www.uni-giessen.de/eichner
> > -
> >
> > Am 17.01.2018 um 10:55 schrieb Gerrit Eichner:
> >> Hello, everyody,
> >>
> >> when asking, e.g., Effect() to compute the effects of a fitted, e.g.,
> >> linear model after having deleted the data frame from the workspace
> >> for which the model was obtained an error is reported:
> >>
> >>  > myair <- airquality
> >>  > fm <- lm(Ozone ~ Temp, data = myair)
> >>  > rm(myair)
> >>  > Effect("Temp", fm)
> >> Error in eval(model$call$data, envir) : object 'myair' not found
> >>
> >> Has anybody a better "workaround" for this than, e.g., explicitly
> >> saving the fitted model object fm together with its original
> >> environment or just the data needed frame (maybe in a list like
> >> fm.plus.origdata <- list(fm, myair = myair)) to be able to restore
> >> the original environemt (or at least the needed opriginal data
> >> frame) of the time when fm was created?
> >>
> >> Thx for any hint!
> >>
> >>   Regards  --  Gerrit

Re: [R] About levene.test

2018-01-15 Thread Fox, John
Dear Mariano,

See the function leveneTest() in the car package.

I hope that this helps,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Marcelo
> Mariano Silva
> Sent: Monday, January 15, 2018 12:49 PM
> To: r-help@r-project.org
> Subject: [R] About levene.test
> 
> Hi,
> 
> What package(s) must I install  so that I can apply the Levene' test in my 
> data?
> 
> I tried 'lawstat' but dependency ‘VGAM’ is not available for this package.
> 
> 
> I am using Rstudio Version 1.1.383
> 
> 
> Tks
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SpreadLevelPlot for more than one factor

2018-01-14 Thread Fox, John
Dear Ashim,

I’ll address your questions briefly but they’re really not appropriate for
this list, which is for questions about using R, not general statistical
questions. 

(1) The relevant distribution is within cells of the wool x tension
cross-classification because it’s the deviations from the cell means that
are supposed to be normally distributed with equal variance. In the
warpbreaks data there are only 9 cases per cell. If you examine all of
these deviations simultaneously, that’s equivalent to examining the
residuals from the two-way ANOVA model fit to the data.

(2) Yes, (d) and (e) visualize simple effects, and (a) and (b) visualize
main effects, the latter only because the data are balanced.

Best,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: http://socserv.mcmaster.ca/jfox/




On 2018-01-09, 10:18 AM, "Ashim Kapoor"  wrote:

>Dear Sir,
>
>
>Many thanks for your reply.
>
>
>I have a query.
>
>
>
>I have a whole set of distributions which should be made normal /
>homoscedastic. Take for instance the warpbreaks data set.
>
>
>
>We have the following boxplots for the warpbreaks dataset:
>
>
>a. boxplot(breaks ~ wool)
>
>b. boxplot(breaks ~ tension)
>
>c. boxplot(breaks ~ interaction(wool,tension))
>d. boxplot(breaks ~ wool @ each level of tension)
>e. boxplot(breaks ~ tension @ each level of wool)
>
>
>Now should we not be making a-e normal and homoscedastic? Should we not
>make a giant collection of boxplots from a-e and use the SpreadLevelPlot
>on this entire collection?
>
>
>A second query : (d) and (e) are the distribution of the simple effects
>of factor wool and tension @ each level of the other. Is that correct?
>Are (a) and (b) the distribution of the main effect of wool and tension?
>Please confirm.
>
>
>
>Best Regards,
>Ashim
>
>
>
>
>
>
>
>
>
>On Sun, Jan 7, 2018 at 8:05 PM, Fox, John
> wrote:
>
>Dear Ashim,
>
>Try spreadLevelPlot(breaks ~ interaction(tension, wool), data=warpbreaks)
>.
>
>I hope this helps,
> John
>
>-
>John Fox, Professor Emeritus
>McMaster University
>Hamilton, Ontario, Canada
>Web: 
>socialsciences.mcmaster.ca/jfox/ <http://socialsciences.mcmaster.ca/jfox/>
>
>
>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ashim
>> Kapoor
>> Sent: Sunday, January 7, 2018 12:08 AM
>> To: r-help@r-project.org
>> Subject: [R] SpreadLevelPlot for more than one factor
>>
>> Dear All,
>>
>> I want a transformation which will make the spread of the response at
>>all
>> combinations of  2 factors the same.
>>
>> See for example :
>>
>> boxplot(breaks ~ tension * wool, warpbreaks)
>>
>> The closest I  can do is :
>>
>> spreadLevelPlot(breaks ~tension , warpbreaks) spreadLevelPlot(breaks ~
>>wool ,
>> warpbreaks)
>>
>> I want to do :
>>
>> spreadLevelPlot(breaks ~tension * wool, warpbreaks)
>>
>> But I get :
>>
>> > spreadLevelPlot(breaks ~tension * wool , warpbreaks)
>> Error in spreadLevelPlot.formula(breaks ~ tension * wool, warpbreaks) :
>>   right-hand side of model has more than one variable
>>
>> What is the corresponding appropriate function for 2 factors ?
>>
>> Many thanks,
>> Ashim
>>
>>   [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> 
>https://stat.ethz.ch/mailman/listinfo/r-help
><https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide
>http://www.R-project.org/posting- <http://www.R-project.org/posting->
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
>

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SpreadLevelPlot for more than one factor

2018-01-07 Thread Fox, John
Dear Ashim,

Try spreadLevelPlot(breaks ~ interaction(tension, wool), data=warpbreaks) .

I hope this helps,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socialsciences.mcmaster.ca/jfox/



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Ashim
> Kapoor
> Sent: Sunday, January 7, 2018 12:08 AM
> To: r-help@r-project.org
> Subject: [R] SpreadLevelPlot for more than one factor
> 
> Dear All,
> 
> I want a transformation which will make the spread of the response at  all
> combinations of  2 factors the same.
> 
> See for example :
> 
> boxplot(breaks ~ tension * wool, warpbreaks)
> 
> The closest I  can do is :
> 
> spreadLevelPlot(breaks ~tension , warpbreaks) spreadLevelPlot(breaks ~ wool ,
> warpbreaks)
> 
> I want to do :
> 
> spreadLevelPlot(breaks ~tension * wool, warpbreaks)
> 
> But I get :
> 
> > spreadLevelPlot(breaks ~tension * wool , warpbreaks)
> Error in spreadLevelPlot.formula(breaks ~ tension * wool, warpbreaks) :
>   right-hand side of model has more than one variable
> 
> What is the corresponding appropriate function for 2 factors ?
> 
> Many thanks,
> Ashim
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Primer for working with survey data in R

2017-11-11 Thread Fox, John
Dear Kevin,

In addition to the advice you've received, take a look at the survey package. 
It's not quite what you're asking for, but in fact it's probably more useful, 
in that it provides correct statistical inference for data collected in complex 
surveys. The package is described in an article,  T. Lumley (2004), Analysis of 
complex survey samples, Journal of Statistical Software 9(1): 1-19, and a book, 
T. Lumley, Complex Surveys: A Guide to Analysis Using R, Wiley, 2010, both by 
the package author.

I hope that this helps,
 John

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Kevin Taylor
> Sent: Saturday, November 11, 2017 2:57 PM
> To: r-help@r-project.org
> Subject: [R] Primer for working with survey data in R
> 
> I am taking a behavioral stats graduate class and the instructor is using 
> SPSS.
> I'm trying to follow along in R.
> 
> Recently in class we started working with scales and survey data, computing
> Cronbach's Alpha, reversing values for reverse coded items, etc.
> 
> Also, SPSS has some built in functionality for entering the meta-data for your
> survey, e.g. the possible values for items, the text of the question, etc.
> 
> I haven't been able to find any survey guidance for R other than how to run 
> the
> actual calculations (Cronbach's, reversing values).
> 
> Are there tutorials, books, or other primers, that would guide a newbie step 
> by
> step through using R for working with survey data? It would be helpful to see
> how others are doing these things. (Not just how to run the mathematical
> operations but how to work with and manage the data.) Possibly this would be
> in conjunction with some packages such as Likert or Scales.
> 
> TIA.
> 
> --Kevin
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error to run this package

2017-10-31 Thread Fox, John
Dear John and Anima,

I didn't reply earlier because other people got to it before I did and because, 
given the lack of information in the original post, there wasn't anything to 
add.

The car package shouldn't require anything near 2.5 Gb to load. Here's what I 
get under Windows 10 with R 3.4.2:

> memory.size()
[1] 62.08
> library(car)
> memory.size()
[1] 166.09

The units are MB (the ?memory.size help page says "Mb" but I believe that's 
usually for "megabits" and that it isn't what's intended).

Some more information, as has been suggested by other posters, might help.

Best,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socserv.mcmaster.ca/jfox



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of John Kane
> via R-help
> Sent: Tuesday, October 31, 2017 10:46 AM
> To: r-help@r-project.org; Anima Pramanik 
> Subject: Re: [R] error to run this package
> 
> Since we don't know what you were doing when this happened it is a bit
> difficult to guess.
> Please supply a minimal set of code that demonstrates what you were doing
> that gives this error.
> The output of sessionInfo() would also be useful.
> Have a look at http://stackoverflow.com/questions/5963269/how-to-make-a-
> great-r-reproducible-example
> or
> Reproducibility · Advanced R.for some idea of the type of information that
> would useful.
> The simplist solution might be to make sure that you have enough memory
> available since the error says, "cannot allocate memory block of size 2.5 Gb".
> 
> 
> |
> |
> |  |
> Reproducibility · Advanced R.
> 
> 
>  |
> 
>  |
> 
>  |
> 
> 
> 
> 
> 
> On Tuesday, October 31, 2017, 9:07:04 AM EDT, Anima Pramanik
>  wrote:
> 
>  Error: package or namespace load failed for ‘car’ in get(Info[i, 1], envir
> = env):
>  cannot allocate memory block of size 2.5 Gb
> 
> 
> please help me to get a solution of this problem
> 
>     [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] bowed linear approximations

2017-09-26 Thread Fox, John
Dear Rich,

I think that it's generally a bad idea to give statistical (as opposed to 
simply technical) advice by email without knowing the context of the research. 
I think that you'd do well to seek help from a statistician, and not just do 
what I suggest below.

Interpolating the data only makes sense if there's no random component to the 
response (mag in your data). Otherwise, it makes more sense to get 
"predictions" from a statistical model that has an explicit error component for 
the response. In your case, a simple quadratic model in log(freq) seems to fit 
the data reasonably well. 

To see what I mean, try

plot(log(freq), mag)
mod <- lm(mag ~ poly(log(freq), 2))
summary(mod)
points(log(freq), fitted(mod), pch=16)
lines(spline(log(freq), fitted(mod)))

Some basic regression diagnostics suggest that we can do better by taking the 
log of mag as well, producing a closer fit to the data and stabilizing the 
error variance:

plot(log(freq), log(mag))
mod2 <- lm(log(mag) ~ poly(log(freq), 2))
summary(mod2)
points(log(freq), fitted(mod2), pch=16)
lines(spline(log(freq), fitted(mod2)))

I have no idea whether this makes substantive sense in the context of your 
problem.

Best,
 John

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Evans,
> Richard K. (GRC-H000)
> Sent: Tuesday, September 26, 2017 10:01 AM
> To: Eric Berger ; Fox, John 
> Cc: r-help@r-project.org
> Subject: Re: [R] bowed linear approximations
> 
> My apologies for the typos in the code.
> Here is a corrected version you can copy/paste in R to see the issue.
> 
> freq <- c(2, 3, 5, 10, 50, 100, 200, 300, 500, 750, 1000, 1300, 1800, 2450, 
> 2900,
> 3000, 4000, 5000, 6000, 7000, 8200, 9300, 1, 11000, 18000, 26500, 33000,
> 4); mag <- c(1.9893038, 1.5088071, 1.1851947, 0.983, 0.7680123,
> 0.7458169, 0.7069638, 0.6393066, 0.6261539, 0.6263381, 0.7053774,
> 0.6900626, 0.6953527, 0.7843036, 0.9056359, 0.8867276, 0.8937421,
> 0.9492288, 0.9629118, 1.1972268, 1.0010515, 0.9945838, 1.0564356,
> 0.873, 1.167, 1.537, 1.467, 1.317);
> plot(freq,mag,type="b",log="x"); for(i in 1:200){ xx <-
> exp(runif(1,log(min(freq)),log(max(freq)) )); yy <- approx(freq,mag,xout=xx,
> method = "linear"); points(xx,yy$y,col=rgb(1,0,0)); }
> 
> For completeness, I have been puzzling over why the approximated points
> don't lie linearly over the original data set (especially prominent  in the 
> bow
> between freq=10 and 50). Once I realized (and concurred with) why this bow
> exists, I have been struggling with how to make these approximations as
> expected.. In my original post, I think I oversimplified it too much by 
> implying
> that my application was just 2 data points.
> 
> Are your suggestions still valid do you think?
> -Rich
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] bowed linear approximations

2017-09-25 Thread Fox, John
Dear Rich,

Assuming that I understand what you want to do, try adding the following to 
your script (which, by the way, is more complicated that it needs to be):

xx <- 10:50
m <- lm(y ~ x)
yy <- predict(m, data.frame(x=xx))
lines(spline(xx, yy), col="blue")

m <- lm(y ~ log(x))
yy <- predict(m, data.frame(x=xx))
points(xx, yy, col="magenta")

The first set of commands adds a line corresponding to the points that you 
plotted, which if I understand right, is *not* what you want. The second set of 
commands shows how to find points along the diagonal straight line that you 
plotted, given their x-values, which is what I think you want.

If you examine the linear models fit, you'll see that they just interpolate 
between the two points, albeit differently.

I hope this helps,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socserv.mcmaster.ca/jfox




> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Evans,
> Richard K. (GRC-H000)
> Sent: Monday, September 25, 2017 3:28 PM
> To: r-help@r-project.org
> Subject: [R] bowed linear approximations
> 
> Hello,
> 
> Please run the following code snippet and note the resulting plot:
> 
> x <- c(10, 50)
> y <- c(0.983, 0.7680123)
> plot(x,y,type="b",log="x")
> for(i in 1:50){
>  xx <- exp(runif(1,log(min(x)),log(max(x)) )) yy <- approx(x,y,xout=xx, 
> method =
> "linear")
> points(xx,yy$y)
> }
> 
> notice the "log=x" plot parameter and the resulting "bow" in the linear
> approximation.
> 
> This makes sense when I realized that the plot command is first making the
> plot and then drawing straight lines between points on a log plot AFTER the
> plot is generated and that that's why the line is straight. I get that.
> .. and it also makes sense that the bowed points are a result of the linear
> approximations being made BEFORE plotted in a logarithmic plot. I get that..
> 
> My goal is to make approximations that lie on the line produced on the plot as
> shown, so I realize that what I want to do is NOT linear approximations, but
> maybe "log" approximations?
> However, the approximation methods are only "linear" and "constant" .. there
> isn't a "log" method to approximate with.
> 
> So can anyone tell me how to fix the code such that he approximated points
> DO lie on the line as plotted with the "log=x" plot parameter?
> Oh, and they have to be uniformly distributed along the Log=x axis.. I think
> that's the tricky part.
> 
> Any help and/or insight would be greatly appreciated.
> 
> Thank you!
> -Rich
> 
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Loading Rcmdr

2017-07-25 Thread Fox, John
Dear Jack,

There's not enough information here to know what the problem might be. Please 
see < https://www.r-project.org/help.html>, in particular the section on asking 
for help, and follow the link to the posting guide.

At a minimum, explain what you did and what happened. It's generally 
informative to provide the results of Sys.info() and sessionInfo() .

Best,
 John

-
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
Web: socserv.mcmaster.ca/jfox



> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Jack Talley
> Sent: Monday, July 24, 2017 7:17 PM
> To: r-help@r-project.org
> Subject: [R] Loading Rcmdr
> 
> With the lastest version of R 3.4.1 I have not been able to loard Rcmdr.
> Advice please.
> 
> Thank you,
> 
> Jack Talley, PhD
> 
>   [[alternative HTML version deleted]]
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >