Re: [R] differing behavior of mean(), median() and sd() with na.rm

2018-08-22 Thread Ivan Calandra
Thanks all for the enlightenment. So, it does make sense that mean() produces NaN and median()/sd() NA, from a calculation point of view at least. But I still think it also makes sense that the mean of NA is NA as well, be it only for consistency with other functions. That's just my opinion of

Re: [R] graphing repeated curves

2018-08-22 Thread Richard Sherman
These are great, thanks. I always forget about paste(). === Richard Sherman rss@gmail.com > On Aug 22, 2018, at 17:56, Fox, John wrote: > > fm <- vector("character",6) > fm[1]<- "mpg ~ hp" > for(i in 2:6)fm[i]<- paste0(fm[i-1]," + I(hp^", i,")") _

Re: [R] graphing repeated curves

2018-08-22 Thread Fox, John
Dear Bert, > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter > Sent: Wednesday, August 22, 2018 8:38 PM > To: Jim Lemon > Cc: rss@gmail.com; R-help > Subject: Re: [R] graphing repeated curves > > I do not think this does what the OP w

Re: [R] graphing repeated curves

2018-08-22 Thread Bert Gunter
I do not think this does what the OP wants -- it does not produce polynomials of the form desired. John Fox's solution using poly() seems to me to be the right approach, but I will show what I think is a considerably simpler way to build up the polynomial expressions just as an example of one way

Re: [R] graphing repeated curves

2018-08-22 Thread Fox, John
Dear Richard, How about this: ord <- order(mtcars$hp) mtcars$hp <- mtcars$hp[ord] mtcars$mpg <- mtcars$mpg[ord] plot(mpg ~ hp, data=mtcars) for (p in 1:6){ m <- lm(mpg ~ poly(hp, p), data=mtcars) lines(mtcars$hp, fitted(m), lty=p, col=p) } legend("topright", legend=1:6, lty=1:6, col=1:6,

Re: [R] graphing repeated curves

2018-08-22 Thread Jim Lemon
Hi Richard, This may be what you want: data(mtcars) m<-list() for(i in 1:6) { rhterms<-paste(paste0("I(hp^",1:i,")"),sep="+") lmexp<-paste0("lm(mpg~",rhterms,",mtcars)") cat(lmexp,"\n") m[[i]]<-eval(parse(text=lmexp)) } plot(mpg~hp,mtcars,type="n") for(i in 1:6) abline(m[[i]],col=i) Jim On

[R] graphing repeated curves

2018-08-22 Thread Richard Sherman
Hi all, I have a simple graphing question that is not really a graphing question, but a question about repeating a task. I’m fiddling with some of McElreath’s Statistical Rethinking, and there’s a graph illustrating extreme overfitting (a number of polynomial terms in x equal to the number of

Re: [R] R package downloading

2018-08-22 Thread Jeff Newmiller
The Bioconductor project does things their own way. Please use their support channels for help with their packages. https://www.bioconductor.org/help/ On August 22, 2018 2:27:47 PM PDT, Spencer Brackett wrote: >Hello all, > >Once the R package TCGAbiolinks or biocLite(“TCGAbiolinks”) is don

Re: [R] lattice barchart() with two variables

2018-08-22 Thread Rich Shepard
On Wed, 22 Aug 2018, Bert Gunter wrote: groups = Summary.Type, ... in your call will then do the job. As an aside, this is a good example of why you should adhere to this format for data analysis in R. Bert, Progress and retreat. I'm putting this aside for a day or so because I need t

[R] R package downloading

2018-08-22 Thread Spencer Brackett
Hello all, Once the R package TCGAbiolinks or biocLite(“TCGAbiolinks”) is done unpacking, how do I go about apply any particular analysis with it? Currently, I am unable to access the consule and edit any further. Would I therefore have to open up a new script? Spencer Brackett [[a

Re: [R] lattice barchart() with two variables

2018-08-22 Thread Rich Shepard
On Wed, 22 Aug 2018, Bert Gunter wrote: (I know that you said your post may already be "out of date", but ...) Bert, Still reading ?xyplot/?barchart. But ?barchart says: "Formally, if groups is specified, then groups along with subscripts is passed to the panel function, ..." which, as I

Re: [R] lattice barchart() with two variables

2018-08-22 Thread Bert Gunter
(I know that you said your post may already be "out of date", but ...) " Despite additional reading of barchart() examples and help pages I'm still missing how to get grouping working and use the years in the dataframe as labels on the x-axis." But ?barchart says: "Formally, if groups is specif

Re: [R] lattice barchart() with two variables

2018-08-22 Thread Rich Shepard
On Wed, 22 Aug 2018, Rich Shepard wrote: Correcting the barchard() command fixed the main issue; getting the second set of bars is still eluding me, but I'll continue working on fixing this. I'll get the years as the x-axis labels rather than year number in sequence from 1 to 29. Despite add

Re: [R] lattice barchart() with two variables

2018-08-22 Thread Rich Shepard
On Wed, 22 Aug 2018, Bert Gunter wrote: See inline. Bert, Will do. Sent a reply before seeing this. More to follow. Thanks, Rich __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help P

Re: [R] Monte Carlo on simple regression

2018-08-22 Thread Ogbos Okike
Dear Erick, This is great!! Many thanks for resolving the problem. Ogbos On Wed, Aug 22, 2018 at 5:44 PM Eric Berger wrote: > Hi Ogbos, > I took a closer look at your code. > Here's a modified version (using dummy data) that seems to do what you > want. > Hopefully this will make it clear what

Re: [R] lattice barchart() with two variables

2018-08-22 Thread Bert Gunter
See inline. -- Bert On Wed, Aug 22, 2018 at 9:17 AM Rich Shepard wrote: > On Wed, 22 Aug 2018, Bert Gunter wrote: > > > No reproducible example (see posting guide below) so minimal help. > > Hi Bert, > >I thought the header and six data rows of the dataframe plus the syntax > of > the com

Re: [R] Monte Carlo on simple regression

2018-08-22 Thread Eric Berger
Hi Ogbos, I took a closer look at your code. Here's a modified version (using dummy data) that seems to do what you want. Hopefully this will make it clear what you need to to. nn <- 100 lDf <- data.frame(Li=rnorm(nn),CR=rnorm(nn)) fit<-lm(Li~CR, data=lDf) a<-summary(fit) N <- nrow(lDf) C <- 50

Re: [R] differing behavior of mean(), median() and sd() with na.rm

2018-08-22 Thread Ted Harding
I think that one can usefully look at this question from the point of view of what "NaN" and "NA" are abbreviations for (at any rate, according to the understanding I have adopted since many years -- maybe over-simplified). NaN: Mot a Number NA: Not Available So NA is typically used for missing v

Re: [R] Monte Carlo on simple regression

2018-08-22 Thread Eric Berger
You have an extra comma ... it should be Li[sample(1:N, size = S, replace = TRUE)] i.e. no comma after the closing parenthesis On Wed, Aug 22, 2018 at 7:20 PM, Ogbos Okike wrote: > Hello Eric, > Thanks for this. > > I tried it. It went but another problem prevents the code from running. > s

Re: [R] Monte Carlo on simple regression

2018-08-22 Thread Ogbos Okike
Hello Erick, Thanks again. Another line indicated error: source("script.R") Error in eval(predvars, data, env) : numeric 'envir' arg not of length one Thank you for additional assitance. Ogbos On Wed, Aug 22, 2018 at 5:23 PM Eric Berger wrote: > You have an extra comma ... it should be > >

Re: [R] Monte Carlo on simple regression

2018-08-22 Thread Ogbos Okike
Hello Eric, Thanks for this. I tried it. It went but another problem prevents the code from running. source("script.R") Error in Li[sample(1:N, size = S, replace = TRUE), ] : incorrect number of dimensions The error is coming from the line: subsample <- Li[sample(1:N, size=S, replace=TRUE), ]

Re: [R] lattice barchart() with two variables

2018-08-22 Thread Rich Shepard
On Wed, 22 Aug 2018, Bert Gunter wrote: No reproducible example (see posting guide below) so minimal help. Hi Bert, I thought the header and six data rows of the dataframe plus the syntax of the command I used were sufficient. Regardless, here's the dput() output: structure(list(Year = c(1

Re: [R] Monte Carlo on simple regression

2018-08-22 Thread Eric Berger
Li is defined as d1$a which is a vector. You should use N <- length(Li) HTH, Eric On Wed, Aug 22, 2018 at 6:02 PM, Ogbos Okike wrote: > Kind R-users, > I run a simple regression. I am interested in using the Monte Carlo to test > the slope parameter. > Here is what I have done: > d1<-read.tab

Re: [R] lattice barchart() with two variables

2018-08-22 Thread Bert Gunter
No reproducible example (see posting guide below) so minimal help. Remove the quotes from your formula. Why did you think they should be there? -- see ?formula. Read the relevant portions of ?xyplot carefully (again?). You seemed to have missed: "*Primary variables:* The x and y variables should

Re: [R] differing behavior of mean(), median() and sd() with na.rm

2018-08-22 Thread Marc Schwartz via R-help
Hi, It might even be worthwhile to review this recent thread on R-Devel: https://stat.ethz.ch/pipermail/r-devel/2018-July/076377.html which touches upon a subtly related topic vis-a-vis NaN handling. Regards, Marc Schwartz > On Aug 22, 2018, at 10:55 AM, Bert Gunter wrote: > > ... And FW

[R] Monte Carlo on simple regression

2018-08-22 Thread Ogbos Okike
Kind R-users, I run a simple regression. I am interested in using the Monte Carlo to test the slope parameter. Here is what I have done: d1<-read.table("Lightcor",col.names=c("a")) d2<-read.table("CRcor",col.names=c("a")) Li<-d1$a CR<-d2$a fit<-lm(Li~CR) a<-summary(fit) a gives the slope as 88.

[R] lattice barchart() with two variables

2018-08-22 Thread Rich Shepard
I've not before created bar charts, only scatter plots and box plots. Checking in Deepayan's book, searching the web, and looking at ?barchart has not shown me the how to get the results I need. The dataframe looks like this: head(stage_heights) Year Med Max 1 1989 91.17 93.32 2 1990

Re: [R] differing behavior of mean(), median() and sd() with na.rm

2018-08-22 Thread Bert Gunter
... And FWIW (not much, I agree), note that if z = numeric(0) and sum(z) = 0, then mean(z) = NaN makes sense, as length(z) = 0, so dividing by 0 gives NaN. So you can see the sorts of issues you may need to consider. Bert Gunter "The trouble with having an open mind is that people keep coming alo

Re: [R] differing behavior of mean(), median() and sd() with na.rm

2018-08-22 Thread Bert Gunter
Actually, the dissonance is a bit more basic. After xxx(, na.rm=TRUE) with all NA's in ... you have numeric(0). So what you see is actually: > z <- numeric(0) > mean(z) [1] NaN > median(z) [1] NA > sd(z) [1] NA > sum(z) [1] 0 etc. I imagine that there may be more of these little inconsistenc

Re: [R] differing behavior of mean(), median() and sd() with na.rm

2018-08-22 Thread Duncan Murdoch
On 22/08/2018 10:33 AM, Ivan Calandra wrote: Dear useRs, I have just noticed that when input is only NA with na.rm=TRUE, mean() results in NaN, whereas median() and sd() produce NA. Shouldn't it all be the same? I think NA makes more sense than NaN in that case. The mean can be defined as sum(

[R] differing behavior of mean(), median() and sd() with na.rm

2018-08-22 Thread Ivan Calandra
Dear useRs, I have just noticed that when input is only NA with na.rm=TRUE, mean() results in NaN, whereas median() and sd() produce NA. Shouldn't it all be the same? I think NA makes more sense than NaN in that case. x <- c(NA, NA, NA) mean(x, na.rm=TRUE) [1] NaN median(x, na.rm=TRUE) [1] N

Re: [R] looking for formula parser that allows coefficients

2018-08-22 Thread Gabor Grothendieck
Some string manipulation can convert the formula to a named vector such as the one shown at the end of your post. library(gsubfn) # input fo <- y ~ 2 - 1.1 * x1 + x3 - x1:x3 + 0.2 * x2:x2 pat <- "([+-])? *(\\d\\S*)? *\\*? *([[:alpha:]]\\S*)?" ch <- format(fo[[3]]) m <- matrix(strapplyc(ch, pat)[