[R] predict.coxph fitted values for failure times
I would like to extract predicted failure times from a coxph model in library(survival). However, none of the prediction options ("lp", "risk", "expected", "terms") seem to bear any relationship to failure time. Perhaps I am asking the wrong question, but can coxph provide predicted failure times? Thanks, Dan Bebber Department of Plant Sciences University of Oxford ___ How much free photo storage do you get? Store your holiday __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R crashes when spherical autocorrelation specified in nlme
Dear list: R crashes when I specify spatial autocorrelation in nlme: sp3 <- corSpher(c(30,0.75),~x+y|Site, nugget = T) cs3 <- Initialize(sp3, data = sav) sav.nlme1<-nlme(vc.asin ~ SSasymp(canopy, Asym, R0, lrc),data = sav, fixed = Asym + R0 + lrc ~ 1, random = R0 + lrc ~ 1|Site, start = list(fixed = sav.ini), verbose = T, correlation = cs3) There is a longish (~3 s) pause prior to the crash. There is no crash if "correlation = cs3" is omitted. The nugget effect is large (~0.75), so perhaps it is not worth including, but I would still like to know if the crash can be avoided. My system is: R 2.0.1 Windows XP SP2 Dual Intel Xeon 2.8GHz 2Gb RAM Thanks, Dan Bebber Department of Plant Sciences University of Oxford UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] plot(cox.zph()): customize xlab & ylab
Hello, plot(cox.zph(my.ph),var=1,xlab="Year") gives the error: Error in plot.default(range(xx), yr, type = "n", xlab = "Time", ylab = ylab[i], : formal argument "xlab" matched by multiple actual arguments How can I customize the xlab and ylab for plots of cox.zph? Thanks, Dan Bebber Department of Plant Sciences University of Oxford UK ___ How much free photo storage do you get? Store your holiday __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] plot(cox.zph()): customize xlab & ylab
Dear all, I've modified the plot.cox.zph function to allow customized xlab and ylab (see below). Someone might like to confirm that it works. Thanks for all the assistance. Dan ___ plot.cox.zph <- function (x, resid = TRUE, se = TRUE, df = 4, nsmo = 40, var, xlab="Time",ylab = paste("Beta(t) for", dimnames(yy)[[2]]),...) { xx <- x$x yy <- x$y d <- nrow(yy) df <- max(df) nvar <- ncol(yy) pred.x <- seq(from = min(xx), to = max(xx), length = nsmo) temp <- c(pred.x, xx) lmat <- ns(temp, df = df, intercept = TRUE) pmat <- lmat[1:nsmo, ] xmat <- lmat[-(1:nsmo), ] qmat <- qr(xmat) if (se) { bk <- backsolve(qmat$qr[1:df, 1:df], diag(df)) xtx <- bk %*% t(bk) seval <- d * ((pmat %*% xtx) * pmat) %*% rep(1, df) } if (missing(var)) var <- 1:nvar else { if (is.character(var)) var <- match(var, dimnames(yy)[[2]]) if (any(is.na(var)) || max(var) > nvar || min(var) < 1) stop("Invalid variable requested") } if (x$transform == "log") { xx <- exp(xx) pred.x <- exp(pred.x) } else if (x$transform != "identity") { xtime <- as.numeric(dimnames(yy)[[1]]) apr1 <- approx(xx, xtime, seq(min(xx), max(xx), length = 17)[2 * (1:8)]) temp <- signif(apr1$y, 2) apr2 <- approx(xtime, xx, temp) xaxisval <- apr2$y xaxislab <- rep("", 8) for (i in 1:8) xaxislab[i] <- format(temp[i]) } for (i in var) { y <- yy[, i] yhat <- pmat %*% qr.coef(qmat, y) if (resid) yr <- range(yhat, y) else yr <- range(yhat) if (se) { temp <- 2 * sqrt(x$var[i, i] * seval) yup <- yhat + temp ylow <- yhat - temp yr <- range(yr, yup, ylow) } if (x$transform == "identity") plot(range(xx), yr, type = "n", xlab = xlab, ylab = ylab[i], ...) else if (x$transform == "log") plot(range(xx), yr, type = "n", xlab = xlab, ylab = ylab[i], log = "x", ...) else { plot(range(xx), yr, type = "n", xlab = xlab, ylab = ylab[i], axes = FALSE, ...) axis(1, xaxisval, xaxislab) axis(2) box() } if (resid) points(xx, y) lines(pred.x, yhat) if (se) { lines(pred.x, yup, lty = 2) lines(pred.x, ylow, lty = 2) } } } --- Thomas Lumley <[EMAIL PROTECTED]> wrote: > On Mon, 11 Jul 2005, Adaikalavan Ramasamy wrote: > > > I am not sure if there is an easy way around this. > An ugly hack is to > > make a copy the function "survival:::plot.cox.zph" > and make your > > modified function. But there are others in the > list who might know > > neater solutions. > > If you then send a patch to the package maintainer > it stops being an ugly > hack and turns into an example of collaborative > open-source development :) > > -thomas > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] bubble.plot() - standardize size of unit circle
Hello, I wrote a wrapper for symbols() that produces a bivariate bubble plot, for use when plot(x,y) hides multiple occurrences of the same x,y combination (e.g. if x,y are integers). Circle area ~ counts per bin, and circle size is controlled by 'scale'. Question: how can I automatically make the smallest circle the same size as a standard plot character, rather than having to approximate it using 'scale'? #Function: bubble.plot<-function(x,y,scale=0.1,xlab=substitute(x),ylab=substitute(y),...){ z<-table(x,y) xx<-rep(as.numeric(rownames(z)),ncol(z)) yy<-sort(rep(as.numeric(colnames(z)),nrow(z))) id<-which(z!=0) symbols(xx[id],yy[id],inches=F,circles=sqrt(z[id])*scale,xlab=xlab,ylab=ylab,...)} #Example: x<-rpois(100,3) y<-x+rpois(100,2) bubble.plot(x,y) ___ How much free photo storage do you get? Store your holiday __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] bubble.plot() - standardize size of unit circle
Thanks- 'sizeplot' didn't come up in any of my searches. Dan --- Jim Lemon <[EMAIL PROTECTED]> wrote: > Dan Bebber wrote: > > Hello, > > > > I wrote a wrapper for symbols() that produces a > > bivariate bubble plot, for use when plot(x,y) > hides > > multiple occurrences of the same x,y combination > (e.g. > > if x,y are integers). > > Circle area ~ counts per bin, and circle size is > > controlled by 'scale'. > > Question: how can I automatically make the > smallest > > circle the same size as a standard plot character, > > rather than having to approximate it using > 'scale'? > > > Ben Bolker's "sizeplot" in the plotrix package does > this using the > standard plotting symbol 1. > > Jim > ___ How much free photo storage do you get? Store your holiday __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] bubble.plot() - standardize size of unit circle
Thanks Martin- I didn't know about the sunflowerplot function. Somehow I prefer the look of bubbles, but I guess you're right about the visual perception. Will have to read up on it. Dan --- Martin Maechler <[EMAIL PROTECTED]> wrote: > Hi, > > >>>>> "Dan" == Dan Bebber <[EMAIL PROTECTED]> > >>>>> on Thu, 21 Jul 2005 12:05:45 +0100 (BST) > writes: > > Dan> Hello, > Dan> I wrote a wrapper for symbols() that > produces a > Dan> bivariate bubble plot, for use when > plot(x,y) hides > Dan> multiple occurrences of the same x,y > combination (e.g. > Dan> if x,y are integers). > Dan> Circle area ~ counts per bin, and circle > size is > Dan> controlled by 'scale'. > > I'm not answering your question, but still, I need > to ask/tell > this: > > Why don't use sunflowerplot() instead? > > The excellent researchers who invented sunflower > plots in the > late 70s early 1980s knew well about "bubble" > alternatives and > much about drawbacks of such bubbles > {mainly the perception laws of areas vs lengths ..} > > That's why they came up with the sunflowers as > improvement .. > See 'References' in help(sunflowerplot) > > Regards, > Martin Maechler, ETH Zurich > > > > > Dan> Question: how can I automatically make the > smallest > Dan> circle the same size as a standard plot > character, > Dan> rather than having to approximate it using > 'scale'? > > Dan> #Function: > Dan> > bubble.plot<-function(x,y,scale=0.1,xlab=substitute(x),ylab=substitute(y),...){ > Dan> z<-table(x,y) > Dan> xx<-rep(as.numeric(rownames(z)),ncol(z)) > Dan> > yy<-sort(rep(as.numeric(colnames(z)),nrow(z))) > Dan> id<-which(z!=0) > Dan> > symbols(xx[id],yy[id],inches=F,circles=sqrt(z[id])*scale,xlab=xlab,ylab=ylab,...)} > > Dan> #Example: > Dan> x<-rpois(100,3) > Dan> y<-x+rpois(100,2) > Dan> bubble.plot(x,y) > > > > Dan> > ___ > > Store your holiday > > Dan> > __ > Dan> R-help@stat.math.ethz.ch mailing list > Dan> > https://stat.ethz.ch/mailman/listinfo/r-help > Dan> PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > Dan> !DSPAM:42df82ae77251002021314! > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] glm cannot find valid starting values
glm(S ~ -1 + Mdif, family=quasipoisson(link=identity), start=strt, sdat) gives error: > Error in glm.fit(x = X, y = Y, weights = weights, start = start, etastart > = > etastart, : >cannot find valid starting values: please specify some strt is set to be the coefficient for a similar fit glm(S ~ -1 + I(Mdif + 1),... i.e. (Mdif + 1) is a vector similar to Mdif. The error appears to occur when some values of Mdif are negative, though I have not had this problem with simulated datasets. Any solutions greatly appreciated (full details and data below). Dan Bebber Department of Plant Sciences University of Oxford OS: WinXP Pro SP2 and Win ME (tried both) Processor: Dual Intel Xeon and Pentium 4 (tried both) R version: 2.3.0 and 2.3.1 (tried both) #Full details (can be pasted directly into R): #Data: sdat <- data.frame( S = c(0, 0, 0, 0, 28, 0, 1, 7, 0, 0, 39, 2, 0, 0, 40, 0, 0, 0, 0, 0, 0, 15, 0, 0, 3, 0, 0, 0, 2, 0, 3, 0, 30, 0, 20, 0, 1, 0, 0, 1, 21, 0, 0, 4, 14, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 3, 0, 2, 5, 0, 0, 0, 0, 0, 0, 0, 25, 0, 5, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 10, 0, 0, 0, 0, 0, 9, 1, 1, 1, 1, 0, 0, 3, 0, 27, 7, 0, 0, 0, 0, 0, 1, 1, 4, 2, 1, 2, 4, 1, 4, 6, 12, 4, 6, 3, 4, 0, 4, 0, 6, 1, 3, 1, 4, 4, 1, 1, 2, 1, 6, 1, 0, 0, 1, 0, 1, 0, 0, 6, 0, 0, 0, 0, 2, 2, 3, 1, 6, 2, 2, 1, 1, 4, 4, 3, 3, 7, 1, 3, 5, 6, 4, 0, 1, 4, 2, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 4, 0, 0, 1, 0, 2, 0, 0, 1, 2, 0, 4, 0, 2, 0, 3, 0, 2, 2, 0, 4, 0, 2, 0, 1, 2, 3, 0, 0, 0, 0, 3, 1, 0, 0, 0, 3, 2, 1, 2, 1, 1, 2, 0, 0, 3, 3, 1, 0, 1, 1, 2, 0, 1, 0, 1, 0, 1, 3, 2, 0, 0, 2, 1, 0, 1, 0, 0, 0, 0, 0, 0, 4, 2, 4, 0, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 3, 2, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 4, 1, 0, 0, 1), M = 620+c(0,cumsum(sdat$S[-328]))) #S is the (unknown) number of N individuals that irreversibly change state in a time #interval t. The data provided are a subset of the full data set. #M is the cumulative sum of individuals that have changed state up to t-1. #Assume that the rate of state change is constant (S ~ kM), but the #distribution of S is clustered. #The goal is to estimate N. #N can be estimated by fitting: qpglm <- glm(S ~ M, family = quasipoisson(link = identity), sdat) summary(qpglm) N.est <- -coef(qpglm)[1]/coef(qpglm)[2] N.est #i.e. x-intercept is minus intercept/slope #To estimate confidence limits on N.est, fit models without intercept to #N.est - M + x, where x is an integer. The model will have the lowest deviance #when x = 0. x <- 0 Mdif <- N.est - M + x qpglm2 <- glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), sdat) summary(qpglm2) #Use analysis of deviance to estimate confidence limits on N.est: disp <- summary(qpglm)$dispersion dfres <- df.residual(qpglm) dev.res <- deviance(qpglm) #From MASS4, p. 210, assume that changes in deviance scaled by #dispersion as |x| increases have an F distribution dev.crit <- dev.res+qf(0.95,1,dfres)*disp dev.crit #values of x for which the deviance = dev.crit give approximate 95% confidence limits #on N.est. #The error occurs when x <= -91.7: x <- -91.7 sdat$Mdif <- N.est - sdat$M + x strt <- coef(glm(S ~ -1 + I(Mdif+1), family = quasipoisson(link = identity), sdat)) qpglm2 <- glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), start=strt, sdat) #The problem is that this interferes with optimization to find values of x for which #deviance = dev.crit __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] glm cannot find valid starting values
Brian Ripley wrote: > > BTW, your example cannot be pasted in as 'sdat' self-references. It could > be fixed, but I gave up at that point. > Oh dear, I'm very sorry. I forgot to run rm(list=ls(all=TRUE)) before testing. The corrected code is: #Data: S <- c(0, 0, 0, 0, 28, 0, 1, 7, 0, 0, 39, 2, 0, 0, 40, 0, 0, 0, 0, 0, 0, 15, 0, 0, 3, 0, 0, 0, 2, 0, 3, 0, 30, 0, 20, 0, 1, 0, 0, 1, 21, 0, 0, 4, 14, 0, 0, 0, 1, 1, 1, 0, 0, 0, 1, 1, 0, 3, 0, 2, 5, 0, 0, 0, 0, 0, 0, 0, 25, 0, 5, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 10, 0, 0, 0, 0, 0, 9, 1, 1, 1, 1, 0, 0, 3, 0, 27, 7, 0, 0, 0, 0, 0, 1, 1, 4, 2, 1, 2, 4, 1, 4, 6, 12, 4, 6, 3, 4, 0, 4, 0, 6, 1, 3, 1, 4, 4, 1, 1, 2, 1, 6, 1, 0, 0, 1, 0, 1, 0, 0, 6, 0, 0, 0, 0, 2, 2, 3, 1, 6, 2, 2, 1, 1, 4, 4, 3, 3, 7, 1, 3, 5, 6, 4, 0, 1, 4, 2, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 4, 0, 0, 1, 0, 2, 0, 0, 1, 2, 0, 4, 0, 2, 0, 3, 0, 2, 2, 0, 4, 0, 2, 0, 1, 2, 3, 0, 0, 0, 0, 3, 1, 0, 0, 0, 3, 2, 1, 2, 1, 1, 2, 0, 0, 3, 3, 1, 0, 1, 1, 2, 0, 1, 0, 1, 0, 1, 3, 2, 0, 0, 2, 1, 0, 1, 0, 0, 0, 0, 0, 0, 4, 2, 4, 0, 2, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 1, 3, 2, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 4, 1, 0, 0, 1) M <- 620+c(0,cumsum(S[-328])) sdat <- data.frame(S,M) #S is the number of N individuals that irreversibly change state in a time #interval t. The data provided are a subset of the full data set. #M is the cumulative sum of individuals that have changed state up to t-1. #Assume that the rate of state change is constant (S ~ kM), but the #distribution of S is clustered. #N can be estimated by fitting: qpglm <- glm(S ~ M, family = quasipoisson(link = identity), sdat) summary(qpglm) N.est <- -coef(qpglm)[1]/coef(qpglm)[2] N.est #i.e. x-intercept is minus intercept/slope #To estimate confidence limits on N.est, fit models without intercept to #N.est - M + x, where x is an integer. The model will have the lowest deviance #when x = 0. x <- 0 Mdif <- N.est - M + x qpglm2 <- glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), sdat) summary(qpglm2) #Use analysis of deviance to estimate confidence limits on N.est: disp <- summary(qpglm)$dispersion dfres <- df.residual(qpglm) dev.res <- deviance(qpglm) #From MASS4, p. 210, assume that changes in deviance scaled by #dispersion as |x| increases have an F distribution dev.crit <- dev.res+qf(0.95,1,dfres)*disp dev.crit #values of x for which the deviance = dev.crit give approximate 95% confidence limits #on N.est. #The error occurs when x <= -91.7: x <- -91.7 sdat$Mdif <- N.est - sdat$M + x strt <- coef(glm(S ~ -1 + I(Mdif+1), family = quasipoisson(link = identity), sdat)) qpglm2 <- glm(S ~ -1 + Mdif, family = quasipoisson(link = identity), start=strt, sdat) __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Greedy triangulation
Hello, does anyone have code that will generate a greedy triangulation (triangulation that uses shortest non-overlapping edges) for a set of points in Euclidean space? Thanks, Dan Bebber ___ Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB UK Tel. 01865 275060 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Greedy triangulation
Thanks, but no, the Delaunay is different. I have written the following, which interested persons might want to check for accuracy and streamlining: #GREEDY TRIANGULATION # #Pick all lines that are shortest and don't overlap, starting with shortest. # greedy<-function(xy){ #input is a matrix with 2 columns (x,y) require(spatstat) #need the crossing.psp function w<-owin(range(xy[,1]),range(xy[,2]))#set the window for crossing.psp dists<-dist(xy) #calculate Euclidean distances for each line ind<-which(lower.tri(dists),arr.ind=T) #matrix with 2 columns (start point, end point) x1<-xy[ind[,1],1] #x of start point y1<-xy[ind[,1],2] #y of start point x2<-xy[ind[,2],1] #x of end point y2<-xy[ind[,2],2] #y of end point dat<-data.frame(strt=ind[,1],fin=ind[,2], x1=x1,y1=y1,x2=x2,y2=y2,len=as.vector(dists))#put all data for lines together dat<-dat[order(dists),] #order data by line length, shortest first inc<-dat[1,]#include the shortest line in the triangulation dat<-dat[-1,] #keep remaining lines while(nrow(dat)>0){ #while lines remain to be tested, do the following... A<-psp(inc$x1,inc$y1,inc$x2,inc$y2,w) #create the psp object for the lines already included B<-psp(dat$x1[1],dat$y1[1],dat$x2[1],dat$y2[1],w) #create the psp object for the next line to be tested cross<-crossing.psp(A,B) #check for crossing of the test line over the lines already included p.match<-sum(cross$x%in%c(inc$x1,inc$x2)) #check if the crossing occurs because same points are included if(cross$n-p.match==0){inc[nrow(inc)+1,]<-dat[1,]} #if the only crossing are due to shared points, include the line dat<-dat[-1,]} #remove the line from the untested lines return(inc)} #when all lines have been tested, return all included lines # #Plot the greedy triangulation # plot.greedy<-function(xy,gr,...){ plot(xy,asp=1,xlab="x",ylab="y",...) segments(gr$x1,gr$y1,gr$x2,gr$y2)} # #Test it # xy<-matrix(runif(40,0,1),nc=2) gr<-greedy(xy) plot.greedy(xy,gr,main="Greedy Triangulation") #END - Original Message - From: "Greg Snow" <[EMAIL PROTECTED]> To: "Dan Bebber" <[EMAIL PROTECTED]>; Sent: Thursday, September 14, 2006 4:32 PM Subject: RE: [R] Greedy triangulation Does the deldir package do what you want? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Statitics Textbook - any recommendation?
MJ Crawley "Statistics: An Introduction Using R", received a good review in Journal of the Royal Statistical Society, Vol 169. Dan Bebber Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB UK [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Likelihood of multiple random processes
Is there a method in R for estimating the likelihood that two (or more) random variables generated a univariate sample? For a single random variable there is: x <- c(rnorm(20,10,3),rnorm(20,20,5)) plot(density(x)) logLik(fitdistr(x,"normal")) Is there a way of of specifying that more than one normal distribution should be fitted? If so, then I suppose that the AIC of the two models could be calculated. Many thanks, Dan Bebber Department of Plant Sciences University of Oxford __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Likelihood of multiple random processes
> Do you know which observations came from which process (you do in your > example, but it whatever it is supposed to emulate)? No- I need to determine whether more than one process could have been involved. > If not, you want to estimate a mixture distribution. This is covered in > chapter 16 of the book fitdistr() supports as well as in several CRAN > packages. Many thanks- I didn't know they were called mixture models. Page 437 is in front of me now. > > On Thu, 2 Nov 2006, Dan Bebber wrote: > >> Is there a method in R for estimating the likelihood that two (or more) >> random variables generated a univariate sample? >> >> For a single random variable there is: >> x <- c(rnorm(20,10,3),rnorm(20,20,5)) >> plot(density(x)) >> logLik(fitdistr(x,"normal")) >> >> Is there a way of of specifying that more than one normal distribution >> should be fitted? >> If so, then I suppose that the AIC of the two models could be calculated. >> >> Many thanks, >> Dan Bebber >> >> Department of Plant Sciences >> University of Oxford >> >> __ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UKFax: +44 1865 272595 > > > -- > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.1.409 / Virus Database: 268.13.23/513 - Release Date: > 02/11/2006 > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Colour-coded Editor for R Code
Jon, are you using Windows? If so, try WinEdt with library(RWinEdt). It has all the features you want and need. For Linux/Unix use EMACS. There is a section on CRAN about editors. Cheers, Dan Bebber Department of Plant Sciences University of Oxford __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] tick label rotation in xyplot (lattice)
Is there any way of rotating tick labels in xyplot? Perhaps some command in scales? My y-values are high (1000s) leading to a lot of white space in the plots. Thanks, Dan Bebber Department of Plant Sciences University of Oxford __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] tick label rotation in xyplot (lattice)
Thanks - I'm guilty of not checking the help files thoroughly enough. My apologies. Dan - Original Message - From: "Chuck Cleland" <[EMAIL PROTECTED]> To: "Dan Bebber" <[EMAIL PROTECTED]> Cc: Sent: Thursday, November 30, 2006 1:22 PM Subject: Re: [R] tick label rotation in xyplot (lattice) > Dan Bebber wrote: >> Is there any way of rotating tick labels in xyplot? Perhaps some command >> in >> scales? >> My y-values are high (1000s) leading to a lot of white space in the >> plots. > > Yes, the scales section of ?xyplot mentions a rot argument. Here is > an example: > > df <- expand.grid(1:2, 1992:2002) > names(df) <- c("MSA", "YEAR") > df$IDUPREV <- runif(22) > > library(lattice) > > xyplot(IDUPREV ~ YEAR | MSA, data = df, scales=list(x=list(rot=45))) > >> Thanks, >> Dan Bebber >> >> Department of Plant Sciences >> University of Oxford >> >> __ >> R-help@stat.math.ethz.ch mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >> > > -- > Chuck Cleland, Ph.D. > NDRI, Inc. > 71 West 23rd Street, 8th floor > New York, NY 10010 > tel: (212) 845-4495 (Tu, Th) > tel: (732) 512-0171 (M, W, F) > fax: (917) 438-0894 > > > > -- > No virus found in this incoming message. > Checked by AVG Free Edition. > Version: 7.1.409 / Virus Database: 268.15.2/559 - Release Date: 30/11/2006 > > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] par(mfrow, fin) incompatibility?
Hello, I want a 2x1 multi-figure, with each plot 5" square. Test code: x<-rnorm(10,0,1) y<-rnorm(10,0,1) par(pty="s", mfrow=c(2,1), fin=c(5,5)) plot(x,y) plot(y,x) but this does not work (overplots the two figures). Substituting pin for fin works, but is not what I want. Are mfrow and fin incompatible? I am basing my code on Fig. 4.6 in MASS4. Running R 2.2.1 & WinXP. Thanks Dan Bebber Department of Plant Sciences University of Oxford __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] nlme for groupedData with inner and outer factors
Hello, I am having trouble specifying a suitable nlme model. My data structure is described by gd <- groupedData(ppath ~ lcut | exp, outer = ~ bait, inner = ~ weight, data = d) i.e. the response (ppath) of several subjects (sub) was measured at levels of a continuous variable (lcut). Subjects were given either of one level of a factor (bait), and all subjects were measured at two levels of another factor (weight). Therefore bait varies among subjects and weight varies within subjects. The relationship ppath ~ cut for each subject and weight appear to follow a logistic curve, with xmid and scal affected by bait and weight. There is also a random effect of subject on xmid and scal. Any help with formulating the correct model would be greatly appreciated. Many thanks, Dan Bebber Department of Plant Sciences University of Oxford p.s. Part of my data are shown below: sublcut ppath bait weight 1 pv1_ 0.0 1.01 0 2 pv1_ 0.1 0.8277738211 0 3 pv1_ 0.2 0.3801025021 0 4 pv1_ 0.3 0.2091518781 0 5 pv1_ 0.4 0.0769293041 0 6 pv1_ 0.5 0.0656815641 0 7 pv1_ 0.6 0.0206701081 0 8 pv1_ 0.7 0.0128170211 0 9 pv1_ 0.8 0.0086615141 0 10 pv1_ 0.9 0.0115683231 0 11 pv19 0.0 1.01 0 12 pv19 0.1 0.6683902911 0 13 pv19 0.2 0.3433184621 0 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] How to test for significance of random effects?
I may be out of my statistical depth here, but isn't it the case that if one has an experimental design with random effects, one has to include the random effects, even if they appear to be non-significant? AFAIK there are two reasons: one is the possibility of 'restriction errors' that arise by unintentional differences in treatments among groups, so making analysis of among-group variance problematic; the other is that allocations of fixed effects to samples is no longer random and therefore the assumption of random errors is broken. Real statisticians may disagree with this, however. Dan Bebber Department of Plant Sciences University of Oxford Message: 12 Date: Sun, 07 May 2006 14:25:44 -0700 From: Spencer Graves <[EMAIL PROTECTED]> Subject: Re: [R] How to test for significance of random effects? To: Jon Olav Vik <[EMAIL PROTECTED]> Cc: r-help@stat.math.ethz.ch Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset=ISO-8859-1; format=flowed 1. Ignoring the "complication of logistic regression", the "anova(lme1,lm1)" provides the answer you seek. See sect. 2.4 in Pinheiro and Bates for more detail on the approximations involved and how that answer can be refined using monte carlo. 2. With logistic regression, you want to do essentially the same thing using glm and lmer (in package 'lme4'), except that many of the required functions are not yet part of 'lme4'. Consider the following example: library(lme4) library(mlmRev) (mlmR <- vignette("MlmSoftRev")) #edit(mlmR) # with Rgui #Stangle(mlmR$file) # with ESS # -> then open file MlmSoftRev.R fitBin <- lmer(use ~ urban+age+livch+(1|district), data=Contraception, family=binomial) fitBin0 <- glm(use ~ urban+age+livch, data=Contraception, family=binomial) 2*pchisq(2*as.numeric(logLik(fitBin)- logLik(fitBin0)), 2, lower.tail=FALSE) Note however that this p-value computation is known to be only an approximation; see RSiteSearch("lmer p-values") for other perspectives. More accurate p-values can be obtained using Markov Chain Monte Carlo, via "mcmcsamp". hope this helps, Spencer Graves Jon Olav Vik wrote: > Dear list members, > > I'm interested in showing that within-group statistical dependence is > negligible, so I can use ordinary linear models without including random > effects. However, I can find no mention of testing a model with vs. > without random effects in either Venable & Ripley (2002) or Pinheiro and > Bates (2000). Our in-house statisticians are not familiar with this, > either, so I would greatly appreciate the help of this list. > > Pinheiro & Bates (2000:83) state that random-effect terms can be tested > based on their likelihood ratio, if both models have the same > fixed-effects structure and both are estimated with REML (I must admit I > do not know exactly what REML is, although I do understand the concept of > ML). > > The examples in Pinheiro & Bates 2000 deal with simple vs. complicated > random-effects structures, both fitted with lme and method="REML". > However, to fit a model without random effects I must use lm() or glm(). > Is there a way to tell these functions to use REML? I see that lme() can > use ML, but Pinheiro&Bates (2000) advised against this for some reason. > > lme() does provide a confidence interval for the between-group variance, > but this is constructed so as to never include zero (I guess the interval > is as narrow as possible on log scale, or something). I would be grateful > if anyone could tell me how to test for zero variance between groups. > > If lm1 and lme1 are fitted with lm() and lme() respectively, then > anova(lm1,lme1) gives an error, whereas anova(lme1,lm1) gives an answer > which looks reasonable enough. > > The command logLik() can retrieve either restricted or ordinary > log-likelihoods from a fitted model object, but the likelihoods are then > evaluated at the fitted parameter estimates. I guess these estimates > differ from if the model were estimated using REML? > > My actual application is a logistic regression with two continuous and one > binary predictor, in which I would like to avoid the complications of > using generalized linear mixed models. Here is a simpler example, which is > rather trivial but illustrates the general question: > > Example (run in R 2.2.1): > > library(nlme) > summary(lm1 <- lm(travel~1,data=Rail)) # no random effect > summary(lme1 <- lme(fixed=travel~1,random=~1|Rail,data=Rail)) # random > effect > intervals(lme1) # confidence for random effect > anova(lm1,lme1) > ## Outputs warning message: > # models with response "NULL" removed because > # response differs from model 1 in:
[R] Force coefficients in glm()
Hello, I have a model glm(Y ~ X, family = quasipoisson(link = "identity")) I would like to vary the coefficient for X and observe the effect on the deviance. Is this possible? Many thanks, Dan Bebber Department of Plant Sciences University of Oxford __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] lmer, p-values and all that
Hello, I've just located the illuminating explanation by Douglas Bates on degrees of freedom in mixed models. The take-home message appears to be: don't trust the p-values from lme. Questions: Should I give up hypothesis testing for fixed effects terms in mixed models? Has my time spent reading Pinheiro & Bates been in vain? Is there a publication on this issue? Thanks, Dan Bebber Department of Plant Sciences University of Oxford __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lattice: clipping data, not plot margins
I am plotting subsets of my data, using ylim. This works fine, but the outer margin line widths of the plot are thin, due to clipping. If I include > trellis.par.set(clip=list(panel = "off")) then the outer margin line widths are fine, but the outlying data is visible. Is there any way of achieving both correct margin line widths and clipping of outlying data? Thanks, Dan Bebber info: Windows XP, R 2.4.1., lattice 0.14-16 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Error() term in glm model formula
Hello, My data are numbers of trees in plots sampled in a number of forest stands. Some stands were subjected to a treatment, others not. Several plots were sampled per stand to get a better idea of what the stand means were, but replication is really at the stand level. Therefore I think this is a split-plot design. I would like to know whether the treatment affected the number of trees, so: results<-aov(trees ~ as.factor(treatment) + Error(stand)) However, the frequency distribution of trees is highly skewed, with lots of zeros. I was therefore considering using a generalized linear model, perhaps with Poisson error. However, the glm function does not seem to support adding an Error() term to the model. My question is: is there any way of modelling the experimental design in glm, or should I transform the data as best as I can and stick with aov? Many thanks, Dan Bebber Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB Tel. 01865 275060 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Error() term in glm model formula
Hello, with regards to my recent question, the design is more a 'repeated measures in space' design than split plot. Dan Bebber *Original message follows** My data are numbers of trees in plots sampled in a number of forest stands. Some stands were subjected to a treatment, others not. Several plots were sampled per stand to get a better idea of what the stand means were, but replication is really at the stand level. Therefore I think this is a split-plot design. I would like to know whether the treatment affected the number of trees, so: results<-aov(trees ~ as.factor(treatment) + Error(stand)) However, the frequency distribution of trees is highly skewed, with lots of zeros. I was therefore considering using a generalized linear model, perhaps with Poisson error. However, the glm function does not seem to support adding an Error() term to the model. My question is: is there any way of modelling the experimental design in glm, or should I transform the data as best as I can and stick with aov? Many thanks, Dan Bebber Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB Tel. 01865 275060 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Error with arima()
Could someone please give a brief explanation, or pointer to an explanation, of the following error: > arima(ts.growth, order = c(1,0,0),include.mean=T) Error in arima(ts.growth, order = c(1, 0, 0), include.mean = T) : non-stationary AR part from CSS and why it does not arise with > arima0(ts.growth, order = c(1,0,0)) Many thanks Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Random intercept model with time-dependent covariates, results different from SAS
Hello, I have been struggling with a similar problem, i.e. fitting an LME model to unbalanced repeated measures data. I found "Linear Mixed Models" by John Fox (http://socserv2.socsci.mcmaster.ca/jfox/Books/Companion/appendix-mixed-mode ls.pdf) quite helpful. Fox gives examples which are unbalanced, so I guess that balance is not a requirement (assuming Fox is correct). However, the sample sizes are large compared to yours (and mine), which may make a difference. Dan Bebber Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB Tel. 01865 275060 Web. http://www.forestecology.co.uk/ "Data, data, data!" he cried impatiently. "I can't make bricks without clay" - Sherlock Holmes, The Adventure of the Copper Beeches, 1892 > Message: 24 > Date: Sun, 4 Jul 2004 19:21:32 +1000 > From: Keith Wong <[EMAIL PROTECTED]> > Subject: Re: [R] Random intercept model with time-dependent covariates, results different from AS > To: Prof Brian Ripley <[EMAIL PROTECTED]> > Cc: [EMAIL PROTECTED] > Message-ID: <[EMAIL PROTECTED]> > Content-Type: text/plain; charset=ISO-8859-1 > > Thank you for the very prompt response. I only included a small > part of the > output to make the message brief. I'm sorry it did not provide > enough detail to > answer my question. I have appended the summary() and anova() > outputs to the > two models I fitted in R. > > Quoting Prof Brian Ripley <[EMAIL PROTECTED]>: > > > Looking at the significance of a main effect (group) in the > presence of an > > interaction (time:group) is hard to interpret, and in your case > is I think > > not even interesting. (The `main effect' probably represents difference > > in intercept for the time effect, that is the group difference > at the last > > time. But see the next para.) Note that the two systems are returning > > different denominator dfs. > > > I take your point that the main effect is probably not interesting in the > presence of an interaction. I was checking the results for > consistency to see > if I was doing the right thing. I was not 100% sure that the SAS > code was in > itself correct. > > > At this point you have not told us enough. My guess is that you have > > complete balance with the same number of subjects in each > group. In that > > case the `group' effect is in the between-subjects stratum (as > defined for > > the use of Error in aov, which you could also do), and thus R's 11 df > > would be right (rather than 44, without W and Z). Without balance Type > > III tests get much harder to interpret and the `group' effect > would appear > > in two strata and there is no simple F test in the classical theory. So > > further guessing, SAS may have failed to detect balance and so used the > > wrong test. > > I had not appreciated the need for balance: in actual fact, one > group has 5 > subjects and the other 7. Will this be a problem? Would the R > analysis still be > valid in that case? > > > > The time-dependent covariates muddy the issue more, and I > looked mainly at > > the analyses without them. Again, a crucial fact is not here: do the > > covariates depend on the subjects as well? > > Yes the covariates are measures of blood pressure and pulse, and > they depend on > the subjects as well. > > > The good news is that the results _are_ similar. You do have different > > time behaviour in the two groups. So stop worrying about tests of > > uninteresting hypotheses and concentrate of summarizing that difference. > > > > -- > > Brian D. Ripley, [EMAIL PROTECTED] > > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > > University of Oxford, Tel: +44 1865 272861 (self) > > 1 South Parks Road, +44 1865 272866 (PA) > > Oxford OX1 3TG, UKFax: +44 1865 272595 > > > Thank you. I was concerned that one or both methods were > incorrect given the > results were inconsistent. Perhaps reassuringly, the parameter > estimates for > the fixed effects in both SAS and R were the same. > > Is the model specification OK for the model with just time, group > and their > interaction? > Is the model specification with the 2 time dependent covariates > appropriate? > > Once again, I'm very grateful for the time you've taken to answer > my questions. > > Keith > > [Output from the 2 models fitted in R follows] > > > g1 = lme(Y ~ time + group + time:group, random = ~ 1 | id, data > = datamod) > > > anova(g1)
[R] Modelling compound logistic growth curves
Motivated by the discovery of 'loglet analysis' (http://phe.rockefeller.edu/LogletLab/) that allows one to decompose growth curves into a series of logistic equations, I attempted to do the same thing in R. SIMULATED DATA Time <- 1:200 pop.size <- SSlogis(Time,10,20,5) + SSlogis(Time,20,100,20) + rnorm(length(Time)) MY ANALYSIS results <- nls(size ~ SSlogis(Time, Asym1, xmid1, scal1) + SSlogis(Time, Asym2, xmid2, scal2), start = list(Asym1=5, xmid1=15, scal1=30, Asym2=25, xmid2=67, scal2=25)) THE RESULT I get the error message: Error in nls(size ~ SSlogis(Time, Asym1, xmid1, scal1) + SSlogis(Time, : step factor 0.000488281 reduced below `minFactor' of 0.000976563 Assistance in doing this analysis would be much appreciated. Dan Bebber Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB Tel. 01865 275060 Web. http://www.forestecology.co.uk/ "Data, data, data!" he cried impatiently. "I can't make bricks without clay" - Sherlock Holmes, The Adventure of the Copper Beeches, 1892 __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Multiple logistic curves
Dear list, Apologies, I have sent this message before but received no replies so I'm trying again just in case... Motivated by the discovery of 'loglet analysis' (http://phe.rockefeller.edu/LogletLab/) that allows decomposition of growth curves into a series of logistic equations, I attempted to do the same thing in R. #SIMULATED DATA Time <- 1:200 pop.size <- SSlogis(Time,10,20,5) + SSlogis(Time,20,100,20) + rnorm(length(Time)) ts.plot(pop.size) #MY ANALYSIS results <- nls(pop.size ~ SSlogis(Time, Asym1, xmid1, scal1) + SSlogis(Time, Asym2, xmid2, scal2), start = list(Asym1=5, xmid1=15, scal1=30, Asym2=25, xmid2=60, scal2=25)) THE RESULT I get the error message: Error in nls(size ~ SSlogis(Time, Asym1, xmid1, scal1) + SSlogis(Time, : step factor 0.000488281 reduced below `minFactor' of 0.000976563 Any hints in making this analysis work would be greatly appreciated. Dan Bebber Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB Tel. 01865 275060 Web. http://www.forestecology.co.uk/ __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] spatial autcorrelation in glmmPQL
I am unable to specify error spatial autocorrelation structure in glmmPQL: DATA x and y coordinates for sample points at which presence/absence of seedlings and canopy openness were recorded in different forest stands. QUESTION Does seedling density increase with canopy openness? ANALYSIS Initial analysis using glmmPQL with binomial errors and stand as random effect > result <- glmmPQL(seedlings ~ canopy, random = ~1|stand, family=binomial, data=regen) Semivariogram of residuals from this fit using geoR showed spatial autocorrelation with range 34.9 m and relative nugget of 75%. Therefore I tried to create a corStruct object: > corel <- corSpher(value = c(34.9,0.75), form = ~x+y, nugget=TRUE) > corel <- Initialize(corel, data = regen) And specify the structure in a new glmmPQL > result2 <- glmmPQL(seedlings ~ canopy, random = ~|stand, family = binomial, correlation = corel, data=regen) PROBLEM I get the error message: iteration 1 Error in eval(expr, envir, enclos) : Object "x" not found (Running R 1.9.1 on Windows ME) I appear to be implementing corSpher and Initialize incorrectly- any help greatly appreciated. Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB Tel. 01865 275060 --- __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] Repeated measures
Hi Sean, I'm not sure I quite understand your question. Am I right in thinking that: state = a binomial dependent variable measure = a continuous predictor If so, perhaps you could try using glmmPQL (Generalized Linear Mixed Models fitted by Penalized Quasi-Likelihood) in library MASS. The model would include random intercepts for each individual, have binomial errors, and some kind of continuous autoregressive error structure (I expect), and would look something like results<-glmmPQL(fixed=state~measure,random=~1|individual, family=binomial, correlation=corCar1(args...),data=your.data) If I've got the wrong end of the stick, my apologies. Dan Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB UK Tel. 01865 275000 -- Message: 11 Date: Wed, 6 Oct 2004 08:07:38 -0400 From: Sean Davis <[EMAIL PROTECTED]> Subject: [R] Repeated measures To: r-help <[EMAIL PROTECTED]> Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset=US-ASCII; format=flowed I have a data set in which I have 5000 repeated measures on 6 subjects over time (varying intervals, but measurements for all individuals are at the same times). There are two states, a "resting" state (the majority of the time), and a perturbed state. I have a continuous measurement at each time point for each of the individuals. I would like to determine the "state" for each individual at each time point. It looks to me like I should be able to do this with the "hidden" command from the "repeated" package (http://popgen0146uns50.unimaas.nl/~jlindsey/rcode.html), but I have found it a bit confusing to get started. The distributions in the two states are approximately normal with differences in centrality and possibly variance (but I can start by assuming similar variances). Thanks, Sean __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] sample variogram construction
Hi Matt, there are several R packages that will compute the sample variogram for you. Check out GeoR, sgeostat, nlme, spatial. There's no point in recoding the whole lot yourself, unless as a learning excercise. D p.s. For time series autocorrelations, you could use acf in package stats. Message: 9 Date: Mon, 25 Oct 2004 02:02:06 -0400 From: [EMAIL PROTECTED] Subject: [R] sample variogram construction To: [EMAIL PROTECTED] Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset=US-ASCII Hi Im attempting to build a sample variogram for 300 obersvations of longitudinal data. So what I need to do is compute the half squared differences between pairs of residuals (for instance if a subject has 4 obersvations, this is 4 choose 2 paird differences) for each subject. Also, then I need the corresponding time differences within each individual. So the end result will be a 300 by 2 matrix with columns corresponding to paired difference residuals within subject and time differences within subject. Basically im having trouble coding this kind of matrix in R, if anyone can help me out or give me some tips id appreciate it. Thanks. Stuck in the for loop student __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] sample variogram construction
Matt, In this case you are plotting the variogram of the residuals of the lme object, not of the data themselves. In your model you are assuming a linear relationship between count and time, with different intercepts and slopes for your different individuals. You are also assuming that the residuals exhibit stationarity. The Variogram function *should* work, so I can only suggest that there is something wrong with your data or code. "Mixed-Effects Models in S and S-Plus" by Pinheiro and Bates will probably help. Dan Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB UK Tel. 01865 275000 > -Original Message- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] > Sent: 25 October 2004 18:32 > To: [EMAIL PROTECTED]; [EMAIL PROTECTED] > Subject: RE: [R] sample variogram construction > > > Okay thanks! > No I have the following difficulty implementing a variogram > in the nlme package: > > > cd1 <- lme(count ~ time, data=cd4,random= ~ time | id) > > plot(Variogram(cd1, form= ~ time | id, robust=TRUE)) > Error in as.array(X) : attempt to set an attribute on NULL > > Im not sure how to fix this, and apply this function to > a longitudinal data with unequal times? > > Any help would be appreciated > > > Quoting Dan Bebber <[EMAIL PROTECTED]>: > > > Hi Matt, > > > > there are several R packages that will compute the sample variogram > > for you. Check out GeoR, sgeostat, nlme, spatial. There's > no point in > > recoding the whole lot yourself, unless as a learning excercise. > > > > D > > > > p.s. For time series autocorrelations, you could use acf in package > > stats. > > > > Message: 9 > > Date: Mon, 25 Oct 2004 02:02:06 -0400 > > From: [EMAIL PROTECTED] > > Subject: [R] sample variogram construction > > To: [EMAIL PROTECTED] > > Message-ID: <[EMAIL PROTECTED]> > > Content-Type: text/plain; charset=US-ASCII > > > > Hi > > > > Im attempting to build a sample variogram for 300 obersvations of > > longitudinal data. So what I need to do is compute the half squared > > differences between pairs of residuals (for instance if a > subject has > > 4 obersvations, this is 4 choose 2 paird differences) for each > > subject. Also, then I need the corresponding time > differences within > > each individual. So the end result will be a 300 by 2 matrix with > > columns corresponding to paired difference residuals within subject > > and time differences within subject. Basically im having trouble > > coding this kind of matrix in R, if anyone can help me out > or give me > > some tips id appreciate it. > > > > Thanks. > > Stuck in the for loop > > student > > > > > > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Error with package update
I received the following error when I attempted to update my packages: updating HTML package descriptions Warning messages: 1: DESCRIPTION file of package 'file28862' missing or broken in: packageDescription(p, lib = lib, fields = pkgFlds) 2: number of columns of result not a multiple of vector length (arg 2) in: rbind(retval, c(p, lib, desc)) 3: DESCRIPTION file of package 'file7460' missing or broken in: packageDescription(p, lib = lib, fields = pkgFlds) 4: number of columns of result not a multiple of vector length (arg 2) in: rbind(retval, c(p, lib, desc)) 5: DESCRIPTION file of package 'file28862' missing or broken in: packageDescription(i, fields = "Title", lib.loc = lib) 6: DESCRIPTION file of package 'file7460' missing or broken in: packageDescription(i, fields = "Title", lib.loc = lib) I am running R 2.0 on Windows XP. Any ideas what caused this? Many thanks, Dan Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB UK Tel. 01865 275000 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] summary.lme() vs. anova.lme()
Dear R list: I modelled changes in a variable (mconc) over time (d) for individuals (replicate) given one of three treatments (treatment) using: mconc.lme <- lme(mconc~treatment*poly(d,2), random=~poly(d,2)|replicate, data=my.data) summary(mconc.lme) shows that the linear coefficient of one of the treatments is significantly different to zero, viz. Value Std.Error DF t-value p-value ... ... ... ... ... treatmentf:poly(d, 2)1 1.3058562 0.5072409 315 2.574430 0.0105 But anova(mconc.lme) gives a non-significant result for the treatment*time interaction, viz. numDF denDF F-value p-value (Intercept) 1 315 159.17267 <.0001 treatment239 0.51364 0.6023 poly(d, 2) 2 315 17.43810 <.0001 treatment:poly(d, 2) 4 315 2.01592 0.0920 Pinheiro & Bates (2000) only discusses anova() for single arguments briefly on p.90. I would like to know whether these results indicate that the significant effect found in summary(mconc.lme) is spurious (perhaps due to multiplicity). Many thanks, Dan Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB UK Tel. 01865 275000 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
RE: [R] summary.lme() vs. anova.lme()
For anyone following this thread in the future: Following Prof. Ripley's advice, I compared models fitted with ML, with and without treatment as a predictor: > anova(mconc.lme1,mconc.lme2) Model df AIC BIC logLik Test L.Ratio p-value mconc.lme1 1 10 -1366.184 -1327.240 693.0919 mconc.lme2 2 16 -1363.095 -1300.785 697.5475 1 vs 2 8.91124 0.1786 I can't reject the null hypothesis of no effect of treatment. Many thanks. Dan Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB UK Tel. 01865 275000 > -Original Message- > From: Prof Brian Ripley [mailto:[EMAIL PROTECTED] > Sent: 17 November 2004 15:32 > To: Dan Bebber > Cc: [EMAIL PROTECTED] > Subject: Re: [R] summary.lme() vs. anova.lme() > > > On Wed, 17 Nov 2004, Dan Bebber wrote: > > > I modelled changes in a variable (mconc) over time (d) for > individuals > > (replicate) given one of three treatments (treatment) > using: mconc.lme > > <- lme(mconc~treatment*poly(d,2), random=~poly(d,2)|replicate, > > data=my.data) > > > > summary(mconc.lme) shows that the linear coefficient of one of the > > treatments is significantly different to zero, viz. > >Value Std.Error DF t-value p-value > > ... ... ... ... > > ... > > treatmentf:poly(d, 2)1 1.3058562 0.5072409 315 2.574430 0.0105 > > > > But anova(mconc.lme) gives a non-significant result for the > > treatment*time interaction, viz. > > numDF denDF F-value p-value > > (Intercept) 1 315 159.17267 <.0001 > > treatment239 0.51364 0.6023 > > poly(d, 2) 2 315 17.43810 <.0001 > > treatment:poly(d, 2) 4 315 2.01592 0.0920 > > > > Pinheiro & Bates (2000) only discusses anova() for single arguments > > briefly on p.90. I would like to know whether these results > indicate > > that the significant effect found in summary(mconc.lme) is spurious > > (perhaps due to multiplicity). > > Probably yes (but p values of 9% and 1% are not that > different, and in > both cases you are looking at a few p values). > > But since both summary.lme and anova.lme use Wald tests, I > would use a > LRT, using anova on two fits (and I would use ML fits to get > a genuine > LRT but that is perhaps being cautious). > > To Dimitris Rizopoulos: as this is the last term in the > sequential anova, > it is the correct Wald test. > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UKFax: +44 1865 272595 > __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] R package classification
Dear list, there are now >400 packages available on CRAN. Would it be useful to classify these packages according to what they do (e.g. classification, graphics, spatial statistics), to assist the user in finding the appropriate package for their problem? Or perhaps the search facility is enough. I would attempt such a classification, but my knowledge of statistical methods isn't good enough. Dan Bebber Department of Plant Sciences University of Oxford UK __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] logistic regression
Helene, you should read up about AIC, deviance, and deviance residuals, then look at the summary() for your model. Dan Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford OX1 3RB UK Message: 78 Date: Tue, 8 Feb 2005 11:15:35 +0100 From: [EMAIL PROTECTED] Subject: [R] logistic regression To: r-help@stat.math.ethz.ch Cc: [EMAIL PROTECTED] Message-ID: <[EMAIL PROTECTED]> Content-Type: text/plain; charset=ISO-8859-1 Hi, I'm using glm function to do logistic regression and now I want to know if it exists a kind of R-squared with this function in order to check the model. Thank you. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] RE: glmmPQL questions
Jo, It looks like farm is your level of replication, so you don't need to specify farm as a random factor. A linear model 'lm' with binomial errors (a.k.a. logistic regression) is enough. You only need to specify different error strata if, say, you had sampled each farm several times. Is that what you mean by 'sampling cluster'? BUT, there is very likely some spatial dependence among farms, so you will also need to model this. If you want to constrain the analysis, check out 'subset'. Missing values: you have to remove farms with missing values from the analysis. Look up 'na.omit'. I think perhaps you need to consult a statistician at the Edinburgh stats department to get info on the appropriate analyses, as the R-help list is usually restricted to R-specific questions. There is a massive amount of literature on agricultural epidemiology (esp. following foot & mouth), so read up to see what has been done before. Cheers, Dan Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB > > Message: 4 > Date: Mon, 28 Mar 2005 12:06:25 +0100 > From: JEB Halliday <[EMAIL PROTECTED]> > Subject: [R] glmmPQL questions > To: r-help@stat.math.ethz.ch > Message-ID: <[EMAIL PROTECTED]> > Content-Type: text/plain; charset=ISO-8859-15 > > > I am looking a risk factors for disease in cattle and am > interested in modelling > farm and sampling cluster as random effects (My outcome is > positive or negative > at the level of the farm). I am using R version 2.0.1 on a Mac and have > identified glmmPQL as hopefully the correct function to use. I have run a > couple of models using this but was hoping that you might be able > to answer a > few questions. > > e.g. model<-glmmPQL(farmstatus~cattlenumber,random~1|farm,binomial) > > I am pretty new to both R and stats so if these questions are > very simple and I > am just missing something, suggestions about good texts on GLMM > in R would be > great. > > First up, what is the best way to constrain the model to only > look at certain > levels of a multi-level factor e.g. a categorisation of cattle > number where all > points of high influence > > (as determined using: summary(influence.measures(model)) ) > > are confined to the largest class (D) and I want to run the model > which just > looks at levels A,B and C? (or only months May-September..) > > I was also wondering about the best way to force specified > variables to remain > in the model when running e.g. stepwise selection of interaction terms? > > Finally, is there is a recognised method for dealing with missing > values in > these models? > and as a minor point the models do not run unless i specify the > data= part of > the syntax and as this is apparently an optional piece of > information I was > wondering why this is required when all of my variables are in > the same data > frame (and even when this data frame is attached?) > > Any help would be greatly appreciated > > Jo Halliday > MSc student > University of Edinburgh > [EMAIL PROTECTED] > __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Dead wood code
Mike, I used R for exactly that purpose, to test a new method for sampling coarse woody debris in silico against existing alternatives. The results are published in: Bebber, D.P. & Thomas, S.C., 2003. Prism sweeps for coarse woody debris. Canadian Journal of Forest Research 33, 17371743. I will put a reprint in the post, and will forward the R code in a separate message. Please cite the article and acknowledge the original code in any publications. Cheers, Dan Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB UK > > Date: Tue, 5 Apr 2005 10:07:00 -0400 > From: "Mike Saunders" > <[EMAIL PROTECTED]> > To: "R Help" > Subject: [R] Dead wood code > > > Is there a package, or does anyone have code they > are willing to share, > that would allow me to simulate sampling of dead > wood pieces across an > area? I am specifically looking for code to > simulate the dead wood > distribution as small line segments across an > extent, and then I will > "sample" the dead wood using different sampling > techniques including > line transects, fixed area plots, etc. > > Thanks, > Mike > > Mike Saunders > Research Assistant > Forest Ecosystem Research Program > Department of Forest Ecosystem Sciences > University of Maine > Orono, ME 04469 > 207-581-2763 (O) > 207-581-4257 (F) > > [[alternative HTML version deleted]] > > __ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > > > Send instant messages to your online friends http://uk.messenger.yahoo.com __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Circular statistics with (direction,size) data
Dear List, I have circular data comprising direction and size for a number of observations. I wish to calculate statistics such as the mean direction and concentration for these data. I note that the CircStats package only deals with angular data (i.e. "Each observation is treated as a unit vector or a point on the unit circle". There appears to be no facility for weighting observations by vector size. My solution would be to create a data set of angles in which my observations are repeated in proportion to the vector size, but this would lead to very large data sets. Does anyone have a better solution to this problem? Many thanks, Dan Bebber Dr. Daniel P. Bebber Department of Plant Sciences University of Oxford South Parks Road Oxford OX1 3RB Tel. 01865 275060 Web. http://www.forestecology.co.uk/ "Data, data, data!" he cried impatiently. "I can't make bricks without clay" - Sherlock Holmes, The Adventure of the Copper Beeches, 1892 [[alternative HTML version deleted]] __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html