[R] Redirecting print output
I see a rich set of graphic device functions to redirect that output. Are there commands to redirect text as well. I have a set of functions that execute many linear regression tests serially and I want to capture this in a file for printing. Thanks, Stan Hopkins [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT(slightly) - Tracking extended projects
James MacDonald jmacdon at med.umich.edu writes: Hi all, Most of the analyses I do are short little once-and-done type things that are easily encapsulated in a .Rnw file. However, I sometimes end up with projects that take an extended amount of time. Usually these projects are not easily encapsulated in an .Rnw file, so I have been using a single .R file with lots of comments. The problem with this approach is keeping track of what you have done and what the results were. Once the .R file gets to be a certain size, the comments aren't as useful, and I find it easy to get lost. I have to assume that others have encountered this problem and hopefully have come up with something more elegant. Any suggestions? Best, Jim One possible choice is the intuitively structured approach of Projects in Tinn-R (see http://www.sciviews.org/Tinn-R/). -- D L McArthur dmca at ucla.edu __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] function optimization: reducing the computing time
Dear useRs, I have written a function that implements a Bayesian method to compare a patient's score on two tasks with that of a small control group, as described in Crawford, J. and Garthwaite, P. (2007). Comparison of a single case to a control or normative sample in neuropsychology: Development of a bayesian approach. Cognitive Neuropsychology, 24(4):343–372. The function (see below) return the expected results, but the time needed to compute is quite long (at least for a test that may be routinely used). There is certainly room for improvement. It would really be helpful if some experts of you may have a look ... Thanks a lot. Regards, Matthieu FUNCTION -- The function takes the performance on two tasks and estimate the rarity (the p-value) of the difference between the patient's two scores, in comparison to the difference i the controls subjects. A standardized and an unstandardized version are provided (controlled by the parameter standardized: T vs. F). Also, for congruency with the original publication, both the raw data and summary statistics could be used for the control group. ## # Bayesian (un)Standardized Difference Test ## #from Crawford and Garthwaite (2007) Cognitive Neuropsychology # implemented by Matthieu Dubois, Matthieu.Duboisatpsp.ucl.ac.be #PACKAGE MCMCpack REQUIRED # patient: a vector with the two scores; controls: matrix/data.frame with the raw scores (one column per task) # mean.c, sd.c, r, n: possibility to enter summaries statistics (mean, standard deviation, correlation, group size) # n.simul: number of simulations # two-sided (Boolean): two-sided (T) vs. one-sided (F) Bayesian Credible interval # standardized (Boolean): standardized (T) vs. unstandardized (F) test # values are: $p.value (one_tailed), $confidence.interval crawford.BSDT - function(patient, controls, mean.c=0, sd.c=0 , r=0, n=0, na.rm=F, n.simul=10, two.sided=T, standardized=T) { library(MCMCpack) #if no summaries are entered, they are computed if(missing(n)) { if(!is.data.frame(controls)) controls - as.data.frame(controls) n - dim(controls)[1] mean.c - mean(controls, na.rm=na.rm) sd.c - sd(controls, na.rm=na.rm) na.method - ifelse(na.rm,complete.obs,all.obs) r - cor(controls[,1], controls[,2], na.method) } #variance/covariance matrix s.xx - (sd.c[1]^2) * (n-1) s.yy - (sd.c[2]^2) * (n-1) s.xy - sd.c[1] * sd.c[2] * r * (n-1) A - matrix(c(s.xx, s.xy, s.xy, s.yy), ncol=2) #estimation function if(standardized) { estimation - function(patient, mean.c, n, A) { #estimation of a variance/covariance matrix (sigma) sigma = riwish(n,A) #random obs. from an inverse-Wishart distribution #estimation of the means (mu) z - rnorm(2) T - t(chol(sigma)) #Cholesky decomposition mu - mean.c + T %*% z/sqrt(n) #standardization z.x - (patient[1]-mu[1]) / sqrt(sigma[1,1]) z.y - (patient[2]-mu[2]) / sqrt(sigma[2,2]) rho.xy - sigma[2.2] / sqrt(sigma[1,1]*sigma[2,2]) z.star - (z.x - z.y) / sqrt(2-2*rho.xy) #conditional p-value p - pnorm(z.star) p } } else { estimation - function(patient, mean.c, n, A) { #estimation of a variance/covariance matrix (sigma) sigma = riwish(n,A) #random obs. from an inverse-Wishart distribution #estimation of the means (mu) z - rnorm(2) T - t(chol(sigma)) #Cholesky decomposition mu - mean.c + T %*% z/sqrt(n) num - (patient[1]-mu[1]) - (patient[2] - mu[2]) denom - sqrt(sigma[1,1]+sigma[2,2]-(2*sigma[1,2])) z.star - num/denom #conditional p-value p - pnorm(z.star) p } } #application p - replicate(n.simul, estimation(patient, mean.c, n, A)) #outputs pval - mean(p) CI - if(two.sided) 100*quantile(p,c(0.025,0.975)) else 100*quantile (p,c(0.95)) output -
Re: [R] The 'REP' term in AMMI{agricolae}
The AMMI senso strictu part starts from the corrected data for genotype and environment. In most cases where AMMI is applied (maybe also in the agricolae package, I haven't checked), starts from the interaction effects obtained through a general linear model anova. It should be possible to replace the input by blups from your mixed model. R news 7/1 (p14-19) gave some basic code for AMMI. There you should be able to get ideas how to modify the code. Joris CG Pettersson [EMAIL PROTECTED] e.slu.se To Sent by: r-help@stat.math.ethz.ch [EMAIL PROTECTED] cc at.math.ethz.ch Subject [R] The 'REP' term in 23/07/2007 19:59 AMMI{agricolae} Dear all, W2k, R 2.5.1 I am trying out the AMMI function in the agricolae package, to analyse the dependence of environment for a certain cultivar. The function responds to four basic variables: ENV Environment GEN Genotype REP Replication Y Response My question concerns the 'REP' term. When I normally do an analysis of variance, the replication would refer to repeated observations within the each field trial. In the case I am analysing right now, I have five years of observations for each 'ENV', in such a case it feels natural to use year as the most important replication of the environment- especially as I am interested in long term trends. This approach also seems to work allright. But the field trials are also structured in three main blocks, subdivided into five 'lattice' blocks, a structure that is powerful in the analysis of variance. (I use a random call in lme{nlme}). Is it possible to use the block structure also in the AMMI analysis? If that is possible, how should I code? I have tried to find out in the documentation, but if it is stated there I do not understand it. Thank you /CG -- CG Pettersson, PhD Swedish University of Agricultural Sciences (SLU) Dept. of Crop Production Ecology. Box 7043. SE-750 07 Uppsala, Sweden [EMAIL PROTECTED] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Redirecting print output
Stan Hopkins wrote: I see a rich set of graphic device functions to redirect that output. Are there commands to redirect text as well. I have a set of functions that execute many linear regression tests serially and I want to capture this in a file for printing. Thanks, Stan Hopkins Yes, there are. ?sink You could also run your functions from a batch mode: R your_script.R output.txt or Rscript your_script.R output.txt This, however, will give you a single file, while sink() allows creation of multiple files. capture.output can store the output in an array of character strings. -- View this message in context: http://www.nabble.com/Redirecting-print-output-tf4134131.html#a11758652 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Redirecting print output
Here are two simple ways: === method1 === cat(line1,\n,file=output.txt) cat(line2,\n,file=output.txt,append=TRUE) === method2 === sink(output.txt) cat(line1,\n) cat(line2,\n) out - lm(y~x,data=data.frame(x=1:10,y=(1:10+rnorm(10,0,0.1 print(out) sink() And then there is 'Sweave'. Check out, for instance http://www.stat.umn.edu/~charlie/Sweave/ You can embed R code, figures, and output from print methods into your latex document. ST --- Stan Hopkins [EMAIL PROTECTED] wrote: I see a rich set of graphic device functions to redirect that output. Are there commands to redirect text as well. I have a set of functions that execute many linear regression tests serially and I want to capture this in a file for printing. Thanks, Stan Hopkins [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] OT(slightly) - Tracking extended projects
At 22:36 23.07.2007 +, D L McArthur wrote: James MacDonald jmacdon at med.umich.edu writes: Hi all, Most of the analyses I do are short little once-and-done type things that are easily encapsulated in a .Rnw file. However, I sometimes end up with projects that take an extended amount of time. Usually these projects are not easily encapsulated in an .Rnw file, so I have been using a single .R file with lots of comments. The problem with this approach is keeping track of what you have done and what the results were. Once the .R file gets to be a certain size, the comments aren't as useful, and I find it easy to get lost. I have to assume that others have encountered this problem and hopefully have come up with something more elegant. Any suggestions? Best, Jim One possible choice is the intuitively structured approach of Projects in Tinn-R (see http://www.sciviews.org/Tinn-R/). -- D L McArthur dmca at ucla.edu Or, in case you use emacs and ESS you could find some hints at the ESS help list ([EMAIL PROTECTED]). Look for [ESS] hide function bodies and [ESS] outline-minor-mode. Heinz __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] geoR/likfit: variance explained by model
I have been modelling spatial data using function likfit in package geoR. Now that I have the spatial regression models, I would like to calculate the amount of variance explained by both the trend and the spatial parts of the model. Any advice on how to do this would be much appreciated. Cheers, Ailsa __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] persp and greek symbols in the axes labels
I don't know why it doesn't work but I think people generally recommend that you use wireframe() in lattice rather than persp(), because wireframe is more customizable (the pdf document referred to in this post is pretty good): http://tolstoy.newcastle.edu.au/R/e2/help/07/03/12534.html Here's an example: library(lattice) library(reshape) x - 1:5 y - 1:3 z - matrix(1:15,ncol=3,dimnames=list(NULL,y)) M - melt(data.frame(x,z,check.names=FALSE),id=1,variable=y) wireframe(value~x*y,data=M, screen=list(z=45,x=-75), xlab=expression(kappa[lambda]), ylab=as.expression(substitute(paste(phi,=,true,sigma), list(true=5))), zlab = Z) [you can play around with the 'screen' argument to rotate the view, analogous to phi and theta in persp()] --- Nathalie Peyrard [EMAIL PROTECTED] wrote: Hello, I am plotting a 3D function using persp and I would like to use greek symbols in the axes labels. I have found examples like this one on the web: plot(0,0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5))) this works well with plot but not with persp: with the command persp(M,theta = -20,phi = 0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)),zlab = Z) I get the labels as in toto.eps Any suggestion? Thanks! Nathalie -- ~~ INRA Toulouse - Unité de Biométrie et Intelligence Artificielle Chemin de Borde-Rouge BP 52627 31326 CASTANET-TOLOSAN cedex FRANCE Tel : +33(0)5.61.28.54.39 - Fax : +33(0)5.61.28.53.35 Web :http://mia.toulouse.inra.fr/index.php?id=217 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] function optimization: reducing the computing time
You need to make use of the profiling methods described in 'Writing R Exensions'. My machine is about 4x faster than yours: I get Each sample represents 0.02 seconds. Total run time: 62.08041 seconds. Total seconds: time spent in function and callees. Self seconds: time spent in function alone. % total % self totalseconds selfsecondsname 100.00 62.08 0.00 0.00 system.time 99.94 62.04 0.00 0.00 crawford.BSDT 99.94 62.04 0.00 0.00 eval 99.10 61.52 1.00 0.62 lapply 99.10 61.52 0.00 0.00 sapply 99.00 61.46 0.00 0.00 replicate 98.61 61.22 2.26 1.40 FUN 98.26 61.00 3.32 2.06 estimation 83.92 52.10 0.26 0.16 riwish 83.67 51.94 4.25 2.64 solve 55.57 34.50 7.18 4.46 solve.default 51.68 32.08 3.77 2.34 rwish ... so 84% of the time is being spent in riwish. Now given that A is fixed, you should be able to speed that up by precomputing the constant parts of the computation (and you can also precompute your 'T'). On Tue, 24 Jul 2007, Matthieu Dubois wrote: Dear useRs, I have written a function that implements a Bayesian method to compare a patient's score on two tasks with that of a small control group, as described in Crawford, J. and Garthwaite, P. (2007). Comparison of a single case to a control or normative sample in neuropsychology: Development of a bayesian approach. Cognitive Neuropsychology, 24(4):343ÿÿ372. The function (see below) return the expected results, but the time needed to compute is quite long (at least for a test that may be routinely used). There is certainly room for improvement. It would really be helpful if some experts of you may have a look ... Thanks a lot. Regards, Matthieu FUNCTION -- The function takes the performance on two tasks and estimate the rarity (the p-value) of the difference between the patient's two scores, in comparison to the difference i the controls subjects. A standardized and an unstandardized version are provided (controlled by the parameter standardized: T vs. F). Also, for congruency with the original publication, both the raw data and summary statistics could be used for the control group. ## # Bayesian (un)Standardized Difference Test ## #from Crawford and Garthwaite (2007) Cognitive Neuropsychology # implemented by Matthieu Dubois, Matthieu.Duboisatpsp.ucl.ac.be #PACKAGE MCMCpack REQUIRED # patient: a vector with the two scores; controls: matrix/data.frame with the raw scores (one column per task) # mean.c, sd.c, r, n: possibility to enter summaries statistics (mean, standard deviation, correlation, group size) # n.simul: number of simulations # two-sided (Boolean): two-sided (T) vs. one-sided (F) Bayesian Credible interval # standardized (Boolean): standardized (T) vs. unstandardized (F) test # values are: $p.value (one_tailed), $confidence.interval crawford.BSDT - function(patient, controls, mean.c=0, sd.c=0 , r=0, n=0, na.rm=F, n.simul=10, two.sided=T, standardized=T) { library(MCMCpack) #if no summaries are entered, they are computed if(missing(n)) { if(!is.data.frame(controls)) controls - as.data.frame(controls) n - dim(controls)[1] mean.c - mean(controls, na.rm=na.rm) sd.c - sd(controls, na.rm=na.rm) na.method - ifelse(na.rm,complete.obs,all.obs) r - cor(controls[,1], controls[,2], na.method) } #variance/covariance matrix s.xx - (sd.c[1]^2) * (n-1) s.yy - (sd.c[2]^2) * (n-1) s.xy - sd.c[1] * sd.c[2] * r * (n-1) A - matrix(c(s.xx, s.xy, s.xy, s.yy), ncol=2) #estimation function if(standardized) { estimation - function(patient, mean.c, n, A) { #estimation of a variance/covariance matrix (sigma) sigma = riwish(n,A) #random obs. from an inverse-Wishart distribution #estimation of the means (mu) z - rnorm(2) T - t(chol(sigma)) #Cholesky decomposition mu - mean.c + T %*% z/sqrt(n) #standardization z.x - (patient[1]-mu[1]) / sqrt(sigma[1,1]) z.y - (patient[2]-mu[2]) / sqrt(sigma[2,2]) rho.xy - sigma[2.2] / sqrt(sigma[1,1]*sigma[2,2]) z.star - (z.x - z.y) / sqrt(2-2*rho.xy) #conditional p-value p - pnorm(z.star) p } } else {
Re: [R] Fitting exponential curve to data points
On 24-Jul-07 01:09:06, Andrew Clegg wrote: Hi folks, I've looked through the list archives and online resources, but I haven't really found an answer to this -- it's pretty basic, but I'm (very much) not a statistician, and I just want to check that my solution is statistically sound. Basically, I have a data file containing two columns of data, call it data.tsv: year count 1999 3 2000 5 2001 9 2002 30 2003 62 2004 154 2005 245 2006 321 These look exponential to me, so what I want to do is plot these points on a graph with linear axes, and add an exponential curve over the top. I also want to give an R-squared for the fit. The way I did it was like so: # Read in the data, make a copy of it, and take logs data = read.table(data.tsv, header=TRUE) log.data = data log.data$count = log(log.data$count) # Fit a model to the logs of the data model = lm(log.data$count ~ year, data = log.data) # Plot the original data points on a graph plot(data) # Draw in the exponents of the model's output lines(data$year, exp(fitted(model))) Is this the right way to do it? log-ing the data and then exp-ing the results seems like a bit of a long-winded way to achieve the desired effect. Is the R-squared given by summary(model) a valid measurement of the fit of the points to an exponential curve, and should I use multiple R-squared or adjusted R-squared? The R-squared I get from this method (0.98 multiple) seems a little high going by the deviation of the last data point from the curve -- you'll see what I mean if you try it. I just did. From the plot of log(count) against year, with the plot of the linear fit of log(count)~year superimposed, I see indications of a non-linear relationship. The departures of the data from the fit follow a rather systematic pattern. Initially the data increase more slowly than the fit, and lie below it. Then they increase faster and corss over above it. Then the data increase less fast than the fit, and the final data point is below the fit. There are not enough data to properly identify the non-linearity, but the overall appearance of the data plot suggests to me that you should be considering one of the growth curve models. Many such models start of with an increasing rate of growth, which then slows down, and typically levels off to an asymptote. The apparent large discrepancy of your final data point could be compatible with this kind of behaviour. At this point, knowledge of what kind of thing is represented by your count variable might be helpful. If, for instance, it is the count of the numbers of individuals of a species in an area, then independent knowledge of growth mechanisms may help to narrow down the kind of model you should be tring to fit. As to your question about Is this the right way to do it (i.e. fitting an exponential curve by doing a linear fit of the logarithm), generally speaking the answer is Yes. But of course you need to be confident that exponential is the right curve to be fitting in the first place. If it's the wrong type of curve to be considering, then it's not the right way to do it! Hoping this help[s, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 24-Jul-07 Time: 10:08:33 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Warning generated by Panda GateDefender Integra.
07/24/2007 11:10:07 [GMT+0100] For security reasons certain items found in an email with your address as the sender have not been accepted. File name: attachment.zip Filtered by: Malformed messages Sender: r-help@stat.math.ethz.ch Recipients: [EMAIL PROTECTED] CC: __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting exponential curve to data points
I think your way is probably the easiest (shockingly). For instance, here are some alternatives - I think in both cases you have to calculate the coefficient of determination (R^2) manually. My understanding is that multiple R^2 in your case is the usual R^2 because you only have one predictor variable, and the adjusted R^2 considers the degrees of freedom and penalizes for additional predictors. Which is better... depends? (Perhaps more stats-savvy people can help you on that one. I'm a chemical engineer so I unjustifiably claim ignorance). ## Data input input - Year Count 19993 20005 20019 200230 200362 2004154 2005245 2006321 dat - read.table(textConnection(input),header=TRUE) dat[,] - lapply(dat,function(x) x-x[1]) # shifting in origin; will need to add back in later ## Nonlinear least squares plot(dat) out - nls(Count~b0*exp(b1*Year),data=dat, start=list(b0=1,b1=1)) lines(dat[,1],fitted(out),col=2) out - nls(Count~b0+b1*Year+b2*Year^2,data=dat, #polynomial start=list(b0=0,b1=1,b2=1)) lines(dat[,1],fitted(out),col=3) ## Optim f - function(.pars,.dat,.fun) sum((.dat[,2]-.fun(.pars,.dat[,1]))^2) fitFun - function(b,x) cbind(1,x,x^2)%*%b expFun - function(b,x) b[1]*exp(b[2]*x) plot(dat) out - optim(c(0,1,1),f,.dat=dat,.fun=fitFun) lines(dat[,1],fitFun(out$par,dat[,1]),col=2) out - optim(c(1,1),f,.dat=dat,.fun=expFun) lines(dat[,1],expFun(out$par,dat[,1]),col=3) --- Andrew Clegg [EMAIL PROTECTED] wrote: Hi folks, I've looked through the list archives and online resources, but I haven't really found an answer to this -- it's pretty basic, but I'm (very much) not a statistician, and I just want to check that my solution is statistically sound. Basically, I have a data file containing two columns of data, call it data.tsv: year count 1999 3 2000 5 2001 9 2002 30 2003 62 2004 154 2005 245 2006 321 These look exponential to me, so what I want to do is plot these points on a graph with linear axes, and add an exponential curve over the top. I also want to give an R-squared for the fit. The way I did it was like so: # Read in the data, make a copy of it, and take logs data = read.table(data.tsv, header=TRUE) log.data = data log.data$count = log(log.data$count) # Fit a model to the logs of the data model = lm(log.data$count ~ year, data = log.data) # Plot the original data points on a graph plot(data) # Draw in the exponents of the model's output lines(data$year, exp(fitted(model))) Is this the right way to do it? log-ing the data and then exp-ing the results seems like a bit of a long-winded way to achieve the desired effect. Is the R-squared given by summary(model) a valid measurement of the fit of the points to an exponential curve, and should I use multiple R-squared or adjusted R-squared? The R-squared I get from this method (0.98 multiple) seems a little high going by the deviation of the last data point from the curve -- you'll see what I mean if you try it. Thanks in advance for any help! Yours gratefully, Andrew. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting exponential curve to data points
Well spoken. And since log transformations are nonlinear and 'compresses' the data, it's not surprising to find that the fit doesn't look so nice while the fit metrics tell you that a model does a good job. --- [EMAIL PROTECTED] wrote: On 24-Jul-07 01:09:06, Andrew Clegg wrote: Hi folks, I've looked through the list archives and online resources, but I haven't really found an answer to this -- it's pretty basic, but I'm (very much) not a statistician, and I just want to check that my solution is statistically sound. Basically, I have a data file containing two columns of data, call it data.tsv: year count 1999 3 2000 5 2001 9 2002 30 2003 62 2004 154 2005 245 2006 321 These look exponential to me, so what I want to do is plot these points on a graph with linear axes, and add an exponential curve over the top. I also want to give an R-squared for the fit. The way I did it was like so: # Read in the data, make a copy of it, and take logs data = read.table(data.tsv, header=TRUE) log.data = data log.data$count = log(log.data$count) # Fit a model to the logs of the data model = lm(log.data$count ~ year, data = log.data) # Plot the original data points on a graph plot(data) # Draw in the exponents of the model's output lines(data$year, exp(fitted(model))) Is this the right way to do it? log-ing the data and then exp-ing the results seems like a bit of a long-winded way to achieve the desired effect. Is the R-squared given by summary(model) a valid measurement of the fit of the points to an exponential curve, and should I use multiple R-squared or adjusted R-squared? The R-squared I get from this method (0.98 multiple) seems a little high going by the deviation of the last data point from the curve -- you'll see what I mean if you try it. I just did. From the plot of log(count) against year, with the plot of the linear fit of log(count)~year superimposed, I see indications of a non-linear relationship. The departures of the data from the fit follow a rather systematic pattern. Initially the data increase more slowly than the fit, and lie below it. Then they increase faster and corss over above it. Then the data increase less fast than the fit, and the final data point is below the fit. There are not enough data to properly identify the non-linearity, but the overall appearance of the data plot suggests to me that you should be considering one of the growth curve models. Many such models start of with an increasing rate of growth, which then slows down, and typically levels off to an asymptote. The apparent large discrepancy of your final data point could be compatible with this kind of behaviour. At this point, knowledge of what kind of thing is represented by your count variable might be helpful. If, for instance, it is the count of the numbers of individuals of a species in an area, then independent knowledge of growth mechanisms may help to narrow down the kind of model you should be tring to fit. As to your question about Is this the right way to do it (i.e. fitting an exponential curve by doing a linear fit of the logarithm), generally speaking the answer is Yes. But of course you need to be confident that exponential is the right curve to be fitting in the first place. If it's the wrong type of curve to be considering, then it's not the right way to do it! Hoping this help[s, Ted. E-Mail: (Ted Harding) [EMAIL PROTECTED] Fax-to-email: +44 (0)870 094 0861 Date: 24-Jul-07 Time: 10:08:33 -- XFMail -- __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] values from a linear model
Dear R users, how can I extrapolate values listed in the summary of an lm model but not directly available between object values such as the the standard errors of the calculated parameters? for example I got a model: mod - lm(Crd ~ 1 + Week, data=data) and its summary: summary(mod) Call: lm(formula = Crd ~ 1 + Week, data = data, model = TRUE, y = TRUE) Residuals: Min 1Q Median 3QMax -4.299e-03 -1.653e-03 2.628e-05 1.291e-03 5.130e-03 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.000e+01 3.962e-04 25238.73 2e-16 *** Week5.038e-04 6.812e-0673.96 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.001966 on 98 degrees of freedom Multiple R-Squared: 0.9824, Adjusted R-squared: 0.9822 F-statistic: 5469 on 1 and 98 DF, p-value: 2.2e-16 I'm interested in values of Std. Error of coefficients... thank you very much Best regards Manuele -- Manuele Pesenti [EMAIL PROTECTED] [EMAIL PROTECTED] http://mpesenti.polito.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] problems with character objects and calls to list() [solved]
Thanks to Patrick Burns and Mark Lyman for their suggestions in solving my problem. Patrick suggested creating the list directly (the solution I opted for) On 7/23/07, Patrick Burns [EMAIL PROTECTED] wrote: Why not make the list directly: list.to.convert - vector('list', n) for(x in 1:n) list.to.convert[[x]] - seq((2*x)-1, 2*x) S Poetry may be of use to you. Whilst Mark suggested the correct evaluation of the character object list(1:2, 3:4, 5:6) [[1]] [1] 1 2 [[2]] [1] 3 4 [[3]] [1] 5 6 eval(parse(text=paste(list(,to.convert,),sep=))) [[1]] [1] 1 2 [[2]] [1] 3 4 [[3]] [1] 5 6 [[4]] [1] 7 8 [[5]] [1] 9 10 [[6]] [1] 11 12 -- In mathematics you don't understand things. You just get used to them. - Johann von Neumann Email - [EMAIL PROTECTED] / [EMAIL PROTECTED] Website - http://slack.ser.man.ac.uk/ Photos - http://www.flickr.com/photos/slackline/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] values from a linear model
It's not very clear to me but I think summary(mod)$coef[, Std. Error] is wat you need? Cheers, Thierry ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest Cel biometrie, methodologie en kwaliteitszorg / Section biometrics, methodology and quality assurance Gaverstraat 4 9500 Geraardsbergen Belgium tel. + 32 54/436 185 [EMAIL PROTECTED] www.inbo.be Do not put your faith in what statistics say until you have carefully considered what they do not say. ~William W. Watt A statistical analysis, properly conducted, is a delicate dissection of uncertainties, a surgery of suppositions. ~M.J.Moroney -Oorspronkelijk bericht- Van: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Namens Manuele Pesenti Verzonden: dinsdag 24 juli 2007 12:03 Aan: r-help@stat.math.ethz.ch Onderwerp: [R] values from a linear model Dear R users, how can I extrapolate values listed in the summary of an lm model but not directly available between object values such as the the standard errors of the calculated parameters? for example I got a model: mod - lm(Crd ~ 1 + Week, data=data) and its summary: summary(mod) Call: lm(formula = Crd ~ 1 + Week, data = data, model = TRUE, y = TRUE) Residuals: Min 1Q Median 3QMax -4.299e-03 -1.653e-03 2.628e-05 1.291e-03 5.130e-03 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.000e+01 3.962e-04 25238.73 2e-16 *** Week5.038e-04 6.812e-0673.96 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.001966 on 98 degrees of freedom Multiple R-Squared: 0.9824, Adjusted R-squared: 0.9822 F-statistic: 5469 on 1 and 98 DF, p-value: 2.2e-16 I'm interested in values of Std. Error of coefficients... thank you very much Best regards Manuele -- Manuele Pesenti [EMAIL PROTECTED] [EMAIL PROTECTED] http://mpesenti.polito.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] values from a linear model
try this: coef(summary(mod))[, Std. Error] I hope it helps. Best, Dimitris Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/(0)16/336899 Fax: +32/(0)16/337015 Web: http://med.kuleuven.be/biostat/ http://www.student.kuleuven.be/~m0390867/dimitris.htm - Original Message - From: Manuele Pesenti [EMAIL PROTECTED] To: r-help@stat.math.ethz.ch Sent: Tuesday, July 24, 2007 12:02 PM Subject: [R] values from a linear model Dear R users, how can I extrapolate values listed in the summary of an lm model but not directly available between object values such as the the standard errors of the calculated parameters? for example I got a model: mod - lm(Crd ~ 1 + Week, data=data) and its summary: summary(mod) Call: lm(formula = Crd ~ 1 + Week, data = data, model = TRUE, y = TRUE) Residuals: Min 1Q Median 3QMax -4.299e-03 -1.653e-03 2.628e-05 1.291e-03 5.130e-03 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.000e+01 3.962e-04 25238.73 2e-16 *** Week5.038e-04 6.812e-0673.96 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.001966 on 98 degrees of freedom Multiple R-Squared: 0.9824, Adjusted R-squared: 0.9822 F-statistic: 5469 on 1 and 98 DF, p-value: 2.2e-16 I'm interested in values of Std. Error of coefficients... thank you very much Best regards Manuele -- Manuele Pesenti [EMAIL PROTECTED] [EMAIL PROTECTED] http://mpesenti.polito.it __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] values from a linear model
Manuele Pesenti wrote: Dear R users, how can I extrapolate values listed in the summary of an lm model but not directly available between object values such as the the standard errors of the calculated parameters? for example I got a model: mod - lm(Crd ~ 1 + Week, data=data) and its summary: summary(mod) Call: lm(formula = Crd ~ 1 + Week, data = data, model = TRUE, y = TRUE) Residuals: Min 1Q Median 3QMax -4.299e-03 -1.653e-03 2.628e-05 1.291e-03 5.130e-03 Coefficients: Estimate Std. Error t value Pr(|t|) (Intercept) 1.000e+01 3.962e-04 25238.73 2e-16 *** Week5.038e-04 6.812e-0673.96 2e-16 *** --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: 0.001966 on 98 degrees of freedom Multiple R-Squared: 0.9824, Adjusted R-squared: 0.9822 F-statistic: 5469 on 1 and 98 DF, p-value: 2.2e-16 I'm interested in values of Std. Error of coefficients... thank you very much If you want to assign these values to some other variables, try assigning the result of the summary() to a variable and working with its components (the result is a list, use $ or [[]] to get its members) mod.sum-summary(mod) then coef(mod.sum)[,2] or mod.sum$coefficients[,2] will give you those Std. Errors -- View this message in context: http://www.nabble.com/values-from-a-linear-model-tf4134911.html#a11760459 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] extension of rnormp package
Il giorno mar, 24/07/2007 alle 06.39 +0100, Prof Brian Ripley ha scritto: On Mon, 23 Jul 2007, Iwona Szyd?owska wrote: Hello, I would like to ask You, how to generate random numbers from an exponential power family with a shape parameter p less than 1(p-0). I found the rnormp package, which can generate numbers from this distribution, but only for parameter less or equal 1. It seems you mean package 'normalp', and that the package author believes that the exponential power distribution is only defined for p = 1 (although that is not on the help page). Other authors believe it is defined by a relationship to the gamma for all p 0. So all you need to do is to change the condition from p 1 to p = 0 in rnormp and friends. Well, I know that an exponential power distribution is defined for p0, (I think quite all the references I know consider p0), but for 0p1 the algorithms that I have implemented for the estimates of the distribution parameters and for the regression parameters are really instable (pratically are not usable at all). Then, I prefered for all the functions of the normalp package consider only the case p=1. All the best, Angelo Mineo However, the algorithms used are not adequate for large or small p. We know that the distribution tends to uniform for p - Inf, but pnormp and rnormp break down for quite modest values of p. As p - 0 it tends to a point distribution at 0, but you will see very large values far too often. So if you want p smaller than say 0.01 you will need to implement a different algorithm. Regards, Iwona Szydlowska [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nnet 10-fold cross-validation
Hi all, there is tune() in the e1071 package for doing this in general, and, among others, a tune.nnet() wrapper (see ?tune): tmodel = tune.nnet(Species ~ ., data = iris, size = 1:5) summary(tmodel) Parameter tuning of `nnet': - sampling method: 10-fold cross validation - best parameters: size 1 - best performance: 0.0133 - Detailed performance results: size error dispersion 11 0.0133 0.02810913 22 0.0267 0.04661373 33 0.0267 0.04661373 44 0.0200 0.04499657 55 0.0267 0.04661373 plot(tmodel) tmodel$best.model a 4-1-3 network with 11 weights inputs: Sepal.Length Sepal.Width Petal.Length Petal.Width output(s): Species options were - softmax modelling etc. Best David On 7/23/07, S.O. Nyangoma [EMAIL PROTECTED] wrote: Hi It clear that to do a classification with svm under 10-fold cross validation one uses svm(Xm, newlabs, type = C-classification, kernel = linear,cross = 10) What corresponds to the nnet? nnet(.,cross=10)? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overlaying a single contour from a new data array in levelplot
Dear R-Help community, I am trying to overlay a single contour line over a correlation plot using levelplot in the lattice package. These are the two arrays: 1) a correlation plot over Africa - so each grid square is a different colour dependent on correlation - this is in an array: result_cor with dim[465,465] 2) a single contour line from a ***different data source*** - this is from data related to the p-values for the above correlation plot - I want to overlay only the 95% confidence contour. The p-values are stored in an array: result.p.values with same dimensions as above. I have read about using panel.levelplot and panel.contourplot in the R-help mailing list but I don't know the right way to call two different data arrays, can anybody help me please? I appreciate your time and help with this question. Many thanks, Jenny ~~ Jennifer Barnes PhD student: long range drought prediction Climate Extremes Group Department of Space and Climate Physics University College London Holmbury St Mary Dorking, Surrey, RH5 6NT Tel: 01483 204149 Mob: 07916 139187 Web: http://climate.mssl.ucl.ac.uk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting exponential curve to data points
Stephen, Ted -- thanks for your input. I'm glad to know I was barking up the right-ish tree at least. On 7/24/07, Ted Harding [EMAIL PROTECTED] wrote: There are not enough data to properly identify the non-linearity, but the overall appearance of the data plot suggests to me that you should be considering one of the growth curve models. Many such models start of with an increasing rate of growth, which then slows down, and typically levels off to an asymptote. The apparent large discrepancy of your final data point could be compatible with this kind of behaviour. You may have hit the nail on the head there. At least I now know that my method would be reasonable *if* I had a genuine exponential curve. Bound to come in handy. At this point, knowledge of what kind of thing is represented by your count variable might be helpful. If, for instance, it is the count of the numbers of individuals of a species in an area, then independent knowledge of growth mechanisms may help to narrow down the kind of model you should be tring to fit. It's the cumulative number of citations in the MEDLINE literature database about a particular topic (natural language processing in biomedicine). So indeed, it can't maintain an exponential growth rate for long, and an initial spurt while the field is novel and trendy, followed by a levelling-off, is just what we'd expect. There was a review a year or so ago that showed a very good exponential fit *then* but if I could show the last point was indicative of a slowdown, that would be news at least. Can anyone point me at a better modelling framework than lm(), in that case? Thanks again, Andrew. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting gam models
Hi everybody, I am working with gams and I have found some questions when plotting gams models. I am using mgcv, and my model looks something like this: model- gam(x ~ s(lat,long)) I can plot the output of the model using plot(model) or plot.gam(model) and I get a surface plot. That is ok, but what I want to do now is to extract the data used to perform the surface plot. Like that I would be able to superpose them to a geographical map, and to plot these data using other programs. Thank you very much in advance Lucía _ Lucia Zarauz AZTI - Tecnalia / Unidad de Investigación Marina Herrera kaia portualdea z/g 20110 Pasaia (Gipuzkoa) Tel: 943 004 800 - Fax: 943 004 801 e-mail: [EMAIL PROTECTED] www.azti.es ; www.tecnalia.info _ LEGE OHARRA AVISO LEGAL DISCLAIMER Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera debekatua dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta kontserbatu gabe. Este mensaje es personal y confidencial y su uso no autorizado está prohibido legalmente. Si usted no es el destinatario, proceda a borrarlo, sin reenviarlo ni conservarlo. This message is personal and confidential, unauthorised use is legally prohibited. If you are not the intended recipient, delete it without resending or backing it. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Re: Fitting exponential curve to data points
Hope these help for alternatives to lm()? I show the use of a 2nd order polynomial as an example to generalize a bit. Sometimes from the subject line two separate responses can appear as reposts when in fact they are not... (though there are identical reposts too). I should probably figure a way around that. --- Stephen Tucker [EMAIL PROTECTED] wrote: ## Data input input - Year Count 1999 3 2000 5 2001 9 2002 30 2003 62 2004 154 2005 245 2006 321 dat - read.table(textConnection(input),header=TRUE) dat[,] - lapply(dat,function(x) x-x[1]) # shifting in origin; will need to add back in later ## Nonlinear least squares plot(dat) out - nls(Count~b0*exp(b1*Year),data=dat, start=list(b0=1,b1=1)) lines(dat[,1],fitted(out),col=2) out - nls(Count~b0+b1*Year+b2*Year^2,data=dat, #polynomial start=list(b0=0,b1=1,b2=1)) lines(dat[,1],fitted(out),col=3) ## Optim f - function(.pars,.dat,.fun) sum((.dat[,2]-.fun(.pars,.dat[,1]))^2) fitFun - function(b,x) cbind(1,x,x^2)%*%b expFun - function(b,x) b[1]*exp(b[2]*x) plot(dat) out - optim(c(0,1,1),f,.dat=dat,.fun=fitFun) lines(dat[,1],fitFun(out$par,dat[,1]),col=2) out - optim(c(1,1),f,.dat=dat,.fun=expFun) lines(dat[,1],expFun(out$par,dat[,1]),col=3) Got a little couch potato? Check out fun summer activities for kids. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting gam models
see ?predict.gam -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 24/07/07, Lucia Zarauz [EMAIL PROTECTED] wrote: Hi everybody, I am working with gams and I have found some questions when plotting gams models. I am using mgcv, and my model looks something like this: model- gam(x ~ s(lat,long)) I can plot the output of the model using plot(model) or plot.gam(model) and I get a surface plot. That is ok, but what I want to do now is to extract the data used to perform the surface plot. Like that I would be able to superpose them to a geographical map, and to plot these data using other programs. Thank you very much in advance Lucía _ Lucia Zarauz AZTI - Tecnalia / Unidad de Investigación Marina Herrera kaia portualdea z/g 20110 Pasaia (Gipuzkoa) Tel: 943 004 800 - Fax: 943 004 801 e-mail: [EMAIL PROTECTED] www.azti.es ; www.tecnalia.info _ LEGE OHARRA AVISO LEGAL DISCLAIMER Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera debekatua dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta kontserbatu gabe. Este mensaje es personal y confidencial y su uso no autorizado está prohibido legalmente. Si usted no es el destinatario, proceda a borrarlo, sin reenviarlo ni conservarlo. This message is personal and confidential, unauthorised use is legally prohibited. If you are not the intended recipient, delete it without resending or backing it. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting gam models
Hi Henrique, Thank you for your suggestion. Actually, I have already tried it, but I am confused because the plot I get is not the same as the output of plot(model) or plot.gam(model). The yaxis is different On the other hand, if I build a more complex model, as for example: model- gam(x ~ s(lat,long) + s(temperature)) I would like to extract the information to build the effects for each explanatory factor (one graph for s(lat,long) and another for s(temperature)), as R does when you use 'plot(model)' and you press return for subsequent pages. My final aim is to plot the influence of s(lat,long) as a contourplot superposed on a geographical map. Maybe there is an easier way to do it... Thank you very much lucía _ Lucia Zarauz AZTI - Tecnalia / Unidad de Investigación Marina Herrera kaia portualdea z/g 20110 Pasaia (Gipuzkoa) Tel: 943 004 800 - Fax: 943 004 801 e-mail: [EMAIL PROTECTED] www.azti.es ; www.tecnalia.info _ LEGE OHARRA AVISO LEGAL DISCLAIMER Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera debekatua dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta kontserbatu gabe. Este mensaje es personal y confidencial y su uso no autorizado está prohibido legalmente. Si usted no es el destinatario, proceda a borrarlo, sin reenviarlo ni conservarlo. This message is personal and confidential, unauthorised use is legally prohibited. If you are not the intended recipient, delete it without resending or backing it. De: Henrique Dallazuanna [mailto:[EMAIL PROTECTED] Enviado el: martes, 24 de julio de 2007 13:36 Para: Lucia Zarauz CC: r-help@stat.math.ethz.ch Asunto: Re: [R] plotting gam models see ?predict.gam -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 24/07/07, Lucia Zarauz [EMAIL PROTECTED] wrote: Hi everybody, I am working with gams and I have found some questions when plotting gams models. I am using mgcv, and my model looks something like this: model- gam(x ~ s(lat,long)) I can plot the output of the model using plot(model) or plot.gam(model) and I get a surface plot. That is ok, but what I want to do now is to extract the data used to perform the surface plot. Like that I would be able to superpose them to a geographical map, and to plot these data using other programs. Thank you very much in advance Lucía _ Lucia Zarauz AZTI - Tecnalia / Unidad de Investigación Marina Herrera kaia portualdea z/g 20110 Pasaia (Gipuzkoa) Tel: 943 004 800 - Fax: 943 004 801 e-mail: [EMAIL PROTECTED] www.azti.es ; www.tecnalia.info _ LEGE OHARRA AVISO LEGAL DISCLAIMER Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera debekatua dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta kontserbatu gabe. Este mensaje es personal y confidencial y su uso no autorizado está prohibido legalmente. Si usted no es el destinatario, proceda a borrarlo, sin reenviarlo ni conservarlo. This message is personal and confidential, unauthorised use is legally prohibited. If you are not the intended recipient, delete it without resending or backing it. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting gam models
2007/7/24, Lucia Zarauz [EMAIL PROTECTED]: Hi everybody, I am working with gams and I have found some questions when plotting gams models. I am using mgcv, and my model looks something like this: model- gam(x ~ s(lat,long)) I can plot the output of the model using plot(model) or plot.gam(model) and I get a surface plot. That is ok, but what I want to do now is to extract the data used to perform the surface plot. Like that I would be able to superpose them to a geographical map, and to plot these data using other programs. Thank you very much in advance Lucía _ Lucia Zarauz AZTI - Tecnalia / Unidad de Investigación Marina Herrera kaia portualdea z/g 20110 Pasaia (Gipuzkoa) Tel: 943 004 800 - Fax: 943 004 801 e-mail: [EMAIL PROTECTED] www.azti.es ; www.tecnalia.info _ LEGE OHARRA AVISO LEGAL DISCLAIMER Mezu hau pertsonala eta isilpekoa da eta baimenik gabeko erabilera debekatua dago legalki. Jasotzailea ez bazara ezabatu mezua, bidali eta kontserbatu gabe. Este mensaje es personal y confidencial y su uso no autorizado está prohibido legalmente. Si usted no es el destinatario, proceda a borrarlo, sin reenviarlo ni conservarlo. This message is personal and confidential, unauthorised use is legally prohibited. If you are not the intended recipient, delete it without resending or backing it. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Also see ?vis.gam. Produces perspective or contour plot views of gam model predictions. Rod. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting gam models
Thank you for your suggestion. Actually, I have already tried it, but I am confused because the plot I get is not the same as the output of plot(model) or plot.gam(model). The yaxis is different -- `plot.gam' will always plot the `centered smooth', i.e. the smooth constrained to sum to zero over the data. By default (i.e. using type=link) `predict.gam' will include the intercept in the predctions. If you want centered terms out of `predict.gam' use the `type=terms option. On the other hand, if I build a more complex model, as for example: model- gam(x ~ s(lat,long) + s(temperature)) I would like to extract the information to build the effects for each explanatory factor (one graph for s(lat,long) and another for s(temperature)), as R does when you use 'plot(model)' and you press return for subsequent pages. Again, try predict.gam with type=terms best, Simon -- Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK +44 1225 386603 www.maths.bath.ac.uk/~sw283 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] persp and greek symbols in the axes labels
On Tue, 24 Jul 2007, Stephen Tucker wrote: I don't know why it doesn't work but I think people generally recommend that It has never been implemented, and I believe the main reason is that the labels are plotted at an angle other than a multiple of 90 degrees. Not all devices can do that, and rotated plotmath text can look quite ugly. And of course this _is_ documented in ?persp xlab, ylab, zlab: titles for the axes. N.B. These must be character strings; expressions are not accepted. Numbers will be coerced to character strings. you use wireframe() in lattice rather than persp(), because wireframe is more customizable (the pdf document referred to in this post is pretty good): http://tolstoy.newcastle.edu.au/R/e2/help/07/03/12534.html Here's an example: library(lattice) library(reshape) x - 1:5 y - 1:3 z - matrix(1:15,ncol=3,dimnames=list(NULL,y)) M - melt(data.frame(x,z,check.names=FALSE),id=1,variable=y) wireframe(value~x*y,data=M, screen=list(z=45,x=-75), xlab=expression(kappa[lambda]), ylab=as.expression(substitute(paste(phi,=,true,sigma), list(true=5))), zlab = Z) [you can play around with the 'screen' argument to rotate the view, analogous to phi and theta in persp()] Of course, that does not rotate the labels. If unrotated labels are acceptable, you can easily set up a new coordinate system (by par(usr=)) and call text() to put labels where you want on that. You can even try rotating them via srt=. There would be no harm in implementing this for use on devices where it will work: a nice self-contained project for someone who would like to learn about R internals. --- Nathalie Peyrard [EMAIL PROTECTED] wrote: Hello, I am plotting a 3D function using persp and I would like to use greek symbols in the axes labels. I have found examples like this one on the web: plot(0,0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5))) this works well with plot but not with persp: with the command persp(M,theta = -20,phi = 0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)),zlab = Z) I get the labels as in toto.eps Any suggestion? Thanks! Nathalie -- ~~ INRA Toulouse - Unité de Biométrie et Intelligence Artificielle Chemin de Borde-Rouge BP 52627 31326 CASTANET-TOLOSAN cedex FRANCE Tel : +33(0)5.61.28.54.39 - Fax : +33(0)5.61.28.53.35 Web :http://mia.toulouse.inra.fr/index.php?id=217 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] recursive vectorized arithmetics
Hi, I would like to write something similar to: for(t in 1:100) v[x[t]] - v[y[t]] + v[z[t]] in a vectorized form. The x, y, and z vectors may contain duplicates (so v[x] - v[y] + v[z] has different semantics). The for loop is not efficient enough for my purposes and I would like to avoid using C/Fortran. This problem occurred to me on several occasions and I feel it is quite general. Does anyone have an idea how to solve it nicely? Thanks, bk __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] persp and greek symbols in the axes labels
Thank you for these answers. I ended up modifying the ps file directly. But next time I will consider wireframe. Nathalie Prof Brian Ripley wrote: On Tue, 24 Jul 2007, Stephen Tucker wrote: I don't know why it doesn't work but I think people generally recommend that It has never been implemented, and I believe the main reason is that the labels are plotted at an angle other than a multiple of 90 degrees. Not all devices can do that, and rotated plotmath text can look quite ugly. And of course this _is_ documented in ?persp xlab, ylab, zlab: titles for the axes. N.B. These must be character strings; expressions are not accepted. Numbers will be coerced to character strings. you use wireframe() in lattice rather than persp(), because wireframe is more customizable (the pdf document referred to in this post is pretty good): http://tolstoy.newcastle.edu.au/R/e2/help/07/03/12534.html Here's an example: library(lattice) library(reshape) x - 1:5 y - 1:3 z - matrix(1:15,ncol=3,dimnames=list(NULL,y)) M - melt(data.frame(x,z,check.names=FALSE),id=1,variable=y) wireframe(value~x*y,data=M, screen=list(z=45,x=-75), xlab=expression(kappa[lambda]), ylab=as.expression(substitute(paste(phi,=,true,sigma), list(true=5))), zlab = Z) [you can play around with the 'screen' argument to rotate the view, analogous to phi and theta in persp()] Of course, that does not rotate the labels. If unrotated labels are acceptable, you can easily set up a new coordinate system (by par(usr=)) and call text() to put labels where you want on that. You can even try rotating them via srt=. There would be no harm in implementing this for use on devices where it will work: a nice self-contained project for someone who would like to learn about R internals. --- Nathalie Peyrard [EMAIL PROTECTED] wrote: Hello, I am plotting a 3D function using persp and I would like to use greek symbols in the axes labels. I have found examples like this one on the web: plot(0,0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5))) this works well with plot but not with persp: with the command persp(M,theta = -20,phi = 0,xlab=expression(kappa[lambda]),ylab=substitute(paste(phi,=,true,sigma),list(true=5)),zlab = Z) I get the labels as in toto.eps Any suggestion? Thanks! Nathalie -- ~~ INRA Toulouse - Unité de Biométrie et Intelligence Artificielle Chemin de Borde-Rouge BP 52627 31326 CASTANET-TOLOSAN cedex FRANCE Tel : +33(0)5.61.28.54.39 - Fax : +33(0)5.61.28.53.35 Web :http://mia.toulouse.inra.fr/index.php?id=217 ~~ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Ready for the edge of your seat? Check out tonight's top picks on Yahoo! TV. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting exponential curve to data points
On 7/24/07, Stephen Tucker [EMAIL PROTECTED] wrote: Hope these help for alternatives to lm()? I show the use of a 2nd order polynomial as an example to generalize a bit. Great, thanks. If I want to demonstrate that a non-linear curve fits better than an exponential, what's the best measure for that? Given that neither of nls() or optim() provide R-squared. Sorry if these are really silly questions. Sometimes from the subject line two separate responses can appear as reposts when in fact they are not... (though there are identical reposts too). I should probably figure a way around that. Nope, my fault, I didn't read them properly and thought you were demonstrating a different way to do exponential curves. Cheers, Andrew. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting exponential curve to data points
Andrew Clegg: Great, thanks. If I want to demonstrate that a non-linear curve fits better than an exponential, what's the best measure for that? Given that neither of nls() or optim() provide R-squared. You really need to *very* careful when trying to interprete R² (which can be defined in many nonequivalent ways) in the nonlinear case. Recommended (and, dare I say, *required* reading): Anderson-Sprecher R. (1994). ‘Model comparisons and R²’. The American Statistician, volume 48, no. 2, pages 113–117. DOI: 10.2307/2684259 Kvålseth T.O. (1985). ‘Cautionary note about R²’. The American Statistician, volume 39, no. 4, pages 279–285. DOI: 10.2307/2683704 Scott A. and Wild C. (1991). ‘Transformations and R²’. The American Statistician, volume 45, no. 2, pages 127–129. ISSN 0003-1305. DOI: 10.2307/2684375 The Scott Wild paper has an example that looks very similar to yours, and that may be instructive. FYI, in case you’re not used to DOIs: you can resolve the above DOIs to fulltext URLs using http://dx.doi.org/ -- Karl Ove Hufthammer __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Comparing a distance vs. correlation matrices
Hi all, I am a student of MRes Bioinformatics and Comp. Biology at University of Leeds, UK. I am trying to compare two matrices in R. One is the matrix containing distance and the other one contains the correlation values. I would like to create a XY plot of these matrices and do some linear regression on it. But, I am a bit new to R, so i have a few questions (I searched in the documentation with no success). I have been successful so far in reading both matrices but dont exactly know how can I get the plot for the same. Thanks very much. Urmi - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fit t distribution
Hi all, I am trying to fit t distribution using the function tFit in the library(fBasics). I am using the code tFit(datac[[2]]) and it returns the following list. Title: Student-t Parameter Estimation Call: tFit(x = datac[[2]]) Model: Student-t Distribution Estimated Parameter(s): df 78.4428 I just wonder how can I refer to the estimated parameters. I tried tFit(datac[[2]]) $df,tFit(datac[[2]])@df, but neither of them work. Could anyone give me some advice? -- View this message in context: http://www.nabble.com/Fit-t-distribution-tf4136445.html#a11764680 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fit t distribution
Hi, tFit(datac[[2]])@fit$estimate -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 24/07/07, livia [EMAIL PROTECTED] wrote: Hi all, I am trying to fit t distribution using the function tFit in the library(fBasics). I am using the code tFit(datac[[2]]) and it returns the following list. Title: Student-t Parameter Estimation Call: tFit(x = datac[[2]]) Model: Student-t Distribution Estimated Parameter(s): df 78.4428 I just wonder how can I refer to the estimated parameters. I tried tFit(datac[[2]]) $df,tFit(datac[[2]])@df, but neither of them work. Could anyone give me some advice? -- View this message in context: http://www.nabble.com/Fit-t-distribution-tf4136445.html#a11764680 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fit t distribution
It works. Many thanks Henrique Dallazuanna wrote: Hi, tFit(datac[[2]])@fit$estimate -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 24/07/07, livia [EMAIL PROTECTED] wrote: Hi all, I am trying to fit t distribution using the function tFit in the library(fBasics). I am using the code tFit(datac[[2]]) and it returns the following list. Title: Student-t Parameter Estimation Call: tFit(x = datac[[2]]) Model: Student-t Distribution Estimated Parameter(s): df 78.4428 I just wonder how can I refer to the estimated parameters. I tried tFit(datac[[2]]) $df,tFit(datac[[2]])@df, but neither of them work. Could anyone give me some advice? -- View this message in context: http://www.nabble.com/Fit-t-distribution-tf4136445.html#a11764680 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/Fit-t-distribution-tf4136445.html#a11764994 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] apply incompatible dimensions error
Hi, I've created the following two matrices (mat1 and mat2) and a function (f) to calculate the correlations between the two on a row by row basis. mat1 - matrix(sample(1:500,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) mat2 - matrix(sample(501:1000,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) f - function(x,y) cor(x,y) When the matrices are squares (# rows = # columns) I have no problems. However, when they are not (as in the example above with 5 columns and 10 rows), I get the following error: apply(mat1, 1, f, y=mat2) Error in cor(x, y, na.method, method == kendall) : incompatible dimensions Any help would be appreciated. Thanks! - Bruce ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] randomForest importance problem with combine [Broadcast]
I've been fixing some problems in the combine() function, but that's only for regression data. Looks like you are doing classification, and I don't see the problem: R library(randomForest) randomForest 4.5-19 Type rfNews() to see new features/changes/bug fixes. R set.seed(1) R rflist - replicate(50, randomForest(iris[-5], iris[[5]], ntree=50, importance=TRUE), simplify=FALSE) R rfall - do.call(combine, rflist) R importance(rfall) setosa versicolor virginica MeanDecreaseAccuracy Sepal.Length 0.4457861 0.53883425 0.55806570.4120840 Sepal.Width 0.3266790 0.07652383 0.36202400.2128450 Petal.Length 1.1950989 1.42014628 1.32204710.7989841 Petal.Width 1.1986973 1.40855969 1.36406200.7951053 MeanDecreaseGini Sepal.Length 9.578580 Sepal.Width 2.301172 Petal.Length42.935832 Petal.Width 44.409058 R importance(rflist[[1]]) setosa versicolor virginica MeanDecreaseAccuracy Sepal.Length 0.401714 0.71583422 0.49464200.4166555 Sepal.Width 0.00 -0.03155946 0.68292870.2317111 Petal.Length 1.290430 1.47915219 1.34567700.8219003 Petal.Width 1.110142 1.44996777 1.35847990.7881210 MeanDecreaseGini Sepal.Length 6.168439 Sepal.Width 2.240723 Petal.Length48.821726 Petal.Width 42.059112 Please provide a reproducible example. Andy From: Joseph Retzer My apologies, subject corrected. I'm building a RF 50 trees at a time due to memory limitations (I have roughly .5 million observations and around 20 variables). I thought I could combine some or all of my forests later and look at global importance. If I have say 2 forests : tree1 and tree2, they have similar Gini and Raw importances and, additionally, are similar to one another. After combining (using the combine command) the trees into one however, the combined tree Raw importances have changed in rank order rather dramtically (e.g. the top most important becomes least important. It is not however a completely reversed ordering). In addtion, the scale of both the Raw and Gini importances is orders of magnitude smaller for the combined tree. Note that the combined tree Gini importance looks roughly similar to the individual tree Gini (and Raw) importance, at least in terms of rank ordering. I'm using the non-formula randomForest specification along with norm.votes=FALSE to facilitate large sample estimation and tree combining. I'm using R 2.5.0 on a windows XP machine with 2 gig RAM. I'm also using randomForest 4.5-18. Any advice is appreciated, Many thanks, Joe [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Notice: This e-mail message, together with any attachments,...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply incompatible dimensions error
Your apply is trying to take the correlations of the rows of mat1 with the columns of mat2 which, of course, does not work if they have different numbers of columns. I think you mean to take the correlations of the columns of mat1 with the columns of mat2. For example, to take the correlations of the 5 columns of mat1 with the first 4 columns of mat2 try: cor(mat1, mat2[,1:4]) col1 col2 col3 col4 col1 -0.34624254 -0.2669519 -0.2705077 0.2183249 col2 -0.26553255 -0.2687643 -0.0865895 0.1819025 col3 0.19474613 -0.2334986 0.1746522 0.2326915 col4 0.09328338 0.5117784 0.2413143 -0.3374916 col5 0.27519716 0.1605331 -0.4057137 0.3282105 On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote: Hi, I've created the following two matrices (mat1 and mat2) and a function (f) to calculate the correlations between the two on a row by row basis. mat1 - matrix(sample(1:500,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) mat2 - matrix(sample(501:1000,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) f - function(x,y) cor(x,y) When the matrices are squares (# rows = # columns) I have no problems. However, when they are not (as in the example above with 5 columns and 10 rows), I get the following error: apply(mat1, 1, f, y=mat2) Error in cor(x, y, na.method, method == kendall) : incompatible dimensions Any help would be appreciated. Thanks! - Bruce ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] cor inside/outside a function has different output
I'm calculating correlations between two matrices mat1 - matrix(sample(1:500,25), ncol = 5, dimnames=list(paste(mat1row, 1:5, sep=), paste(mat1col, 1:5, sep=))) mat2 - matrix(sample(501:1000,25), ncol = 5, dimnames=list(paste(mat2row, 1:5, sep=), paste(mat2col, 1:5, sep=))) using what would seem to be two similar methods: Method 1: f - function(x,y) cor(x,y) apply(mat1, 1, f, y=mat2) Method 2: cor(mat1, mat2) However, the results (see blow) are different: apply(mat1, 1, f, y=mat2) mat1row1 mat1row2mat1row3mat1row4mat1row5 [1,] -0.27601028 -0.1352143 0.03538690 -0.03084075 -0.60171704 [2,] -0.01595532 -0.3881197 -0.43663982 0.49081806 0.33291995 [3,] 0.35969624 -0.0582948 0.57462169 0.09926796 -0.02948423 [4,] -0.41435920 -0.7164638 -0.21213496 -0.55183934 -0.25341790 [5,] 0.33802803 0.5371508 0.05219095 0.83533575 0.17850291 cor(mat1, mat2) mat2col1mat2col2 mat2col3 mat2col4 mat2col5 mat1col1 -0.84077496 -0.01538414 -0.6078933 -0.2263840 -0.1421335 mat1col2 0.23074421 0.54606286 -0.2354733 0.5214255 -0.2129077 mat1col3 -0.8528 0.19550100 -0.5920509 -0.8694040 0.6853990 mat1col4 0.08050976 -0.55449840 0.6225666 0.6187971 -0.8971584 mat1col5 -0.10199564 -0.43854767 -0.5803077 -0.5100285 0.2848351 Also, for method 2, the calculations are done on a column x column basis. Is there any way to do this on a row by row basis. Looking at the help page for cor, I don't see any parameters that could be used to do this. Thanks, - Bruce ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply incompatible dimensions error
are you positive that your function is doing what you expect it to do? it looks like you want something like: sapply(1:10, function(i) cor(mat1[i,], mat2[i,])) b On Jul 24, 2007, at 11:05 AM, Bernzweig, Bruce ((Consultant)) wrote: Hi, I've created the following two matrices (mat1 and mat2) and a function (f) to calculate the correlations between the two on a row by row basis. mat1 - matrix(sample(1:500,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) mat2 - matrix(sample(501:1000,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) f - function(x,y) cor(x,y) When the matrices are squares (# rows = # columns) I have no problems. However, when they are not (as in the example above with 5 columns and 10 rows), I get the following error: apply(mat1, 1, f, y=mat2) Error in cor(x, y, na.method, method == kendall) : incompatible dimensions Any help would be appreciated. Thanks! - Bruce ** Please be aware that, notwithstanding the fact that the pers... {{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply incompatible dimensions error
Thanks Gabor! You state that my apply is taking rows of mat1 with columns of mat2. Is this because I have the y=mat2 parameter? apply(mat1, 1, f, y=mat2) Actually, what I would like is to run the correlations on a row against row basis: mat1 row1 x mat2 row1, etc. Thanks again, - Bruce -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 24, 2007 11:31 AM To: Bernzweig, Bruce (Consultant) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] apply incompatible dimensions error Your apply is trying to take the correlations of the rows of mat1 with the columns of mat2 which, of course, does not work if they have different numbers of columns. I think you mean to take the correlations of the columns of mat1 with the columns of mat2. For example, to take the correlations of the 5 columns of mat1 with the first 4 columns of mat2 try: cor(mat1, mat2[,1:4]) col1 col2 col3 col4 col1 -0.34624254 -0.2669519 -0.2705077 0.2183249 col2 -0.26553255 -0.2687643 -0.0865895 0.1819025 col3 0.19474613 -0.2334986 0.1746522 0.2326915 col4 0.09328338 0.5117784 0.2413143 -0.3374916 col5 0.27519716 0.1605331 -0.4057137 0.3282105 On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote: Hi, I've created the following two matrices (mat1 and mat2) and a function (f) to calculate the correlations between the two on a row by row basis. mat1 - matrix(sample(1:500,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) mat2 - matrix(sample(501:1000,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) f - function(x,y) cor(x,y) When the matrices are squares (# rows = # columns) I have no problems. However, when they are not (as in the example above with 5 columns and 10 rows), I get the following error: apply(mat1, 1, f, y=mat2) Error in cor(x, y, na.method, method == kendall) : incompatible dimensions Any help would be appreciated. Thanks! - Bruce ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply incompatible dimensions error
Then try this: cor(t(mat1), t(mat2)) Also note 1. the above implies that mat1 and mat2 must have the same number of columns since if x and y are vectors cor(x,y) only makes sense if they have the same length. 2. the usual convention is that variables are stored as columns andt that rows correspond to cases so typically you would have (in terms of your mat1 and mat2): Mat1 - t(mat1) Mat2 - t(mat2) and then use Mat1 and Mat2, e.g. cor(Mat1, Mat2) On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote: Thanks Gabor! You state that my apply is taking rows of mat1 with columns of mat2. Is this because I have the y=mat2 parameter? apply(mat1, 1, f, y=mat2) Actually, what I would like is to run the correlations on a row against row basis: mat1 row1 x mat2 row1, etc. Thanks again, - Bruce -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 24, 2007 11:31 AM To: Bernzweig, Bruce (Consultant) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] apply incompatible dimensions error Your apply is trying to take the correlations of the rows of mat1 with the columns of mat2 which, of course, does not work if they have different numbers of columns. I think you mean to take the correlations of the columns of mat1 with the columns of mat2. For example, to take the correlations of the 5 columns of mat1 with the first 4 columns of mat2 try: cor(mat1, mat2[,1:4]) col1 col2 col3 col4 col1 -0.34624254 -0.2669519 -0.2705077 0.2183249 col2 -0.26553255 -0.2687643 -0.0865895 0.1819025 col3 0.19474613 -0.2334986 0.1746522 0.2326915 col4 0.09328338 0.5117784 0.2413143 -0.3374916 col5 0.27519716 0.1605331 -0.4057137 0.3282105 On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote: Hi, I've created the following two matrices (mat1 and mat2) and a function (f) to calculate the correlations between the two on a row by row basis. mat1 - matrix(sample(1:500,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) mat2 - matrix(sample(501:1000,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) f - function(x,y) cor(x,y) When the matrices are squares (# rows = # columns) I have no problems. However, when they are not (as in the example above with 5 columns and 10 rows), I get the following error: apply(mat1, 1, f, y=mat2) Error in cor(x, y, na.method, method == kendall) : incompatible dimensions Any help would be appreciated. Thanks! - Bruce ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** Please be aware that, notwithstanding the fact that the person sending this communication has an address in Bear Stearns' e-mail system, this person is not an employee, agent or representative of Bear Stearns. Accordingly, this person has no power or authority to represent, make any recommendation, solicitation, offer or statements or disclose information on behalf of or in any way bind Bear Stearns or any of its affiliates. ** __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply incompatible dimensions error
that's garbor's suggestion then. sorry for the misunderstanding. :-) b On Jul 24, 2007, at 11:35 AM, Bernzweig, Bruce ((Consultant)) wrote: Thanks Benilton, I know what I want to do, just not sure how to do it using R. The help documentation is not very clear. What I am trying to do is calculate correlations on a row against row basis: mat1 row1 x mat2 row1, mat1 row1 x mat2 row2, ... mat1 row1 x mat2 row-n, mat1 row-n, mat2 row-n - Bruce -Original Message- From: Benilton Carvalho [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 24, 2007 11:31 AM To: Bernzweig, Bruce (Consultant) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] apply incompatible dimensions error are you positive that your function is doing what you expect it to do? it looks like you want something like: sapply(1:10, function(i) cor(mat1[i,], mat2[i,])) b On Jul 24, 2007, at 11:05 AM, Bernzweig, Bruce ((Consultant)) wrote: Hi, I've created the following two matrices (mat1 and mat2) and a function (f) to calculate the correlations between the two on a row by row basis. mat1 - matrix(sample(1:500,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) mat2 - matrix(sample(501:1000,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) f - function(x,y) cor(x,y) When the matrices are squares (# rows = # columns) I have no problems. However, when they are not (as in the example above with 5 columns and 10 rows), I get the following error: apply(mat1, 1, f, y=mat2) Error in cor(x, y, na.method, method == kendall) : incompatible dimensions Any help would be appreciated. Thanks! - Bruce * * Please be aware that, notwithstanding the fact that the pers... {{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ** Please be aware that, notwithstanding the fact that the person sending this communication has an address in Bear Stearns' e-mail system, this person is not an employee, agent or representative of Bear Stearns. Accordingly, this person has no power or authority to represent, make any recommendation, solicitation, offer or statements or disclose information on behalf of or in any way bind Bear Stearns or any of its affiliates. ** __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply incompatible dimensions error
Thanks for the explanation. As for the rows/columns thing, the data I receive is given to me in that way. I currently read it in using read.csv. Is there a function I should look at that can take that and transpose it or should I just process the data first outside of R? Thanks, - Bruce -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 24, 2007 11:43 AM To: Bernzweig, Bruce (Consultant) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] apply incompatible dimensions error Then try this: cor(t(mat1), t(mat2)) Also note 1. the above implies that mat1 and mat2 must have the same number of columns since if x and y are vectors cor(x,y) only makes sense if they have the same length. 2. the usual convention is that variables are stored as columns andt that rows correspond to cases so typically you would have (in terms of your mat1 and mat2): Mat1 - t(mat1) Mat2 - t(mat2) and then use Mat1 and Mat2, e.g. cor(Mat1, Mat2) On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote: Thanks Gabor! You state that my apply is taking rows of mat1 with columns of mat2. Is this because I have the y=mat2 parameter? apply(mat1, 1, f, y=mat2) Actually, what I would like is to run the correlations on a row against row basis: mat1 row1 x mat2 row1, etc. Thanks again, - Bruce -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 24, 2007 11:31 AM To: Bernzweig, Bruce (Consultant) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] apply incompatible dimensions error Your apply is trying to take the correlations of the rows of mat1 with the columns of mat2 which, of course, does not work if they have different numbers of columns. I think you mean to take the correlations of the columns of mat1 with the columns of mat2. For example, to take the correlations of the 5 columns of mat1 with the first 4 columns of mat2 try: cor(mat1, mat2[,1:4]) col1 col2 col3 col4 col1 -0.34624254 -0.2669519 -0.2705077 0.2183249 col2 -0.26553255 -0.2687643 -0.0865895 0.1819025 col3 0.19474613 -0.2334986 0.1746522 0.2326915 col4 0.09328338 0.5117784 0.2413143 -0.3374916 col5 0.27519716 0.1605331 -0.4057137 0.3282105 On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote: Hi, I've created the following two matrices (mat1 and mat2) and a function (f) to calculate the correlations between the two on a row by row basis. mat1 - matrix(sample(1:500,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) mat2 - matrix(sample(501:1000,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) f - function(x,y) cor(x,y) When the matrices are squares (# rows = # columns) I have no problems. However, when they are not (as in the example above with 5 columns and 10 rows), I get the following error: apply(mat1, 1, f, y=mat2) Error in cor(x, y, na.method, method == kendall) : incompatible dimensions Any help would be appreciated. Thanks! - Bruce ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** Please be aware that, notwithstanding the fact that the person sending this communication has an address in Bear Stearns' e-mail system, this person is not an employee, agent or representative of Bear Stearns. Accordingly, this person has no power or authority to represent, make any recommendation, solicitation, offer or statements or disclose information on behalf of or in any way bind Bear Stearns or any of its affiliates. ** ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Using lmer with huge amount of data
Based on the examples I've seen in using statistical analysis packages such as lmer, it seems that people usually tabulate all the input data into one file with the first line indicating the variable names (or labels), and then read the file inside R. However, in my case I can't do that because of the huge amount of imaging data. Suppose I have a one-way within-subject ANCOVA with one covariate, and I would like to use lmer in R package lme4 to analyze the data. In the terminology of linear mixed models, I have a fixed factor A with 3 levels, a random factor B (subject), and a covariate (age) with a model like this MyResult - lmer(Response ~ FactorA + Age + (1 | subject), MyData, ...) My input data are like this: For each subject I have a file (a huge matrix) storing the response values of the subject at many locations (~30,000 voxels) corresponding to factor A at the 1st level, another file for factor A at the 2nd level, and a 3rd file for factor A at the 3rd level. Then I have another file storing the age of those subjects. The analysis with the linear mixed model above would be done at each voxel separately. It seems impractical to create one gigantic file or matrix to feed into the above command line because of the big number of voxels. I'm not sure how to proceed in this case. Any suggestions would be highly appreciated. Also if I'm concerned about any potential violation of sphericity among the 3 levels of factor A, how can I test sphericity violation in lmer? And if violation exists, how can I make corrections in contrast testing? Thank you very much, Gang __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cor inside/outside a function has different output
Sorry. I looked up t after writing the previous email and realized that was what I was looking for! -Original Message- From: Gabor Grothendieck [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 24, 2007 11:48 AM To: Bernzweig, Bruce (Consultant) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] cor inside/outside a function has different output I think this is really answered already in my previous post but just in case try this: res1 - t(apply(mat1, 1, cor, t(mat2))) res2 - cor(t(mat1), t(mat2)) all.equal(res1, res2, check.attributes = FALSE) [1] TRUE On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote: I'm calculating correlations between two matrices mat1 - matrix(sample(1:500,25), ncol = 5, dimnames=list(paste(mat1row, 1:5, sep=), paste(mat1col, 1:5, sep=))) mat2 - matrix(sample(501:1000,25), ncol = 5, dimnames=list(paste(mat2row, 1:5, sep=), paste(mat2col, 1:5, sep=))) using what would seem to be two similar methods: Method 1: f - function(x,y) cor(x,y) apply(mat1, 1, f, y=mat2) Method 2: cor(mat1, mat2) However, the results (see blow) are different: apply(mat1, 1, f, y=mat2) mat1row1 mat1row2mat1row3mat1row4mat1row5 [1,] -0.27601028 -0.1352143 0.03538690 -0.03084075 -0.60171704 [2,] -0.01595532 -0.3881197 -0.43663982 0.49081806 0.33291995 [3,] 0.35969624 -0.0582948 0.57462169 0.09926796 -0.02948423 [4,] -0.41435920 -0.7164638 -0.21213496 -0.55183934 -0.25341790 [5,] 0.33802803 0.5371508 0.05219095 0.83533575 0.17850291 cor(mat1, mat2) mat2col1mat2col2 mat2col3 mat2col4 mat2col5 mat1col1 -0.84077496 -0.01538414 -0.6078933 -0.2263840 -0.1421335 mat1col2 0.23074421 0.54606286 -0.2354733 0.5214255 -0.2129077 mat1col3 -0.8528 0.19550100 -0.5920509 -0.8694040 0.6853990 mat1col4 0.08050976 -0.55449840 0.6225666 0.6187971 -0.8971584 mat1col5 -0.10199564 -0.43854767 -0.5803077 -0.5100285 0.2848351 Also, for method 2, the calculations are done on a column x column basis. Is there any way to do this on a row by row basis. Looking at the help page for cor, I don't see any parameters that could be used to do this. Thanks, - Bruce ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cor inside/outside a function has different output
I think this is really answered already in my previous post but just in case try this: res1 - t(apply(mat1, 1, cor, t(mat2))) res2 - cor(t(mat1), t(mat2)) all.equal(res1, res2, check.attributes = FALSE) [1] TRUE On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote: I'm calculating correlations between two matrices mat1 - matrix(sample(1:500,25), ncol = 5, dimnames=list(paste(mat1row, 1:5, sep=), paste(mat1col, 1:5, sep=))) mat2 - matrix(sample(501:1000,25), ncol = 5, dimnames=list(paste(mat2row, 1:5, sep=), paste(mat2col, 1:5, sep=))) using what would seem to be two similar methods: Method 1: f - function(x,y) cor(x,y) apply(mat1, 1, f, y=mat2) Method 2: cor(mat1, mat2) However, the results (see blow) are different: apply(mat1, 1, f, y=mat2) mat1row1 mat1row2mat1row3mat1row4mat1row5 [1,] -0.27601028 -0.1352143 0.03538690 -0.03084075 -0.60171704 [2,] -0.01595532 -0.3881197 -0.43663982 0.49081806 0.33291995 [3,] 0.35969624 -0.0582948 0.57462169 0.09926796 -0.02948423 [4,] -0.41435920 -0.7164638 -0.21213496 -0.55183934 -0.25341790 [5,] 0.33802803 0.5371508 0.05219095 0.83533575 0.17850291 cor(mat1, mat2) mat2col1mat2col2 mat2col3 mat2col4 mat2col5 mat1col1 -0.84077496 -0.01538414 -0.6078933 -0.2263840 -0.1421335 mat1col2 0.23074421 0.54606286 -0.2354733 0.5214255 -0.2129077 mat1col3 -0.8528 0.19550100 -0.5920509 -0.8694040 0.6853990 mat1col4 0.08050976 -0.55449840 0.6225666 0.6187971 -0.8971584 mat1col5 -0.10199564 -0.43854767 -0.5803077 -0.5100285 0.2848351 Also, for method 2, the calculations are done on a column x column basis. Is there any way to do this on a row by row basis. Looking at the help page for cor, I don't see any parameters that could be used to do this. Thanks, - Bruce ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] plotting a summary.rq object in using pkg quantreg
Hello, I am having problems adjusting the plot output from the quantreg package. Anyone know what I'm doing wrong? For example (borrowing from the help files): plot(summary(rq(foodexp~income,tau = 1:49/50,data=engel)), nrow=1, ncol=2,alpha = .4, ols = TRUE, xlab=test) The alpha= parameter seems to have no effect on my output, even when I set it to a ridiculous value like 0.4. Also, though in the help file it says |...| = optional arguments to plot, xlab (as an example) seems to do nothing. If the answer is that I should extract the values I need and construct the plot I want independently of the rq.process object, that it okay I suppose, if inefficient. Maybe a more fundamental question is how do I get in and see how plot is working in this case so that I can modify. Thanks much! J P.S. I've explored using plot.summary.rqs but the problems seem to be the same. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fitting exponential curve to data points
Andrew Clegg andrew.clegg at gmail.com writes: ... If I want to demonstrate that a non-linear curve fits better than an exponential, what's the best measure for that? Given that neither of nls() or optim() provide R-squared. To supplement Karl's comment, try Douglas Bates' (author of nls) comments on the matter http://www.ens.gu.edu.au/ROBERTK/R/HELP/00B/0399.HTML Short summary: * ... the lack of automatic ANOVA, R^2 and adj. R^2 from nls is a feature, not a bug :-) * My best advice regarding R^2 statistics with nonlinear models is, as Nancy Reagan suggested, Just say no. Dieter __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] crimtab related question
Dear all, the dataset documented under ?crimtab was also used in: @article{TreloarAE1934, title = {The adequacy of {S}tudent's criterion of deviations in small sample means}, author = {Treloar, A.E. and Wilder, M.A.}, journal = {The Annals of Mathematical Statistics}, volume = {5}, pages = {324-341}, year = {1934} } The following is from page 335 of the above paper: From the table provided by MacDonell (1902) on the associated variation of stature (to the nearest inch) and length of the left middle finger (to the nearest millimeter) in 3000 British criminals, the measusurements were transferred to 3000 numbered Denison metal-rim tags from which the cords has been removed. After thorough checking and mixing of these circular disks, samples of 5 tags each were drawn at random until the supply was exhausted. Unfortunately, three of these samples were erroneously returned to a receiving box before being copied, and the records of 597 samples only are available. Could someone give me a clue about the kind of device that was used here? Is it a kind of lottery machine? I don't understand why three samples were lost. What is this receiving box? Thanks for any hint, Best, -- Jean R. Lobry([EMAIL PROTECTED]) Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I, 43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE allo : +33 472 43 27 56 fax: +33 472 43 13 88 http://pbil.univ-lyon1.fr/members/lobry/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.
Hi all, If the question below as been answered before I apologize for the posting. I would like to get the frequencies of occurrence of all values in a given variable in a multivariate dataset. In short for each variable (or field) a summary of values contained with in a value:frequency pair, there can be many such pairs for a given variable. I would like to do the same for several such variables. I have used table() but am unable to extract the individual value and frequency values. Please advise. Allan. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fitting the best line to the plot of distance vs. correlation matrix
Hi all, Thanks for your help in generating the matrix of distance vs correlation. I did it using plot(as.vector(as.matrix(cormat)), as.vector(as.matrix(distmat))) Now I want to quantitate the same. May be on linear regression or some other statistical functions. I have tried using linmod for linear regression. But as I have two matrices in the form of the dataframes, I'm wondering if it is the right way to do it in this? Or are there even better options than this available. Can anyone please help me for that. Thanks. Urmi - Once upon a time there was 1 GB storage in your inbox. Click here for happy ending. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme or gls prediction intervals
Hi folks, I am trying to generate 95% confidence intervals for a gls model using predict.nlme with R version 2.5.1 (2007-06-27) . nlme: Linear and Nonlinear Mixed Effects Models. R package version 3.1-83. I have looked in help, and I can do it for lm and glm models, and I can generate simple predictions for lme models with various levels -- I am familiar with the basics. Is there a way to get prediction intervals for gls models? My best model uses varPower(), so I am reluctant to fall back on lm predictions. Thank you, Hank Dr. Hank Stevens, Associate Professor 338 Pearson Hall Botany Department Miami University Oxford, OH 45056 Office: (513) 529-4206 Lab: (513) 529-4262 FAX: (513) 529-4243 http://www.cas.muohio.edu/~stevenmh/ http://www.muohio.edu/ecology/ http://www.muohio.edu/botany/ E Pluribus Unum __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] x,y,z table to matrix with x as rows and y as columns
Hello all, I am sure I am missing something obvious but I cannot find the function I am looking for. I have a data frame with three columns: X, Y and Z, with X and Y being grid coordinates and Z the value associated with these coordinates. I want to transform this data frame in a matrix of Z values, on the grid defined by X and Y (and, as a plus, fill the X.Y combinations which do no exist in the original data frame with NAs in the resulting matrix). I could do this manually but I guess the appropriate function should be somewhere around. I just can't find it. Thank you in advance for your help. JiHO --- http://jo.irisson.free.fr/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] plotting a summary.rq object in using pkg quantreg
Package questions to package maintainers please. The short answer is that your alpha = .4 parameter needs to be passed to summary not to plot. Try this: plot(summary(rq(foodexp~income,tau = 1:49/50,data=engel),alpha =. 4), nrow=1, ncol=2, ols = TRUE) A longer answer would involve a boring disquisition about various fitting methods and standard error estimation methods and their historical evolution and defaults. (By default rank-based confidence bands are being used for the engel data since the sample size is relatively small.) Regarding your more fundamental question: you can always modify functions such as summary.rq or plot.summary.rqs -- see for example ?fix. url:www.econ.uiuc.edu/~rogerRoger Koenker email [EMAIL PROTECTED] Department of Economics vox:217-333-4558University of Illinois fax:217-244-6678Champaign, IL 61820 On Jul 24, 2007, at 11:07 AM, Jeff G. wrote: Hello, I am having problems adjusting the plot output from the quantreg package. Anyone know what I'm doing wrong? For example (borrowing from the help files): plot(summary(rq(foodexp~income,tau = 1:49/50,data=engel)), nrow=1, ncol=2,alpha = .4, ols = TRUE, xlab=test) The alpha= parameter seems to have no effect on my output, even when I set it to a ridiculous value like 0.4. Also, though in the help file it says |...| = optional arguments to plot, xlab (as an example) seems to do nothing. If the answer is that I should extract the values I need and construct the plot I want independently of the rq.process object, that it okay I suppose, if inefficient. Maybe a more fundamental question is how do I get in and see how plot is working in this case so that I can modify. Thanks much! J P.S. I've explored using plot.summary.rqs but the problems seem to be the same. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Calculating subsets of row pairs using somthing faster than a for loop.
Hi all, Situation: - I have two matrices each w/ 4 rows and 20 columns. mat1 - matrix(sample(1:500,80), ncol = 20, dimnames=list(paste(mat1row, 1:4, sep=), paste(mat1col, 1:20, sep=))) mat2 - matrix(sample(501:1000,80), ncol = 20, dimnames=list(paste(mat2row, 1:4, sep=), paste(mat2col, 1:20, sep=))) - Each column represents a value in a time series. Q: What do I want: Calculate moving average correlations for each row x row pair: For each row x row pair I want 10 values representing moving average correlations for 10 sets of time-values: cor(mat1[1,1:10], mat2[1,1:10]) cor(mat1[1,2:11], mat2[1,2:11]) ... cor(mat1[1,11:20], mat2[1,11:20]) cor(mat1[1,1:10], mat2[2,1:10]) ... cor(mat1[4,11:20], mat2[4,11:20]) Result would be a 16 (rows) x 10 (col) matrix matMA ma1, ma2, ..., ma10 for (mat1 row1) x (mat2 row1) ma1, ma2, ..., ma10 for (mat1 row1) x (mat2 row2) ... ma1, ma2, ..., ma10 for (mat1 row4) x (mat2 row3) ma1, ma2, ..., ma10 for (mat1 row4) x (mat2 row4) I would like to be able to do this without using a for loop due to the slowness of that method. Is it possible to iterate through subsets w/o using a for loop? Thanks, - Bruce P ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] apply incompatible dimensions error
Thanks Benilton, I know what I want to do, just not sure how to do it using R. The help documentation is not very clear. What I am trying to do is calculate correlations on a row against row basis: mat1 row1 x mat2 row1, mat1 row1 x mat2 row2, ... mat1 row1 x mat2 row-n, mat1 row-n, mat2 row-n - Bruce -Original Message- From: Benilton Carvalho [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 24, 2007 11:31 AM To: Bernzweig, Bruce (Consultant) Cc: r-help@stat.math.ethz.ch Subject: Re: [R] apply incompatible dimensions error are you positive that your function is doing what you expect it to do? it looks like you want something like: sapply(1:10, function(i) cor(mat1[i,], mat2[i,])) b On Jul 24, 2007, at 11:05 AM, Bernzweig, Bruce ((Consultant)) wrote: Hi, I've created the following two matrices (mat1 and mat2) and a function (f) to calculate the correlations between the two on a row by row basis. mat1 - matrix(sample(1:500,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) mat2 - matrix(sample(501:1000,50), ncol = 5, dimnames=list(paste(row, 1:10, sep=), paste(col, 1:5, sep=))) f - function(x,y) cor(x,y) When the matrices are squares (# rows = # columns) I have no problems. However, when they are not (as in the example above with 5 columns and 10 rows), I get the following error: apply(mat1, 1, f, y=mat2) Error in cor(x, y, na.method, method == kendall) : incompatible dimensions Any help would be appreciated. Thanks! - Bruce ** Please be aware that, notwithstanding the fact that the pers... {{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Computing the rank of a matrix.
RV == RAVI VARADHAN [EMAIL PROTECTED] on Sat, 07 Apr 2007 18:39:36 -0400 writes: this is a bit a late reply... better late than never RV Hi Martin, Hi Ravi, thanks for your research. RV I played a bit with rankMat. I will first state the RV conclusions of my numerical experiments and then present RV the results to support them: RV 1. I don't believe that rankMat, or equivalently RV Matlab's rank computation, is necessarily better than RV qr(.)$rank or (qr., LAPACK=TRUE)$rank. In fact, for the RV notorious Hilbert matrix, rankMat can give poor RV estimates of rank. RV 2. There exists no universally optimal (i.e. optimal RV for all A) tol in qr(A, tol)$rank that would be RV optimally close to rankMat. The discrepancy in the RV ranks computed by qr(A)$rank and rankMat(A) obviously RV depends on the matrix A. This is evident from the tol RV used in rankMat: RV tol - max(d) * .Machine$double.eps * abs(singValA[1]) RV So, this value of tol in qr will also minimize the discrepancy. RV 3. The tol parameter is irrelevant in qr(A, tol, RV LAPACK=TRUE)$rank, i.e. LAPACK doesn't seem to utilize RV the tol parameter when computing the rank. However, RV qr(A, tol, LAPACK=FALSE)$rank does depend on tol. Yes, indeed! The help page of qr() actually says so {at least now, don't know about 3 months ago}. RV Now, here are the results: RV 1. hilbert.rank - matrix(NA,20,3) hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) } for (i in 1:20) { RV + himat - hilbert(i) RV + hilbert.rank[i,1] - rankMat(himat) RV + hilbert.rank[i,2] - qr(himat,tol=1.0e-14)$rank RV + hilbert.rank[i,3] - qr(himat, tol = .Machine$double.eps, LAPACK = TRUE)$rank RV + } hilbert.rank RV [,1] [,2] [,3] RV [1,]111 RV [2,]222 RV [3,]333 RV [4,]444 RV [5,]555 RV [6,]666 RV [7,]777 RV [8,]888 RV [9,]999 RV [10,] 10 10 10 RV [11,] 10 11 11 RV [12,] 11 12 12 RV [13,] 11 12 13 RV [14,] 11 13 14 RV [15,] 12 13 15 RV [16,] 12 15 16 RV [17,] 12 16 17 RV [18,] 12 16 18 RV [19,] 13 17 19 RV [20,] 13 17 20 RV We see that rankMat underestimates the rank for n 10, but qr(himat, tol = .Machine$double.eps, LAPACK=TRUE)$rank gets it right. Yes, indeed; and that's seems a bit against the ``common knowledge'' that svd() is more reliable than qr() Hmm I'm still baffled a bit.. Note that with the Hilbert matrices, one might argue that hilbert(20) might really not have a correct estimated rank of 20, but at least for hilbert(13) or so, the rank should be == n. BTW, there's a nice plot, slightly related to this problem, using rcond() from the Matrix package : library(Matrix) hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) } rcHilb - sapply(1:20, function(n) rcond(Matrix(hilbert(n plot(rcHilb, xlab = n, log = y, col = 2, type = b, main = reciprocal condition numbers of Hilbert(n)) where one sees that the LAPACK code that underlies Matrix::rcond() also gets into problem when estimating the condition number for hilbert(n) when n ~ 16 . I think if we wanted to make real progress here, we'd have to consult with numerical analyist specialists. But for me the topic is too remote to be followed up further at the moment. One conclusion for me is that to estimate the rank of a matrix in current versions of R, one should use rankMat - function(x) qr(x, LAPACK = TRUE)$rank (as was suggested as one possibility in the original thread) Regards, and thank you again, Ravi. Martin Maechler, ETH Zurich RV 2. RV Here I first, created a random rectangular matrix, and then added a number of rows to it, where these new rows are the same as some of the original rows except for a tiny amount of noise, which I call eps. So, the degree of rank deficiency is controlled by eps. I let eps take on 3 values: 1.e-07, 1.e-10, and 1.e-14, and show that the optimal tol (in terms of being close to rankMat) depends on the level of precision (eps) in the matrix. set.seed(123) nrow - 15 ncol - 20 nsim - 5000 ndefic - 4 # number of nearly dependent rows eps - 1.e-07 rnk - matrix(NA, nsim, 5) for (j in 1:nsim) { RV + A - matrix(rnorm(nrow*ncol),nrow, ncol) RV + newrows - matrix(NA, ndefic, ncol) RV + for (i in 1:ndefic) { RV + newrows[i,] - A[nrow-i,] + rnorm(ncol, sd=min(abs(A[nrow-i+1,]))* eps) RV + } RV + RV + B - rbind(A,newrows) RV + rnk[j,1] - rankMat(B) RV + rnk[j,2] - qr(B, tol = 1.e-07)$rank RV + rnk[j,3] - qr(B, tol = 1.e-11)$rank RV + rnk[j,4] - qr(B, tol = 1.0e-14)$rank RV + rnk[j,5] - qr(B,
[R] quantreg behavior changes for N1000
Hello again R-experts and novices (like me), This seems like a bug to me - or maybe it's intentional...can anyone confirm? Up to 1000 reps, summary() of a rq object gives different output and subtly different confidence interval estimates. ThanksJeff testx=runif(1200) testy=rnorm(1200, 5) test.rq=summary(rq(testy[1:1000]~testx[1:1000], tau=2:98/100)) test.rq[[1]] Gives this output: Call: rq(formula = testy[1:1000] ~ testx[1:1000], tau = 2:98/100) tau: [1] 0.02 Coefficients: coefficients lower bd upper bd (Intercept)3.00026 2.45142 3.17098 testx[1:1000] -0.00870 -0.39817 0.49946 test.rq=summary(rq(testy[1:1001]~testx[1:1001], tau=2:98/100)) test.rq[[1]] Gives this (different) output: Call: rq(formula = testy[1:1001] ~ testx[1:1001], tau = 2:98/100) tau: [1] 0.02 Coefficients: ValueStd. Error t value Pr(|t|) (Intercept)3.00026 0.21605 13.88658 0.0 testx[1:1001] -0.00870 0.32976 -0.02638 0.97896 plot(test.rq, nrow=2, ncol=2) # The slope estimates appear to be the same but there are subtle differences in the confidence intervals, which shouldn't be due simply to the inclusion of one more point. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] quantreg behavior changes for N1000
When in doubt: RTFM -- Quoting from ?summary.rq se: specifies the method used to compute standard standard errors. There are currently five available methods: 1. 'rank' which produces confidence intervals for the estimated parameters by inverting a rank test as described in Koenker (1994). The default option assumes that the errors are iid, while the option iid = FALSE implements the proposal of Koenker Machado (1999). This is the default method unless the sample size exceeds 1001, or cov = FALSE in which case se = nid is used. url:www.econ.uiuc.edu/~rogerRoger Koenker email[EMAIL PROTECTED]Department of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Jul 24, 2007, at 12:57 PM, Jeff G. wrote: Hello again R-experts and novices (like me), This seems like a bug to me - or maybe it's intentional...can anyone confirm? Up to 1000 reps, summary() of a rq object gives different output and subtly different confidence interval estimates. ThanksJeff testx=runif(1200) testy=rnorm(1200, 5) test.rq=summary(rq(testy[1:1000]~testx[1:1000], tau=2:98/100)) test.rq[[1]] Gives this output: Call: rq(formula = testy[1:1000] ~ testx[1:1000], tau = 2:98/100) tau: [1] 0.02 Coefficients: coefficients lower bd upper bd (Intercept)3.00026 2.45142 3.17098 testx[1:1000] -0.00870 -0.39817 0.49946 test.rq=summary(rq(testy[1:1001]~testx[1:1001], tau=2:98/100)) test.rq[[1]] Gives this (different) output: Call: rq(formula = testy[1:1001] ~ testx[1:1001], tau = 2:98/100) tau: [1] 0.02 Coefficients: ValueStd. Error t value Pr(|t|) (Intercept)3.00026 0.21605 13.88658 0.0 testx[1:1001] -0.00870 0.32976 -0.02638 0.97896 plot(test.rq, nrow=2, ncol=2) # The slope estimates appear to be the same but there are subtle differences in the confidence intervals, which shouldn't be due simply to the inclusion of one more point. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [R-pkgs] New package: pomp, inference for partially-observed Markov processes
To: [EMAIL PROTECTED] Subject: New package: pomp, inference for partially-observed Markov processes The new package 'pomp' is built around a very general realization of nonlinear partially-observed Markov processes (AKA state-space models, nonlinear stochastic dynamical systems). The user provides functions specifying the model's process and measurement components. The package's algorithms are built on top of these functions. At the moment, algorithms are provided for particle filtering (AKA sequential importance sampling or sequential Monte Carlo) and the likelihood maximization by iterated filtering (MIF) method of Ionides, Breto, and King (PNAS, 103:18438-18443, 2006). Future support for a variety of other algorithms is envisioned. A working group of the National Center for Ecological Analysis and Synthesis (NCEAS), Inference for Mechanistic Models, is currently implementing additional methods for this package. Simple worked examples are provided in the form of a vignette, random_walk_example. The package is provided under the GPL. Contributions are welcome, as are comments, suggestions, examples, and bug reports. The development of this package has been aided by support from the U.S. N.S.F (Grants #EF-0545276, #EF-0430120) and by the Inference for Mechanistic Models Working Group supported by the National Center for Ecological Analysis and Synthesis, a Center funded by NSF (Grant #DEB-0553768), the University of California, Santa Barbara, and the State of California. -- Aaron A. King, Ph.D. Ecology Evolutionary Biology University of Michigan http://www.umich.edu/~kingaa GPG Public Key: 0x2B00840F ___ R-packages mailing list [EMAIL PROTECTED] https://stat.ethz.ch/mailman/listinfo/r-packages __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Renamig a factor
Which is the proper way to rename a factor? If I do: test$Parc[test$Parc==OlÞrdola]-Olèrdola R complains that Warning message: invalid factor level, NAs generated in: `[-.factor`(`*tmp*`, test$Parc == OlÞrdola, value = Olèrdola) Thanks Agus -- Dr. Agustin Lobo Institut de Ciencies de la Terra Jaume Almera (CSIC) LLuis Sole Sabaris s/n 08028 Barcelona Spain Tel. 34 934095410 Fax. 34 934110012 email: [EMAIL PROTECTED] http://www.ija.csic.es/gt/obster __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Latent class analysis with ordinal manifest variables
Hi, I need a package to run Latent Class Analysis with ordinal manifest variables but i have not found it. LCA is for dichotomous variables and poLCA is for nominal ones. Can anyone tell me where i can find such a package if it exists. Thanks. Eduard [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x,y,z table to matrix with x as rows and y as columns
jiho jo.irisson at gmail.com writes: Hello all, I am sure I am missing something obvious but I cannot find the function I am looking for. I have a data frame with three columns: X, Y and Z, with X and Y being grid coordinates and Z the value associated with these coordinates. I want to transform this data frame in a matrix of Z values, on the grid defined by X and Y (and, as a plus, fill the X.Y combinations which do no exist in the original data frame with NAs in the resulting matrix). I could do this manually but I guess the appropriate function should be somewhere around. I just can't find it. I have used xtabs in the past, but xtabs returns 0 instead of NA, which makes great for cross-tabulation, the offending line is x[is.na(x)] - 0. So you would need to either modify the xtabs function or trust that z is never 0 and replace all 0's with NA. mydat - expand.grid(x=1:5, y=1:5) mydat - data.frame(mydat, z=rnorm(25)) mydat$z[sample(1:25,4)] - NA mytab - xtabs(z~x+y, mydat) Mark Lyman __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to add circular text for a graph with concentric circles
Dear R experts, I am plotting the population of students who live in a city, and in successive circular bands made of the contiguous districts that surround the city. This is a stylized figure, where I specify the area of each successive circle based on the cumulative population of students. I want to compare two sets of concentric circles across different populations - such as 'All students' and 'Private students' (those attending private school) by using the same colours and the same dimension of the outer circle. I have attached the .pdf file with the output, and the R code to generate the first set of circles. I would appreciate any tips about how to rotate the text label that marks each concentric circle (except the central circle) to be curved around the circle, located in the middle of each band, thus following the circle instead of being horizontal, as I have it now. Thank you very much, best regards, Suhas # Conurbano1a.R # R Program to generate circles, radii based on proportion of ALL STUDENTS # Prepared by Suhas, Tuesday, July 24, 2007 grid.circle(x=0.5, y=0.5, r=3*(.1268), default.units=npc, name=NULL, gp=gpar(fill=olivedrab1,col=NULL), draw=TRUE, vp=NULL) grid.circle(x=0.5, y=0.5, r=3*(0.095882), default.units=npc, name=NULL, gp=gpar(fill=cornflowerblue,col=NULL), draw=TRUE, vp=NULL) grid.circle(x=0.5, y=0.5, r=3*(0.077894), default.units=npc, name=NULL, gp=gpar(fill=aliceblue,col=NULL), draw=TRUE, vp=NULL) grid.circle(x=0.5, y=0.5, r=3*(0.061884), default.units=npc, name=NULL, gp=gpar(fill=seagreen,col=NULL), draw=TRUE, vp=NULL) grid.circle(x=0.5, y=0.5, r=3*(0.050740), default.units=npc, name=NULL, gp=gpar(fill=lightpink2,col=NULL), draw=TRUE, vp=NULL) grid.circle(x=0.5, y=0.5, r=3*(0.045906), default.units=npc, name=NULL, gp=gpar(fill=navy,col=NULL), draw=TRUE, vp=NULL) # Now to add the labels ? used trial and error for x and y values grid.text(Provincia Interior, x=0.5, y=0.85, gp=gpar(fontsize=6, col=black)) grid.text(Conurbano 4, x=0.5, y=0.75, gp=gpar(fontsize=6, col=black)) grid.text(Conurbano 3, x=0.5, y=0.71, gp=gpar(fontsize=6, col=black)) grid.text(Conurbano 2, x=0.5, y=0.67, gp=gpar(fontsize=6, col=white)) grid.text(Conurbano 1, x=0.5, y=0.645, gp=gpar(fontsize=6, col=black)) grid.text(Ciudad de Buenos Aires, x=0.5, y=0.56, gp=gpar(fontsize=6, col=white)) # Title of graph grid.text(All Students, x=0.2,y=0.95,gp=gpar(fontsize=20,col=navy)) ** Suhas D. Parandekar Senior Education Economist Latin American and Caribbean Region Tel: 202 458 7622 e-mail: [EMAIL PROTECTED]__ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using lmer with huge amount of data
Thank you very much for the response, Prof. Ripley. I think I am missing something here: how do you make this 'huge' and 'gigantic'? You have not told us how many subjects you have, but in imaging experiments it is usually no more than 50 and often less. Usually we have 10-30 subjects. For each subject you have 3 x 30,000 responses plus an age. That is under 1Mb of data per subject, so the problem looks modest unless you have many hundreds of subjects. Nothing says you need to read the data in one go, but it will be helpful to have all the data available to R at once (although this could be alleviated by using a DBMS interface). In the hypothetical situation I mentioned in my previous mail (suppose we have 12 subjects), all the input data would be stored in 3 X 12 files each of which contains 30,000 numbers, plus one more file for age. Sure I can read in those 27 files at once, and I'm not concerned about the data size at this point, but my question is: do I have to reshuffle those 36 files and create 30,000 separate arrays (one for each voxel) in R so that I could run lmer voxel-wise? I think the problem is rather going to be running 30,000 lmer fits, which in my experience often take seconds each. Each fit will only need a modest amount of data (3 responses and one age per subject). Right. What is the most efficient strategy to run such an analysis voxel-wise? Write a function, and then use apply()? Or simply do it in a loop? Thanks, Gang On Tue, 24 Jul 2007, Gang Chen wrote: Based on the examples I've seen in using statistical analysis packages such as lmer, it seems that people usually tabulate all the input data into one file with the first line indicating the variable names (or labels), and then read the file inside R. However, in my case I can't do that because of the huge amount of imaging data. Suppose I have a one-way within-subject ANCOVA with one covariate, and I would like to use lmer in R package lme4 to analyze the data. In the terminology of linear mixed models, I have a fixed factor A with 3 levels, a random factor B (subject), and a covariate (age) with a model like this MyResult - lmer(Response ~ FactorA + Age + (1 | subject), MyData, ...) My input data are like this: For each subject I have a file (a huge matrix) storing the response values of the subject at many locations (~30,000 voxels) corresponding to factor A at the 1st level, another file for factor A at the 2nd level, and a 3rd file for factor A at the 3rd level. Then I have another file storing the age of those subjects. The analysis with the linear mixed model above would be done at each voxel separately. It seems impractical to create one gigantic file or matrix to feed into the above command line because of the big number of voxels. I'm not sure how to proceed in this case. Any suggestions would be highly appreciated. Also if I'm concerned about any potential violation of sphericity among the 3 levels of factor A, how can I test sphericity violation in lmer? And if violation exists, how can I make corrections in contrast testing? Thank you very much, Gang -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x,y,z table to matrix with x as rows and y as columns
I am sure I am missing something obvious but I cannot find the function I am looking for. I have a data frame with three columns: X, Y and Z, with X and Y being grid coordinates and Z the value associated with these coordinates. I want to transform this data frame in a matrix of Z values, on the grid defined by X and Y (and, as a plus, fill the X.Y combinations which do no exist in the original data frame with NAs in the resulting matrix). I could do this manually but I guess the appropriate function should be somewhere around. I just can't find it. Immediately after my last post I realized there was a much better solution mydat - expand.grid(x=1:5, y=1:5) mydat - data.frame(mydat, z=rnorm(25)) mydat$z[sample(1:25,4)] - NA data2mat - function(x, y, z) + { + out - matrix(unlist(split(z, interaction(x,y))), ncol=length(unique(y))) + dimnames(out) - list(unique(x), unique(y)) + out + } with(mydat, data2mat(x, y, z)) Mark Lyman __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Renamig a factor
Try: levels(test$Parc)[levels(test$Parc)==OlÞrdola] - Olèrdola -- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40 S 49° 16' 22 O On 24/07/07, Agustin Lobo [EMAIL PROTECTED] wrote: Which is the proper way to rename a factor? If I do: test$Parc[test$Parc==OlÞrdola]-Olèrdola R complains that Warning message: invalid factor level, NAs generated in: `[-.factor`(`*tmp*`, test$Parc == OlÞrdola, value = Olèrdola) Thanks Agus -- Dr. Agustin Lobo Institut de Ciencies de la Terra Jaume Almera (CSIC) LLuis Sole Sabaris s/n 08028 Barcelona Spain Tel. 34 934095410 Fax. 34 934110012 email: [EMAIL PROTECTED] http://www.ija.csic.es/gt/obster __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] ggplot2 axis color
Hi: Does anyone have an idea on how to color the axis and labels using ggplot2? This is what I got: library(ggplot2) p - qplot(total_bill, tip, data = tips) NewPlot- p + geom_abline(slope=c(0.1,0.15,0.2), colour=c(red,blue,yellow),size=c(2,5,2)) NewPlot + geom_smooth(colour=green, size=3,linetype=3) NewPlot$background.fill-cornsilk NewPlot$background.colour - blue NewPlot$axis.colour-red ? it doesn't do it Thanks Felipe D. Carrillo Fishery Biologist US Fish Wildlife Service Red Bluff, California 96080 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Calculating subsets of row pairs using somthing faster than a for loop.
I doubt its any faster than using a loop but probably less code is: library(zoo) z1 - zoo(t(mat1)); z2 - zoo(t(mat2)) idx - 1:ncol(z1) out - rollapply(cbind(z1, z2), 11, by.column = FALSE, FUN = function(x) cor(x[,idx],x[,-idx])) will give you a 10 x 16 multivariate zoo series such that: out[1, ] is c(cor(t(mat1[,1:11]), t(mat2[,1:11]))) out[2, ] is c(cor(t(mat1[,2:12]), t(mat2[,2:12]))) etc. and t(out) is a matrix in the orientation you asked for. Try library(zoo) vignette(zoo) for an intro to zoo. On 7/24/07, Bernzweig, Bruce (Consultant) [EMAIL PROTECTED] wrote: Hi all, Situation: - I have two matrices each w/ 4 rows and 20 columns. mat1 - matrix(sample(1:500,80), ncol = 20, dimnames=list(paste(mat1row, 1:4, sep=), paste(mat1col, 1:20, sep=))) mat2 - matrix(sample(501:1000,80), ncol = 20, dimnames=list(paste(mat2row, 1:4, sep=), paste(mat2col, 1:20, sep=))) - Each column represents a value in a time series. Q: What do I want: Calculate moving average correlations for each row x row pair: For each row x row pair I want 10 values representing moving average correlations for 10 sets of time-values: cor(mat1[1,1:10], mat2[1,1:10]) cor(mat1[1,2:11], mat2[1,2:11]) ... cor(mat1[1,11:20], mat2[1,11:20]) cor(mat1[1,1:10], mat2[2,1:10]) ... cor(mat1[4,11:20], mat2[4,11:20]) Result would be a 16 (rows) x 10 (col) matrix matMA ma1, ma2, ..., ma10 for (mat1 row1) x (mat2 row1) ma1, ma2, ..., ma10 for (mat1 row1) x (mat2 row2) ... ma1, ma2, ..., ma10 for (mat1 row4) x (mat2 row3) ma1, ma2, ..., ma10 for (mat1 row4) x (mat2 row4) I would like to be able to do this without using a for loop due to the slowness of that method. Is it possible to iterate through subsets w/o using a for loop? Thanks, - Bruce P ** Please be aware that, notwithstanding the fact that the pers...{{dropped}} __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Obtaining summary of frequencies of value occurrences for a variable in a multivariate dataset.
The name of the table should give you the value. And if you have a matrix, you just need to convert it into a vector first. m - matrix( LETTERS[ c(1:3, 3:5, 2:4) ], nc=3 ) m [,1] [,2] [,3] [1,] A C B [2,] B D C [3,] C E D tb - table( as.vector(m) ) tb A B C D E 1 2 3 2 1 paste( names(tb), :, tb, sep= ) [1] A:1 B:2 C:3 D:2 E:1 If this is not what you want, then please give a simple example. Regards, Adai Allan Kamau wrote: Hi all, If the question below as been answered before I apologize for the posting. I would like to get the frequencies of occurrence of all values in a given variable in a multivariate dataset. In short for each variable (or field) a summary of values contained with in a value:frequency pair, there can be many such pairs for a given variable. I would like to do the same for several such variables. I have used table() but am unable to extract the individual value and frequency values. Please advise. Allan. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Regression trees using Goodness-of-Split
Hi I have two questions: 1) I would like to know if there is a package in R that constructs a regression tree using the 'goodness-of-split' algorithm for survival analysis proposed by Le Blanc and Crowley (1993) (rather than the usual CART algorithm that uses within-node difference and impurity functions). In other words, is there a tree package based on between-node difference using a two sample test e.g log rank test. 2)If such a package exists, are there user-defined split functions (can we substitute other split functions). Thanks for any help Fiona -- Fiona Callaghan, MS A432 Crabtree Hall Department of Biostatistics Graduate School of Public Health University of Pittsburgh Phone 412 624 3063 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2 axis color
Hi Felipe, Looks like a bug! I'll try and get it fixed for the next version. In the meantime, you can read the last chapter of the ggplot book to see how to fix it with grid. Hadley On 7/24/07, Felipe Carrillo [EMAIL PROTECTED] wrote: Hi: Does anyone have an idea on how to color the axis and labels using ggplot2? This is what I got: library(ggplot2) p - qplot(total_bill, tip, data = tips) NewPlot- p + geom_abline(slope=c(0.1,0.15,0.2), colour=c(red,blue,yellow),size=c(2,5,2)) NewPlot + geom_smooth(colour=green, size=3,linetype=3) NewPlot$background.fill-cornsilk NewPlot$background.colour - blue NewPlot$axis.colour-red ? it doesn't do it Thanks Felipe D. Carrillo Fishery Biologist US Fish Wildlife Service Red Bluff, California 96080 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using lmer with huge amount of data
On Tue, 24 Jul 2007, Gang Chen wrote: Thank you very much for the response, Prof. Ripley. I think I am missing something here: how do you make this 'huge' and 'gigantic'? You have not told us how many subjects you have, but in imaging experiments it is usually no more than 50 and often less. Usually we have 10-30 subjects. For each subject you have 3 x 30,000 responses plus an age. That is under 1Mb of data per subject, so the problem looks modest unless you have many hundreds of subjects. Nothing says you need to read the data in one go, but it will be helpful to have all the data available to R at once (although this could be alleviated by using a DBMS interface). In the hypothetical situation I mentioned in my previous mail (suppose we have 12 subjects), all the input data would be stored in 3 X 12 files each of which contains 30,000 numbers, plus one more file for age. Sure I can read in those 27 files at once, and I'm not concerned about the data size at this point, but my question is: do I have to reshuffle those 36 files and create 30,000 separate arrays (one for each voxel) in R so that I could run lmer voxel-wise? No. You can index data structures in R. I think the problem is rather going to be running 30,000 lmer fits, which in my experience often take seconds each. Each fit will only need a modest amount of data (3 responses and one age per subject). Right. What is the most efficient strategy to run such an analysis voxel-wise? Write a function, and then use apply()? Or simply do it in a loop? apply() is a loop internally. I would just use a for() loop here, probably running groups of voxels in different jobs run simultaneously on multi-CPU machines. Thanks, Gang On Tue, 24 Jul 2007, Gang Chen wrote: Based on the examples I've seen in using statistical analysis packages such as lmer, it seems that people usually tabulate all the input data into one file with the first line indicating the variable names (or labels), and then read the file inside R. However, in my case I can't do that because of the huge amount of imaging data. Suppose I have a one-way within-subject ANCOVA with one covariate, and I would like to use lmer in R package lme4 to analyze the data. In the terminology of linear mixed models, I have a fixed factor A with 3 levels, a random factor B (subject), and a covariate (age) with a model like this MyResult - lmer(Response ~ FactorA + Age + (1 | subject), MyData, ...) My input data are like this: For each subject I have a file (a huge matrix) storing the response values of the subject at many locations (~30,000 voxels) corresponding to factor A at the 1st level, another file for factor A at the 2nd level, and a 3rd file for factor A at the 3rd level. Then I have another file storing the age of those subjects. The analysis with the linear mixed model above would be done at each voxel separately. It seems impractical to create one gigantic file or matrix to feed into the above command line because of the big number of voxels. I'm not sure how to proceed in this case. Any suggestions would be highly appreciated. Also if I'm concerned about any potential violation of sphericity among the 3 levels of factor A, how can I test sphericity violation in lmer? And if violation exists, how can I make corrections in contrast testing? Thank you very much, Gang -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Computing the rank of a matrix.
Hi Martin, I just realized (courtesy: ?qr) that LAPACK=TRUE always gives full rank, no matter what the matrix and tolerance are. So, clearly my results using LAPACK=TRUE should be ignored. So, the real comparison is only between rankMat and qr(., LAPACK=FALSE)$rank. I cant help but fell that there can be no correct solution to an ill-posed problem. Furthermore, I haven't come across a real application where the numerical estimate of a rank is directly useful. Best, Ravi. --- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: [EMAIL PROTECTED] Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html -Original Message- From: Martin Maechler [mailto:[EMAIL PROTECTED] Sent: Tuesday, July 24, 2007 1:43 PM To: RAVI VARADHAN Cc: Martin Maechler; r-help@stat.math.ethz.ch Subject: Re: [R] Computing the rank of a matrix. RV == RAVI VARADHAN [EMAIL PROTECTED] on Sat, 07 Apr 2007 18:39:36 -0400 writes: this is a bit a late reply... better late than never RV Hi Martin, Hi Ravi, thanks for your research. RV I played a bit with rankMat. I will first state the RV conclusions of my numerical experiments and then present RV the results to support them: RV 1. I don't believe that rankMat, or equivalently RV Matlab's rank computation, is necessarily better than RV qr(.)$rank or (qr., LAPACK=TRUE)$rank. In fact, for the RV notorious Hilbert matrix, rankMat can give poor RV estimates of rank. RV 2. There exists no universally optimal (i.e. optimal RV for all A) tol in qr(A, tol)$rank that would be RV optimally close to rankMat. The discrepancy in the RV ranks computed by qr(A)$rank and rankMat(A) obviously RV depends on the matrix A. This is evident from the tol RV used in rankMat: RV tol - max(d) * .Machine$double.eps * abs(singValA[1]) RV So, this value of tol in qr will also minimize the discrepancy. RV 3. The tol parameter is irrelevant in qr(A, tol, RV LAPACK=TRUE)$rank, i.e. LAPACK doesn't seem to utilize RV the tol parameter when computing the rank. However, RV qr(A, tol, LAPACK=FALSE)$rank does depend on tol. Yes, indeed! The help page of qr() actually says so {at least now, don't know about 3 months ago}. RV Now, here are the results: RV 1. hilbert.rank - matrix(NA,20,3) hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) } for (i in 1:20) { RV + himat - hilbert(i) RV + hilbert.rank[i,1] - rankMat(himat) RV + hilbert.rank[i,2] - qr(himat,tol=1.0e-14)$rank RV + hilbert.rank[i,3] - qr(himat, tol = .Machine$double.eps, LAPACK = TRUE)$rank RV + } hilbert.rank RV [,1] [,2] [,3] RV [1,]111 RV [2,]222 RV [3,]333 RV [4,]444 RV [5,]555 RV [6,]666 RV [7,]777 RV [8,]888 RV [9,]999 RV [10,] 10 10 10 RV [11,] 10 11 11 RV [12,] 11 12 12 RV [13,] 11 12 13 RV [14,] 11 13 14 RV [15,] 12 13 15 RV [16,] 12 15 16 RV [17,] 12 16 17 RV [18,] 12 16 18 RV [19,] 13 17 19 RV [20,] 13 17 20 RV We see that rankMat underestimates the rank for n 10, but qr(himat, tol = .Machine$double.eps, LAPACK=TRUE)$rank gets it right. Yes, indeed; and that's seems a bit against the ``common knowledge'' that svd() is more reliable than qr() Hmm I'm still baffled a bit.. Note that with the Hilbert matrices, one might argue that hilbert(20) might really not have a correct estimated rank of 20, but at least for hilbert(13) or so, the rank should be == n. BTW, there's a nice plot, slightly related to this problem, using rcond() from the Matrix package : library(Matrix) hilbert - function(n) { i - 1:n; 1 / outer(i - 1, i, +) } rcHilb - sapply(1:20, function(n) rcond(Matrix(hilbert(n plot(rcHilb, xlab = n, log = y, col = 2, type = b, main = reciprocal condition numbers of Hilbert(n)) where one sees that the LAPACK code that underlies Matrix::rcond() also gets into problem when estimating the condition number for hilbert(n) when n ~ 16 . I think if we wanted to make real progress here, we'd have to consult with numerical analyist specialists. But for me the topic is too remote to be followed up further at the moment. One conclusion for me is that to estimate the rank of a matrix in current versions of R, one should use rankMat - function(x) qr(x, LAPACK = TRUE)$rank (as was suggested as one possibility in the original
Re: [R] Using lmer with huge amount of data
I think I am missing something here: how do you make this 'huge' and 'gigantic'? You have not told us how many subjects you have, but in imaging experiments it is usually no more than 50 and often less. For each subject you have 3 x 30,000 responses plus an age. That is under 1Mb of data per subject, so the problem looks modest unless you have many hundreds of subjects. Nothing says you need to read the data in one go, but it will be helpful to have all the data available to R at once (although this could be alleviated by using a DBMS interface). I think the problem is rather going to be running 30,000 lmer fits, which in my experience often take seconds each. Each fit will only need a modest amount of data (3 responses and one age per subject). On Tue, 24 Jul 2007, Gang Chen wrote: Based on the examples I've seen in using statistical analysis packages such as lmer, it seems that people usually tabulate all the input data into one file with the first line indicating the variable names (or labels), and then read the file inside R. However, in my case I can't do that because of the huge amount of imaging data. Suppose I have a one-way within-subject ANCOVA with one covariate, and I would like to use lmer in R package lme4 to analyze the data. In the terminology of linear mixed models, I have a fixed factor A with 3 levels, a random factor B (subject), and a covariate (age) with a model like this MyResult - lmer(Response ~ FactorA + Age + (1 | subject), MyData, ...) My input data are like this: For each subject I have a file (a huge matrix) storing the response values of the subject at many locations (~30,000 voxels) corresponding to factor A at the 1st level, another file for factor A at the 2nd level, and a 3rd file for factor A at the 3rd level. Then I have another file storing the age of those subjects. The analysis with the linear mixed model above would be done at each voxel separately. It seems impractical to create one gigantic file or matrix to feed into the above command line because of the big number of voxels. I'm not sure how to proceed in this case. Any suggestions would be highly appreciated. Also if I'm concerned about any potential violation of sphericity among the 3 levels of factor A, how can I test sphericity violation in lmer? And if violation exists, how can I make corrections in contrast testing? Thank you very much, Gang -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] crimtab related question
Hi Jean, You haven't yet had a reply from an authoratitive source, so here is my tuppence worth to part of your enquiry. It's almost certain that the receiving box is a receptacle into which tags were placed after they had been drawn and the inscribed measurement noted down. Measurements on three tags were unwittingly not noted before the tags were transferred to the receiving box. They lay there with a good many other tags, so the inscribed measurement/tag couldn't be recovered. I hope this clarifies some points. Regards, Mark. Jean lobry wrote: Dear all, the dataset documented under ?crimtab was also used in: @article{TreloarAE1934, title = {The adequacy of {S}tudent's criterion of deviations in small sample means}, author = {Treloar, A.E. and Wilder, M.A.}, journal = {The Annals of Mathematical Statistics}, volume = {5}, pages = {324-341}, year = {1934} } The following is from page 335 of the above paper: From the table provided by MacDonell (1902) on the associated variation of stature (to the nearest inch) and length of the left middle finger (to the nearest millimeter) in 3000 British criminals, the measusurements were transferred to 3000 numbered Denison metal-rim tags from which the cords has been removed. After thorough checking and mixing of these circular disks, samples of 5 tags each were drawn at random until the supply was exhausted. Unfortunately, three of these samples were erroneously returned to a receiving box before being copied, and the records of 597 samples only are available. Could someone give me a clue about the kind of device that was used here? Is it a kind of lottery machine? I don't understand why three samples were lost. What is this receiving box? Thanks for any hint, Best, -- Jean R. Lobry([EMAIL PROTECTED]) Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I, 43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE allo : +33 472 43 27 56 fax: +33 472 43 13 88 http://pbil.univ-lyon1.fr/members/lobry/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/crimtab-related-question-tf4137237.html#a11772414 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] crimtab related question
While on the subject of mechanical methods of statistical research I can't resist quoting Doob's (1997) Statistical Science interview: My system, complicated by my inaccurate typing, led to retyping material over and over, and for some time I had an electric drill on my desk, provided with an eraser bit which I used to erase typing. I rarely used the system of brushing white fluid over a typed error because I was not patient enough to let the fluid dry before retyping. Long after my first book was done I discovered the tape rolls which cover lines of type. As I typed and retyped my work it became so repugnant to me that I had more and more difficulty even to look at it to check it. This fact accounts for many slips that a careful reading would have discovered. I commonly used a stochastic system of checking, picking a page and then a place on the page at random and reading a few sentences, in order to avoid reading it in context and thereby to avoid reading what was in my mind rather than what I had written. At first I would catch something at almost every trial, and I would continue until several trials would yield nothing. I have tried this system on other authors, betting for example that I would find something to correct on a randomly chosen printed page of text, and nonmathematicans suffering under the delusion that mathematics is errorless would be surprised at how many bets I have won. The relevance to the present inquiry is confirmed by the misspelling of Dennison in the Annals reference quoted below. See, for example: http://www.amazon.com/Avery-Dennison-Metal-Rim-Tags/dp/B000AN376G On the substance of Jean's question, Mark's interpretation seems very plausible. Thanks to Jean and to Martin Maechler for adding this dataset to R. url:www.econ.uiuc.edu/~rogerRoger Koenker email[EMAIL PROTECTED]Department of Economics vox: 217-333-4558University of Illinois fax: 217-244-6678Champaign, IL 61820 On Jul 24, 2007, at 4:42 PM, Mark Difford wrote: Hi Jean, You haven't yet had a reply from an authoratitive source, so here is my tuppence worth to part of your enquiry. It's almost certain that the receiving box is a receptacle into which tags were placed after they had been drawn and the inscribed measurement noted down. Measurements on three tags were unwittingly not noted before the tags were transferred to the receiving box. They lay there with a good many other tags, so the inscribed measurement/tag couldn't be recovered. I hope this clarifies some points. Regards, Mark. Jean lobry wrote: Dear all, the dataset documented under ?crimtab was also used in: @article{TreloarAE1934, title = {The adequacy of {S}tudent's criterion of deviations in small sample means}, author = {Treloar, A.E. and Wilder, M.A.}, journal = {The Annals of Mathematical Statistics}, volume = {5}, pages = {324-341}, year = {1934} } The following is from page 335 of the above paper: From the table provided by MacDonell (1902) on the associated variation of stature (to the nearest inch) and length of the left middle finger (to the nearest millimeter) in 3000 British criminals, the measusurements were transferred to 3000 numbered Denison metal-rim tags from which the cords has been removed. After thorough checking and mixing of these circular disks, samples of 5 tags each were drawn at random until the supply was exhausted. Unfortunately, three of these samples were erroneously returned to a receiving box before being copied, and the records of 597 samples only are available. Could someone give me a clue about the kind of device that was used here? Is it a kind of lottery machine? I don't understand why three samples were lost. What is this receiving box? Thanks for any hint, Best, -- Jean R. Lobry([EMAIL PROTECTED]) Laboratoire BBE-CNRS-UMR-5558, Univ. C. Bernard - LYON I, 43 Bd 11/11/1918, F-69622 VILLEURBANNE CEDEX, FRANCE allo : +33 472 43 27 56 fax: +33 472 43 13 88 http://pbil.univ-lyon1.fr/members/lobry/ __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/crimtab-related- question-tf4137237.html#a11772414 Sent from the R help mailing list archive at Nabble.com. __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting- guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Passing equations as arguments
Friends, I'm trying to pass an equation as an argument to a function. The idea is as follows. Let us say i write an independent function Ideal Situation: ifunc - function(x) { return((x*x)-2) } mainfunc - function(a,b) { evala - ifunc(a) evalb - ifunc(b) if (evalaevalb){return(evala)} else return(evalb) } Now I want to try and write this entire program in a single function with the user specifying the equation as an argument to the function. myfunc - function(a, b, eqn) { func1 - function (x) ?? { return(eqn in terms of x) ?? } Further arguments to check The imply that this does not seem to be correct. The idea is how to assign the equation expression from the main equation into the inner function. Is there anyway to do that within this set up? Thanks in advance Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Overlaying a single contour from a new data array in levelplot
On 7/24/07, Jenny Barnes [EMAIL PROTECTED] wrote: Dear R-Help community, I am trying to overlay a single contour line over a correlation plot using levelplot in the lattice package. These are the two arrays: 1) a correlation plot over Africa - so each grid square is a different colour dependent on correlation - this is in an array: result_cor with dim[465,465] 2) a single contour line from a ***different data source*** - this is from data related to the p-values for the above correlation plot - I want to overlay only the 95% confidence contour. The p-values are stored in an array: result.p.values with same dimensions as above. I have read about using panel.levelplot and panel.contourplot in the R-help mailing list but I don't know the right way to call two different data arrays, can anybody help me please? I appreciate your time and help with this question. I can think of a couple of different ways, but the simplest will probably be to compute the single contour beforehand and add it after the standard levelplot using a panel function. E.g., using the 'volcano' data for both matrices: ## you need the explicit x and y arguments because ## the default is different from levelplot. vcl - contourLines(x = seq_len(nrow(volcano)), y = seq_len(ncol(volcano)), z = volcano, levels = c(172, 182)) levelplot(volcano, add.cl = vcl, panel = function(..., add.cl) { panel.levelplot(...) lapply(add.cl, panel.polygon, border = 'red') }) -Deepayan __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Passing equations as arguments
Here is one possible solution: ifun - function(a, b, FUN){ evala - FUN(a) evalb - FUN(b) if (evala evalb) return(evala) else return(evalb) } ifun(1,2,function(x) (x*x) - 2) On 7/24/07, Anup Nandialath [EMAIL PROTECTED] wrote: Friends, I'm trying to pass an equation as an argument to a function. The idea is as follows. Let us say i write an independent function Ideal Situation: ifunc - function(x) { return((x*x)-2) } mainfunc - function(a,b) { evala - ifunc(a) evalb - ifunc(b) if (evalaevalb){return(evala)} else return(evalb) } Now I want to try and write this entire program in a single function with the user specifying the equation as an argument to the function. myfunc - function(a, b, eqn) { func1 - function (x) ?? { return(eqn in terms of x) ?? } Further arguments to check The imply that this does not seem to be correct. The idea is how to assign the equation expression from the main equation into the inner function. Is there anyway to do that within this set up? Thanks in advance Regards Anup - [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate.ts
On Wed, 25 Jul 2007, laimonis wrote: Consider the following scrap of code: ...slightly modified to x1 - ts(1:24, start = c(2000, 10), freq = 12) x2 - ts(1:24, start = c(2000, 11), freq = 12) and then y1 - aggregate(x1, nfreq = 4) gives the desired result while y2 - aggregate(x2, nfreq = 4) probably does not what you would like it to do. In both cases, the 24 observations are broken into 8 segments of equal length (as documented on ?aggregate.ts) and then aggregated. Therefore as.vector(y1) as.vector(y2) are identical (and not matched by quarter...as you would probably like). And don't tell me that the aggregating a monthly series into quarters makes no sense!! (see response to Bug 9798). 1. Your tone is not appropriate. 2. You're not quoting the reply correctly. It said: You cannot aggregate a time series that does not run over quarters into quarters. The speculation is plain wrong. The reply means that aggregate.ts() does not do what you think it does. As I tried to illustrate with the example above. One can probably argue about whether it makes sense to aggregate a monthly time series into quarter when I don't have complete observations in each quarter. But maybe it might be worth considering a change in aggregate.ts() so that the old and new frequency are matched even with incomplete observations? Currently, the zoo implementation allows this: Coercing back and forth gives: library(zoo) z1 - as.ts(aggregate(as.zoo(x1), as.yearqtr, sum)) z2 - as.ts(aggregate(as.zoo(x2), as.yearqtr, sum)) where z1 is identical to y1, and z2 is what you probably want. hth, Z __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] aggregate.ts
Consider the following scrap of code: x- ts(1:50,start=c(1,11),freq=12) y - aggregate(x,nfreq=4) c(y) [1] 6 15 24 33 42 51 60 69 78 87 96 105 114 123 132 141 y Error in rep.int(, start.pad) : invalid number of copies in rep.int() tsp(y) [1] 1.83 5.58 4.00 So we can aggregate into quarters, but we cannot print it using print.ts Even if print.ts cannot line the series into columns as it normally does for quarterly data, we would expect it to behave as it does when we aggregate into thirds. y3 - aggregate(x,nfreq=3) y3 Time Series: Start = 1.83 End = 5.5 Frequency = 3 [1] 10 26 42 58 74 90 106 122 138 154 170 186 And don't tell me that the aggregating a monthly series into quarters makes no sense!! (see response to Bug 9798). Laimonis Kavalieris __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] initalizing and checking validity of S4 classes
Dear useRs and wizaRds, I am currently developing a set of functions using S4 classes. On the way I encountered the problem exemplified with the code below. For some reason the 'validity' method does not seem to work, i.e. does not check for errors in the specification of the slots of the defined class. Any hints? My understanding of the whole S4 system was that validity checks are made *after* class initialization. Is that correct? Thanks a lot in advance! PS. Session info: R version 2.5.1 (2007-06-27) i386-pc-mingw32 locale: LC_COLLATE=Polish_Poland.1250;LC_CTYPE=Polish_Poland.1250;LC_MONETARY=Polish_Poland.1250;LC_NUMERIC=C;LC_TIME=Polish_Poland.1250 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base -b-e-g-i-n--r--c-o-d-e- setClass( someclass, representation(v=numeric, l=character), prototype( v=numeric(0), l=character(0) ) ) setMethod(initialize, someclass, function(.Object, v=numeric(0), l=character(0)) { # strip the vector names cv - v cl - l names(cv) - NULL names(cl) - NULL rval - .Object [EMAIL PROTECTED] - cv [EMAIL PROTECTED] - cl rval } ) # at this point this should be OK o - new(someclass, v=1:2, l=letters[1:3]) o # check validity f - function(object) { rval - NULL if( length([EMAIL PROTECTED]) != length([EMAIL PROTECTED]) ) rval - c( rval, lengths dont match) if( is.null(rval) ) return(TRUE) else return(rval) } # this should return error description f(o) # define the validity method setValidity( someclass, f) # this should return an error new(someclass, v=1:2, l=letters[1:3]) # but it doesn't... -e-n-d--r--c-o-d-e- Michal Bojanowski ICS / Department of Sociology Utrecht University Heidelberglaan 2; 3584 CS Utrecht Room 1428 m.j.bojanowski at uu dot nl http://www.fss.uu.nl/soc/bojanowski/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] initalizing and checking validity of S4 classes
Hi Michal -- Add validObject to your initialize method: setMethod(initialize, someclass, + function(.Object, v=numeric(0), l=character(0)) + { + # strip the vector names + cv - v + cl - l + names(cv) - NULL + names(cl) - NULL + rval - .Object + [EMAIL PROTECTED] - cv + [EMAIL PROTECTED] - cl + validObject(rval) + rval + } ) [1] initialize new(someclass, v=1:2, l=letters[1:3]) Error in validObject(rval) : invalid class someclass object: lengths dont match Note that without initialize, we have: setClass(A, + representation=representation( +v=numeric, l=character)) [1] A setValidity(A, f) Slots: Name: v l Class: numeric character new(A, v=1:2, l=letters[1:3]) Error in validObject(.Object) : invalid class A object: lengths dont match Here are two interpretations of this. (1) using 'initialize' means that you are taking control of the initialization process, and hence know when you need to call validObject. (2) Responsibility for object validity is ambiguous -- does it belong with 'new', 'initialize', or a 'constructor' that the programmer might write? This is particularly problematic with R's copy semantics, where creating transiently invalid objects seems to be almost necessary (e.g., callNextMethod() in 'initialize' might initialize the inherited slots of the object, but the object itself is of the derived class and could well be invalid 'invalid' after the base class has finished with initialize). Martin Bojanowski, M.J. (Michal) [EMAIL PROTECTED] writes: Dear useRs and wizaRds, I am currently developing a set of functions using S4 classes. On the way I encountered the problem exemplified with the code below. For some reason the 'validity' method does not seem to work, i.e. does not check for errors in the specification of the slots of the defined class. Any hints? My understanding of the whole S4 system was that validity checks are made *after* class initialization. Is that correct? Thanks a lot in advance! PS. Session info: R version 2.5.1 (2007-06-27) i386-pc-mingw32 locale: LC_COLLATE=Polish_Poland.1250;LC_CTYPE=Polish_Poland.1250;LC_MONETARY=Polish_Poland.1250;LC_NUMERIC=C;LC_TIME=Polish_Poland.1250 attached base packages: [1] stats graphics grDevices utils datasets methods [7] base -b-e-g-i-n--r--c-o-d-e- setClass( someclass, representation(v=numeric, l=character), prototype( v=numeric(0), l=character(0) ) ) setMethod(initialize, someclass, function(.Object, v=numeric(0), l=character(0)) { # strip the vector names cv - v cl - l names(cv) - NULL names(cl) - NULL rval - .Object [EMAIL PROTECTED] - cv [EMAIL PROTECTED] - cl rval } ) # at this point this should be OK o - new(someclass, v=1:2, l=letters[1:3]) o # check validity f - function(object) { rval - NULL if( length([EMAIL PROTECTED]) != length([EMAIL PROTECTED]) ) rval - c( rval, lengths dont match) if( is.null(rval) ) return(TRUE) else return(rval) } # this should return error description f(o) # define the validity method setValidity( someclass, f) # this should return an error new(someclass, v=1:2, l=letters[1:3]) # but it doesn't... -e-n-d--r--c-o-d-e- Michal Bojanowski ICS / Department of Sociology Utrecht University Heidelberglaan 2; 3584 CS Utrecht Room 1428 m.j.bojanowski at uu dot nl http://www.fss.uu.nl/soc/bojanowski/ [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Martin Morgan Bioconductor / Computational Biology http://bioconductor.org __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] question on using gl1ce from lasso2 package
Hi, I tried several settings by using the family=gaussian in gl1ce, but none of them works. For the case glm can work. Here is the error message I got: glm(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length ,data=iris,family=gaussian()) gl1ce(Petal.Width~Sepal.Length+Sepal.Width+Petal.Length ,data=iris,family=gaussian()) Error in eval(expr, envir, enclos) : object etastart not found Does anyone have experience with this function? Any help will be appreciated, Li [[alternative HTML version deleted]] __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.