Hi Mark, Unless you are fitting millions of very very very simple models, I doubt that extracting p-values is going to be a limiting factor in the speed of your analysis.
Hadley On Mon, Mar 8, 2010 at 3:47 AM, Mark Kimpel <mwkim...@gmail.com> wrote: > Hadley, > > Thanks for pointing me to some good articles. Unfortunately, I have already > read Holger's and my main concern is computational efficiency. The buzzword > on this list regarding efficient code is "vectorization". I am, frankly, > surprised that there is a way to vectorize analysis of complex models but > not to extract p values from them. Dieter's reply points one towards using > lapply, which in my experience allows for compact code but not an increase > in efficiency (one of Holger's examples demonstrates this). Anyway, I cannot > see how to go from Holger's fairly simple examples to one that involves a > complex model with several factors and interactions. > > Limma, which does provide p values if contrasts are used, is blindingly fast > but I believe Gordon Smyth has hard-coded most of this excellent package in > C. I was hoping to achieve something similar without the use of the > moderated t-statistics that Limma uses. > > Looks like I am stuck using loops with mcapply. Thank goodness for my > Corei7! > > Mark > > Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry > Indiana University School of Medicine > > 15032 Hunter Court, Westfield, IN 46074 > > (317) 490-5129 Work, & Mobile & VoiceMail > (317) 399-1219 Skype No Voicemail please > > > On Sun, Mar 7, 2010 at 2:08 PM, hadley wickham <h.wick...@gmail.com> wrote: >> >> Hi Mark, >> >> If efficiency is a concern you might want to read "Computing Thousands >> of Test Statistics Simultaneously in R" by Holger Schwender and Tina >> Müller, http://stat-computing.org/newsletter/issues/scgn-18-1.pdf. >> >> If you just want to do it, see the examples in >> http://had.co.nz/plyr/plyr-intro-090510.pdf. >> >> Hadley >> >> On Sun, Mar 7, 2010 at 7:03 PM, Mark Kimpel <mwkim...@gmail.com> wrote: >> > Is it possible to vectorize anova over the output of a vectorized lm? I >> > have a gene expression matrix with each row being a gene and columns for >> > samples. There are several factors with interactions. I can get p values >> > by >> > looping over the matrix with lm and anova, but I would like to make this >> > as >> > computationally efficient as possible. I am able to vectorize the lm >> > command, but when I try to use anova on the resultant model object I get >> > just one anova result. >> > >> > Is what I want to do possible? And, yes, I am quite conversant with >> > Limma >> > and other BioC packages, I have my reasons for wanting to use lm and >> > anova. >> > >> > Thanks, >> > >> > Mark >> > Mark W. Kimpel MD ** Neuroinformatics ** Dept. of Psychiatry >> > Indiana University School of Medicine >> > >> > 15032 Hunter Court, Westfield, IN 46074 >> > >> > (317) 490-5129 Work, & Mobile & VoiceMail >> > (317) 399-1219 Skype No Voicemail please >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Assistant Professor / Dobelman Family Junior Chair >> Department of Statistics / Rice University >> http://had.co.nz/ > > -- Assistant Professor / Dobelman Family Junior Chair Department of Statistics / Rice University http://had.co.nz/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.