I could get an r squared from lm.fit by correlating fitted.values and my response variable. But could I do it somehow using Sums of Squares? I am clear on SS for residuals. But where is SS for the model or the total SS in lm.fit output? Thank you! Dimitri
On Mon, Sep 8, 2008 at 1:57 PM, Gabor Grothendieck <[EMAIL PROTECTED]> wrote: > On Mon, Sep 8, 2008 at 1:47 PM, Dimitri Liakhovitski <[EMAIL PROTECTED]> > wrote: >> Thank you everyone for your responses. I'll answer several questions. >> >> 1. > Disclaimer: I have **NO IDEA** of the details of what you want >> to do or why >>> -- but I am willing to bet that there are better ways of doing it than 1.8 >>> mm multiple refressions that take 270 secs each!! (which I find difficult to >>> believe in itself -- are you sure you are doing things right? Something >>> sounds very fishy here: R's regression code is typically very fast). >> I probably should not bore everyone, but just to explain where the >> large number is coming from. I have an experimental design with 7 >> factors. Each factor has between 3 and 5 levels. Once you cross them >> all, you end up with 18,000 cells. For each cell, I want to generate a >> sample of N=100. For each sample I have to analyze the data using 3 >> different statistical methods of analysis (the goal of the >> Monte-Carlo) is to compare those methods. One of the methods requires >> running of up to ~32,000 simple multiple regressions - yes just for >> one sample and it's not a mistake. I test-ran one such analysis for a >> sample with N=800 and 15 predictors and it took 270 seconds. R was >> actually very fast - it ran each of the individual regressions in >> about 0.008 seconds. Still I need something faster. >> >> 2. Sorry - what was the formula sum(lm.fit(x,y))$residuals^2) for? For >> example, using it on my data, I got a value of 36,644... > > Its the sum of the squares of the residuals. > >> >> 3. I know that for similarly challenging situations people did used >> Fortran compilers. So, anyone heard of a free Fortran library or an >> efficient piece of code? >> >> Thank you! >> Dimitri >> >> >>> >>> -- Bert Gunter >>> >>> -----Original Message----- >>> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On >>> Behalf Of Dimitri Liakhovitski >>> Sent: Monday, September 08, 2008 9:56 AM >>> To: Prof Brian Ripley >>> Cc: R-Help List >>> Subject: Re: [R] Question about multiple regression >>> >>> Yes, see my previous e-mail on how long R takes (270 seconds for one >>> of the 1,800,000 sets I need) - using system.time. >>> Not sure how to test the same for Fortran... >>> >>> On Mon, Sep 8, 2008 at 12:51 PM, Prof Brian Ripley >>> <[EMAIL PROTECTED]> wrote: >>>> Are you sure R's ways are not fast enough (there are many layers >>> underneath >>>> lm)? For an example of how you might do this at C/Fortran level, see the >>>> function lqs() in MASS. >>>> >>>> On Mon, 8 Sep 2008, Dimitri Liakhovitski wrote: >>>> >>>>> Dear R-list, >>>>> maybe some of you could point me in the right direction: >>>>> >>>>> Are you aware of any FREE Fortran or Java libraries/actual pieces of >>>>> code that are VERY efficient (time-wise) in running the regular linear >>>>> least-squares multiple regression? >>>> >>>> A lot of the effort is in getting the right answer fast, including for >>> e.g. >>>> collinear inputs. >>>> >>>>> More specifically, I have to run small regression models (between 1 >>>>> and 15 predictors) on samples of up to N=700 but thousands and >>>>> thousands of them. >>>>> >>>>> I am designing a simulation in R and running those regressions and R >>>>> itself is way too slow. So, I am thinking of compiling the regression >>>>> run itself in Fortran and Java and then calling it from R. >>>> >>>> I think Java is unlikely to be fast compared to the Fortran R itself uses. >>>> >>>> Have you profiled to find where the time is really being spent (both R and >>>> C/Fortran profiling if necessary). >>>> >>>>> >>>>> Thank you very much for any advice! >>>>> >>>>> Dimitri Liakhovitski >>>>> MarketTools, Inc. >>>>> [EMAIL PROTECTED] >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> -- >>>> Brian D. Ripley, [EMAIL PROTECTED] >>>> Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ >>>> University of Oxford, Tel: +44 1865 272861 (self) >>>> 1 South Parks Road, +44 1865 272866 (PA) >>>> Oxford OX1 3TG, UK Fax: +44 1865 272595 >>>> >>> >>> >>> >>> -- >>> Dimitri Liakhovitski >>> MarketTools, Inc. >>> [EMAIL PROTECTED] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >>> >> >> >> >> -- >> Dimitri Liakhovitski >> MarketTools, Inc. >> [EMAIL PROTECTED] >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- Dimitri Liakhovitski MarketTools, Inc. [EMAIL PROTECTED] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.