A couple of comments:

o  Methods such as decision trees do not need to expand factors into columns
of 1df contrasts, so the memory requirement is vastly different.  The models
produced is also very, very different.

o  Why would you want "all possible interactions" of 10 variables, 6 of
which are factors?  How do you intend to interpret, e.g., the 6-factor
interaction?  What can you conclude about a significant 10-variable
interaction?  What is your ultimate goal for this exercise?  Answer to that
should help you decide on more reasonable models to fit.

o  One thing to try is fit the ANOVA model "by hand" by computing cell means
and examine them.  This avoids creating the huge design matrix that's mostly
0s.

HTH,
Andy

> -----Original Message-----
> From: Alexander Sirotkin [at Yahoo] [mailto:[EMAIL PROTECTED] 
> Sent: Friday, October 17, 2003 4:30 AM
> To: John Fox
> Cc: [EMAIL PROTECTED]
> Subject: Re: [R] R memory and CPU requirements
> 
> 
> I agree completely. 
> 
> In fact, I have about 5000 observations, which should
> be enough. 
> I was using 200 samples because of RAM limitations and
>  I'm afraid to think about what amount of RAM I'll
> need to fit an aov() for such data.
> 
> 
> --- John Fox <[EMAIL PROTECTED]> wrote:
> > Dear Alexander,
> > 
> > If I understand you correctly, you have a sample of
> > 200 observations. Even
> > if you had only two factors with 40 levels each, the
> > main effects and 
> > interactions of these factors would require about
> > 1600 degrees of freedom 
> > -- that is, more than the number of observations.
> > This doesn't make a whole 
> > lot of sense.
> > 
> > I hope that this helps,
> >   John
> > 
> > At 05:03 PM 10/16/2003 -0700, Alexander Sirotkin
> > \[at Yahoo\] wrote:
> > 
> > >--- Deepayan Sarkar <[EMAIL PROTECTED]> wrote:
> > > > On Thursday 16 October 2003 17:59, Alexander
> > > > Sirotkin \[at Yahoo\] wrote:
> > > > > Thanks for all the help on my previous
> > questions.
> > > > >
> > > > > One more (hopefully last one) : I've been very 
> surprised when I 
> > > > > tried to fit a model (using
> > > > aov())
> > > > > for a sample of size 200 and 10 variables and
> > > > their
> > > > > interactions.
> > > >
> > > > That doesn't really say much. How many of these
> > > > variables are factors ? How
> > > > many levels do they have ? And what is the order
> > of
> > > > the interaction ? (Note
> > > > that for 10 numeric variables, if you allow all 
> interactions, then 
> > > > there will be a 100 terms in your model. This increases for
> > > > factors.)
> > > >
> > > > In other words, how big is your model matrix ?
> > (See
> > > > ?model.matrix)
> > > >
> > > > Deepayan
> > > >
> > >
> > >
> > >I see...
> > >
> > >Unfortunately, model.matrix() ran out of memory :)
> > >I have 10 variables, 6 of which are factor, 2 of
> > which
> > >
> > >have quite a lot of levels (about 40). And I would
> > >like
> > >to allow all interactions.
> > >
> > >I understand your point about categorical
> > variables,
> > >but
> > >still - this does not seem like too much data to
> > me.
> > >
> > >
> > >I remmeber fitting all kinds of models (mostly
> > >decision
> > >trees) for much, much larger data sets.
> > >
> > >______________________________________________
> > >[EMAIL PROTECTED] mailing list
> >
> >https://www.stat.math.ethz.ch/mailman/listinfo/r-help
> > 
> >
> -----------------------------------------------------
> > John Fox
> > Department of Sociology
> > McMaster University
> > Hamilton, Ontario, Canada L8S 4M4
> > email: [EMAIL PROTECTED]
> > phone: 905-525-9140x23604
> > web: www.socsci.mcmaster.ca/jfox
> >
> -----------------------------------------------------
> >
> 
> ______________________________________________
> [EMAIL PROTECTED] mailing list 
> https://www.stat.math.ethz.ch/mailman/listinfo> /r-help
>

______________________________________________
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help

Reply via email to