My apologies, subject corrected.

I'm building a RF 50 trees at a time due to memory limitations (I have
 roughly .5 million observations and around 20 variables). I thought I
 could combine some or all of my forests later and look at global
 importance. 

If I have say 2 forests : tree1 and tree2, they have similar Gini and
 Raw importances and, additionally, are similar to one another. After
 combining (using the combine command) the trees into one however, the
 combined tree Raw importances have changed in rank order rather dramtically
 (e.g. the top most important becomes least important. It is not
 however a completely reversed ordering). In addtion, the scale of both the
 Raw and Gini importances is orders of magnitude smaller for the combined
 tree.

Note that the combined tree Gini importance looks roughly similar to
 the individual tree Gini (and Raw) importance, at least in terms of rank
 ordering.

I'm using the non-formula randomForest specification  along  with
  norm.votes=FALSE to facilitate  large sample  estimation  and  tree
 combining.

I'm using R 2.5.0 on a windows XP machine with 2 gig RAM. I'm also
 using randomForest 4.5-18.

Any advice is appreciated,
Many thanks,
Joe


        [[alternative HTML version deleted]]

______________________________________________
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to