Hi R Expert Community,
My question: What is the difference between Mean Decrease Accuracy produced by
importance(foo) vs foo$importance in a Random Forest Model?
I ran a Random Forest classification model where the classifier is binary. I
stored the model in object FOREST_model. I than ran importance(FOREST_model)
and FOREST_model$importance. I usually use the prior but decided to learn more
about what is in summary(randomForest ) so I ran the latter. I expected both to
produce identical output. Mean Decrease Gini is the only thing that is
identical in both.
I looked at ? Random Forest and Package 'randomForest' documentation and didn't
find any info explaining this difference.
I am not including a reproducible example because this is most likely
something, perhaps simple, such as one is divided by something (if so, what?),
that I am just not aware of.
importance(FOREST_model)
HC TER MeanDecreaseAccuracy MeanDecreaseGini
APPT_TYP_CD_LL 0.16025157 -0.521041660 0.15670297 12.793624
ORG_NAM_LL 0.20886631 -0.952057325 0.20208393 107.137049
NEW_DISCIPLINE 0.20685079 -0.960719435 0.20076762 86.495063
FOREST_model$importance
HC TER MeanDecreaseAccuracy MeanDecreaseGini
APPT_TYP_CD_LL 0.0049473962 -3.727629e-03 0.0045949805
12.793624
ORG_NAM_LL 0.0090715845 -2.401016e-02 0.0077298067
107.137049
NEW_DISCIPLINE 0.0130672572 -2.656671e-02 0.0114583178
86.495063
Dan Lopez
LLNL, HRIM, Workforce Analytics & Metrics
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.