I am trying to use the random forests package for classification in R. The Variable Importance Measures listed are:
-mean raw importance score of variable x for class 0 -mean raw importance score of variable x for class 1 -MeanDecreaseAccuracy -MeanDecreaseGini Now I know what these "mean" as in I know their definitions. What I want to know is how to use them. What I am trying to figure out is what these values mean in only the context of how accurate they are, what is a good value, what is a bad value, what are the maximums and minimums, etc. If a variable has a high MeanDecreaseAccuracy or MeanDecreaseGini does that mean it is important or unimportant? Also any information on the raw scores would be really helpful too. I want to know everything there is to know about these numbers that is relevant to the application of them. I don't really want a technical explanation that uses words like 'error', 'summation', or 'permutated', but rather a simpler explanation that didn't involve any discussion of how random forests works(I have read all about that and didn't find it very helpful.) Like if I wanted someone to explain to me how to use a radio, I wouldn't expect the explanation to involve how a radio converts radio waves into sound. If anyone can help me out at all it would be really great. I have read many many lectures on random forests and other data mining lectures but I have never found simple answers about how to read the variable importance measures. Thanks, Paul Fisch ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.