Does anyone know a reason why, in principle, a call to randomForest cannot accept a data frame with missing predictor values? If each individual tree is built using CART, then it seems like this should be possible. (I understand that one may impute missing values using rfImpute or some other method, but I would like to avoid doing that.)
If this functionality were available, then when the trees are being constructed and when subsequent data are put through the forest, one would also specify an argument for the use of surrogate rules, just like in rpart. I realize this question is very specific to randomForest, as opposed to R in general, but any comments are appreciated. I suppose I am looking for someone to say "It's not appropriate, and here's why ..." or "Good idea. Please implement and post your code." Thanks, Darin England, Senior Scientist Ingenix ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.