The error message is pretty clear, really. To spell it out a bit more, what you have done is as follows.
Your training set has factor variables in it. Suppose one of them is "f". In the training set it has 5 levels, say. Your test set also has a factor "f", as it must, but it appears that in the test set it has 6 levels, or more, or levels that do not agree with those for "f" in the training set. This mismatch measn that the predict method for randomForest cannot use this test set. What you have to do is make sure that the factor levels agree for every factor in both test and training set. One way to do this is to put the test and training set together with rbind(...) say, and then separate them again. But even this will still have a problem for you. Because you training set will have some factor levels empty, which are not empty in the test set. The error will most likely be more subtle, though. You really need to sort this out yourself. It is not particularly an R problem, but a confusion over data. To be useful, your training set need to cover the field for all levels of every factor. Think about it. -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Nagu Sent: Saturday, 8 March 2008 5:37 AM To: r-help@r-project.org; [EMAIL PROTECTED] Subject: [R] error in random forest Hi, I get the following error when I try to predict the probabilities of a test sample: Error in predict.randomForest(fit.EBA.OM.rf.50, x.OM, type = "prob") : New factor levels not present in the training data I have about 630 predictor variables in the dataset x.OM (25 factor variables and the remaining are continuous variables). Any ideas on how to trace it? Thank you, Nagu ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.