Hi all,

I'm using the random forest implementation in Mahout 0.8 to perform
classification (org.apache.mahout.classifier.df.mapreduce.BuildForest and
org.apache.mahout.classifier.df.mapreduce.TestForest). I've run the
classifier multiple times with different parameters and different data
splits, and consistently get accuracy of ~0.9.

I've previously used R's RRF package with the exact same data and I
consistently get accuracy of ~0.95, which is a fair bit higher than the
Mahout results. I've been unable to figure out why the classifiers perform
differently with the same data and the same parameters.

Has anyone found that Mahout's random forest doesn't perform as well as
other implementations? If not, is there any reason why it wouldn't perform
as well?

Cheers,
Tim

Reply via email to