Setting the vote rate in a Random Forest in MLlib

Young, Matthew T Wed, 16 Dec 2015 10:46:53 -0800

One of our data scientists is interested in using Spark to improve performance 
in some random forest binary classifications, but isn't getting good enough 
results from MLlib's implementation of the random forest compared to R's 
randomforest library with the available parameters. She suggested that if she 
could tune the vote rate of the forest (how many trees are required to vote 
true to cause a categorization) she might be able to reach the false positive 
and true positive targets for the project.


Is there any way to set the vote rate for a random forest in Spark 1.5.2? I 
don't see any such option in the trainClassifier 
API<https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.mllib.tree.RandomForest$>.

Thanks,

-- Matthew

Setting the vote rate in a Random Forest in MLlib

Reply via email to