One of our data scientists is interested in using Spark to improve performance in some random forest binary classifications, but isn't getting good enough results from MLlib's implementation of the random forest compared to R's randomforest library with the available parameters. She suggested that if she could tune the vote rate of the forest (how many trees are required to vote true to cause a categorization) she might be able to reach the false positive and true positive targets for the project.
Is there any way to set the vote rate for a random forest in Spark 1.5.2? I don't see any such option in the trainClassifier API<https://spark.apache.org/docs/1.5.2/api/scala/index.html#org.apache.spark.mllib.tree.RandomForest$>. Thanks, -- Matthew