Hi,

The machine learning models in org.apache.spark.mllib have a .predict()
method that can be applied to a Vector to return a prediction.

However this method does not appear on the new models on org.apache.spark.ml
and you have to wrap up a Vector in a DataFrame to send a prediction in.
This ties you into bringing in more of Spark's code as a dependency if you
wish to embed the models in production code outside of Spark.

Also if you wish to feed predictions in one at a time in that context it
makes the process a lot slower, thus it seems to me the old models are more
amenable to being used outside of Spark than the new models at this time.

Are there any plans to add the .predict() method back to the models in the
new API?

Regards,

James

Reply via email to