Thanks for the reply @srowen.

>>I don't think you can move or alter the class APis. 
Agreed. That's not my intention at all.

>>There also isn't much value in copying the code. Maybe there are
opportunities for moving some internal code.
There will probably be some copying and moving internal code, but this is
not the main purpose. 
The goal is to have these algorithms implemented using the Dataset API. 
Currently, the implementation of these classes/algorithms uses RDDs by
wrapping the old (mllib) classes, which will eventually be deprecated (and
deleted).

>>But in general I think all this has to wait. 
Do you have any schedule or plan in mind? If deprecation is targeted for
3.0, then we roughly have 1.5 years.
On the other-hand, the current situation prevents us from making
improvements to the existing classes, for example I'd like to add maxDocFreq
to ml.feature.IDF to make it similar to scikit-learn, but that's hard to do
because it's just a wrapper mllib.feature.IDF,


Thank you for the discussion.
Yacine.



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to