ML has a DataFrame based API, while MLlib is RDDs and will be deprecated as of Spark 2.0.
On Thu, Jul 21, 2016 at 10:41 PM, VG <vlin...@gmail.com> wrote: > Why do we have these 2 packages ... ml and mlib? > What is the difference in these > > > > On Fri, Jul 22, 2016 at 11:09 AM, Bryan Cutler <cutl...@gmail.com> wrote: > >> Hi JG, >> >> If you didn't know this, Spark MLlib has 2 APIs, one of which uses >> DataFrames. Take a look at this example >> https://github.com/apache/spark/blob/master/examples/src/main/java/org/apache/spark/examples/ml/JavaLinearRegressionWithElasticNetExample.java >> >> This example uses a Dataset<Row>, which is type equivalent to a DataFrame. >> >> >> On Thu, Jul 21, 2016 at 8:41 PM, Jean Georges Perrin <j...@jgp.net> wrote: >> >>> Hi, >>> >>> I am looking for some really super basic examples of MLlib (like a >>> linear regression over a list of values) in Java. I have found a few, but I >>> only saw them using JavaRDD... and not DataFrame. >>> >>> I was kind of hoping to take my current DataFrame and send them in >>> MLlib. Am I too optimistic? Do you know/have any example like that? >>> >>> Thanks! >>> >>> jg >>> >>> >>> Jean Georges Perrin >>> j...@jgp.net / @jgperrin >>> >>> >>> >>> >>> >> >