Re: data frame or RDD for machine learning

2016-06-09 Thread Sandeep Nemuri
Please refer : http://spark.apache.org/docs/latest/mllib-guide.html ~Sandeep On Thursday 9 June 2016, Jacek Laskowski wrote: > Hi, > > Use DataFrame-based API (aka spark.ml) first and if your ml algorithm > doesn't support it switch to a RDD-based API (spark.mllib). What

Re: data frame or RDD for machine learning

2016-06-09 Thread Jacek Laskowski
Hi, Use DataFrame-based API (aka spark.ml) first and if your ml algorithm doesn't support it switch to a RDD-based API (spark.mllib). What algorithm are you going to use? Jacek On 9 Jun 2016 9:12 a.m., "pseudo oduesp" wrote: > Hi, > after spark 1.3 we have dataframe (

data frame or RDD for machine learning

2016-06-09 Thread pseudo oduesp
Hi, after spark 1.3 we have dataframe ( thanks good ) , instead rdd : in machine learning algorithmes we should give him an RDD or dataframe? i mean when i build modele : Model = algoritme(rdd) or Model = algorithme(df) if you have an exemple with data frame i prefer work with