Hi Xiaomeng, Have you tried to confirm the DataFrame contents before fitting? like assembleddata.show() before fitting.
Regards, Yuhao 2016-12-21 10:05 GMT-08:00 Xiaomeng Wan <shawn...@gmail.com>: > Hi, > > I am running linear regression on a dataframe and get the following error: > > Exception in thread "main" java.lang.AssertionError: assertion failed: > Training dataset is empty. > > at scala.Predef$.assert(Predef.scala:170) > > at org.apache.spark.ml.optim.WeightedLeastSquares$Aggregator.validate( > WeightedLeastSquares.scala:247) > > at org.apache.spark.ml.optim.WeightedLeastSquares.fit( > WeightedLeastSquares.scala:82) > > at org.apache.spark.ml.regression.LinearRegression. > train(LinearRegression.scala:180) > > at org.apache.spark.ml.regression.LinearRegression. > train(LinearRegression.scala:70) > > at org.apache.spark.ml.Predictor.fit(Predictor.scala:90) > > here is the data and code: > > {"label":79.3,"features":{"type":1,"values":[6412. > 143500000001,888.0,1407.0,1.5844594594594594,10.614,12.07, > 0.12062966031483012,0.9991237664152219,6.065,0.49751449875724935]}} > > {"label":72.3,"features":{"type":1,"values":[6306. > 044500000001,1084.0,1451.0,1.338560885608856,7.018,12.04,0. > 41710963455149497,0.9992054343916128,6.05,0.4975083056478405]}} > > {"label":76.7,"features":{"type":1,"values":[6142. > 920000000003,1494.0,1437.0,0.9618473895582329,7.939,12.06, > 0.34170812603648426,0.9992216101762574,6.06,0.49751243781094534]}} > > val lr = new LinearRegression().setMaxIter(300).setFeaturesCol("features") > > val lrModel = lr.fit(assembleddata) > > Any clue or inputs are appreciated. > > > Regards, > > Shawn > > >