Thanks a lot Burak, that helped. On Fri, Mar 13, 2015 at 1:55 PM, Burak Yavuz <brk...@gmail.com> wrote:
> Hi, > > I would suggest you use LBFGS, as I think the step size is hurting you. > You can run the same thing in LBFGS as: > > ``` > val algorithm = new LBFGS(new LeastSquaresGradient(), new SimpleUpdater()) > val initialWeights = Vectors.dense(Array.fill(3)( > scala.util.Random.nextDouble())) > > val weights = algorithm.optimize(parsedData,initialWeights) > ``` > > Note that weights will be a Vector and not a model. You can then generate > the model with: > > val w = weights.toArray > val intercept = w.takeRight(1).head() > val model = new LinearRegressionModel(Vectors.dense(w.dropRight(1)), > intercept) > > > Best, > Burak > > On Wed, Mar 11, 2015 at 11:59 AM, EcoMotto Inc. <ecomot...@gmail.com> > wrote: > >> Hello, >> >> I am trying to run LinearRegression on a dummy data set, given below. >> Here I tried all different settings but I am still failing to reproduce >> desired coefficients. >> >> Please help me out, as I facing the same problem in my actual dataset. >> Thank you. >> >> This dataset is generated based on the simple equation: Y = 4 + (2 * x1) >> + (3 * x2) >> >> *Data:* >> y,x1,x2 >> 6.3,1,0.1 >> 8.6,2,0.2 >> 10.9,3,0.3 >> 13.8,4,0.6 >> 16.4,5,0.8 >> 19.6,6,1.2 >> 22.8,7,1.6 >> 25.7,8,1.9 >> 28.3,9,2.1 >> 31.2,10,2.4 >> 34.1,11,2.7 >> >> *Spark Code:* >> val data = sc.textFile("Data/tempData_1.csv" ) >> >> val parsedData = data.mapPartitions(_.drop(1)).map { >> line => >> val parts = line.split(',') >> LabeledPoint(parts(0).toDouble,Vectors.dense(Array(1.0,parts(1).toDouble,parts(2).toDouble))) >> }.cache() >> >> var numIterations = 400 >> val step = 0.01 >> val algorithm = new LinearRegressionWithSGD() >> algorithm.setIntercept(false) //Even tried with intercept(True) and just >> (x1,x2) features >> algorithm.optimizer.setStepSize(step) >> algorithm.optimizer.setNumIterations(numIterations) >> .setUpdater(new SimpleUpdater()) >> //.setRegParam(0.1) >> .setMiniBatchFraction(1.0) >> >> val initialWeights = >> Vectors.dense(Array.fill(3)(scala.util.Random.nextDouble())) >> >> val model = algorithm.run(parsedData,initialWeights) >> println(s">>>> Model intercept: ${model.intercept}, weights: >> ${model.weights}") >> >> >> >> Regards, >> Arun >> > >