Re: Getting incorrect weights for LinearRegression

2015-03-13 Thread Burak Yavuz
Hi,

I would suggest you use LBFGS, as I think the step size is hurting you. You
can run the same thing in LBFGS as:

```
val algorithm = new LBFGS(new LeastSquaresGradient(), new SimpleUpdater())
val initialWeights = Vectors.dense(Array.fill(3)(
scala.util.Random.nextDouble()))

val weights = algorithm.optimize(parsedData,initialWeights)
```

Note that weights will be a Vector and not a model. You can then generate
the model with:

val w = weights.toArray
val intercept = w.takeRight(1).head()
val model = new LinearRegressionModel(Vectors.dense(w.dropRight(1)),
intercept)


Best,
Burak

On Wed, Mar 11, 2015 at 11:59 AM, EcoMotto Inc. ecomot...@gmail.com wrote:

 Hello,

 I am trying to run LinearRegression on a dummy data set, given below. Here
 I tried all different settings but I am still failing to reproduce desired
 coefficients.

 Please help me out, as I facing the same problem in my actual dataset.
 Thank you.

 This dataset is generated based on the simple equation: Y = 4 + (2 * x1) +
 (3 * x2)

 *Data:*
 y,x1,x2
 6.3,1,0.1
 8.6,2,0.2
 10.9,3,0.3
 13.8,4,0.6
 16.4,5,0.8
 19.6,6,1.2
 22.8,7,1.6
 25.7,8,1.9
 28.3,9,2.1
 31.2,10,2.4
 34.1,11,2.7

 *Spark Code:*
 val data = sc.textFile(Data/tempData_1.csv )

 val parsedData = data.mapPartitions(_.drop(1)).map {
 line =
 val parts = line.split(',')
 LabeledPoint(parts(0).toDouble,Vectors.dense(Array(1.0,parts(1).toDouble,parts(2).toDouble)))
   }.cache()

 var numIterations = 400
 val step = 0.01
 val algorithm = new LinearRegressionWithSGD()
 algorithm.setIntercept(false) //Even tried with intercept(True) and just
 (x1,x2) features
 algorithm.optimizer.setStepSize(step)
 algorithm.optimizer.setNumIterations(numIterations)
 .setUpdater(new SimpleUpdater())
 //.setRegParam(0.1)
 .setMiniBatchFraction(1.0)

 val initialWeights =
 Vectors.dense(Array.fill(3)(scala.util.Random.nextDouble()))

 val model = algorithm.run(parsedData,initialWeights)
 println(s Model intercept: ${model.intercept}, weights:
 ${model.weights})



 Regards,
 Arun



Re: Getting incorrect weights for LinearRegression

2015-03-13 Thread EcoMotto Inc.
Thanks a lot Burak, that helped.

On Fri, Mar 13, 2015 at 1:55 PM, Burak Yavuz brk...@gmail.com wrote:

 Hi,

 I would suggest you use LBFGS, as I think the step size is hurting you.
 You can run the same thing in LBFGS as:

 ```
 val algorithm = new LBFGS(new LeastSquaresGradient(), new SimpleUpdater())
 val initialWeights = Vectors.dense(Array.fill(3)(
 scala.util.Random.nextDouble()))

 val weights = algorithm.optimize(parsedData,initialWeights)
 ```

 Note that weights will be a Vector and not a model. You can then generate
 the model with:

 val w = weights.toArray
 val intercept = w.takeRight(1).head()
 val model = new LinearRegressionModel(Vectors.dense(w.dropRight(1)),
 intercept)


 Best,
 Burak

 On Wed, Mar 11, 2015 at 11:59 AM, EcoMotto Inc. ecomot...@gmail.com
 wrote:

 Hello,

 I am trying to run LinearRegression on a dummy data set, given below.
 Here I tried all different settings but I am still failing to reproduce
 desired coefficients.

 Please help me out, as I facing the same problem in my actual dataset.
 Thank you.

 This dataset is generated based on the simple equation: Y = 4 + (2 * x1)
 + (3 * x2)

 *Data:*
 y,x1,x2
 6.3,1,0.1
 8.6,2,0.2
 10.9,3,0.3
 13.8,4,0.6
 16.4,5,0.8
 19.6,6,1.2
 22.8,7,1.6
 25.7,8,1.9
 28.3,9,2.1
 31.2,10,2.4
 34.1,11,2.7

 *Spark Code:*
 val data = sc.textFile(Data/tempData_1.csv )

 val parsedData = data.mapPartitions(_.drop(1)).map {
 line =
 val parts = line.split(',')
 LabeledPoint(parts(0).toDouble,Vectors.dense(Array(1.0,parts(1).toDouble,parts(2).toDouble)))
   }.cache()

 var numIterations = 400
 val step = 0.01
 val algorithm = new LinearRegressionWithSGD()
 algorithm.setIntercept(false) //Even tried with intercept(True) and just
 (x1,x2) features
 algorithm.optimizer.setStepSize(step)
 algorithm.optimizer.setNumIterations(numIterations)
 .setUpdater(new SimpleUpdater())
 //.setRegParam(0.1)
 .setMiniBatchFraction(1.0)

 val initialWeights =
 Vectors.dense(Array.fill(3)(scala.util.Random.nextDouble()))

 val model = algorithm.run(parsedData,initialWeights)
 println(s Model intercept: ${model.intercept}, weights:
 ${model.weights})



 Regards,
 Arun