Re: How could output the StreamingLinearRegressionWithSGD prediction result?

2015-06-23 Thread Xiangrui Meng
Please check the input path to your test data, and call `.count()` and
see whether there are records in it. -Xiangrui

On Sat, Jun 20, 2015 at 9:23 PM, Gavin Yue yue.yuany...@gmail.com wrote:
 Hey,

 I am testing the StreamingLinearRegressionWithSGD following the tutorial.


 It works, but I could not output the prediction results. I tried the
 saveAsTextFile, but it only output _SUCCESS to the folder.


 I am trying to check the prediction results and use
 BinaryClassificationMetrics to get areaUnderROC.


 Any example for this?

 Thank you !

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



How could output the StreamingLinearRegressionWithSGD prediction result?

2015-06-20 Thread Gavin Yue
Hey,

I am testing the StreamingLinearRegressionWithSGD following the tutorial.


It works, but I could not output the prediction results. I tried the
saveAsTextFile, but it only output _SUCCESS to the folder.


I am trying to check the prediction results and use
BinaryClassificationMetrics to get areaUnderROC.


Any example for this?

Thank you !


Cannot PredictOnValues or PredictOn base on the model build with StreamingLinearRegressionWithSGD

2014-12-05 Thread Bui, Tri
Hi,

The following example code is able to build the correct model.weights, but its 
prediction value is zero.   Am I passing the PredictOnValues incorrectly?  I 
also coded a batch version base on LinearRegressionWithSGD() with the same 
train and test data, iteration, stepsize info,  and  it was able to  
model.predict with pretty good result.

I don' know why the predictOnValues is coming out zero, is there another way to 
predict on StreamingLinearRegressonWithSGD().

Attached is the test and train data I am using.

Numiteration and stepsize to converge to the model is 600 and .0001.

val trainingData = ssc.textFileStream(inp(0)).map(LabeledPoint.parse)
val testData = ssc.textFileStream(inp(1)).map(LabeledPoint.parse)
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(inp(3).toInt)).setNumIterations(inp(4).toInt).setStepSize(inp(5).toFloat)
model.algorithm.setIntercept(true)
model.trainOn(trainingData)
//model.predictOnValues(testData.map(xp = (xp.label, xp.features))).print()
model.predictOn(testData.map(xp = (xp.features))).print()
ssc.start()
ssc.awaitTermination()

Thanks for the help.
Tri





final.test
Description: final.test


final.train
Description: final.train

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD

2014-12-01 Thread Bui, Tri
Thanks Yanbo!  That works!

The only issue is that it won’t print the predicted value from lp.features, 
from code line below.

model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print()

It prints the test input data correctly, but it keeps on printing “0.0” as the 
predicted values, which is the lp.features.

Thanks
Tri

From: Yanbo Liang [mailto:yanboha...@gmail.com]
Sent: Thursday, November 27, 2014 12:22 AM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Inaccurate Estimate of weights model from 
StreamingLinearRegressionWithSGD

Hi Tri,

Maybe my latest responds for your problem is lost, whatever, the following code 
snippet can run correctly.

val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))

model.algorithm.setIntercept(true)

Because that all setXXX() function in StreamingLinearRegressionWithSGD will 
return this.type which is an instance of itself,
so we need set other configuration in a separate line w/o return value.

2014-11-27 1:04 GMT+08:00 Bui, Tri 
tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid:
Thanks Yanbo!

Modified code below:

val conf = new 
SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression)
val ssc = new StreamingContext(conf, Seconds(args(2).toLong))
val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse)
val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse)
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001).algorithm.setIntercept(true)
model.trainOn(trainingData)
model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print()
ssc.start()
ssc.awaitTermination()

But I am getting compile error:
[error] 
/data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:54:
 value trainOn is not a member
of org.apache.spark.mllib.regression.LinearRegressionWithSGD
[error] model.trainOn(trainingData)
[error]   ^
[error] 
/data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:55:
 value predictOnValues is not a
member of org.apache.spark.mllib.regression.LinearRegressionWithSGD
[error] model.predictOnValues(testData.map(lp = (lp.label, 
lp.features))).print()
[error]   ^
[error] two errors found
[error] (compile:compile) Compilation failed

Thanks
Tri

From: Yanbo Liang [mailto:yanboha...@gmail.commailto:yanboha...@gmail.com]
Sent: Tuesday, November 25, 2014 8:57 PM
To: Bui, Tri
Cc: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Inaccurate Estimate of weights model from 
StreamingLinearRegressionWithSGD

Hi Tri,

setIntercept() is not a member function of StreamingLinearRegressionWithSGD, 
it's a member function of LinearRegressionWithSGD(GeneralizedLinearAlgorithm) 
which is a member variable(named algorithm) of StreamingLinearRegressionWithSGD.

So you need to change your code to:
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
.algorithm.setIntercept(true)

Thanks
Yanbo


2014-11-25 23:51 GMT+08:00 Bui, Tri 
tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid:
Thanks Liang!

It was my bad, I fat finger one of the data point, correct it and the result 
match with yours.

I am still not able to get the intercept.  I am getting   [error] 
/data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47:
 value setIntercept
mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

I try code below:
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
model.setIntercept(addIntercept = true).trainOn(trainingData)

and:

val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
.setIntercept(true)

But still get compilation error.

Thanks
Tri




From: Yanbo Liang [mailto:yanboha...@gmail.commailto:yanboha...@gmail.com]
Sent: Tuesday, November 25, 2014 4:08 AM
To: Bui, Tri
Cc: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Inaccurate Estimate of weights model from 
StreamingLinearRegressionWithSGD

The case run correctly in my environment.

14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model 
updated at time 141690890 ms
14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current 
model: weights, [0.8588]

Can you provide more detail information if it is convenience?

Turn on the intercept value can be set as following:
val model = new StreamingLinearRegressionWithSGD()
  .algorithm.setIntercept(true)

2014-11-25 3:31 GMT+08:00 Bui, Tri 
tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid:
Hi,

I am getting incorrect weights model from StreamingLinearRegressionwith SGD.

One feature Input data is:

(1,[1])
(2,[2])
…
.
(20,[20

StreamingLinearRegressionWithSGD

2014-12-01 Thread Joanne Contact
Hi Gurus,

I did not look at the code yet. I wonder if StreamingLinearRegressionWithSGD
http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/StreamingLinearRegressionWithSGD.html

is equivalent to
LinearRegressionWithSGD
http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/LinearRegressionWithSGD.htmlwith
starting weights of the current batch as the ending weights of the last
batch?

Since RidgeRegressionModel
http://spark.apache.org/docs/latest/api/java/org/apache/spark/mllib/regression/RidgeRegressionModel.html
does
not seem to have a streaming version, just wonder if this way will suffice.


Thanks!

J


RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD

2014-11-26 Thread Bui, Tri
Liang,

Can you do me a favor and run the predictOnvalues on a sample test data, and 
see if it is working on your end, it is not working for me.  It keeps 
predicting 0.

My code:

val conf = new 
SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression)
val ssc = new StreamingContext(conf, Seconds(args(2).toLong))
val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse)
val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse)
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001)
model.trainOn(trainingData)
model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print()
ssc.start()
ssc.awaitTermination()

Thanks
Tri


From: Bui, Tri [mailto:tri@verizonwireless.com.INVALID]
Sent: Tuesday, November 25, 2014 9:52 AM
To: Yanbo Liang
Cc: user@spark.apache.org
Subject: RE: Inaccurate Estimate of weights model from 
StreamingLinearRegressionWithSGD

Thanks Liang!

It was my bad, I fat finger one of the data point, correct it and the result 
match with yours.

I am still not able to get the intercept.  I am getting   [error] 
/data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47:
 value setIntercept
mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

I try code below:
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
model.setIntercept(addIntercept = true).trainOn(trainingData)

and:

val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
.setIntercept(true)

But still get compilation error.

Thanks
Tri




From: Yanbo Liang [mailto:yanboha...@gmail.com]
Sent: Tuesday, November 25, 2014 4:08 AM
To: Bui, Tri
Cc: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Inaccurate Estimate of weights model from 
StreamingLinearRegressionWithSGD

The case run correctly in my environment.

14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model 
updated at time 141690890 ms
14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current 
model: weights, [0.8588]

Can you provide more detail information if it is convenience?

Turn on the intercept value can be set as following:
val model = new StreamingLinearRegressionWithSGD()
  .algorithm.setIntercept(true)

2014-11-25 3:31 GMT+08:00 Bui, Tri 
tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid:
Hi,

I am getting incorrect weights model from StreamingLinearRegressionwith SGD.

One feature Input data is:

(1,[1])
(2,[2])
…
.
(20,[20])

The result from the Current model: weights is [-4.432]….which is not correct.

Also, how do I turn on the intercept value for the StreamingLinearRegression ?

Thanks
Tri



RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD

2014-11-26 Thread Bui, Tri
Thanks Yanbo!

Modified code below:

val conf = new 
SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression)
val ssc = new StreamingContext(conf, Seconds(args(2).toLong))
val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse)
val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse)
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001).algorithm.setIntercept(true)
model.trainOn(trainingData)
model.predictOnValues(testData.map(lp = (lp.label, lp.features))).print()
ssc.start()
ssc.awaitTermination()

But I am getting compile error:
[error] 
/data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:54:
 value trainOn is not a member
of org.apache.spark.mllib.regression.LinearRegressionWithSGD
[error] model.trainOn(trainingData)
[error]   ^
[error] 
/data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:55:
 value predictOnValues is not a
member of org.apache.spark.mllib.regression.LinearRegressionWithSGD
[error] model.predictOnValues(testData.map(lp = (lp.label, 
lp.features))).print()
[error]   ^
[error] two errors found
[error] (compile:compile) Compilation failed

Thanks
Tri

From: Yanbo Liang [mailto:yanboha...@gmail.com]
Sent: Tuesday, November 25, 2014 8:57 PM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Inaccurate Estimate of weights model from 
StreamingLinearRegressionWithSGD

Hi Tri,

setIntercept() is not a member function of StreamingLinearRegressionWithSGD, 
it's a member function of LinearRegressionWithSGD(GeneralizedLinearAlgorithm) 
which is a member variable(named algorithm) of StreamingLinearRegressionWithSGD.

So you need to change your code to:
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
.algorithm.setIntercept(true)

Thanks
Yanbo


2014-11-25 23:51 GMT+08:00 Bui, Tri 
tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid:
Thanks Liang!

It was my bad, I fat finger one of the data point, correct it and the result 
match with yours.

I am still not able to get the intercept.  I am getting   [error] 
/data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47:
 value setIntercept
mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

I try code below:
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
model.setIntercept(addIntercept = true).trainOn(trainingData)

and:

val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
.setIntercept(true)

But still get compilation error.

Thanks
Tri




From: Yanbo Liang [mailto:yanboha...@gmail.commailto:yanboha...@gmail.com]
Sent: Tuesday, November 25, 2014 4:08 AM
To: Bui, Tri
Cc: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Inaccurate Estimate of weights model from 
StreamingLinearRegressionWithSGD

The case run correctly in my environment.

14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model 
updated at time 141690890 ms
14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current 
model: weights, [0.8588]

Can you provide more detail information if it is convenience?

Turn on the intercept value can be set as following:
val model = new StreamingLinearRegressionWithSGD()
  .algorithm.setIntercept(true)

2014-11-25 3:31 GMT+08:00 Bui, Tri 
tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid:
Hi,

I am getting incorrect weights model from StreamingLinearRegressionwith SGD.

One feature Input data is:

(1,[1])
(2,[2])
…
.
(20,[20])

The result from the Current model: weights is [-4.432]….which is not correct.

Also, how do I turn on the intercept value for the StreamingLinearRegression ?

Thanks
Tri




Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD

2014-11-26 Thread Yanbo Liang
Hi Tri,

Maybe my latest responds for your problem is lost, whatever, the following
code snippet can run correctly.

val model = new
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))


model.algorithm.setIntercept(true)


Because that all setXXX() function in StreamingLinearRegressionWithSGD will
return this.type which is an instance of itself,
so we need set other configuration in a separate line w/o return value.

2014-11-27 1:04 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid:

 Thanks Yanbo!



 Modified code below:



 val conf = new
 SparkConf().setMaster(local[2]).setAppName(StreamingLinearRegression)

 val ssc = new StreamingContext(conf, Seconds(args(2).toLong))

 val trainingData = ssc.textFileStream(args(0)).map(LabeledPoint.parse)

 val testData = ssc.textFileStream(args(1)).map(LabeledPoint.parse)

 val model = new
 StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt)).setNumIterations(args(4).toInt).setStepSize(.0001).algorithm.setIntercept(true)

 model.trainOn(trainingData)

 model.predictOnValues(testData.map(lp = (lp.label,
 lp.features))).print()

 ssc.start()

 ssc.awaitTermination()



 But I am getting compile error:

 [error]
 /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:54:
 value trainOn is not a member

 of org.apache.spark.mllib.regression.LinearRegressionWithSGD

 [error] model.trainOn(trainingData)

 [error]   ^

 [error]
 /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:55:
 value predictOnValues is not a

 member of org.apache.spark.mllib.regression.LinearRegressionWithSGD

 [error] model.predictOnValues(testData.map(lp = (lp.label,
 lp.features))).print()

 [error]   ^

 [error] two errors found

 [error] (compile:compile) Compilation failed



 Thanks

 Tri



 *From:* Yanbo Liang [mailto:yanboha...@gmail.com]
 *Sent:* Tuesday, November 25, 2014 8:57 PM
 *To:* Bui, Tri
 *Cc:* user@spark.apache.org
 *Subject:* Re: Inaccurate Estimate of weights model from
 StreamingLinearRegressionWithSGD



 Hi Tri,



 setIntercept() is not a member function
 of StreamingLinearRegressionWithSGD, it's a member function
 of LinearRegressionWithSGD(GeneralizedLinearAlgorithm) which is a member
 variable(named algorithm) of StreamingLinearRegressionWithSGD.



 So you need to change your code to:

 val model = new
 StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))

 .algorithm.setIntercept(true)



 Thanks

 Yanbo





 2014-11-25 23:51 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid:

 Thanks Liang!



 It was my bad, I fat finger one of the data point, correct it and the
 result match with yours.



 I am still not able to get the intercept.  I am getting   [error]
 /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47:
 value setIntercept

 mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD



 I try code below:

 val model = new
 StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))

 model.setIntercept(addIntercept = true).trainOn(trainingData)



 and:



 val model = new
 StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))

 .setIntercept(true)



 But still get compilation error.



 Thanks

 Tri









 *From:* Yanbo Liang [mailto:yanboha...@gmail.com]
 *Sent:* Tuesday, November 25, 2014 4:08 AM
 *To:* Bui, Tri
 *Cc:* user@spark.apache.org
 *Subject:* Re: Inaccurate Estimate of weights model from
 StreamingLinearRegressionWithSGD



 The case run correctly in my environment.



 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model
 updated at time 141690890 ms

 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD:
 Current model: weights, [0.8588]



 Can you provide more detail information if it is convenience?



 Turn on the intercept value can be set as following:

 val model = new StreamingLinearRegressionWithSGD()

   .algorithm.setIntercept(true)



 2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid:

 Hi,



 I am getting incorrect weights model from StreamingLinearRegressionwith
 SGD.



 One feature Input data is:



 (1,[1])

 (2,[2])

 …

 .

 (20,[20])



 The result from the Current model: weights is [-4.432]….which is not
 correct.



 Also, how do I turn on the intercept value for the
 StreamingLinearRegression ?



 Thanks

 Tri







Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD

2014-11-25 Thread Yanbo Liang
The case run correctly in my environment.

14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model
updated at time 141690890 ms
14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current
model: weights, [0.8588]

Can you provide more detail information if it is convenience?

Turn on the intercept value can be set as following:
val model = new StreamingLinearRegressionWithSGD()
  .algorithm.setIntercept(true)

2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid:

 Hi,



 I am getting incorrect weights model from StreamingLinearRegressionwith
 SGD.



 One feature Input data is:



 (1,[1])

 (2,[2])

 …

 .

 (20,[20])



 The result from the Current model: weights is [-4.432]….which is not
 correct.



 Also, how do I turn on the intercept value for the
 StreamingLinearRegression ?



 Thanks

 Tri



RE: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD

2014-11-25 Thread Bui, Tri
Thanks Liang!

It was my bad, I fat finger one of the data point, correct it and the result 
match with yours.

I am still not able to get the intercept.  I am getting   [error] 
/data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47:
 value setIntercept
mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD

I try code below:
val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
model.setIntercept(addIntercept = true).trainOn(trainingData)

and:

val model = new 
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))
.setIntercept(true)

But still get compilation error.

Thanks
Tri




From: Yanbo Liang [mailto:yanboha...@gmail.com]
Sent: Tuesday, November 25, 2014 4:08 AM
To: Bui, Tri
Cc: user@spark.apache.org
Subject: Re: Inaccurate Estimate of weights model from 
StreamingLinearRegressionWithSGD

The case run correctly in my environment.

14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model 
updated at time 141690890 ms
14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Current 
model: weights, [0.8588]

Can you provide more detail information if it is convenience?

Turn on the intercept value can be set as following:
val model = new StreamingLinearRegressionWithSGD()
  .algorithm.setIntercept(true)

2014-11-25 3:31 GMT+08:00 Bui, Tri 
tri@verizonwireless.com.invalidmailto:tri@verizonwireless.com.invalid:
Hi,

I am getting incorrect weights model from StreamingLinearRegressionwith SGD.

One feature Input data is:

(1,[1])
(2,[2])
…
.
(20,[20])

The result from the Current model: weights is [-4.432]….which is not correct.

Also, how do I turn on the intercept value for the StreamingLinearRegression ?

Thanks
Tri



Re: Inaccurate Estimate of weights model from StreamingLinearRegressionWithSGD

2014-11-25 Thread Yanbo Liang
Hi Tri,

setIntercept() is not a member function
of StreamingLinearRegressionWithSGD, it's a member function
of LinearRegressionWithSGD(GeneralizedLinearAlgorithm) which is a member
variable(named algorithm) of StreamingLinearRegressionWithSGD.

So you need to change your code to:

val model = new
StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))

.algorithm.setIntercept(true)


Thanks

Yanbo


2014-11-25 23:51 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid:

 Thanks Liang!



 It was my bad, I fat finger one of the data point, correct it and the
 result match with yours.



 I am still not able to get the intercept.  I am getting   [error]
 /data/project/LinearRegression/src/main/scala/StreamingLinearRegression.scala:47:
 value setIntercept

 mber of org.apache.spark.mllib.regression.StreamingLinearRegressionWithSGD



 I try code below:

 val model = new
 StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))

 model.setIntercept(addIntercept = true).trainOn(trainingData)



 and:



 val model = new
 StreamingLinearRegressionWithSGD().setInitialWeights(Vectors.zeros(args(3).toInt))

 .setIntercept(true)



 But still get compilation error.



 Thanks

 Tri









 *From:* Yanbo Liang [mailto:yanboha...@gmail.com]
 *Sent:* Tuesday, November 25, 2014 4:08 AM
 *To:* Bui, Tri
 *Cc:* user@spark.apache.org
 *Subject:* Re: Inaccurate Estimate of weights model from
 StreamingLinearRegressionWithSGD



 The case run correctly in my environment.



 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD: Model
 updated at time 141690890 ms

 14/11/25 17:48:20 INFO regression.StreamingLinearRegressionWithSGD:
 Current model: weights, [0.8588]



 Can you provide more detail information if it is convenience?



 Turn on the intercept value can be set as following:

 val model = new StreamingLinearRegressionWithSGD()

   .algorithm.setIntercept(true)



 2014-11-25 3:31 GMT+08:00 Bui, Tri tri@verizonwireless.com.invalid:

 Hi,



 I am getting incorrect weights model from StreamingLinearRegressionwith
 SGD.



 One feature Input data is:



 (1,[1])

 (2,[2])

 …

 .

 (20,[20])



 The result from the Current model: weights is [-4.432]….which is not
 correct.



 Also, how do I turn on the intercept value for the
 StreamingLinearRegression ?



 Thanks

 Tri