Repository: spark
Updated Branches:
  refs/heads/branch-1.5 3df750029 -> f95129c17


[SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not train with 
given regParam and convergenceTol parameters

These params were being passed into the StreamingLogisticRegressionWithSGD 
constructor, but not transferred to the call for model training.  Same with 
StreamingLinearRegressionWithSGD.  I added the params as named arguments to the 
call and also fixed the intercept parameter, which was being passed as 
regularization value.

Author: Bryan Cutler <bjcut...@us.ibm.com>

Closes #9002 from BryanCutler/StreamingSGD-convergenceTol-bug-10959.

(cherry picked from commit 5410747a84e9be1cea44159dfc2216d5e0728ab4)
Signed-off-by: Xiangrui Meng <m...@databricks.com>


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f95129c1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f95129c1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f95129c1

Branch: refs/heads/branch-1.5
Commit: f95129c17523ea60220a37576b8a9390943cf98e
Parents: 3df7500
Author: Bryan Cutler <bjcut...@us.ibm.com>
Authored: Thu Oct 8 22:21:07 2015 -0700
Committer: Xiangrui Meng <m...@databricks.com>
Committed: Thu Oct 8 22:23:16 2015 -0700

----------------------------------------------------------------------
 python/pyspark/mllib/classification.py | 3 ++-
 python/pyspark/mllib/regression.py     | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/f95129c1/python/pyspark/mllib/classification.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/classification.py 
b/python/pyspark/mllib/classification.py
index 8f27c44..a302ed8 100644
--- a/python/pyspark/mllib/classification.py
+++ b/python/pyspark/mllib/classification.py
@@ -632,7 +632,8 @@ class 
StreamingLogisticRegressionWithSGD(StreamingLinearAlgorithm):
             if not rdd.isEmpty():
                 self._model = LogisticRegressionWithSGD.train(
                     rdd, self.numIterations, self.stepSize,
-                    self.miniBatchFraction, self._model.weights)
+                    self.miniBatchFraction, self._model.weights,
+                    regParam=self.regParam, convergenceTol=self.convergenceTol)
 
         dstream.foreachRDD(update)
 

http://git-wip-us.apache.org/repos/asf/spark/blob/f95129c1/python/pyspark/mllib/regression.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/regression.py 
b/python/pyspark/mllib/regression.py
index 41946e3..562fb36 100644
--- a/python/pyspark/mllib/regression.py
+++ b/python/pyspark/mllib/regression.py
@@ -669,7 +669,7 @@ class 
StreamingLinearRegressionWithSGD(StreamingLinearAlgorithm):
                 self._model = LinearRegressionWithSGD.train(
                     rdd, self.numIterations, self.stepSize,
                     self.miniBatchFraction, self._model.weights,
-                    self._model.intercept)
+                    intercept=self._model.intercept, 
convergenceTol=self.convergenceTol)
 
         dstream.foreachRDD(update)
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

Reply via email to