Repository: spark Updated Branches: refs/heads/branch-1.5 3df750029 -> f95129c17
[SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not train with given regParam and convergenceTol parameters These params were being passed into the StreamingLogisticRegressionWithSGD constructor, but not transferred to the call for model training. Same with StreamingLinearRegressionWithSGD. I added the params as named arguments to the call and also fixed the intercept parameter, which was being passed as regularization value. Author: Bryan Cutler <bjcut...@us.ibm.com> Closes #9002 from BryanCutler/StreamingSGD-convergenceTol-bug-10959. (cherry picked from commit 5410747a84e9be1cea44159dfc2216d5e0728ab4) Signed-off-by: Xiangrui Meng <m...@databricks.com> Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f95129c1 Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f95129c1 Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f95129c1 Branch: refs/heads/branch-1.5 Commit: f95129c17523ea60220a37576b8a9390943cf98e Parents: 3df7500 Author: Bryan Cutler <bjcut...@us.ibm.com> Authored: Thu Oct 8 22:21:07 2015 -0700 Committer: Xiangrui Meng <m...@databricks.com> Committed: Thu Oct 8 22:23:16 2015 -0700 ---------------------------------------------------------------------- python/pyspark/mllib/classification.py | 3 ++- python/pyspark/mllib/regression.py | 2 +- 2 files changed, 3 insertions(+), 2 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/spark/blob/f95129c1/python/pyspark/mllib/classification.py ---------------------------------------------------------------------- diff --git a/python/pyspark/mllib/classification.py b/python/pyspark/mllib/classification.py index 8f27c44..a302ed8 100644 --- a/python/pyspark/mllib/classification.py +++ b/python/pyspark/mllib/classification.py @@ -632,7 +632,8 @@ class StreamingLogisticRegressionWithSGD(StreamingLinearAlgorithm): if not rdd.isEmpty(): self._model = LogisticRegressionWithSGD.train( rdd, self.numIterations, self.stepSize, - self.miniBatchFraction, self._model.weights) + self.miniBatchFraction, self._model.weights, + regParam=self.regParam, convergenceTol=self.convergenceTol) dstream.foreachRDD(update) http://git-wip-us.apache.org/repos/asf/spark/blob/f95129c1/python/pyspark/mllib/regression.py ---------------------------------------------------------------------- diff --git a/python/pyspark/mllib/regression.py b/python/pyspark/mllib/regression.py index 41946e3..562fb36 100644 --- a/python/pyspark/mllib/regression.py +++ b/python/pyspark/mllib/regression.py @@ -669,7 +669,7 @@ class StreamingLinearRegressionWithSGD(StreamingLinearAlgorithm): self._model = LinearRegressionWithSGD.train( rdd, self.numIterations, self.stepSize, self.miniBatchFraction, self._model.weights, - self._model.intercept) + intercept=self._model.intercept, convergenceTol=self.convergenceTol) dstream.foreachRDD(update) --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org