[GitHub] spark pull request #21828: Update regression.py
Github user woodthom2 closed the pull request at: https://github.com/apache/spark/pull/21828 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21828: Update regression.py
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21828#discussion_r204012751 --- Diff: python/pyspark/ml/regression.py --- @@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", labelCol="label", predictionCol="pre maxDepth=5, maxBins=32, minInstancesPerNode=1, minInfoGain=0.0, maxMemoryInMB=256, cacheNodeIds=False, subsamplingRate=1.0, checkpointInterval=10, lossType="squared", maxIter=20, stepSize=0.1, seed=None, - impuriy="variance", featureSubsetStrategy="all"): + impurity="variance", featureSubsetStrategy="all"): --- End diff -- we could. I would rather use `_NoValue` instance for that purpose though. Also, I would make a warning via warnings package as we do in the code base. Can we add a simple test for that as well while we are here? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21828: Update regression.py
Github user woodthom2 commented on a diff in the pull request: https://github.com/apache/spark/pull/21828#discussion_r203997163 --- Diff: python/pyspark/ml/regression.py --- @@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", labelCol="label", predictionCol="pre maxDepth=5, maxBins=32, minInstancesPerNode=1, minInfoGain=0.0, maxMemoryInMB=256, cacheNodeIds=False, subsamplingRate=1.0, checkpointInterval=10, lossType="squared", maxIter=20, stepSize=0.1, seed=None, - impuriy="variance", featureSubsetStrategy="all"): + impurity="variance", featureSubsetStrategy="all"): --- End diff -- what about this until next major release? ``` def setParams(self, featuresCol="features", labelCol="label", predictionCol="prediction", maxDepth=5, maxBins=32, minInstancesPerNode=1, minInfoGain=0.0, maxMemoryInMB=256, cacheNodeIds=False, subsamplingRate=1.0, checkpointInterval=10, lossType="squared", maxIter=20, stepSize=0.1, seed=None, impuriy=None, impurity="variance", featureSubsetStrategy="all"): if impuriy is not None: # for backward compatibility impurity = impuriy ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21828: Update regression.py
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21828#discussion_r203995996 --- Diff: python/pyspark/ml/regression.py --- @@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", labelCol="label", predictionCol="pre maxDepth=5, maxBins=32, minInstancesPerNode=1, minInfoGain=0.0, maxMemoryInMB=256, cacheNodeIds=False, subsamplingRate=1.0, checkpointInterval=10, lossType="squared", maxIter=20, stepSize=0.1, seed=None, - impuriy="variance", featureSubsetStrategy="all"): + impurity="variance", featureSubsetStrategy="all"): --- End diff -- Of course that's possible - user upgraded Spark and suddenly it gives no such keyword exception. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21828: Update regression.py
Github user woodthom2 commented on a diff in the pull request: https://github.com/apache/spark/pull/21828#discussion_r203994440 --- Diff: python/pyspark/ml/regression.py --- @@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", labelCol="label", predictionCol="pre maxDepth=5, maxBins=32, minInstancesPerNode=1, minInfoGain=0.0, maxMemoryInMB=256, cacheNodeIds=False, subsamplingRate=1.0, checkpointInterval=10, lossType="squared", maxIter=20, stepSize=0.1, seed=None, - impuriy="variance", featureSubsetStrategy="all"): + impurity="variance", featureSubsetStrategy="all"): --- End diff -- is anyone depending on the typoed version? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21828: Update regression.py
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21828#discussion_r203994108 --- Diff: python/pyspark/ml/regression.py --- @@ -1116,7 +1116,7 @@ def setParams(self, featuresCol="features", labelCol="label", predictionCol="pre maxDepth=5, maxBins=32, minInstancesPerNode=1, minInfoGain=0.0, maxMemoryInMB=256, cacheNodeIds=False, subsamplingRate=1.0, checkpointInterval=10, lossType="squared", maxIter=20, stepSize=0.1, seed=None, - impuriy="variance", featureSubsetStrategy="all"): + impurity="variance", featureSubsetStrategy="all"): --- End diff -- I think this can't just changed like this since it's going to break other users codes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21828: Update regression.py
GitHub user woodthom2 opened a pull request: https://github.com/apache/spark/pull/21828 Update regression.py Correct typo impuriy -> impurity (this would have stopped GBT working for some hyperparameter configurations) ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/woodthom2/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21828.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21828 commit d3973d0149ce313210c1f1930a9450bb33d70960 Author: woodthom2 Date: 2018-07-20T09:51:59Z Update regression.py Correct typo impuriy -> impurity (this would have stopped GBT working for some hyperparameter configurations) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org