[GitHub] spark pull request #21557: [SPARK-24439][ML][PYTHON]Add distanceMeasure to B...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21557 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21557: [SPARK-24439][ML][PYTHON]Add distanceMeasure to B...
Github user huaxingao commented on a diff in the pull request: https://github.com/apache/spark/pull/21557#discussion_r198684081 --- Diff: python/pyspark/ml/clustering.py --- @@ -622,10 +621,10 @@ def __init__(self, featuresCol="features", predictionCol="prediction", maxIter=2 @keyword_only @since("2.0.0") def setParams(self, featuresCol="features", predictionCol="prediction", maxIter=20, - seed=None, k=4, minDivisibleClusterSize=1.0): + seed=None, k=4, minDivisibleClusterSize=1.0, distanceMeasure="euclidean"): """ setParams(self, featuresCol="features", predictionCol="prediction", maxIter=20, \ - seed=None, k=4, minDivisibleClusterSize=1.0) + seed=None, k=4, minDivisibleClusterSize=1.0, distanceMeasure="euclidean") Sets params for BisectingKMeans. --- End diff -- @BryanCutler Thank you very much for your review. I will make change. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21557: [SPARK-24439][ML][PYTHON]Add distanceMeasure to B...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/21557#discussion_r198675445 --- Diff: python/pyspark/ml/clustering.py --- @@ -622,10 +621,10 @@ def __init__(self, featuresCol="features", predictionCol="prediction", maxIter=2 @keyword_only @since("2.0.0") def setParams(self, featuresCol="features", predictionCol="prediction", maxIter=20, - seed=None, k=4, minDivisibleClusterSize=1.0): + seed=None, k=4, minDivisibleClusterSize=1.0, distanceMeasure="euclidean"): """ setParams(self, featuresCol="features", predictionCol="prediction", maxIter=20, \ - seed=None, k=4, minDivisibleClusterSize=1.0) + seed=None, k=4, minDivisibleClusterSize=1.0, distanceMeasure="euclidean") Sets params for BisectingKMeans. --- End diff -- I know we already have `setDistanceMeasure` and `getDistanceMeasure` methods from the shared param, but can you also add them here so we can use the `since` decorator? (same as KMeans) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21557: [SPARK-24439][ML][PYTHON]Add distanceMeasure to B...
GitHub user huaxingao opened a pull request: https://github.com/apache/spark/pull/21557 [SPARK-24439][ML][PYTHON]Add distanceMeasure to BisectingKMeans in PySpark ## What changes were proposed in this pull request? add distanceMeasure to BisectingKMeans in Python. ## How was this patch tested? added doctest and also manually tested it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/huaxingao/spark spark-24439 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21557.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21557 commit 7f4cb6177003482461c063f90e1e642f714ddcea Author: Huaxin Gao Date: 2018-06-13T17:38:15Z [SPARK-24439][ML][PYTHON]Add distanceMeasure to BisectingKMeans in PySpark --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org