[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199478745 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/python/PythonSQLUtils.scala --- @@ -34,17 +33,19 @@ private[sql] object PythonSQLUtils

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199498622 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -38,70 +39,75 @@ import org.apache.spark.util.Utils

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199502733 --- Diff: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala --- @@ -398,6 +398,25 @@ private[spark] object PythonRDD extends Logging

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199482134 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -183,34 +182,111 @@ private[sql] object ArrowConverters

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199476976 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -183,34 +182,111 @@ private[sql] object ArrowConverters

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199496002 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3236,13 +3237,50 @@ class Dataset[T] private[sql

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199497456 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -183,34 +182,111 @@ private[sql] object ArrowConverters

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199484323 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3236,13 +3237,50 @@ class Dataset[T] private[sql

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199508609 --- Diff: python/pyspark/serializers.py --- @@ -184,27 +184,59 @@ def loads(self, obj): raise NotImplementedError -class

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199482021 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -183,34 +182,111 @@ private[sql] object ArrowConverters

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199499070 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -38,70 +39,75 @@ import org.apache.spark.util.Utils

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-07-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199371158 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala --- @@ -183,34 +182,111 @@ private[sql] object ArrowConverters

[GitHub] spark pull request #21546: [WIP][SPARK-23030][SQL][PYTHON] Use Arrow stream ...

2018-06-29 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/21546#discussion_r199275753 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala --- @@ -3236,13 +3237,50 @@ class Dataset[T] private[sql

[GitHub] spark issue #20629: [SPARK-23451][ML] Deprecate KMeans.computeCost

2018-06-28 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20629 +1 for @mgaido91's plan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #16722: [SPARK-19591][ML][MLlib] Add sample weights to decision ...

2018-06-22 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/16722 Yes, feel free to take this over. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #19680: [SPARK-22461][ML] Refactor Spark ML model summari...

2018-06-01 Thread sethah
Github user sethah closed the pull request at: https://github.com/apache/spark/pull/19680 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20701: [SPARK-23528][ML] Add numIter to ClusteringSummar...

2018-03-26 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20701#discussion_r177200509 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansModel.scala --- @@ -46,6 +47,10 @@ class KMeansModel @Since("2.4.0") (@Si

[GitHub] spark issue #20632: [SPARK-3159][ML] Add decision tree pruning

2018-03-02 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20632 Merged with master. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20632: [SPARK-3159][ML] Add decision tree pruning

2018-03-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r171982559 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -703,4 +707,16 @@ private object RandomForestSuite

[GitHub] spark issue #20632: [SPARK-3159][ML] Add decision tree pruning

2018-03-02 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20632 @asolimando I thought of one more thing on the tests that I'd like to have. Other than that I think this is ready. @srowen For some reason the tests won't run... Do you have any insight

[GitHub] spark pull request #20632: [SPARK-3159][ML] Add decision tree pruning

2018-03-02 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r171899289 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -631,10 +634,70 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark issue #20709: [SPARK-18844][MLLIB] Adding more binary classification e...

2018-03-02 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20709 You don't need to (and should not) open a new PR to fix merge conflicts. Just fix them through git, on the same branch

[GitHub] spark issue #20632: [SPARK-3159][ML] Add decision tree pruning

2018-03-01 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20632 Jenkins test this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20709: [SPARK-18844][MLLIB] Adding more binary classification e...

2018-03-01 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20709 Why did you close the old one and re-open this? The discussion is lost now. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #20708: [SPARK-21209][MLLLIB] Implement Incremental PCA algorith...

2018-03-01 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20708 * Only committers can trigger the tests. * MLlib is in maintenance only mode, so we wouldn't accept this patch as is. * If this were to go into ML, I think you'd need to discuss it more

[GitHub] spark pull request #18998: [SPARK-21748][ML] Migrate the implementation of H...

2018-02-28 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/18998#discussion_r171425701 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala --- @@ -93,11 +97,21 @@ class HashingTF @Since("1.4.0") (@Si

[GitHub] spark issue #20632: [SPARK-3159][ML] Add decision tree pruning

2018-02-28 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20632 Jenkins retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #20632: [SPARK-3159][ML] Add decision tree pruning

2018-02-28 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20632 Squashing makes it impossible to review the history of the code review, so I don't think it's a good idea. It's fine for now. This LGTM. Let's see if @srowen or @jkbradley have any thoughts

[GitHub] spark issue #20632: [SPARK-3159] added subtree pruning in the translation fr...

2018-02-28 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20632 @asolimando Can you change the title to include `[ML]` and also shorten it. Maybe just: `Add decision tree pruning

[GitHub] spark issue #20632: [SPARK-3159] added subtree pruning in the translation fr...

2018-02-28 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20632 @srowen Do we need you to trigger the tests? I'm not sure why they haven't been run... --- - To unsubscribe, e-mail: reviews

[GitHub] spark issue #20632: [SPARK-3159] added subtree pruning in the translation fr...

2018-02-28 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20632 Jenkins test this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-27 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r171071499 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -266,15 +265,24 @@ private[tree] class LearningNode( var isLeaf: Boolean

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-27 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r171071028 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -92,6 +92,7 @@ private[spark] object RandomForest extends Logging

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-27 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r171070831 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -631,10 +634,99 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #18998: [SPARK-21748][ML] Migrate the implementation of H...

2018-02-27 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/18998#discussion_r171025256 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala --- @@ -93,11 +97,21 @@ class HashingTF @Since("1.4.0") (@Si

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-26 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170738256 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -541,7 +541,9 @@ object DecisionTreeSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-26 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170738944 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -631,10 +634,99 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-26 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170738068 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -269,12 +268,19 @@ private[tree] class LearningNode( /** * Convert

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-23 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170410905 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -541,7 +541,7 @@ object DecisionTreeSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-23 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170412046 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -631,6 +651,160 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-23 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170410747 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -402,20 +405,40 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-23 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170410687 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -402,20 +407,35 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-23 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170410775 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -283,10 +292,12 @@ private[tree] class LearningNode( // Here we want

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-23 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170412098 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -631,6 +651,160 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-23 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170410851 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -18,17 +18,20 @@ package org.apache.spark.ml.tree.impl

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-23 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r170410834 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -270,11 +269,21 @@ private[tree] class LearningNode( * Convert

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169806418 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -402,20 +405,40 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169834234 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -362,10 +365,10 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169834018 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -359,29 +339,6 @@ class DecisionTreeSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169806576 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -402,20 +405,40 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169833178 --- Diff: mllib/src/test/scala/org/apache/spark/mllib/tree/DecisionTreeSuite.scala --- @@ -303,26 +303,6 @@ class DecisionTreeSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169806090 --- Diff: mllib/src/test/scala/org/apache/spark/ml/tree/impl/RandomForestSuite.scala --- @@ -402,20 +405,40 @@ class RandomForestSuite extends SparkFunSuite

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169737984 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +291,34 @@ private[tree] class LearningNode

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169749740 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +291,34 @@ private[tree] class LearningNode

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169735198 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +291,34 @@ private[tree] class LearningNode

[GitHub] spark pull request #20632: [SPARK-3159] added subtree pruning in the transla...

2018-02-21 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20632#discussion_r169703784 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/Node.scala --- @@ -287,6 +291,34 @@ private[tree] class LearningNode

[GitHub] spark pull request #20472: [SPARK-22751][ML]Improve ML RandomForest shuffle ...

2018-02-20 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20472#discussion_r169391525 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -1001,11 +996,18 @@ private[spark] object RandomForest extends

[GitHub] spark pull request #20472: [SPARK-22751][ML]Improve ML RandomForest shuffle ...

2018-02-20 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20472#discussion_r169386551 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tree/impl/RandomForest.scala --- @@ -1001,11 +996,18 @@ private[spark] object RandomForest extends

[GitHub] spark issue #20472: [SPARK-22751][ML]Improve ML RandomForest shuffle perform...

2018-02-20 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20472 @srowen Can you trigger the tests? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark issue #20332: [SPARK-23138][ML][DOC] Multiclass logistic regression su...

2018-01-29 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20332 Thanks a lot for your review, @MLnick! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark pull request #20332: [SPARK-23138][ML][DOC] Multiclass logistic regres...

2018-01-29 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20332#discussion_r164476639 --- Diff: docs/ml-classification-regression.md --- @@ -125,7 +123,8 @@ Continuing the earlier example: [`LogisticRegressionTrainingSummary`](api/python

[GitHub] spark issue #20411: [SPARK-17139][ML][FOLLOW-UP] update LogisticRegressionSu...

2018-01-26 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20411 This is already fixed in #20332. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #20332: [SPARK-23138][ML][DOC] Multiclass logistic regres...

2018-01-26 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20332#discussion_r164151869 --- Diff: docs/ml-classification-regression.md --- @@ -97,10 +97,6 @@ only available on the driver. [`LogisticRegressionTrainingSummary`](api/scala

[GitHub] spark pull request #20332: [SPARK-23138][ML][DOC] Multiclass logistic regres...

2018-01-26 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20332#discussion_r164151796 --- Diff: docs/ml-classification-regression.md --- @@ -125,7 +117,6 @@ Continuing the earlier example: [`LogisticRegressionTrainingSummary`](api/python

[GitHub] spark pull request #20332: [SPARK-23138][ML][DOC] Multiclass logistic regres...

2018-01-26 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20332#discussion_r164151687 --- Diff: examples/src/main/scala/org/apache/spark/examples/ml/MulticlassLogisticRegressionWithElasticNetExample.scala --- @@ -49,6 +49,48 @@ object

[GitHub] spark pull request #20332: [SPARK-23138][ML][DOC] Multiclass logistic regres...

2018-01-26 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/20332#discussion_r164151731 --- Diff: examples/src/main/python/ml/multiclass_logistic_regression_with_elastic_net.py --- @@ -43,6 +43,43 @@ # Print the coefficients

[GitHub] spark issue #20332: [SPARK-23138][ML][DOC] Multiclass summary example and us...

2018-01-19 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20332 @jkbradley @MLnick --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #20332: [SPARK-23138][ML][DOC] Multiclass summary example...

2018-01-19 Thread sethah
GitHub user sethah opened a pull request: https://github.com/apache/spark/pull/20332 [SPARK-23138][ML][DOC] Multiclass summary example and user guide ## What changes were proposed in this pull request? User guide and examples are updated to reflect multiclass logistic

[GitHub] spark issue #20188: [SPARK-22993][ML] Clarify HasCheckpointInterval param do...

2018-01-09 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20188 Thanks, latest commit should fix it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands

[GitHub] spark issue #20188: [SPARK-22993][ML] Clarify HasCheckpointInterval param do...

2018-01-09 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20188 Good call @felixcheung! Will update shortly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160503466 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -126,15 +180,69 @@ abstract class MLWriter extends BaseReadWrite with Logging

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160484001 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala --- @@ -1044,6 +1056,50 @@ class LinearRegressionSuite extends

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160461644 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -126,15 +180,69 @@ abstract class MLWriter extends BaseReadWrite with Logging

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160503640 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -85,12 +87,55 @@ private[util] sealed trait BaseReadWrite { protected

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160496808 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -126,15 +180,69 @@ abstract class MLWriter extends BaseReadWrite with Logging

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160462794 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -85,12 +87,55 @@ private[util] sealed trait BaseReadWrite { protected

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160502536 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -85,12 +87,55 @@ private[util] sealed trait BaseReadWrite { protected

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160501723 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -126,15 +180,69 @@ abstract class MLWriter extends BaseReadWrite with Logging

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160503322 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -126,15 +180,69 @@ abstract class MLWriter extends BaseReadWrite with Logging

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160471845 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -710,15 +711,57 @@ class LinearRegressionModel private[ml

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160463657 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -126,15 +180,69 @@ abstract class MLWriter extends BaseReadWrite with Logging

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160483562 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala --- @@ -1044,6 +1056,50 @@ class LinearRegressionSuite extends

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160461560 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -85,12 +87,55 @@ private[util] sealed trait BaseReadWrite { protected

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160506592 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala --- @@ -1044,6 +1056,50 @@ class LinearRegressionSuite extends

[GitHub] spark pull request #19876: [ML][SPARK-11171][SPARK-11239] Add PMML export to...

2018-01-09 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r160463225 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -85,12 +87,55 @@ private[util] sealed trait BaseReadWrite { protected

[GitHub] spark issue #20188: [SPARK-22993][ML] Clarify HasCheckpointInterval param do...

2018-01-08 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/20188 cc @srowen @holdenk The MLlib counterparts actually make mention of this, but for some reason the note never got ported over to ML package. The only caveat I can think of is that this doc

[GitHub] spark pull request #20188: [SPARK-22993][ML] Clarify HasCheckpointInterval p...

2018-01-08 Thread sethah
GitHub user sethah opened a pull request: https://github.com/apache/spark/pull/20188 [SPARK-22993][ML] Clarify HasCheckpointInterval param doc ## What changes were proposed in this pull request? Add a note to the `HasCheckpointInterval` parameter doc that clarifies

[GitHub] spark pull request #19876: [WIP][ML][SPARK-11171][SPARK-11239] Add PMML expo...

2017-12-12 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r156389238 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -126,15 +180,69 @@ abstract class MLWriter extends BaseReadWrite with Logging

[GitHub] spark pull request #19876: [WIP][ML][SPARK-11171][SPARK-11239] Add PMML expo...

2017-12-12 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r155157785 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -126,15 +180,69 @@ abstract class MLWriter extends BaseReadWrite with Logging

[GitHub] spark pull request #19876: [WIP][ML][SPARK-11171][SPARK-11239] Add PMML expo...

2017-12-12 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r156370588 --- Diff: mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala --- @@ -85,12 +87,55 @@ private[util] sealed trait BaseReadWrite { protected

[GitHub] spark pull request #19876: [WIP][ML][SPARK-11171][SPARK-11239] Add PMML expo...

2017-12-12 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r156388871 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala --- @@ -994,6 +998,38 @@ class LinearRegressionSuite

[GitHub] spark pull request #19876: [WIP][ML][SPARK-11171][SPARK-11239] Add PMML expo...

2017-12-12 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19876#discussion_r156381361 --- Diff: mllib/src/main/scala/org/apache/spark/ml/regression/LinearRegression.scala --- @@ -554,7 +555,49 @@ class LinearRegressionModel private[ml

[GitHub] spark issue #19876: [WIP][ML][SPARK-11171][SPARK-11239] Add PMML export to S...

2017-12-12 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/19876 @holdenk Do you mind leaving some comments on the intentions/benefits of this new API for the benefit of other reviewers? For example, what use cases may exist - adding third party PFA support

[GitHub] spark pull request #19904: [SPARK-22707][ML] Optimize CrossValidator memory ...

2017-12-07 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19904#discussion_r155710913 --- Diff: mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala --- @@ -146,25 +147,18 @@ class CrossValidator @Since("1.2.0") (@Si

[GitHub] spark issue #19904: [SPARK-22707][ML] Optimize CrossValidator memory occupat...

2017-12-07 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/19904 Can you share your test/results with us? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #18729: [SPARK-21526] [MLlib] Add support to ML LogisticRegressi...

2017-12-05 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/18729 I actually completely agree about perfect being the enemy of good in this case. We should provide something workable that can be safely modified in the future if needed. Still, this needs to be done

[GitHub] spark pull request #19638: [SPARK-22422][ML] Add Adjusted R2 to RegressionMe...

2017-11-08 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19638#discussion_r149731177 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala --- @@ -764,13 +764,17 @@ class LinearRegressionSuite

[GitHub] spark pull request #19638: [SPARK-22422][ML] Add Adjusted R2 to RegressionMe...

2017-11-07 Thread sethah
Github user sethah commented on a diff in the pull request: https://github.com/apache/spark/pull/19638#discussion_r149559666 --- Diff: mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala --- @@ -764,13 +764,17 @@ class LinearRegressionSuite

[GitHub] spark issue #19680: [SPARK-22641][ML] Refactor Spark ML model summaries

2017-11-06 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/19680 I think it's important to restructure these summaries to inherit from the same traits, so different methods can be re-used. That structure has to live somewhere and there isn't really a logical

[GitHub] spark issue #19680: [SPARK-22641][ML] Refactor Spark ML model summaries

2017-11-06 Thread sethah
Github user sethah commented on the issue: https://github.com/apache/spark/pull/19680 Mima failures :) These APIs were all marked as experimental, which does give us some freedom to move them, though I know we prefer to avoid it. It's mainly complaining that we changed these classes

  1   2   3   4   5   6   7   8   9   10   >