[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread yogeshg
Github user yogeshg commented on the issue: https://github.com/apache/spark/pull/22912 In an offline discussion with @MrBago , we noted that there's at most as many (non-cancelled) `timerTasks` on the `timer` as there are slots. So, one thread for managing logging is probably fine

[GitHub] spark pull request #22912: [SPARK-25901] Use only one thread in BarrierTaskC...

2018-10-31 Thread yogeshg
GitHub user yogeshg opened a pull request: https://github.com/apache/spark/pull/22912 [SPARK-25901] Use only one thread in BarrierTaskContext companion object ## What changes were proposed in this pull request? Now we use only one `timer` (and thus a backing thread

[GitHub] spark issue #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff test Pyth...

2018-04-05 Thread yogeshg
Github user yogeshg commented on the issue: https://github.com/apache/spark/pull/20904 lgtm, I'll defer to @jkbradley --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

[GitHub] spark pull request #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff te...

2018-04-05 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20904#discussion_r179639255 --- Diff: python/pyspark/ml/stat.py --- @@ -134,6 +134,63 @@ def corr(dataset, column, method="pearson"): return _java2py(sc, javaCo

[GitHub] spark pull request #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff te...

2018-04-04 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20904#discussion_r179278641 --- Diff: python/pyspark/ml/stat.py --- @@ -134,6 +134,63 @@ def corr(dataset, column, method="pearson"): return _java2py(sc, javaCo

[GitHub] spark pull request #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff te...

2018-04-04 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20904#discussion_r179267921 --- Diff: python/pyspark/ml/stat.py --- @@ -134,6 +134,63 @@ def corr(dataset, column, method="pearson"): return _java2py(sc, javaCo

[GitHub] spark pull request #20904: [SPARK-23751][ML][PySpark] Kolmogorov-Smirnoff te...

2018-04-04 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20904#discussion_r179283700 --- Diff: python/pyspark/ml/stat.py --- @@ -134,6 +134,63 @@ def corr(dataset, column, method="pearson"): return _java2py(sc, javaCo

[GitHub] spark pull request #20970: [SPARK-23562][ML] Forward RFormula handleInvalid ...

2018-04-03 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20970#discussion_r178942791 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/RFormulaSuite.scala --- @@ -592,4 +593,26 @@ class RFormulaSuite extends MLTest

[GitHub] spark issue #20970: [SPARK-23562][ML] Forward RFormula handleInvalid Param t...

2018-04-03 Thread yogeshg
Github user yogeshg commented on the issue: https://github.com/apache/spark/pull/20970 - [ ] send all PRs against subtasks, rather than against the parent task - [ ] avoid using infix notation for testing `contains

[GitHub] spark pull request #20970: [SPARK-23562][ML] Forward RFormula handleInvalid ...

2018-04-03 Thread yogeshg
GitHub user yogeshg opened a pull request: https://github.com/apache/spark/pull/20970 [SPARK-23562][ML] Forward RFormula handleInvalid Param to VectorAssembler to handle invalid values in non-string columns ## What changes were proposed in this pull request? `handleInvalid

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-04-02 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r178636550 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -136,34 +181,106 @@ class VectorAssembler @Since("1.4.0"

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-04-02 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r178605922 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -49,32 +55,64 @@ class VectorAssembler @Since("1.4.0"

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-21 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r176285294 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -49,32 +53,57 @@ class VectorAssembler @Since("1.4.0"

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-21 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r176280827 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala --- @@ -147,4 +149,72 @@ class VectorAssemblerSuite

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-21 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r176267223 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala --- @@ -147,4 +149,72 @@ class VectorAssemblerSuite

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-21 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r176265864 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -136,34 +172,88 @@ class VectorAssembler @Since("1.4.0"

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-21 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r176266756 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala --- @@ -147,4 +149,72 @@ class VectorAssemblerSuite

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-21 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r176266213 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala --- @@ -147,4 +149,72 @@ class VectorAssemblerSuite

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-21 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r176245770 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala --- @@ -147,4 +149,72 @@ class VectorAssemblerSuite

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-21 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r176228007 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -49,32 +53,57 @@ class VectorAssembler @Since("1.4.0"

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-21 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r176220684 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/VectorAssemblerSuite.scala --- @@ -37,24 +37,26 @@ class VectorAssemblerSuite test

[GitHub] spark pull request #6452: [SPARK-7198] [MLLIB] VectorAssembler should output...

2018-03-19 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/6452#discussion_r175548228 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -46,19 +47,59 @@ class VectorAssembler(override val uid: String

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-16 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r175153265 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -49,32 +51,65 @@ class VectorAssembler @Since("1.4.0"

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-16 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r175152022 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/VectorAssembler.scala --- @@ -85,18 +120,34 @@ class VectorAssembler @Since("1.4.0"

[GitHub] spark issue #20829: [SPARK-23690][ML] Add handleinvalid to VectorAssembler

2018-03-15 Thread yogeshg
Github user yogeshg commented on the issue: https://github.com/apache/spark/pull/20829 test this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #20829: [SPARK-23690][ML] Add handleinvalid to VectorAssembler

2018-03-15 Thread yogeshg
Github user yogeshg commented on the issue: https://github.com/apache/spark/pull/20829 I fixed code paths that failed tests, waiting for @SparkQA . Offline talk with @MrBago suggests that we can perhaps decrease the number of maps in `transform` method. Looking

[GitHub] spark pull request #20829: [SPARK-23690][ML] Add handleinvalid to VectorAsse...

2018-03-15 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20829#discussion_r174941546 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -234,7 +234,7 @@ class StringIndexerModel ( val metadata

[GitHub] spark pull request #20829: [SPARK-23690] [ML] Add handleinvalid to VectorAss...

2018-03-14 Thread yogeshg
GitHub user yogeshg opened a pull request: https://github.com/apache/spark/pull/20829 [SPARK-23690] [ML] Add handleinvalid to VectorAssembler ## What changes were proposed in this pull request? Introduce `handleInvalid` parameter in `VectorAssembler` that can take

[GitHub] spark pull request #20724: [SPARK-18630][PYTHON][ML] Move del method from Ja...

2018-03-05 Thread yogeshg
Github user yogeshg commented on a diff in the pull request: https://github.com/apache/spark/pull/20724#discussion_r172273476 --- Diff: python/pyspark/ml/tests.py --- @@ -173,6 +173,45 @@ class MockModel(MockTransformer, Model, HasFake): pass +class

[GitHub] spark pull request #20724: [SPARK-18630][PYTHON][ML] Move del method from Ja...

2018-03-02 Thread yogeshg
GitHub user yogeshg opened a pull request: https://github.com/apache/spark/pull/20724 [SPARK-18630][PYTHON][ML] Move del method from JavaParams to JavaWrapper; add tests ## What changes were proposed in this pull request? Move del method from JavaParams to JavaWrapper; add