[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

2018-12-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23272#discussion_r240204508 --- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java --- @@ -667,4 +669,54 @@ public void testPeakMemoryUsed

[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

2018-12-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23272#discussion_r240204245 --- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java --- @@ -667,4 +669,54 @@ public void testPeakMemoryUsed

[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

2018-12-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23272#discussion_r240187707 --- Diff: core/src/test/java/org/apache/spark/memory/TestMemoryConsumer.java --- @@ -38,12 +38,14 @@ public long spill(long size, MemoryConsumer trigger

[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

2018-12-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23272#discussion_r240187678 --- Diff: core/src/test/java/org/apache/spark/unsafe/map/AbstractBytesToBytesMapSuite.java --- @@ -667,4 +668,53 @@ public void testPeakMemoryUsed

[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

2018-12-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23272 > have you seen any bug report caused by this dead lock? The original reporter of the JIRA ticket SPARK-26265 has hit with this bug in their workl

[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

2018-12-10 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23272#discussion_r240170574 --- Diff: core/src/main/java/org/apache/spark/unsafe/map/BytesToBytesMap.java --- @@ -255,11 +255,18 @@ private MapIterator(int numRecords, Location loc

[GitHub] spark issue #23269: [SPARK-26316] Currently the wrong implementation in the ...

2018-12-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23269 hmm, I think the PR title is too long...Maybe just `Revert hash join metrics that causes performance degradation

[GitHub] spark issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapI...

2018-12-10 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23272 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesM...

2018-12-10 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/23272 [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager ## What changes were proposed in this pull request

[GitHub] spark pull request #23248: [SPARK-26293][SQL] Cast exception when having pyt...

2018-12-09 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23248#discussion_r240041688 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/python/ExtractPythonUDFs.scala --- @@ -131,8 +131,20 @@ object ExtractPythonUDFs extends

[GitHub] spark pull request #23263: [SPARK-23674][ML] Adds Spark ML Events

2018-12-08 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23263#discussion_r240003952 --- Diff: mllib/src/main/scala/org/apache/spark/ml/Estimator.scala --- @@ -65,7 +65,19 @@ abstract class Estimator[M <: Model[M]] extends PipelineSt

[GitHub] spark pull request #23263: [SPARK-23674][ML] Adds Spark ML Events

2018-12-08 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23263#discussion_r240003674 --- Diff: mllib/src/test/scala/org/apache/spark/ml/MLEventsSuite.scala --- @@ -0,0 +1,199 @@ +/* + * Licensed to the Apache Software Foundation (ASF

[GitHub] spark pull request #23263: [SPARK-23674][ML] Adds Spark ML Events

2018-12-08 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23263#discussion_r240003563 --- Diff: mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala --- @@ -132,7 +132,8 @@ class Pipeline @Since("1.4.0") ( * @return fitte

[GitHub] spark pull request #23253: [SPARK-26303][SQL] Return partial results for bad...

2018-12-08 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23253#discussion_r240003434 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -37,6 +37,8 @@ displayTitle: Spark SQL Upgrading Guide - In Spark version 2.4 and earlier

[GitHub] spark pull request #23253: [SPARK-26303][SQL] Return partial results for bad...

2018-12-08 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23253#discussion_r239998453 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -37,6 +37,8 @@ displayTitle: Spark SQL Upgrading Guide - In Spark version 2.4 and earlier

[GitHub] spark pull request #23253: [SPARK-26303][SQL] Return partial results for bad...

2018-12-08 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23253#discussion_r239998485 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -37,6 +37,8 @@ displayTitle: Spark SQL Upgrading Guide - In Spark version 2.4 and earlier

[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-12-08 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20146 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23259: [SPARK-26215][SQL][WIP] Define reserved/non-reser...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23259#discussion_r239994385 --- Diff: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 --- @@ -769,7 +774,7 @@ nonReserved | REVOKE | GRANT | LOCK

[GitHub] spark pull request #23253: [SPARK-26303][SQL] Return partial results for bad...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23253#discussion_r239994326 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -37,6 +37,8 @@ displayTitle: Spark SQL Upgrading Guide - In Spark version 2.4 and earlier

[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-12-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20146 Thanks @holdenk for reviewing! I've resolved some comments and replied others. Please take a look. Thanks. --- - To unsubscribe

[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20146#discussion_r239994064 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -130,21 +159,60 @@ class StringIndexer @Since("1.4.0") (

[GitHub] spark pull request #23258: [SPARK-23375][SQL][FOLLOWUP][TEST] Test Sort metr...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23258#discussion_r239993927 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -182,10 +182,13 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20146#discussion_r239993480 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -130,21 +159,60 @@ class StringIndexer @Since("1.4.0") (

[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20146#discussion_r239992942 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -310,11 +439,23 @@ object StringIndexerModel extends MLReadable

[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20146#discussion_r239992845 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -310,11 +439,23 @@ object StringIndexerModel extends MLReadable

[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20146#discussion_r239992579 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -130,21 +159,60 @@ class StringIndexer @Since("1.4.0") (

[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20146#discussion_r239992360 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -79,26 +81,53 @@ private[feature] trait StringIndexerBase extends

[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20146#discussion_r239992378 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -130,21 +159,60 @@ class StringIndexer @Since("1.4.0") (

[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20146#discussion_r239991238 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala --- @@ -130,21 +159,60 @@ class StringIndexer @Since("1.4.0") (

[GitHub] spark pull request #23207: [SPARK-26193][SQL] Implement shuffle write metric...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23207#discussion_r239990986 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala --- @@ -38,13 +38,21 @@ case class CollectLimitExec(limit: Int, child

[GitHub] spark pull request #23253: [SPARK-26303][SQL] Return partial results for bad...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23253#discussion_r239846601 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -37,6 +37,8 @@ displayTitle: Spark SQL Upgrading Guide - In Spark version 2.4 and earlier

[GitHub] spark pull request #23253: [SPARK-26303][SQL] Return partial results for bad...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23253#discussion_r239846300 --- Diff: docs/sql-migration-guide-upgrade.md --- @@ -37,6 +37,8 @@ displayTitle: Spark SQL Upgrading Guide - In Spark version 2.4 and earlier

[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-12-07 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20146 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23224: [SPARK-26277][SQL][TEST] WholeStageCodegen metric...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23224#discussion_r239837818 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala --- @@ -80,8 +80,10 @@ class SQLMetricsSuite extends

[GitHub] spark pull request #23249: [SPARK-26297][SQL] improve the doc of Distributio...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23249#discussion_r239754619 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala --- @@ -118,10 +115,12 @@ case class

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-07 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r239744338 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -95,9 +77,116 @@ case class

[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-12-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20146 ping @dbtsai --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark issue #23239: [SPARK-26021][SQL][followup] only deal with NaN and -0.0...

2018-12-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23239 The migration guide has changed by another followup https://github.com/apache/spark/pull/23141: > In Spark version 2.4 and earlier, float/double -0.0 is semantically equal to 0.0, but us

[GitHub] spark pull request #23239: [SPARK-26021][SQL][followup] only deal with NaN a...

2018-12-06 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23239#discussion_r239690045 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeWriter.java --- @@ -198,11 +198,45 @@ protected final void

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-06 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r239668492 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -95,9 +77,116 @@ case class

[GitHub] spark pull request #23215: [SPARK-26263][SQL] Validate partition values with...

2018-12-06 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23215#discussion_r239460682 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -272,9 +279,14 @@ object PartitioningUtils

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-12-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22514 @cloud-fan I've updated the PR description. Thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #23215: [SPARK-26263][SQL] Validate partition values with user p...

2018-12-06 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23215 Sounds good to me too. As there is a config, it is good that we can still disable it. --- - To unsubscribe, e-mail: reviews

[GitHub] spark pull request #23215: [SPARK-26263][SQL] Validate partition values with...

2018-12-06 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23215#discussion_r239388954 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -272,9 +279,13 @@ object PartitioningUtils

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r239323943 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -95,9 +77,116 @@ case class

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23213 I think so, don't know if @cloud-fan or @mgaido91 has other opinions? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23236 > Compared to the error values from the test failures above, they match up until the 10th batch but then these continue until the 16th where it has a timeout I suspect that might beca

[GitHub] spark pull request #23239: [SPARK-26021][SQL][followup] only deal with NaN a...

2018-12-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23239#discussion_r239302780 --- Diff: sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/codegen/UnsafeWriter.java --- @@ -198,11 +198,45 @@ protected final void

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r239300131 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -95,9 +77,116 @@ case class

[GitHub] spark issue #23236: [SPARK-26275][PYTHON][ML] Increases timeout for Streamin...

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23236 I agree with @BryanCutler's analysis and it looks a bit weird at few things about this test. I also think it is fine to increase the timeout for now

[GitHub] spark pull request #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as a...

2018-12-05 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/23231 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23231 Then let me close this now. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23231 Ok. Maybe we can add few words in ml migration guide to clearly announce this. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23213: [SPARK-26262][SQL] Runs SQLQueryTestSuite on mixed confi...

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23213 `wholeStage=false, factoryMode=CODE_ONLY` and `wholeStage=false, factoryMode=NO_CODEGEN` should have more complete test coverage for `GenerateUnsafeProject`, `GenerateMutableProject`, etc

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23231 It is because we have such claim in ml migration guide that said we will keep OneHotEncoderEstimator as an alias. I'm fine if we have consensus now that we can avoid such alias

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23230 Thanks @HyukjinKwon --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews

[GitHub] spark pull request #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as a...

2018-12-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23231#discussion_r239011539 --- Diff: mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoderEstimator.scala --- @@ -0,0 +1,91 @@ +/* + * Licensed to the Apache

[GitHub] spark issue #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to...

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23231 cc @srowen @dbtsai --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as a...

2018-12-05 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23231#discussion_r239008438 --- Diff: mllib/src/test/scala/org/apache/spark/ml/feature/OneHotEncoderEstimatorSuite.scala --- @@ -0,0 +1,423 @@ +/* + * Licensed to the Apache

[GitHub] spark pull request #23231: [SPARK-26273][ML] Add OneHotEncoderEstimator as a...

2018-12-05 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/23231 [SPARK-26273][ML] Add OneHotEncoderEstimator as alias to OneHotEncoder ## What changes were proposed in this pull request? SPARK-26133 removed deprecated OneHotEncoder and renamed

[GitHub] spark issue #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder

2018-12-05 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23230 cc @HyukjinKwon @srowen --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail

[GitHub] spark pull request #23230: [SPARK-26133][ML][Followup] Fix doc for OneHotEnc...

2018-12-05 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/23230 [SPARK-26133][ML][Followup] Fix doc for OneHotEncoder ## What changes were proposed in this pull request? This fixes doc of renamed OneHotEncoder in PySpark. ## How was this patch

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r238909363 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -181,62 +180,39 @@ case class RelationConversions( conf

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r238902415 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -181,62 +180,39 @@ case class RelationConversions( conf

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-12-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r238707304 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/CreateHiveTableAsSelectCommand.scala --- @@ -95,9 +77,127 @@ case class

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHO...

2018-12-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23213#discussion_r238696052 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQueryTestSuite.scala --- @@ -144,9 +144,10 @@ class SQLQueryTestSuite extends QueryTest

[GitHub] spark pull request #23213: [SPARK-26262][SQL] Run SQLQueryTestSuite with WHO...

2018-12-04 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23213#discussion_r238695610 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala --- @@ -2899,6 +2899,144 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark issue #23214: [SPARK-26155] Optimizing the performance of LongToUnsafe...

2018-12-04 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23214 Thanks for doing this. I think we are more close to the root cause. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark issue #23204: Revert "[SPARK-21052][SQL] Add hash map metrics to join"

2018-12-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23204 Is this observable in general hash join query, except for TPC-DS Q19? --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #23204: Revert "[SPARK-21052][SQL] Add hash map metrics t...

2018-12-03 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23204#discussion_r238270550 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala --- @@ -483,8 +470,6 @@ private[execution] final class

[GitHub] spark issue #23203: [SPARK-26252][PYTHON] Add support to run specific unitte...

2018-12-03 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23203 Not look closely at the changes yet, but I think it should be very useful. Thanks @HyukjinKwon --- - To unsubscribe, e-mail

[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-12-01 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r238065068 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -225,4 +225,10 @@ private[sql] object SQLUtils extends Logging

[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-11-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r237896514 --- Diff: R/pkg/R/functions.R --- @@ -202,8 +202,9 @@ NULL #' \itemize{ #' \item \code{from_json}: a structType object to use

[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-11-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r237898787 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -225,4 +225,10 @@ private[sql] object SQLUtils extends Logging

[GitHub] spark pull request #23184: [SPARK-26227][R] from_[csv|json] should accept sc...

2018-11-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23184#discussion_r237899057 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/api/r/SQLUtils.scala --- @@ -225,4 +225,10 @@ private[sql] object SQLUtils extends Logging

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-30 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237818279 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,35 @@ abstract class Expression

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-11-30 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22514 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h

[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23152#discussion_r237768463 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala --- @@ -2276,4 +2276,16 @@ class SQLQuerySuite extends QueryTest

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r237749421 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala --- @@ -92,4 +92,18 @@ class HiveParquetSuite extends QueryTest

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-11-29 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22957 Btw, I think we can update the PR title and description to reflect new changes. --- - To unsubscribe, e-mail: reviews-unsubscr

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237749287 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/PlannerSuite.scala --- @@ -780,6 +780,23 @@ class PlannerSuite extends SharedSQLContext

[GitHub] spark issue #22957: [SPARK-25951][SQL] Ignore aliases for distributions and ...

2018-11-29 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22957 This looks good to me. Just a comment about wording. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237747550 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,35 @@ abstract class Expression

[GitHub] spark pull request #22957: [SPARK-25951][SQL] Ignore aliases for distributio...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22957#discussion_r237747770 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala --- @@ -195,14 +195,35 @@ abstract class Expression

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r237747152 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala --- @@ -181,62 +180,39 @@ case class RelationConversions( conf

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-11-29 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22514 https://user-images.githubusercontent.com/68855/49268483-aaa6d000-f49a-11e8-92c3-5ee78012fe9e.png;> --- - To unsubscribe, e-m

[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23152#discussion_r237721273 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -879,13 +879,13 @@ case

[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23152#discussion_r237720737 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -879,13 +879,13 @@ case

[GitHub] spark issue #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of `Colum...

2018-11-29 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/23152 Agreed. But looks like the added test was failed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark pull request #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts,...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23171#discussion_r237402055 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -335,6 +343,41 @@ case class In(value: Expression

[GitHub] spark pull request #23171: [SPARK-26205][SQL] Optimize In for bytes, shorts,...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23171#discussion_r237405465 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -335,6 +343,41 @@ case class In(value: Expression

[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23176#discussion_r237398085 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -367,11 +367,29 @@ case class InSet(child

[GitHub] spark pull request #23176: [SPARK-26211][SQL] Fix InSet for binary, and stru...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23176#discussion_r237397442 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala --- @@ -367,11 +367,29 @@ case class InSet(child

[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

2018-11-29 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23152#discussion_r237382531 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/statsEstimation/FilterEstimation.scala --- @@ -879,13 +879,13 @@ case

[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

2018-11-28 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23152#discussion_r237381966 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -821,6 +822,32 @@ class

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-11-28 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22514 > can we try a query and see what the SQL UI looks like? Yes. I will try a query and post the SQL

[GitHub] spark pull request #22514: [SPARK-25271][SQL] Hive ctas commands should use ...

2018-11-28 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/22514#discussion_r237364946 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/HiveParquetSuite.scala --- @@ -92,4 +92,18 @@ class HiveParquetSuite extends QueryTest

[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

2018-11-28 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23152#discussion_r237359890 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -821,6 +822,32 @@ class

[GitHub] spark pull request #23152: [SPARK-26181][SQL] the `hasMinMaxStats` method of...

2018-11-28 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/23152#discussion_r237347496 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/statsEstimation/FilterEstimationSuite.scala --- @@ -821,6 +822,32 @@ class

[GitHub] spark issue #22514: [SPARK-25271][SQL] Hive ctas commands should use data so...

2018-11-28 Thread viirya
Github user viirya commented on the issue: https://github.com/apache/spark/pull/22514 Yea, lets see if retest works well. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e

  1   2   3   4   5   6   7   8   9   10   >