[GitHub] [spark] viirya commented on a change in pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
viirya commented on a change in pull request #30379: URL: https://github.com/apache/spark/pull/30379#discussion_r523517002 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SubExprEliminationBenchmark.scala ## @@ -0,0 +1,119 @@ +/* + * Licensed to the

[GitHub] [spark] github-actions[bot] closed pull request #29309: [SPARK-29886][SQL] Add support for satisfying HashClusteredDistribution by DataSourceV2 implementations

2020-11-14 Thread GitBox
github-actions[bot] closed pull request #29309: URL: https://github.com/apache/spark/pull/29309 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] github-actions[bot] closed pull request #29266: [SPARK-32464][SQL] Support skew handling on join that has one side wi…

2020-11-14 Thread GitBox
github-actions[bot] closed pull request #29266: URL: https://github.com/apache/spark/pull/29266 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] github-actions[bot] commented on pull request #29354: [SPARK-32533][SQL] Improve Avro read/write performance on nested structs and array of structs

2020-11-14 Thread GitBox
github-actions[bot] commented on pull request #29354: URL: https://github.com/apache/spark/pull/29354#issuecomment-727287293 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] github-actions[bot] closed pull request #29080: [SPARK-32271][ML] Add option for k-fold cross-validation to CrossValidator

2020-11-14 Thread GitBox
github-actions[bot] closed pull request #29080: URL: https://github.com/apache/spark/pull/29080 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
SparkQA commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727287205 **[Test build #131101 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131101/testReport)** for PR 30379 at commit

[GitHub] [spark] viirya commented on a change in pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
viirya commented on a change in pull request #30379: URL: https://github.com/apache/spark/pull/30379#discussion_r523490332 ## File path: sql/core/benchmarks/SubExprEliminationBenchmark-results.txt ## @@ -0,0 +1,15 @@

[GitHub] [spark] viirya edited a comment on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
viirya edited a comment on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727285058 > @maropu That's a good point. Initially, I thought three steps. 1) we add a benchmark with the baseline first, 2) merge the #30341, 3) update this benchmark in another

[GitHub] [spark] viirya commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
viirya commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727285058 > @maropu That's a good point. Initially, I thought three steps. 1) we add a benchmark with the baseline first, 2) merge the #30341, 3) update this benchmark in another PR. But,

[GitHub] [spark] SparkQA commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
SparkQA commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727284961 **[Test build #131100 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131100/testReport)** for PR 30379 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #30379: URL: https://github.com/apache/spark/pull/30379#discussion_r523487480 ## File path: sql/core/benchmarks/SubExprEliminationBenchmark-results.txt ## @@ -7,9 +7,9 @@ OpenJDK 64-Bit Server VM 1.8.0_265-b01 on Mac OS X

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-727283913 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-727283913 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-14 Thread GitBox
SparkQA removed a comment on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-727255673 **[Test build #131097 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131097/testReport)** for PR 29695 at commit

[GitHub] [spark] SparkQA commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-14 Thread GitBox
SparkQA commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-727283664 **[Test build #131097 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131097/testReport)** for PR 29695 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #30379: URL: https://github.com/apache/spark/pull/30379#discussion_r523477798 ## File path: sql/core/benchmarks/SubExprEliminationBenchmark-results.txt ## @@ -0,0 +1,15 @@

[GitHub] [spark] dongjoon-hyun commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727282415 Your idea is better and correct because there is no conf yet~  > Yea, but, on second thought, merging this fist looks fine, too.

[GitHub] [spark] maropu commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
maropu commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727281979 @dongjoon-hyun Yea, but, on second thought, merging this fist looks fine, too. This is an automated message

[GitHub] [spark] dongjoon-hyun commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727280631 @maropu That's a good point. Initially, I thought three steps. 1) we add a benchmark with the baseline first, 2) merge the #30341, 3) update this benchmark in another PR.

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727277910 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
SparkQA commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727277904 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35702/

[GitHub] [spark] maropu commented on a change in pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
maropu commented on a change in pull request #30379: URL: https://github.com/apache/spark/pull/30379#discussion_r523473622 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SubExprEliminationBenchmark.scala ## @@ -0,0 +1,119 @@ +/* + * Licensed to the

[GitHub] [spark] AmplabJenkins commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727277910 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727277666 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727277664 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] AmplabJenkins commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727277664 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA removed a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727264772 **[Test build #131098 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131098/testReport)** for PR 30378 at commit

[GitHub] [spark] SparkQA commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727277527 **[Test build #131098 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131098/testReport)** for PR 30378 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30377: URL: https://github.com/apache/spark/pull/30377#issuecomment-727274887 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] janekdb commented on a change in pull request #30376: change 'spark.sql.adaptive.skewedPartitionThresholdInBytes' to 'spark.sql.adaptive.skewJoin.skewedPartitionThresholdInBytes' #SP

2020-11-14 Thread GitBox
janekdb commented on a change in pull request #30376: URL: https://github.com/apache/spark/pull/30376#discussion_r523472097 ## File path: docs/sql-performance-tuning.md ## @@ -280,7 +280,7 @@ Data skew can severely downgrade the performance of join queries. This feature d

[GitHub] [spark] SparkQA commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
SparkQA commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727275058 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35702/

[GitHub] [spark] AmplabJenkins commented on pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #30377: URL: https://github.com/apache/spark/pull/30377#issuecomment-727274887 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
SparkQA removed a comment on pull request #30377: URL: https://github.com/apache/spark/pull/30377#issuecomment-727244935 **[Test build #131095 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131095/testReport)** for PR 30377 at commit

[GitHub] [spark] SparkQA commented on pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
SparkQA commented on pull request #30377: URL: https://github.com/apache/spark/pull/30377#issuecomment-727274677 **[Test build #131095 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131095/testReport)** for PR 30377 at commit

[GitHub] [spark] janekdb commented on a change in pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
janekdb commented on a change in pull request #30377: URL: https://github.com/apache/spark/pull/30377#discussion_r523471140 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/ShowPartitionsSuite.scala ## @@ -0,0 +1,198 @@ +/* + * Licensed to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727273046 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727273046 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
SparkQA removed a comment on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727242883 **[Test build #131094 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131094/testReport)** for PR 30341 at commit

[GitHub] [spark] SparkQA commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
SparkQA commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727272817 **[Test build #131094 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131094/testReport)** for PR 30341 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727271846 Test FAILed. Refer to this link for build results (access rights to CI server needed):

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727271844 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To

[GitHub] [spark] SparkQA commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727271838 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35701/

[GitHub] [spark] AmplabJenkins commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727271844 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #30379: URL: https://github.com/apache/spark/pull/30379#discussion_r523468628 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/SubExprEliminationBenchmark.scala ## @@ -0,0 +1,119 @@ +/* + * Licensed to

[GitHub] [spark] SparkQA commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
SparkQA commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727271125 **[Test build #131099 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131099/testReport)** for PR 30379 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #30379: URL: https://github.com/apache/spark/pull/30379#discussion_r523468297 ## File path: sql/core/benchmarks/SubExprEliminationBenchmark-results.txt ## @@ -0,0 +1,15 @@

[GitHub] [spark] dongjoon-hyun commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727271068 Thank you so much for this additional work, @viirya ! This is an automated message from the Apache Git

[GitHub] [spark] viirya commented on pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
viirya commented on pull request #30379: URL: https://github.com/apache/spark/pull/30379#issuecomment-727270996 cc @dongjoon-hyun @maropu This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] viirya opened a new pull request #30379: [SPARK-33455][SQL][TEST] Add SubExprEliminationBenchmark for benchmarking subexpression elimination

2020-11-14 Thread GitBox
viirya opened a new pull request #30379: URL: https://github.com/apache/spark/pull/30379 ### What changes were proposed in this pull request? This patch adds a benchmark `SubExprEliminationBenchmark` for benchmarking subexpression elimination feature. ### Why are

[GitHub] [spark] SparkQA commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727269325 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35701/

[GitHub] [spark] dongjoon-hyun commented on pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30357: URL: https://github.com/apache/spark/pull/30357#issuecomment-727268825 If this is required at `branch-3.0` as described in JIRA, please make a backporting PR, @luluorta . This

[GitHub] [spark] dongjoon-hyun closed pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-14 Thread GitBox
dongjoon-hyun closed pull request #30357: URL: https://github.com/apache/spark/pull/30357 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727268527 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727268527 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA removed a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA removed a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727252309 **[Test build #131096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131096/testReport)** for PR 30378 at commit

[GitHub] [spark] SparkQA commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727268302 **[Test build #131096 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131096/testReport)** for PR 30378 at commit

[GitHub] [spark] Victsm commented on a change in pull request #30312: [WIP][SPARK-32917][SHUFFLE][CORE][test-maven][test-hadoop2.7] Adds support for executors to push shuffle blocks after successful m

2020-11-14 Thread GitBox
Victsm commented on a change in pull request #30312: URL: https://github.com/apache/spark/pull/30312#discussion_r523465292 ## File path: core/src/main/scala/org/apache/spark/shuffle/ShuffleWriter.scala ## @@ -17,18 +17,466 @@ package org.apache.spark.shuffle -import

[GitHub] [spark] Victsm commented on a change in pull request #30312: [WIP][SPARK-32917][SHUFFLE][CORE][test-maven][test-hadoop2.7] Adds support for executors to push shuffle blocks after successful m

2020-11-14 Thread GitBox
Victsm commented on a change in pull request #30312: URL: https://github.com/apache/spark/pull/30312#discussion_r523465089 ## File path: core/src/main/scala/org/apache/spark/shuffle/ShuffleWriter.scala ## @@ -17,18 +17,466 @@ package org.apache.spark.shuffle -import

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30357: [SPARK-33432][SQL] SQL parser should use active SQLConf

2020-11-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #30357: URL: https://github.com/apache/spark/pull/30357#discussion_r523464426 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala ## @@ -200,12 +200,12 @@ class

[GitHub] [spark] dongjoon-hyun commented on pull request #30358: [SPARK-33394][SQL] Throw `NoSuchNamespaceException` for not existing namespace in `InMemoryTableCatalog.listTables()`

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30358: URL: https://github.com/apache/spark/pull/30358#issuecomment-727265759 I'll leave this PR for @cloud-fan 's last sign-off. This is an automated message from the Apache Git

[GitHub] [spark] Victsm commented on a change in pull request #30312: [WIP][SPARK-32917][SHUFFLE][CORE][test-maven][test-hadoop2.7] Adds support for executors to push shuffle blocks after successful m

2020-11-14 Thread GitBox
Victsm commented on a change in pull request #30312: URL: https://github.com/apache/spark/pull/30312#discussion_r523463471 ## File path: common/network-common/src/main/java/org/apache/spark/network/client/TransportClientFactory.java ## @@ -254,7 +254,7 @@ TransportClient

[GitHub] [spark] SparkQA commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727264772 **[Test build #131098 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131098/testReport)** for PR 30378 at commit

[GitHub] [spark] dongjoon-hyun commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727263989 Sure! Given the importance of Hadoop 2, it sounds good to me. This is an automated message from the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-727263672 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-14 Thread GitBox
SparkQA commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-727263669 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35700/

[GitHub] [spark] AmplabJenkins commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-727263672 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] mridulm edited a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
mridulm edited a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727261552 > I means the result comes in 15 minutes and it's hard to consider it as a flaky job. I am concerned sometimes we might just not see it - given it is listed at

[GitHub] [spark] mridulm commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
mridulm commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727261552 > I means the result comes in 15 minutes and it's hard to consider it as a flaky job. I am concerned sometimes we might just not see it - given it is listed at the bottom

[GitHub] [spark] mridulm commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
mridulm commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727261337 +CC @shaneknapp Will it be possible to do the same for jenkins ? It will ensure consistency and no accidental merges.

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727260912 Ya. That's true. However, `GitHub Action` is very fast and `Hadoop 2` build compilation error is easily detected. I means the result comes in 15 minutes and it's

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727260912 Ya. That's true. However, `GitHub Action` is very fast and `Hadoop 2` build compilation error is easily detected. I means the result comes in 15 minutes and it's

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727260912 Ya. That's true. However, `GitHub Action` is very fast and `Hadoop 2` build compilation error is easily detected. I means the result comes in 15 minutes and it's

[GitHub] [spark] dongjoon-hyun commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727261032 BTW, I have only read-only access privilege to see Jenkins configuration. This is an automated message

[GitHub] [spark] dongjoon-hyun edited a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
dongjoon-hyun edited a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727260912 Ya. That's true. However, `GitHub Action` is very fast and `Hadoop 2` build compilation error is easily detected. I means the result comes in 15 minutes and it's

[GitHub] [spark] dongjoon-hyun commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727260912 Ya. That's true. However, `GitHub Action` is very fast and `Hadoop 2` build compilation error is easily detected. I means the result comes in 15 minutes and it's hard to

[GitHub] [spark] SparkQA commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-14 Thread GitBox
SparkQA commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-727260461 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35700/

[GitHub] [spark] mridulm commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
mridulm commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727259556 Thanks for making this change @dongjoon-hyun ! This applies only to github actions right ? I assumed that currently the procedure was to wait for either jenkins or github

[GitHub] [spark] AmplabJenkins commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727259432 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727259429 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35699/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727259432 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30377: URL: https://github.com/apache/spark/pull/30377#issuecomment-727253695 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727256937 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35699/

[GitHub] [spark] SparkQA commented on pull request #29695: [SPARK-22390][SPARK-32833][SQL] [WIP]JDBC V2 Datasource aggregate push down

2020-11-14 Thread GitBox
SparkQA commented on pull request #29695: URL: https://github.com/apache/spark/pull/29695#issuecomment-727255673 **[Test build #131097 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131097/testReport)** for PR 29695 at commit

[GitHub] [spark] viirya commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
viirya commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727255265 > It would be great if we can have a benchmark suite (from the code in the PR description), @viirya . Yes, @dongjoon-hyun, I remember it. :) I will add a benchmark suite

[GitHub] [spark] dongjoon-hyun commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727254222 It would be great if we can have a benchmark suite (from the code in the PR description), @viirya . This

[GitHub] [spark] AmplabJenkins commented on pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #30377: URL: https://github.com/apache/spark/pull/30377#issuecomment-727253695 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] SparkQA commented on pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
SparkQA commented on pull request #30377: URL: https://github.com/apache/spark/pull/30377#issuecomment-727253690 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35698/

[GitHub] [spark] SparkQA commented on pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
SparkQA commented on pull request #30378: URL: https://github.com/apache/spark/pull/30378#issuecomment-727252309 **[Test build #131096 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131096/testReport)** for PR 30378 at commit

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
dongjoon-hyun commented on a change in pull request #30378: URL: https://github.com/apache/spark/pull/30378#discussion_r523452097 ## File path: .github/workflows/build_and_test.yml ## @@ -419,3 +419,24 @@ jobs: run: | ./dev/change-scala-version.sh 2.13

[GitHub] [spark] dongjoon-hyun commented on pull request #30375: [SPARK-33288][YARN][FOLLOW-UP][test-hadoop2.7] Fix type mismatch error

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30375: URL: https://github.com/apache/spark/pull/30375#issuecomment-727252100 I made a PR to protect Hadoop 2 profile. Could you review that, @mridulm ? - https://github.com/apache/spark/pull/30378

[GitHub] [spark] dongjoon-hyun opened a new pull request #30378: [SPARK-33454][INFRA] Add GitHub Action job for Hadoop 2

2020-11-14 Thread GitBox
dongjoon-hyun opened a new pull request #30378: URL: https://github.com/apache/spark/pull/30378 ### What changes were proposed in this pull request? This PR aims to protect `Hadoop 2.x` profile in Apache Spark 3.1+. ### Why are the changes needed? Since we switch our

[GitHub] [spark] AmplabJenkins commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
AmplabJenkins commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727250946 This is an automated message from the Apache Git Service. To respond to the message, please log on to

[GitHub] [spark] AmplabJenkins removed a comment on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
AmplabJenkins removed a comment on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727250946 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
SparkQA commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727250940 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35697/

[GitHub] [spark] dongjoon-hyun commented on pull request #30375: [SPARK-33288][YARN][FOLLOW-UP][test-hadoop2.7] Fix type mismatch error

2020-11-14 Thread GitBox
dongjoon-hyun commented on pull request #30375: URL: https://github.com/apache/spark/pull/30375#issuecomment-727250699 My initial question was about dropping `hadoop-2.x` profile completely like we did for `hive-1.2`, @mridulm . It was just a question for the possibility~ If you

[GitHub] [spark] SparkQA commented on pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
SparkQA commented on pull request #30377: URL: https://github.com/apache/spark/pull/30377#issuecomment-727250156 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35698/

[GitHub] [spark] SparkQA commented on pull request #30341: [SPARK-33427][SQL] Add subexpression elimination for interpreted expression evaluation

2020-11-14 Thread GitBox
SparkQA commented on pull request #30341: URL: https://github.com/apache/spark/pull/30341#issuecomment-727248167 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35697/

[GitHub] [spark] viirya commented on a change in pull request #30368: [SPARK-33442][SQL] Change Combine Limit to Eliminate limit using max row

2020-11-14 Thread GitBox
viirya commented on a change in pull request #30368: URL: https://github.com/apache/spark/pull/30368#discussion_r523447490 ## File path: sql/core/src/test/resources/tpcds-plan-stability/approved-plans-v1_4/q92/explain.txt ## @@ -1,5 +1,5 @@ == Physical Plan ==

[GitHub] [spark] mridulm commented on pull request #30375: [SPARK-33288][YARN][FOLLOW-UP][test-hadoop2.7] Fix type mismatch error

2020-11-14 Thread GitBox
mridulm commented on pull request #30375: URL: https://github.com/apache/spark/pull/30375#issuecomment-727245118 > BTW, @wangyum and @mridulm and @tgravescs . Do you think it's possible for us to start discussion for dropping Hadoop 2.7 at Apache Spark 3.2? Is the proposal to drop

[GitHub] [spark] SparkQA commented on pull request #30377: [SPARK-33453][SQL][TESTS] Unify v1 and v2 SHOW PARTITIONS tests

2020-11-14 Thread GitBox
SparkQA commented on pull request #30377: URL: https://github.com/apache/spark/pull/30377#issuecomment-727244935 **[Test build #131095 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/131095/testReport)** for PR 30377 at commit

<    1   2   3   >