[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836216634 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836192784 **[Test build #138321 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138321/testReport)** for PR 32399 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue

2021-05-09 Thread GitBox
SparkQA removed a comment on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836190123 **[Test build #138320 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138320/testReport)** for PR 32399 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still co

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836190600 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138320/

[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836190579 **[Test build #138320 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138320/testReport)** for PR 32399 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue r

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836190600 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138320/ -- This

[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836190123 **[Test build #138320 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138320/testReport)** for PR 32399 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue

2021-05-09 Thread GitBox
SparkQA removed a comment on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836187589 **[Test build #138319 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138319/testReport)** for PR 32399 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still co

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836188026 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138319/

[GitHub] [spark] AmplabJenkins commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue r

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836188026 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138319/ -- This

[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836188006 **[Test build #138319 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138319/testReport)** for PR 32399 at commit

[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836187589 **[Test build #138319 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138319/testReport)** for PR 32399 at commit

[GitHub] [spark] SparkQA commented on pull request #32031: [WIP] Initial work of Remote Shuffle Service on Kubernetes

2021-05-09 Thread GitBox
SparkQA commented on pull request #32031: URL: https://github.com/apache/spark/pull/32031#issuecomment-836185216 **[Test build #138318 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138318/testReport)** for PR 32031 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32475: [SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32475: URL: https://github.com/apache/spark/pull/32475#issuecomment-836183250 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42837/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-836183252 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42836/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32482: [SPARK-35332][SQL] Make cache plan disable configs configurable

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32482: URL: https://github.com/apache/spark/pull/32482#issuecomment-836183249 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42838/

[GitHub] [spark] AmplabJenkins commented on pull request #32482: [SPARK-35332][SQL] Make cache plan disable configs configurable

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32482: URL: https://github.com/apache/spark/pull/32482#issuecomment-836183249 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42838/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32475: [SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32475: URL: https://github.com/apache/spark/pull/32475#issuecomment-836183250 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42837/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-836183252 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42836/ --

[GitHub] [spark] SparkQA commented on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
SparkQA commented on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-836177985 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA commented on pull request #32482: [SPARK-35332][SQL] Make cache plan disable configs configurable

2021-05-09 Thread GitBox
SparkQA commented on pull request #32482: URL: https://github.com/apache/spark/pull/32482#issuecomment-836176561 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA commented on pull request #32475: [SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty

2021-05-09 Thread GitBox
SparkQA commented on pull request #32475: URL: https://github.com/apache/spark/pull/32475#issuecomment-836174973 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42837/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #32475: [SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty

2021-05-09 Thread GitBox
SparkQA commented on pull request #32475: URL: https://github.com/apache/spark/pull/32475#issuecomment-836171886 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42837/ -- This is an automated message from the Apache

[GitHub] [spark] AmplabJenkins commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-836153120 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138312/ -- This

[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836152449 **[Test build #138317 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138317/testReport)** for PR 32399 at commit

[GitHub] [spark] SparkQA removed a comment on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
SparkQA removed a comment on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-835964738 **[Test build #138312 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138312/testReport)** for PR 32473 at commit

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-836151733 **[Test build #138312 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138312/testReport)** for PR 32473 at commit

[GitHub] [spark] srowen commented on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
srowen commented on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-836150808 Getting pretty big! but OK if needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] ulysses-you commented on a change in pull request #32482: [SPARK-35332][SQL] Make cache plan disable configs configurable

2021-05-09 Thread GitBox
ulysses-you commented on a change in pull request #32482: URL: https://github.com/apache/spark/pull/32482#discussion_r629039347 ## File path: sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala ## @@ -1175,7 +1175,7 @@ class CachedTableSuite extends QueryTest

[GitHub] [spark] SparkQA commented on pull request #32482: [SPARK-35332][SQL] Make cache plan disable configs configurable

2021-05-09 Thread GitBox
SparkQA commented on pull request #32482: URL: https://github.com/apache/spark/pull/32482#issuecomment-836147815 **[Test build #138316 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138316/testReport)** for PR 32482 at commit

[GitHub] [spark] ulysses-you commented on pull request #32482: [SPARK-35332][SQL] Make cache plan disable configs configurable

2021-05-09 Thread GitBox
ulysses-you commented on pull request #32482: URL: https://github.com/apache/spark/pull/32482#issuecomment-836147309 Thank you @maropu @c21 @dongjoon-hyun . Agree, the current config seems overkill to user, it's better to just make it as `enabled`. Refactor this PR to

[GitHub] [spark] SparkQA commented on pull request #32475: [SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty

2021-05-09 Thread GitBox
SparkQA commented on pull request #32475: URL: https://github.com/apache/spark/pull/32475#issuecomment-83614 **[Test build #138315 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138315/testReport)** for PR 32475 at commit

[GitHub] [spark] SparkQA commented on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
SparkQA commented on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-836145492 **[Test build #138314 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138314/testReport)** for PR 32487 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32488: [SPARK-35316][SQL] UnwrapCastInBinaryComparison support In/InSet predicate

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32488: URL: https://github.com/apache/spark/pull/32488#issuecomment-836144135 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] cfmcgrady opened a new pull request #32488: [SPARK-35316][SQL] UnwrapCastInBinaryComparison support In/InSet predicate

2021-05-09 Thread GitBox
cfmcgrady opened a new pull request #32488: URL: https://github.com/apache/spark/pull/32488 ### What changes were proposed in this pull request? This pr add in/inset predicate support for `UnwrapCastInBinaryComparison`. Current implement doesn't pushdown filters for

[GitHub] [spark] c21 commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
c21 commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r629027318 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -418,115 +443,140 @@ case class SortMergeJoinExec(

[GitHub] [spark] c21 commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
c21 commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r629027318 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -418,115 +443,140 @@ case class SortMergeJoinExec(

[GitHub] [spark] SparkQA removed a comment on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue

2021-05-09 Thread GitBox
SparkQA removed a comment on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836035623 **[Test build #138313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138313/testReport)** for PR 32399 at commit

[GitHub] [spark] viirya commented on a change in pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
viirya commented on a change in pull request #32487: URL: https://github.com/apache/spark/pull/32487#discussion_r629025675 ## File path: dev/create-release/release-build.sh ## @@ -210,6 +210,8 @@ if [[ "$1" == "package" ]]; then PYSPARK_VERSION=`echo "$SPARK_VERSION" |

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still co

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836109653 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138313/

[GitHub] [spark] AmplabJenkins commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue r

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836109653 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138313/ -- This

[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836108663 **[Test build #138313 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138313/testReport)** for PR 32399 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-836106608 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138311/

[GitHub] [spark] AmplabJenkins commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-836106608 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138311/ -- This

[GitHub] [spark] wangyum commented on a change in pull request #32410: [SPARK-35286][SQL] Replace SessionState.start with SessionState.setCurrentSessionState

2021-05-09 Thread GitBox
wangyum commented on a change in pull request #32410: URL: https://github.com/apache/spark/pull/32410#discussion_r629020979 ## File path: sql/hive-thriftserver/src/main/java/org/apache/hive/service/cli/session/HiveSessionImpl.java ## @@ -141,7 +141,7 @@ public void open(Map

[GitHub] [spark] SparkQA removed a comment on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
SparkQA removed a comment on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-835906957 **[Test build #138311 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138311/testReport)** for PR 32473 at commit

[GitHub] [spark] HeartSaVioR commented on pull request #25911: [SPARK-29223][SQL][SS] Enable global timestamp per topic while specifying offset by timestamp in Kafka source

2021-05-09 Thread GitBox
HeartSaVioR commented on pull request #25911: URL: https://github.com/apache/spark/pull/25911#issuecomment-836089685 I see actual customer's demand on this; "a" topic has 100+ partitions and it's weird to let them craft json which contains 100+ partitions for the same timestamp.

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-836088555 **[Test build #138311 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138311/testReport)** for PR 32473 at commit

[GitHub] [spark] c21 commented on pull request #32480: [SPARK-35354][SQL] Replace BaseJoinExec with ShuffledJoin in CoalesceBucketsInJoin

2021-05-09 Thread GitBox
c21 commented on pull request #32480: URL: https://github.com/apache/spark/pull/32480#issuecomment-836086921 Thank you @maropu for review! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] beliefer commented on a change in pull request #32442: [SPARK-35283][SQL] Support query some DDL with CTES

2021-05-09 Thread GitBox
beliefer commented on a change in pull request #32442: URL: https://github.com/apache/spark/pull/32442#discussion_r629016341 ## File path: sql/core/src/test/resources/sql-tests/inputs/cte-ddl.sql ## @@ -0,0 +1,65 @@ +-- Test data. +CREATE NAMESPACE IF NOT EXISTS

[GitHub] [spark] LuciferYang closed pull request #32374: [WIP][SPARK-35253][BUILD][SQL] Upgrade Janino from 3.0.16 to 3.1.3

2021-05-09 Thread GitBox
LuciferYang closed pull request #32374: URL: https://github.com/apache/spark/pull/32374 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] LuciferYang commented on pull request #32374: [WIP][SPARK-35253][BUILD][SQL] Upgrade Janino from 3.0.16 to 3.1.3

2021-05-09 Thread GitBox
LuciferYang commented on pull request #32374: URL: https://github.com/apache/spark/pull/32374#issuecomment-836082137 close this because SPARK-35253 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] LuciferYang commented on a change in pull request #32455: [SPARK-35253][SQL][BUILD] Bump up the janino version to v3.1.4

2021-05-09 Thread GitBox
LuciferYang commented on a change in pull request #32455: URL: https://github.com/apache/spark/pull/32455#discussion_r629014929 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala ## @@ -1434,9 +1435,10 @@ object

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still co

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836069987 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42835/

[GitHub] [spark] AmplabJenkins commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue r

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836069987 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42835/ --

[GitHub] [spark] zhengruifeng commented on pull request #32350: [SPARK-35231][SQL] logical.Range override maxRowsPerPartition

2021-05-09 Thread GitBox
zhengruifeng commented on pull request #32350: URL: https://github.com/apache/spark/pull/32350#issuecomment-836067509 Thank you so much! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836058502 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42835/ --

[GitHub] [spark] maropu commented on a change in pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
maropu commented on a change in pull request #32487: URL: https://github.com/apache/spark/pull/32487#discussion_r629004607 ## File path: dev/create-release/release-build.sh ## @@ -210,6 +210,8 @@ if [[ "$1" == "package" ]]; then PYSPARK_VERSION=`echo "$SPARK_VERSION" |

[GitHub] [spark] huaxingao commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
huaxingao commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-836051980 @dongjoon-hyun > Shall we change the grouping in order see the trend according to the block size? Sorry, I just saw your comment. I guess it might be a little

[GitHub] [spark] huaxingao commented on a change in pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
huaxingao commented on a change in pull request #32473: URL: https://github.com/apache/spark/pull/32473#discussion_r629004056 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/BloomFilterBenchmark.scala ## @@ -81,8 +80,57 @@ object

[GitHub] [spark] SparkQA commented on pull request #32399: [SPARK-35271][ML][PYSPARK] Fix: After CrossValidator/TrainValidationSplit fit raised error, some backgroud threads may still continue run or

2021-05-09 Thread GitBox
SparkQA commented on pull request #32399: URL: https://github.com/apache/spark/pull/32399#issuecomment-836035623 **[Test build #138313 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138313/testReport)** for PR 32399 at commit

[GitHub] [spark] AmplabJenkins commented on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-836035119 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138310/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-836035114 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-836035119 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/138310/

[GitHub] [spark] AmplabJenkins commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-836035114 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] maropu commented on pull request #32480: [SPARK-35354][SQL] Replace BaseJoinExec with ShuffledJoin in CoalesceBucketsInJoin

2021-05-09 Thread GitBox
maropu commented on pull request #32480: URL: https://github.com/apache/spark/pull/32480#issuecomment-836019661 Thank you, @c21. Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [spark] maropu closed pull request #32480: [SPARK-35354][SQL] Replace BaseJoinExec with ShuffledJoin in CoalesceBucketsInJoin

2021-05-09 Thread GitBox
maropu closed pull request #32480: URL: https://github.com/apache/spark/pull/32480 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-835996955 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] wangyum commented on a change in pull request #32442: [SPARK-35283][SQL] Support query some DDL with CTES

2021-05-09 Thread GitBox
wangyum commented on a change in pull request #32442: URL: https://github.com/apache/spark/pull/32442#discussion_r628987144 ## File path: sql/core/src/test/resources/sql-tests/inputs/cte-ddl.sql ## @@ -0,0 +1,65 @@ +-- Test data. +CREATE NAMESPACE IF NOT EXISTS

[GitHub] [spark] maropu commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
maropu commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r628986405 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -418,115 +443,140 @@ case class

[GitHub] [spark] maropu commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
maropu commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r628986405 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -418,115 +443,140 @@ case class

[GitHub] [spark] wangyum commented on a change in pull request #32442: [SPARK-35283][SQL] Support query some DDL with CTES

2021-05-09 Thread GitBox
wangyum commented on a change in pull request #32442: URL: https://github.com/apache/spark/pull/32442#discussion_r628980328 ## File path: sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4 ## @@ -375,8 +363,18 @@ ctes : WITH namedQuery (','

[GitHub] [spark] c21 commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
c21 commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r628977186 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -418,115 +443,140 @@ case class SortMergeJoinExec(

[GitHub] [spark] c21 commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
c21 commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r628976694 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -554,67 +604,118 @@ case class SortMergeJoinExec(

[GitHub] [spark] wangyum commented on a change in pull request #32442: [SPARK-35283][SQL] Support query some DDL with CTES

2021-05-09 Thread GitBox
wangyum commented on a change in pull request #32442: URL: https://github.com/apache/spark/pull/32442#discussion_r628976181 ## File path: sql/core/src/test/resources/sql-tests/inputs/cte-ddl.sql ## @@ -0,0 +1,65 @@ +-- Test data. +CREATE NAMESPACE IF NOT EXISTS

[GitHub] [spark] SparkQA removed a comment on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
SparkQA removed a comment on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-835879367 **[Test build #138309 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138309/testReport)** for PR 32473 at commit

[GitHub] [spark] maropu commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
maropu commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r628974305 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -418,115 +443,140 @@ case class

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-835985975 **[Test build #138309 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138309/testReport)** for PR 32473 at commit

[GitHub] [spark] maropu commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
maropu commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r628972459 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -418,115 +443,140 @@ case class

[GitHub] [spark] maropu commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
maropu commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r628969762 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -418,115 +443,140 @@ case class

[GitHub] [spark] SparkQA removed a comment on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
SparkQA removed a comment on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-835906899 **[Test build #138310 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138310/testReport)** for PR 32487 at commit

[GitHub] [spark] SparkQA commented on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
SparkQA commented on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-835979912 **[Test build #138310 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138310/testReport)** for PR 32487 at commit

[GitHub] [spark] maropu commented on pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
maropu commented on pull request #32476: URL: https://github.com/apache/spark/pull/32476#issuecomment-835977883 > @maropu - JoinBenchmark has only inner sort merge join, but not left/right outer join. So this PR does not affect the result of benchmark as it is. Shall we have a followup PR

[GitHub] [spark] c21 commented on pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
c21 commented on pull request #32476: URL: https://github.com/apache/spark/pull/32476#issuecomment-835976988 @maropu - `JoinBenchmark` has only inner sort merge join, but not left/right outer join. So this PR does not affect the result of benchmark as it is. Shall we have a followup PR to

[GitHub] [spark] wangyum commented on a change in pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
wangyum commented on a change in pull request #32473: URL: https://github.com/apache/spark/pull/32473#discussion_r628967020 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/BloomFilterBenchmark.scala ## @@ -81,8 +80,57 @@ object

[GitHub] [spark] c21 commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
c21 commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r628966219 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -353,12 +353,37 @@ case class SortMergeJoinExec(

[GitHub] [spark] wangyum commented on pull request #29642: [SPARK-32792][SQL] Improve Parquet In filter pushdown

2021-05-09 Thread GitBox
wangyum commented on pull request #29642: URL: https://github.com/apache/spark/pull/29642#issuecomment-835972741 @dongjoon-hyun This pr only improve the `In` predicate. I have added the improvement part to PR description. -- This is an automated message from the Apache Git

[GitHub] [spark] maropu commented on pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
maropu commented on pull request #32476: URL: https://github.com/apache/spark/pull/32476#issuecomment-835970308 Could you update the `JoinBenchmark` results, too? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [spark] maropu commented on a change in pull request #32476: [SPARK-35349][SQL] Add code-gen for left/right outer sort merge join

2021-05-09 Thread GitBox
maropu commented on a change in pull request #32476: URL: https://github.com/apache/spark/pull/32476#discussion_r628960873 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/SortMergeJoinExec.scala ## @@ -418,115 +443,140 @@ case class

[GitHub] [spark] wangyum commented on a change in pull request #29642: [SPARK-32792][SQL] Improve Parquet In filter pushdown

2021-05-09 Thread GitBox
wangyum commented on a change in pull request #29642: URL: https://github.com/apache/spark/pull/29642#discussion_r628965380 ## File path: sql/core/benchmarks/FilterPushdownBenchmark-jdk11-results.txt ## @@ -2,669 +2,669 @@ Pushdown for many distinct value case

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-835964738 **[Test build #138312 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/138312/testReport)** for PR 32473 at commit

[GitHub] [spark] wangyum commented on a change in pull request #29642: [SPARK-32792][SQL] Improve Parquet In filter pushdown

2021-05-09 Thread GitBox
wangyum commented on a change in pull request #29642: URL: https://github.com/apache/spark/pull/29642#discussion_r628964580 ## File path: sql/core/benchmarks/FilterPushdownBenchmark-jdk11-results.txt ## @@ -2,669 +2,669 @@ Pushdown for many distinct value case

[GitHub] [spark] github-actions[bot] commented on pull request #31296: [SPARK-34205][SQL][SS] Add pipe to Dataset to enable Streaming Dataset pipe

2021-05-09 Thread GitBox
github-actions[bot] commented on pull request #31296: URL: https://github.com/apache/spark/pull/31296#issuecomment-835957576 We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-835929791 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42832/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
AmplabJenkins removed a comment on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-835929789 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42833/

[GitHub] [spark] AmplabJenkins commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-835929789 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42833/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
AmplabJenkins commented on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-835929791 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/42832/ --

[GitHub] [spark] viirya commented on a change in pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
viirya commented on a change in pull request #32487: URL: https://github.com/apache/spark/pull/32487#discussion_r628955847 ## File path: dev/create-release/release-build.sh ## @@ -210,6 +210,8 @@ if [[ "$1" == "package" ]]; then PYSPARK_VERSION=`echo "$SPARK_VERSION" |

[GitHub] [spark] dongjoon-hyun commented on pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
dongjoon-hyun commented on pull request #32487: URL: https://github.com/apache/spark/pull/32487#issuecomment-835927141 Also, cc @srowen -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #32487: [SPARK-35358][BUILD] Increase maximum Java heap used for release build to avoid OOM

2021-05-09 Thread GitBox
dongjoon-hyun commented on a change in pull request #32487: URL: https://github.com/apache/spark/pull/32487#discussion_r628955769 ## File path: dev/create-release/release-build.sh ## @@ -210,6 +210,8 @@ if [[ "$1" == "package" ]]; then PYSPARK_VERSION=`echo

  1   2   3   >