[GitHub] [spark] sarutak opened a new pull request #32691: Docker integration test ga take2

2021-05-27 Thread GitBox
sarutak opened a new pull request #32691: URL: https://github.com/apache/spark/pull/32691 ### What changes were proposed in this pull request? This PR proposes to add `docker-integratin-tests` to `run-tests.py` and GA. Once #32631 was merged but there was a lack of consideration.

[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
SparkQA commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850157887 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43564/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #32653: [SPARK-35312][SS] Introduce new Option in Kafka source to specify minimum number of records to read per trigger

2021-05-27 Thread GitBox
SparkQA commented on pull request #32653: URL: https://github.com/apache/spark/pull/32653#issuecomment-850152542 **[Test build #139048 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139048/testReport)** for PR 32653 at commit

[GitHub] [spark] SparkQA commented on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
SparkQA commented on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850150535 **[Test build #139047 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139047/testReport)** for PR 32688 at commit

[GitHub] [spark] LuciferYang commented on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
LuciferYang commented on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850149331 thx @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SparkQA commented on pull request #32690: [SPARK-35510][PYTHON] Fix and reenable test_stats_on_non_numeric_columns_should_be_discarded_if_numeric_only_is_true

2021-05-27 Thread GitBox
SparkQA commented on pull request #32690: URL: https://github.com/apache/spark/pull/32690#issuecomment-850148551 **[Test build #139046 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139046/testReport)** for PR 32690 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-850146664 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43565/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850146667 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139039/

[GitHub] [spark] sarutak commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox
sarutak commented on a change in pull request #32631: URL: https://github.com/apache/spark/pull/32631#discussion_r641279731 ## File path: .github/workflows/build_and_test.yml ## @@ -625,3 +625,83 @@ jobs: with: name: unit-tests-log-tpcds--8-hadoop3.2-hive2.3

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850092153 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850146665 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] AmplabJenkins commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-850146664 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43565/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850146665 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] AmplabJenkins commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850146669 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139040/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850146667 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139039/ -- This

[GitHub] [spark] HyukjinKwon edited a comment on pull request #32689: [SPARK-35552][SQL] Make query stage materialized more readable

2021-05-27 Thread GitBox
HyukjinKwon edited a comment on pull request #32689: URL: https://github.com/apache/spark/pull/32689#issuecomment-850145032 @LuciferYang, the docker test failure should be now fixed in the latest master branch. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] HyukjinKwon commented on pull request #32689: [SPARK-35552][SQL] Make query stage materialized more readable

2021-05-27 Thread GitBox
HyukjinKwon commented on pull request #32689: URL: https://github.com/apache/spark/pull/32689#issuecomment-850145032 @LuciferYang, the docker test failure should be fixed in the latest master branch. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HyukjinKwon edited a comment on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
HyukjinKwon edited a comment on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850144781 @LuciferYang, the test failure should be now fixed in the latest master branch. -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] HyukjinKwon commented on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
HyukjinKwon commented on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850144781 @LuciferYang, the test failure should be fixed in the latest master branch. -- This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] HyukjinKwon edited a comment on pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox
HyukjinKwon edited a comment on pull request #32631: URL: https://github.com/apache/spark/pull/32631#issuecomment-850143899 sorry for reverting quickly - I reverted first as the issue is sort of minor but it takes a while to test related to this  -- This is an automated message from

[GitHub] [spark] HyukjinKwon commented on pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox
HyukjinKwon commented on pull request #32631: URL: https://github.com/apache/spark/pull/32631#issuecomment-850143899 sorry for a revert quickly - I reverted first as the issue is sort of minor but it takes a while to test related to this  -- This is an automated message from the Apache

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox
HyukjinKwon commented on a change in pull request #32631: URL: https://github.com/apache/spark/pull/32631#discussion_r641277034 ## File path: .github/workflows/build_and_test.yml ## @@ -625,3 +625,83 @@ jobs: with: name:

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32631: [SPARK-35483][INFRA] Add docker-integration-tests to run-tests.py and GA.

2021-05-27 Thread GitBox
HyukjinKwon commented on a change in pull request #32631: URL: https://github.com/apache/spark/pull/32631#discussion_r641276796 ## File path: .github/workflows/build_and_test.yml ## @@ -625,3 +625,83 @@ jobs: with: name:

[GitHub] [spark] SparkQA commented on pull request #32689: [SPARK-35552][SQL] Make query stage materialized more readable

2021-05-27 Thread GitBox
SparkQA commented on pull request #32689: URL: https://github.com/apache/spark/pull/32689#issuecomment-850142972 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43563/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-27 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-850142364 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43565/ --

[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
SparkQA commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850140897 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43564/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA removed a comment on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
SparkQA removed a comment on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850068550 **[Test build #139040 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139040/testReport)** for PR 32686 at commit

[GitHub] [spark] SparkQA commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
SparkQA commented on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850137924 **[Test build #139040 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139040/testReport)** for PR 32686 at commit

[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
SparkQA commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850137712 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43562/ -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
SparkQA commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850135302 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43561/ -- This is an automated message from the

[GitHub] [spark] SparkQA removed a comment on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
SparkQA removed a comment on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850068523 **[Test build #139039 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139039/testReport)** for PR 32688 at commit

[GitHub] [spark] SparkQA commented on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
SparkQA commented on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850134234 **[Test build #139039 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139039/testReport)** for PR 32688 at commit

[GitHub] [spark] HyukjinKwon commented on pull request #32690: [SPARK-35510][PYTHON] Fix and reenable test_stats_on_non_numeric_columns_should_be_discarded_if_numeric_only_is_true

2021-05-27 Thread GitBox
HyukjinKwon commented on pull request #32690: URL: https://github.com/apache/spark/pull/32690#issuecomment-850130270 cc @xinrong-databricks and @itholic too fyi -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] HyukjinKwon opened a new pull request #32690: [SPARK-35510][PYTHON] Fix and reenable test_stats_on_non_numeric_columns_should_be_discarded_if_numeric_only_is_true

2021-05-27 Thread GitBox
HyukjinKwon opened a new pull request #32690: URL: https://github.com/apache/spark/pull/32690 ### What changes were proposed in this pull request? This PR proposes to fix and reenable `test_stats_on_non_numeric_columns_should_be_discarded_if_numeric_only_is_true` that was disabled

[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
SparkQA commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850126315 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43562/ -- This is an automated message from the Apache

[GitHub] [spark] lidiyag commented on pull request #32664: [SPARK-35516][WEBUI] Storage UI tab Storage Level tool tip correction

2021-05-27 Thread GitBox
lidiyag commented on pull request #32664: URL: https://github.com/apache/spark/pull/32664#issuecomment-850124776 @dongjoon-hyun @srowen please take a look -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] SparkQA commented on pull request #32473: [SPARK-35345][SQL] Add Parquet tests to BloomFilterBenchmark

2021-05-27 Thread GitBox
SparkQA commented on pull request #32473: URL: https://github.com/apache/spark/pull/32473#issuecomment-850124566 **[Test build #139045 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139045/testReport)** for PR 32473 at commit

[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
SparkQA commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850124515 **[Test build #139044 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139044/testReport)** for PR 32658 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32582: [SPARK-35436][SS] RocksDBFileManager - save checkpoint to DFS

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32582: URL: https://github.com/apache/spark/pull/32582#issuecomment-850124238 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43559/

[GitHub] [spark] SparkQA commented on pull request #32689: [SPARK-35552][SQL] Make query stage materialized more readable

2021-05-27 Thread GitBox
SparkQA commented on pull request #32689: URL: https://github.com/apache/spark/pull/32689#issuecomment-850124454 **[Test build #139043 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139043/testReport)** for PR 32689 at commit

[GitHub] [spark] ulysses-you commented on pull request #32689: [SPARK-35552][SQL] Make query stage materialized more readable

2021-05-27 Thread GitBox
ulysses-you commented on pull request #32689: URL: https://github.com/apache/spark/pull/32689#issuecomment-850124397 cc @maropu @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] ulysses-you opened a new pull request #32689: [SPARK-35552][SQL] Make query stage materialized more readable

2021-05-27 Thread GitBox
ulysses-you opened a new pull request #32689: URL: https://github.com/apache/spark/pull/32689 ### What changes were proposed in this pull request? Add a new method `isMaterialized` in `QueryStageExec`. ### Why are the changes needed? Currently, we use

[GitHub] [spark] AmplabJenkins commented on pull request #32582: [SPARK-35436][SS] RocksDBFileManager - save checkpoint to DFS

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32582: URL: https://github.com/apache/spark/pull/32582#issuecomment-850124238 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43559/ --

[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
SparkQA commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850123795 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43561/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
SparkQA commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-85015 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43560/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #32582: [SPARK-35436][SS] RocksDBFileManager - save checkpoint to DFS

2021-05-27 Thread GitBox
SparkQA commented on pull request #32582: URL: https://github.com/apache/spark/pull/32582#issuecomment-850121547 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43559/ --

[GitHub] [spark] otterc commented on a change in pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage

2021-05-27 Thread GitBox
otterc commented on a change in pull request #30691: URL: https://github.com/apache/spark/pull/30691#discussion_r641257596 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ## @@ -2000,6 +2023,147 @@ private[spark] class DAGScheduler( } }

[GitHub] [spark] allisonwang-db commented on a change in pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-05-27 Thread GitBox
allisonwang-db commented on a change in pull request #32303: URL: https://github.com/apache/spark/pull/32303#discussion_r641255370 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/joinTypes.scala ## @@ -107,6 +107,11 @@ case class UsingJoin(tpe:

[GitHub] [spark] HyukjinKwon edited a comment on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
HyukjinKwon edited a comment on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850107310 @itholic the generated doc looks a bit weird: ![Screen Shot 2021-05-28 at 1 09 38

[GitHub] [spark] HyukjinKwon commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
HyukjinKwon commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850107310 @itholic the generated doc looks a bit weird: ![Screen Shot 2021-05-28 at 1 09 38

[GitHub] [spark] allisonwang-db commented on a change in pull request #32303: [SPARK-34382][SQL] Support LATERAL subqueries

2021-05-27 Thread GitBox
allisonwang-db commented on a change in pull request #32303: URL: https://github.com/apache/spark/pull/32303#discussion_r641253529 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/joins.scala ## @@ -168,6 +168,21 @@ object EliminateOuterJoin

[GitHub] [spark] HyukjinKwon commented on a change in pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
HyukjinKwon commented on a change in pull request #32658: URL: https://github.com/apache/spark/pull/32658#discussion_r641253439 ## File path: docs/sql-data-sources-csv.md ## @@ -38,3 +36,217 @@ Spark SQL provides `spark.read().csv("file_name")` to read a file or directory o

[GitHub] [spark] AmplabJenkins commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850092153 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43558/ --

[GitHub] [spark] SparkQA commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
SparkQA commented on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850092142 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43558/ -- This is an automated message from the

[GitHub] [spark] itholic commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
itholic commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850092067 Thanks, @HyukjinKwon -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SparkQA commented on pull request #32658: [SPARK-35433][DOCS] Move CSV data source options from Python and Scala into a single page

2021-05-27 Thread GitBox
SparkQA commented on pull request #32658: URL: https://github.com/apache/spark/pull/32658#issuecomment-850089349 **[Test build #139042 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139042/testReport)** for PR 32658 at commit

[GitHub] [spark] SparkQA commented on pull request #32582: [SPARK-35436][SS] RocksDBFileManager - save checkpoint to DFS

2021-05-27 Thread GitBox
SparkQA commented on pull request #32582: URL: https://github.com/apache/spark/pull/32582#issuecomment-850087350 **[Test build #139041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139041/testReport)** for PR 32582 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32687: [SPARK-35545][SQL] Split SubqueryExpression's children field into outer attributes and join conditions

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32687: URL: https://github.com/apache/spark/pull/32687#issuecomment-850086867 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139036/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850086868 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43557/

[GitHub] [spark] AmplabJenkins commented on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850086868 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43557/ --

[GitHub] [spark] AmplabJenkins commented on pull request #32687: [SPARK-35545][SQL] Split SubqueryExpression's children field into outer attributes and join conditions

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32687: URL: https://github.com/apache/spark/pull/32687#issuecomment-850086867 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139036/ -- This

[GitHub] [spark] wangyum commented on pull request #32675: [SPARK-35531][SQL] Can not insert into hive bucket table if create table with upper case schema

2021-05-27 Thread GitBox
wangyum commented on pull request #32675: URL: https://github.com/apache/spark/pull/32675#issuecomment-850086524 cc @cloud-fan @yaooqinn @AngersZh -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] wangyum commented on a change in pull request #32675: [SPARK-35531][SQL] Can not insert into hive bucket table if create table with upper case schema

2021-05-27 Thread GitBox
wangyum commented on a change in pull request #32675: URL: https://github.com/apache/spark/pull/32675#discussion_r641220605 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala ## @@ -870,4 +871,68 @@ class InsertSuite extends QueryTest with

[GitHub] [spark] wangyum commented on a change in pull request #32675: [SPARK-35531][SQL] Can not insert into hive bucket table if create table with upper case schema

2021-05-27 Thread GitBox
wangyum commented on a change in pull request #32675: URL: https://github.com/apache/spark/pull/32675#discussion_r641220214 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -1092,14 +1092,28 @@ private[hive] object

[GitHub] [spark] SparkQA commented on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
SparkQA commented on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850083738 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43557/ --

[GitHub] [spark] viirya commented on pull request #32582: [SPARK-35436][SS] RocksDBFileManager - save checkpoint to DFS

2021-05-27 Thread GitBox
viirya commented on pull request #32582: URL: https://github.com/apache/spark/pull/32582#issuecomment-850083083 Thanks @xuanyuanking. I will find some time to review this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [spark] SparkQA commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
SparkQA commented on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850082809 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43558/ -- This is an automated message from the Apache

[GitHub] [spark] xuanyuanking commented on pull request #32582: [SPARK-35436] RocksDBFileManager - save checkpoint to DFS

2021-05-27 Thread GitBox
xuanyuanking commented on pull request #32582: URL: https://github.com/apache/spark/pull/32582#issuecomment-850079062 As we merged #32272, after rebasing and addressing the comment, this one is ready for review. cc @viirya and @HeartSaVioR -- This is an automated message from the

[GitHub] [spark] xuanyuanking commented on pull request #32272: [SPARK-35172][SS] The implementation of RocksDBCheckpointMetadata

2021-05-27 Thread GitBox
xuanyuanking commented on pull request #32272: URL: https://github.com/apache/spark/pull/32272#issuecomment-850078509 Thanks for the review and help! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to

[GitHub] [spark] xuanyuanking commented on a change in pull request #32272: [SPARK-35172][SS] The implementation of RocksDBCheckpointMetadata

2021-05-27 Thread GitBox
xuanyuanking commented on a change in pull request #32272: URL: https://github.com/apache/spark/pull/32272#discussion_r641196335 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala ## @@ -0,0 +1,165 @@ +/* + * Licensed

[GitHub] [spark] SparkQA removed a comment on pull request #32687: [SPARK-35545][SQL] Split SubqueryExpression's children field into outer attributes and join conditions

2021-05-27 Thread GitBox
SparkQA removed a comment on pull request #32687: URL: https://github.com/apache/spark/pull/32687#issuecomment-849987071 **[Test build #139036 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139036/testReport)** for PR 32687 at commit

[GitHub] [spark] SparkQA commented on pull request #32687: [SPARK-35545][SQL] Split SubqueryExpression's children field into outer attributes and join conditions

2021-05-27 Thread GitBox
SparkQA commented on pull request #32687: URL: https://github.com/apache/spark/pull/32687#issuecomment-850074626 **[Test build #139036 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139036/testReport)** for PR 32687 at commit

[GitHub] [spark] HyukjinKwon closed pull request #32673: [SPARK-35530][ML][TESTS] Fix rounding error in DifferentiableLossAggregatorSuite with Java 11

2021-05-27 Thread GitBox
HyukjinKwon closed pull request #32673: URL: https://github.com/apache/spark/pull/32673 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service,

[GitHub] [spark] HyukjinKwon commented on pull request #32673: [SPARK-35530][ML][TESTS] Fix rounding error in DifferentiableLossAggregatorSuite with Java 11

2021-05-27 Thread GitBox
HyukjinKwon commented on pull request #32673: URL: https://github.com/apache/spark/pull/32673#issuecomment-850068886 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] SparkQA commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
SparkQA commented on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850068550 **[Test build #139040 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139040/testReport)** for PR 32686 at commit

[GitHub] [spark] SparkQA commented on pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
SparkQA commented on pull request #32688: URL: https://github.com/apache/spark/pull/32688#issuecomment-850068523 **[Test build #139039 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139039/testReport)** for PR 32688 at commit

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32397: [SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32397: URL: https://github.com/apache/spark/pull/32397#issuecomment-850067873 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] LuciferYang opened a new pull request #32688: [SPARK-35550][BUILD] Upgrade Jackson to 2.12.3

2021-05-27 Thread GitBox
LuciferYang opened a new pull request #32688: URL: https://github.com/apache/spark/pull/32688 ### What changes were proposed in this pull request? This pr upgrade Jackson version to 2.12.3. Jackson Release 2.12.3:

[GitHub] [spark] AmplabJenkins commented on pull request #32397: [SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32397: URL: https://github.com/apache/spark/pull/32397#issuecomment-850067873 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For

[GitHub] [spark] SparkQA removed a comment on pull request #32397: [SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2021-05-27 Thread GitBox
SparkQA removed a comment on pull request #32397: URL: https://github.com/apache/spark/pull/32397#issuecomment-850011679 **[Test build #139038 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139038/testReport)** for PR 32397 at commit

[GitHub] [spark] SparkQA commented on pull request #32397: [SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2021-05-27 Thread GitBox
SparkQA commented on pull request #32397: URL: https://github.com/apache/spark/pull/32397#issuecomment-850058005 **[Test build #139038 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139038/testReport)** for PR 32397 at commit

[GitHub] [spark] SparkQA commented on pull request #32397: [SPARK-35084][CORE] Spark 3: supporting "--packages" in k8s cluster mode

2021-05-27 Thread GitBox
SparkQA commented on pull request #32397: URL: https://github.com/apache/spark/pull/32397#issuecomment-850056962 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43556/ -- This is an automated message from the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-850053730 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139035/

[GitHub] [spark] AmplabJenkins commented on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-850053730 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139035/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-05-27 Thread GitBox
SparkQA removed a comment on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-849958280 **[Test build #139035 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139035/testReport)** for PR 32301 at commit

[GitHub] [spark] SparkQA commented on pull request #32301: [SPARK-35194][SQL] Refactor nested column aliasing for readability

2021-05-27 Thread GitBox
SparkQA commented on pull request #32301: URL: https://github.com/apache/spark/pull/32301#issuecomment-850053105 **[Test build #139035 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139035/testReport)** for PR 32301 at commit

[GitHub] [spark] otterc commented on a change in pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage

2021-05-27 Thread GitBox
otterc commented on a change in pull request #30691: URL: https://github.com/apache/spark/pull/30691#discussion_r641118134 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ## @@ -2000,6 +2023,147 @@ private[spark] class DAGScheduler( } }

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850049662 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139037/

[GitHub] [spark] AmplabJenkins commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850049662 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/139037/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
SparkQA removed a comment on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850010173 **[Test build #139037 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139037/testReport)** for PR 32686 at commit

[GitHub] [spark] SparkQA commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
SparkQA commented on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850049444 **[Test build #139037 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/139037/testReport)** for PR 32686 at commit

[GitHub] [spark] venkata91 commented on pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage

2021-05-27 Thread GitBox
venkata91 commented on pull request #30691: URL: https://github.com/apache/spark/pull/30691#issuecomment-850045368 Addressed all the comments AFAIK, please review @mridulm @Victsm @Ngone51 @otterc -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] venkata91 commented on a change in pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage

2021-05-27 Thread GitBox
venkata91 commented on a change in pull request #30691: URL: https://github.com/apache/spark/pull/30691#discussion_r641095507 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ## @@ -2004,6 +2020,131 @@ private[spark] class DAGScheduler( }

[GitHub] [spark] venkata91 commented on a change in pull request #30691: [SPARK-32920][SHUFFLE] Finalization of Shuffle push/merge with Push based shuffle and preparation step for the reduce stage

2021-05-27 Thread GitBox
venkata91 commented on a change in pull request #30691: URL: https://github.com/apache/spark/pull/30691#discussion_r641094842 ## File path: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala ## @@ -2136,9 +2137,24 @@ private[spark] class DAGScheduler( }

[GitHub] [spark] allisonwang-db commented on pull request #32687: [SPARK-35545][SQL] Split SubqueryExpression's children field into outer attributes and join conditions

2021-05-27 Thread GitBox
allisonwang-db commented on pull request #32687: URL: https://github.com/apache/spark/pull/32687#issuecomment-850041129 cc @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [spark] sunchao commented on pull request #31998: [SPARK-34859][SQL] parquet vectorized reader - support column index with rowIndexes

2021-05-27 Thread GitBox
sunchao commented on pull request #31998: URL: https://github.com/apache/spark/pull/31998#issuecomment-850040241 @lxian In the current approach we'd have to copy values from one vector to another. I think a better and more efficient approach may be to feed the row indexes to

[GitHub] [spark] zhouyejoe edited a comment on pull request #32007: [SPARK-33350][SHUFFLE] Add support to DiskBlockManager to create merge directory and to get the local shuffle merged data

2021-05-27 Thread GitBox
zhouyejoe edited a comment on pull request #32007: URL: https://github.com/apache/spark/pull/32007#issuecomment-850036241 Created ticket for later improvement [SPARK-35546](https://issues.apache.org/jira/browse/SPARK-35546) -- This is an automated message from the Apache Git Service. To

[GitHub] [spark] zhouyejoe commented on pull request #32007: [SPARK-33350][SHUFFLE] Add support to DiskBlockManager to create merge directory and to get the local shuffle merged data

2021-05-27 Thread GitBox
zhouyejoe commented on pull request #32007: URL: https://github.com/apache/spark/pull/32007#issuecomment-850036241 Created ticket for later improvement https://issues.apache.org/jira/browse/SPARK-35546 -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
AmplabJenkins removed a comment on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850035894 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43555/

[GitHub] [spark] AmplabJenkins commented on pull request #32686: [WIP][SPARK-35544][SQL] Add tree pattern pruning to Analyzer rules

2021-05-27 Thread GitBox
AmplabJenkins commented on pull request #32686: URL: https://github.com/apache/spark/pull/32686#issuecomment-850035894 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/43555/ --

  1   2   3   4   5   6   >