[GitHub] [spark] beliefer commented on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
beliefer commented on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879587025 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] sarutak commented on a change in pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-13 Thread GitBox
sarutak commented on a change in pull request #33253: URL: https://github.com/apache/spark/pull/33253#discussion_r669284266 ## File path: core/src/test/resources/HistoryServerExpectations/running_app_list_json_expectation.json ## @@ -1 +1 @@ -[ ] +[ ] Review comment: U

[GitHub] [spark] Ngone51 commented on a change in pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
Ngone51 commented on a change in pull request #32401: URL: https://github.com/apache/spark/pull/32401#discussion_r669282673 ## File path: core/src/main/java/org/apache/spark/shuffle/checksum/ShuffleChecksumHelper.java ## @@ -0,0 +1,83 @@ +package org.apache.spark.shuffle.check

[GitHub] [spark] sunchao commented on pull request #33330: [SPARK-36123][SQL] Parquet vectorized reader doesn't skip null values correctly

2021-07-13 Thread GitBox
sunchao commented on pull request #0: URL: https://github.com/apache/spark/pull/0#issuecomment-879583404 thanks @gengliangwang - I opened #4 for this. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

[GitHub] [spark] sunchao opened a new pull request #33334: [SPARK-35743][SQL][TEST] Refactor ParquetColumnIndexSuite

2021-07-13 Thread GitBox
sunchao opened a new pull request #4: URL: https://github.com/apache/spark/pull/4 ### What changes were proposed in this pull request? Refactor `ParquetColumnIndexSuite` and allow better code reuse. ### Why are the changes needed? A few methods in

[GitHub] [spark] cloud-fan commented on a change in pull request #32872: [SPARK-35639][SQL] Make hasCoalescedPartition return true if something was actually coalesced

2021-07-13 Thread GitBox
cloud-fan commented on a change in pull request #32872: URL: https://github.com/apache/spark/pull/32872#discussion_r669277698 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala ## @@ -87,8 +87,15 @@ case class CustomShuf

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-879580591 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140995/ -

[GitHub] [spark] SparkQA commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-13 Thread GitBox
SparkQA commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-879580580 **[Test build #140995 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140995/testReport)** for PR 33253 at commit [`77920b8`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-13 Thread GitBox
SparkQA removed a comment on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-879580409 **[Test build #140995 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140995/testReport)** for PR 33253 at commit [`77920b8`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-879580591 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140995/ -- This

[GitHub] [spark] SparkQA commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-13 Thread GitBox
SparkQA commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-879580409 **[Test build #140995 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140995/testReport)** for PR 33253 at commit [`77920b8`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-879578133 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140991/ -

[GitHub] [spark] venkata91 commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-13 Thread GitBox
venkata91 commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-879579744 > @venkata91 Could you fix the style issue first? > > ``` > [error] /home/runner/work/spark/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSu

[GitHub] [spark] AmplabJenkins commented on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-879578133 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140991/ -- This

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33323: [SPARK-35739][SQL] Add Java-compatible Dataset.join overloads

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #33323: URL: https://github.com/apache/spark/pull/33323#issuecomment-879136851 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use t

[GitHub] [spark] SparkQA removed a comment on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-07-13 Thread GitBox
SparkQA removed a comment on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-879537921 **[Test build #140991 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140991/testReport)** for PR 33174 at commit [`c6d4f21`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33323: [SPARK-35739][SQL] Add Java-compatible Dataset.join overloads

2021-07-13 Thread GitBox
SparkQA commented on pull request #33323: URL: https://github.com/apache/spark/pull/33323#issuecomment-879577688 **[Test build #140994 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140994/testReport)** for PR 33323 at commit [`7beee40`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-07-13 Thread GitBox
SparkQA commented on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-879577612 **[Test build #140991 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140991/testReport)** for PR 33174 at commit [`c6d4f21`](https://github.co

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-879577172 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140987/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33330: [SPARK-36123][SQL] Parquet vectorized reader doesn't skip null values correctly

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #0: URL: https://github.com/apache/spark/pull/0#issuecomment-879577174 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140986/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879577176 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45506/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879577173 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140990/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33333: [SPARK-36129][BUILD] Upgrade commons-compress to 1.21 to deal with CVEs

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #3: URL: https://github.com/apache/spark/pull/3#issuecomment-879577171 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45507/

[GitHub] [spark] AmplabJenkins commented on pull request #33333: [SPARK-36129][BUILD] Upgrade commons-compress to 1.21 to deal with CVEs

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #3: URL: https://github.com/apache/spark/pull/3#issuecomment-879577171 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45507/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879577173 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140990/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-879577172 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140987/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33330: [SPARK-36123][SQL] Parquet vectorized reader doesn't skip null values correctly

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #0: URL: https://github.com/apache/spark/pull/0#issuecomment-879577174 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140986/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879577176 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45506/ -- T

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33332: [SQL] Warn if less files visible after stats write

2021-07-13 Thread GitBox
HyukjinKwon commented on a change in pull request #2: URL: https://github.com/apache/spark/pull/2#discussion_r669267871 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala ## @@ -166,7 +166,7 @@ class BasicWrite

[GitHub] [spark] SparkQA commented on pull request #33333: [SPARK-36129][BUILD] Upgrade commons-compress to 1.21 to deal with CVEs

2021-07-13 Thread GitBox
SparkQA commented on pull request #3: URL: https://github.com/apache/spark/pull/3#issuecomment-879572572 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45507/ -- This

[GitHub] [spark] SparkQA removed a comment on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
SparkQA removed a comment on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879537873 **[Test build #140990 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140990/testReport)** for PR 33258 at commit [`a843bc3`](https://gi

[GitHub] [spark] HyukjinKwon commented on pull request #33332: [SQL] Warn if less files visible after stats write

2021-07-13 Thread GitBox
HyukjinKwon commented on pull request #2: URL: https://github.com/apache/spark/pull/2#issuecomment-879571129 @tooptoop4 please refer to https://spark.apache.org/contributing.html and make the PR description and title properly with a jira. -- This is an automated message from the

[GitHub] [spark] SparkQA commented on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
SparkQA commented on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879571046 **[Test build #140990 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140990/testReport)** for PR 33258 at commit [`a843bc3`](https://github.co

[GitHub] [spark] HyukjinKwon edited a comment on pull request #33329: [WIP][SPARK-35917][SHUFFLE][CORE][3.2] Disable push-based shuffle feature to prevent it from being used

2021-07-13 Thread GitBox
HyukjinKwon edited a comment on pull request #33329: URL: https://github.com/apache/spark/pull/33329#issuecomment-879569468 Yeah, I think we won't necessarily have to make it failed when it's enabled. I believe it's fine to explicitly document that this feature is unstable, and either corr

[GitHub] [spark] HyukjinKwon commented on pull request #33329: [WIP][SPARK-35917][SHUFFLE][CORE][3.2] Disable push-based shuffle feature to prevent it from being used

2021-07-13 Thread GitBox
HyukjinKwon commented on pull request #33329: URL: https://github.com/apache/spark/pull/33329#issuecomment-879569468 Yeah, I think we won't necessarily have to make it failed when it's enabled. I believe it's fine to explicitly document that this feature is unstable, and either correctness

[GitHub] [spark] gengliangwang commented on pull request #33330: [SPARK-36123][SQL] Parquet vectorized reader doesn't skip null values correctly

2021-07-13 Thread GitBox
gengliangwang commented on pull request #0: URL: https://github.com/apache/spark/pull/0#issuecomment-879568535 @sunchao Thanks for the work. I think it's OK to have a PR for test refactoring. -- This is an automated message from the Apache Git Service. To respond to the message,

[GitHub] [spark] HyukjinKwon commented on pull request #33325: [SPARK-36076][SQL][3.0] ArrayIndexOutOfBounds in Cast string to timestamp

2021-07-13 Thread GitBox
HyukjinKwon commented on pull request #33325: URL: https://github.com/apache/spark/pull/33325#issuecomment-879567089 the sparkr test failure should be ignorable. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [spark] ekoifman commented on a change in pull request #32872: [SPARK-35639][SQL] Make hasCoalescedPartition return true if something was actually coalesced

2021-07-13 Thread GitBox
ekoifman commented on a change in pull request #32872: URL: https://github.com/apache/spark/pull/32872#discussion_r669262650 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala ## @@ -87,8 +87,15 @@ case class CustomShuff

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33324: [SPARK-36093][SQL] RemoveRedundantAliases should not change Command's parameter's expression's name

2021-07-13 Thread GitBox
HyukjinKwon commented on a change in pull request #33324: URL: https://github.com/apache/spark/pull/33324#discussion_r669262443 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala ## @@ -4058,6 +4058,44 @@ class SQLQuerySuite extends QueryTest with S

[GitHub] [spark] HyukjinKwon commented on pull request #33323: [SPARK-35739][SQL] Add Java-compatible Dataset.join overloads

2021-07-13 Thread GitBox
HyukjinKwon commented on pull request #33323: URL: https://github.com/apache/spark/pull/33323#issuecomment-879566146 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsu

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33323: [SPARK-35739][SQL] Add Java-compatible Dataset.join overloads

2021-07-13 Thread GitBox
HyukjinKwon commented on a change in pull request #33323: URL: https://github.com/apache/spark/pull/33323#discussion_r669261877 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -981,6 +1006,58 @@ class Dataset[T] private[sql]( join(right, usin

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33323: [SPARK-35739][SQL] Add Java-compatible Dataset.join overloads

2021-07-13 Thread GitBox
HyukjinKwon commented on a change in pull request #33323: URL: https://github.com/apache/spark/pull/33323#discussion_r669261712 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -956,6 +956,31 @@ class Dataset[T] private[sql]( join(right, Seq(u

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33323: [SPARK-35739][SQL] Add Java-compatible Dataset.join overloads

2021-07-13 Thread GitBox
HyukjinKwon commented on a change in pull request #33323: URL: https://github.com/apache/spark/pull/33323#discussion_r669261605 ## File path: sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala ## @@ -956,6 +956,31 @@ class Dataset[T] private[sql]( join(right, Seq(u

[GitHub] [spark] SparkQA commented on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
SparkQA commented on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879565484 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45506/ -- This is an automated message from the A

[GitHub] [spark] SparkQA removed a comment on pull request #33330: [SPARK-36123][SQL] Parquet vectorized reader doesn't skip null values correctly

2021-07-13 Thread GitBox
SparkQA removed a comment on pull request #0: URL: https://github.com/apache/spark/pull/0#issuecomment-879470999 **[Test build #140986 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140986/testReport)** for PR 0 at commit [`41a7ca8`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33330: [SPARK-36123][SQL] Parquet vectorized reader doesn't skip null values correctly

2021-07-13 Thread GitBox
SparkQA commented on pull request #0: URL: https://github.com/apache/spark/pull/0#issuecomment-879564742 **[Test build #140986 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140986/testReport)** for PR 0 at commit [`41a7ca8`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-07-13 Thread GitBox
SparkQA removed a comment on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-879471127 **[Test build #140987 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140987/testReport)** for PR 33077 at commit [`b540632`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-07-13 Thread GitBox
SparkQA commented on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-879564143 **[Test build #140987 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140987/testReport)** for PR 33077 at commit [`b540632`](https://github.co

[GitHub] [spark] sarutak commented on pull request #33253: [SPARK-36038][CORE] Speculation metrics summary at stage level

2021-07-13 Thread GitBox
sarutak commented on pull request #33253: URL: https://github.com/apache/spark/pull/33253#issuecomment-879563645 @venkata91 Could you fix the style issue first? ``` [error] /home/runner/work/spark/spark/core/src/test/scala/org/apache/spark/deploy/history/HistoryServerSuite.scala:195:

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879562230 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45504/

[GitHub] [spark] SparkQA commented on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
SparkQA commented on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879562217 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45504/ -- This is an automated message from the A

[GitHub] [spark] AmplabJenkins commented on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879562230 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45504/ -- T

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33286: [SPARK-36079][SQL] Null-based filter estimate should always be in the range [0, 1]

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #33286: URL: https://github.com/apache/spark/pull/33286#issuecomment-879558620 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140985/ -

[GitHub] [spark] AmplabJenkins commented on pull request #33286: [SPARK-36079][SQL] Null-based filter estimate should always be in the range [0, 1]

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #33286: URL: https://github.com/apache/spark/pull/33286#issuecomment-879558620 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140985/ -- This

[GitHub] [spark] SparkQA commented on pull request #33333: [SPARK-36129][BUILD] Upgrade commons-compress to 1.21 to deal with CVEs

2021-07-13 Thread GitBox
SparkQA commented on pull request #3: URL: https://github.com/apache/spark/pull/3#issuecomment-879558541 **[Test build #140993 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140993/testReport)** for PR 3 at commit [`ad3da13`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-879557067 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45505/

[GitHub] [spark] AmplabJenkins commented on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-879557067 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45505/ -- T

[GitHub] [spark] sarutak opened a new pull request #33333: [SPARK-36129][BUILD] Upgrade commons-compress to 1.21 to deal with CVEs

2021-07-13 Thread GitBox
sarutak opened a new pull request #3: URL: https://github.com/apache/spark/pull/3 ### What changes were proposed in this pull request? This PR upgrades `commons-compress` from `1.20` to `1.21` to deal with CVEs. ### Why are the changes needed? Some CVEs which aff

[GitHub] [spark] SparkQA commented on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
SparkQA commented on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879555269 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45506/ -- This is an automated message from the Apache

[GitHub] [spark] cfmcgrady commented on pull request #32488: [SPARK-35316][SQL] UnwrapCastInBinaryComparison support In/InSet predicate

2021-07-13 Thread GitBox
cfmcgrady commented on pull request #32488: URL: https://github.com/apache/spark/pull/32488#issuecomment-879554154 > @allisonwang-db good catch! can you open a JIRA ticket to track this bug? Open a JIRA ticket [SPARK-36130](https://issues.apache.org/jira/projects/SPARK/issues/SPARK-3

[GitHub] [spark] HyukjinKwon commented on pull request #30869: [SPARK-33865][SQL] When HiveDDL, we need check avro schema too

2021-07-13 Thread GitBox
HyukjinKwon commented on pull request #30869: URL: https://github.com/apache/spark/pull/30869#issuecomment-879551742 cc @xkrogen FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [spark] HeartSaVioR commented on a change in pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
HeartSaVioR commented on a change in pull request #32401: URL: https://github.com/apache/spark/pull/32401#discussion_r669248537 ## File path: core/src/main/java/org/apache/spark/shuffle/checksum/ShuffleChecksumHelper.java ## @@ -0,0 +1,83 @@ +package org.apache.spark.shuffle.c

[GitHub] [spark] SparkQA removed a comment on pull request #33286: [SPARK-36079][SQL] Null-based filter estimate should always be in the range [0, 1]

2021-07-13 Thread GitBox
SparkQA removed a comment on pull request #33286: URL: https://github.com/apache/spark/pull/33286#issuecomment-879449174 **[Test build #140985 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140985/testReport)** for PR 33286 at commit [`67a228b`](https://gi

[GitHub] [spark] SparkQA commented on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
SparkQA commented on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879551344 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45504/ -- This is an automated message from the Apache

[GitHub] [spark] SparkQA commented on pull request #33286: [SPARK-36079][SQL] Null-based filter estimate should always be in the range [0, 1]

2021-07-13 Thread GitBox
SparkQA commented on pull request #33286: URL: https://github.com/apache/spark/pull/33286#issuecomment-879551139 **[Test build #140985 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140985/testReport)** for PR 33286 at commit [`67a228b`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-07-13 Thread GitBox
SparkQA commented on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-879549740 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45505/ -- This

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33309: [SPARK-36106][SQL][CORE] Label error classes for subset of QueryCompilationErrors

2021-07-13 Thread GitBox
HyukjinKwon commented on a change in pull request #33309: URL: https://github.com/apache/spark/pull/33309#discussion_r669246375 ## File path: core/src/main/resources/error/error-classes.json ## @@ -11,6 +11,25 @@ "message" : [ "Found duplicate keys '%s'" ], "sqlState"

[GitHub] [spark] HyukjinKwon commented on a change in pull request #33263: [SPARK-35027][CORE] Close the inputStream in FileAppender when writin…

2021-07-13 Thread GitBox
HyukjinKwon commented on a change in pull request #33263: URL: https://github.com/apache/spark/pull/33263#discussion_r669246103 ## File path: core/src/main/scala/org/apache/spark/deploy/worker/ExecutorRunner.scala ## @@ -185,11 +185,11 @@ private[deploy] class ExecutorRunner(

[GitHub] [spark] cloud-fan commented on pull request #32488: [SPARK-35316][SQL] UnwrapCastInBinaryComparison support In/InSet predicate

2021-07-13 Thread GitBox
cloud-fan commented on pull request #32488: URL: https://github.com/apache/spark/pull/32488#issuecomment-879547677 @allisonwang-db good catch! can you open a JIRA ticket to track this bug? -- This is an automated message from the Apache Git Service. To respond to the message, please log o

[GitHub] [spark] cloud-fan commented on a change in pull request #24595: [SPARK-20774][SPARK-27036][SQL] Cancel the running broadcast execution on BroadcastTimeout

2021-07-13 Thread GitBox
cloud-fan commented on a change in pull request #24595: URL: https://github.com/apache/spark/pull/24595#discussion_r669244466 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala ## @@ -67,68 +70,74 @@ case class BroadcastEx

[GitHub] [spark] Ngone51 commented on pull request #33116: [SPARK-35259][SHUFFLE] Rename ExternalBlockHandler Timer variables to remove incorrect millis suffix

2021-07-13 Thread GitBox
Ngone51 commented on pull request #33116: URL: https://github.com/apache/spark/pull/33116#issuecomment-87950 > My only concern with this approach is if some other metrics reporter (besides YarnShuffleService) may try to use these custom timers as if they still had nanosecond units. I'm

[GitHub] [spark] HyukjinKwon commented on pull request #33314: [SPARK-36118][SQL] Add bitmap functions for Spark SQL

2021-07-13 Thread GitBox
HyukjinKwon commented on pull request #33314: URL: https://github.com/apache/spark/pull/33314#issuecomment-879544141 are there more references? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

[GitHub] [spark] Shockang commented on a change in pull request #24595: [SPARK-20774][SPARK-27036][SQL] Cancel the running broadcast execution on BroadcastTimeout

2021-07-13 Thread GitBox
Shockang commented on a change in pull request #24595: URL: https://github.com/apache/spark/pull/24595#discussion_r669236306 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala ## @@ -67,68 +70,74 @@ case class BroadcastExc

[GitHub] [spark] Shockang edited a comment on pull request #24595: [SPARK-20774][SPARK-27036][SQL] Cancel the running broadcast execution on BroadcastTimeout

2021-07-13 Thread GitBox
Shockang edited a comment on pull request #24595: URL: https://github.com/apache/spark/pull/24595#issuecomment-879541147 > @Shockang how do you think about this proposal? https://github.com/apache/spark/pull/24595/files#r667590820 Sorry, I'm busy in the past several days. You can tak

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879540980 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140992/ -

[GitHub] [spark] SparkQA removed a comment on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
SparkQA removed a comment on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879540799 **[Test build #140992 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140992/testReport)** for PR 32401 at commit [`8d54e38`](https://gi

[GitHub] [spark] Shockang commented on pull request #24595: [SPARK-20774][SPARK-27036][SQL] Cancel the running broadcast execution on BroadcastTimeout

2021-07-13 Thread GitBox
Shockang commented on pull request #24595: URL: https://github.com/apache/spark/pull/24595#issuecomment-879541147 > @Shockang how do you think about this proposal? https://github.com/apache/spark/pull/24595/files#r667590820 Sorry, I'm busy these two days. You can take a look at my su

[GitHub] [spark] SparkQA commented on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
SparkQA commented on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879540969 **[Test build #140992 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140992/testReport)** for PR 32401 at commit [`8d54e38`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879540980 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140992/ -- This

[GitHub] [spark] Shockang commented on a change in pull request #24595: [SPARK-20774][SPARK-27036][SQL] Cancel the running broadcast execution on BroadcastTimeout

2021-07-13 Thread GitBox
Shockang commented on a change in pull request #24595: URL: https://github.com/apache/spark/pull/24595#discussion_r669236306 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/BroadcastExchangeExec.scala ## @@ -67,68 +70,74 @@ case class BroadcastExc

[GitHub] [spark] SparkQA commented on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
SparkQA commented on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879540799 **[Test build #140992 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140992/testReport)** for PR 32401 at commit [`8d54e38`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #33332: [SQL] Warn if less files visible after stats write

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #2: URL: https://github.com/apache/spark/pull/2#issuecomment-879540304 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL a

[GitHub] [spark] ulysses-you commented on a change in pull request #32872: [SPARK-35639][SQL] Make hasCoalescedPartition return true if something was actually coalesced

2021-07-13 Thread GitBox
ulysses-you commented on a change in pull request #32872: URL: https://github.com/apache/spark/pull/32872#discussion_r669237731 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala ## @@ -87,8 +87,15 @@ case class CustomSh

[GitHub] [spark] tooptoop4 opened a new pull request #33332: [SQL] Warn if less files visible after stats write

2021-07-13 Thread GitBox
tooptoop4 opened a new pull request #2: URL: https://github.com/apache/spark/pull/2 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: review

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32401: [SPARK-35276][CORE] Calculate checksum for shuffle data and write as checksum file

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #32401: URL: https://github.com/apache/spark/pull/32401#issuecomment-879200836 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45489/

[GitHub] [spark] SparkQA commented on pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-07-13 Thread GitBox
SparkQA commented on pull request #33174: URL: https://github.com/apache/spark/pull/33174#issuecomment-879537921 **[Test build #140991 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140991/testReport)** for PR 33174 at commit [`c6d4f21`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #33258: [SPARK-36037][SQL] Support ANSI SQL LOCALTIMESTAMP datetime value function

2021-07-13 Thread GitBox
SparkQA commented on pull request #33258: URL: https://github.com/apache/spark/pull/33258#issuecomment-879537873 **[Test build #140990 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140990/testReport)** for PR 33258 at commit [`a843bc3`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-879536574 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45503/

[GitHub] [spark] AmplabJenkins removed a comment on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #32049: URL: https://github.com/apache/spark/pull/32049#issuecomment-879536575 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140984/ -

[GitHub] [spark] AmplabJenkins removed a comment on pull request #33331: [SPARK-36125][PYTHON] Implement non-equality comparison operators between two Categoricals

2021-07-13 Thread GitBox
AmplabJenkins removed a comment on pull request #1: URL: https://github.com/apache/spark/pull/1#issuecomment-879536572 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45502/

[GitHub] [spark] AmplabJenkins commented on pull request #33331: [SPARK-36125][PYTHON] Implement non-equality comparison operators between two Categoricals

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #1: URL: https://github.com/apache/spark/pull/1#issuecomment-879536572 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45502/ -- T

[GitHub] [spark] AmplabJenkins commented on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #32049: URL: https://github.com/apache/spark/pull/32049#issuecomment-879536575 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/140984/ -- This

[GitHub] [spark] AmplabJenkins commented on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-07-13 Thread GitBox
AmplabJenkins commented on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-879536574 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/45503/ -- T

[GitHub] [spark] ekoifman commented on a change in pull request #32872: [SPARK-35639][SQL] Make hasCoalescedPartition return true if something was actually coalesced

2021-07-13 Thread GitBox
ekoifman commented on a change in pull request #32872: URL: https://github.com/apache/spark/pull/32872#discussion_r669233485 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CustomShuffleReaderExec.scala ## @@ -87,8 +87,15 @@ case class CustomShuff

[GitHub] [spark] Yikun commented on a change in pull request #33174: [SPARK-35721][PYTHON] Path level discover for python unittests

2021-07-13 Thread GitBox
Yikun commented on a change in pull request #33174: URL: https://github.com/apache/spark/pull/33174#discussion_r669228900 ## File path: python/run-tests.py ## @@ -40,6 +44,111 @@ from sparktestsupport.shellutils import which, subprocess_check_output # noqa from sparktestsupp

[GitHub] [spark] SparkQA commented on pull request #33077: [SPARK-34892][SS] Introduce MergingSortWithSessionWindowStateIterator sorting input rows and rows in state efficiently

2021-07-13 Thread GitBox
SparkQA commented on pull request #33077: URL: https://github.com/apache/spark/pull/33077#issuecomment-879529600 Kubernetes integration test unable to build dist. exiting with code: 1 URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45503/ -- This

[GitHub] [spark] otterc commented on a change in pull request #33329: [WIP][SPARK-35917][SHUFFLE][CORE][3.2] Disable push-based shuffle feature to prevent it from being used

2021-07-13 Thread GitBox
otterc commented on a change in pull request #33329: URL: https://github.com/apache/spark/pull/33329#discussion_r669189297 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -2079,7 +2079,7 @@ package object config { "conjunction wit

[GitHub] [spark] SparkQA removed a comment on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-07-13 Thread GitBox
SparkQA removed a comment on pull request #32049: URL: https://github.com/apache/spark/pull/32049#issuecomment-879419291 **[Test build #140984 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140984/testReport)** for PR 32049 at commit [`2c889c6`](https://gi

[GitHub] [spark] SparkQA commented on pull request #32049: [SPARK-34952][SQL] Aggregate (Min/Max/Count) push down for Parquet

2021-07-13 Thread GitBox
SparkQA commented on pull request #32049: URL: https://github.com/apache/spark/pull/32049#issuecomment-879528075 **[Test build #140984 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/140984/testReport)** for PR 32049 at commit [`2c889c6`](https://github.co

[GitHub] [spark] SparkQA commented on pull request #33331: [SPARK-36125][PYTHON] Implement non-equality comparison operators between two Categoricals

2021-07-13 Thread GitBox
SparkQA commented on pull request #1: URL: https://github.com/apache/spark/pull/1#issuecomment-879522682 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/45502/ -- This is an automated message from the A

<    1   2   3   4   5   6   7   >