[GitHub] [spark] LuciferYang commented on a change in pull request #30701: [SPARK-33212][BUILD] Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile

2021-09-25 Thread GitBox


LuciferYang commented on a change in pull request #30701:
URL: https://github.com/apache/spark/pull/30701#discussion_r716146866



##
File path: core/pom.xml
##
@@ -66,7 +66,13 @@
 
 
   org.apache.hadoop
-  hadoop-client
+  ${hadoop-client-api.artifact}

Review comment:
   @sunchao Yes, the behavior is  expected now ~ thx ~




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm edited a comment on pull request #34098: [SPARK-36842][Core] TaskSchedulerImpl - stop TaskResultGetter properly

2021-09-25 Thread GitBox


mridulm edited a comment on pull request #34098:
URL: https://github.com/apache/spark/pull/34098#issuecomment-927237439


   The change looks good to me.
   Do you want to do the same within `SparkEnv.stop` as well ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on pull request #34098: [SPARK-36842][Core] TaskSchedulerImpl - stop TaskResultGetter properly

2021-09-25 Thread GitBox


mridulm commented on pull request #34098:
URL: https://github.com/apache/spark/pull/34098#issuecomment-927237439


   Do you want to do the same within `SparkEnv.stop` as well ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

2021-09-25 Thread GitBox


dongjoon-hyun commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927237411


   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34097:
URL: https://github.com/apache/spark/pull/34097#issuecomment-927237205


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48142/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code

2021-09-25 Thread GitBox


SparkQA commented on pull request #34097:
URL: https://github.com/apache/spark/pull/34097#issuecomment-927237197


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48142/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34097:
URL: https://github.com/apache/spark/pull/34097#issuecomment-927237205


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48142/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang closed pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

2021-09-25 Thread GitBox


gengliangwang closed pull request #34100:
URL: https://github.com/apache/spark/pull/34100


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] gengliangwang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

2021-09-25 Thread GitBox


gengliangwang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927236621


   Merging to master/3.2. Thanks all!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] mridulm commented on pull request #34083: Add docs about using Shiv for packaging (similar to PEX)

2021-09-25 Thread GitBox


mridulm commented on pull request #34083:
URL: https://github.com/apache/spark/pull/34083#issuecomment-927235605


   +CC @zhouyejoe 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE

2021-09-25 Thread GitBox


SparkQA commented on pull request #32084:
URL: https://github.com/apache/spark/pull/32084#issuecomment-927234818


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48144/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


SparkQA removed a comment on pull request #34051:
URL: https://github.com/apache/spark/pull/34051#issuecomment-927202526


   **[Test build #143627 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143627/testReport)**
 for PR 34051 at commit 
[`190fa2b`](https://github.com/apache/spark/commit/190fa2b796454125d83a90309b17a1f970e90fe0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #33873:
URL: https://github.com/apache/spark/pull/33873#issuecomment-927230473


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143631/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns

2021-09-25 Thread GitBox


SparkQA removed a comment on pull request #34038:
URL: https://github.com/apache/spark/pull/34038#issuecomment-927202534


   **[Test build #143628 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143628/testReport)**
 for PR 34038 at commit 
[`f382cf2`](https://github.com/apache/spark/commit/f382cf27d1b9eb640129e08da3c2811af04cdc5f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34051:
URL: https://github.com/apache/spark/pull/34051#issuecomment-927233297


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143627/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34038:
URL: https://github.com/apache/spark/pull/34038#issuecomment-927233299


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143628/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927233298


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48141/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE

2021-09-25 Thread GitBox


SparkQA commented on pull request #32084:
URL: https://github.com/apache/spark/pull/32084#issuecomment-927233448


   **[Test build #143632 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143632/testReport)**
 for PR 32084 at commit 
[`a846ecd`](https://github.com/apache/spark/commit/a846ecd5221bc4b21416c9c52552cdaa0e683d0d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34107: [SPARK-36851][SQL] Incorrect parsing of negative ANSI typed interval literals

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34107:
URL: https://github.com/apache/spark/pull/34107#issuecomment-927233406


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927233298


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48141/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34038:
URL: https://github.com/apache/spark/pull/34038#issuecomment-927233299


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143628/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34051:
URL: https://github.com/apache/spark/pull/34051#issuecomment-927233297


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143627/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns

2021-09-25 Thread GitBox


SparkQA commented on pull request #34038:
URL: https://github.com/apache/spark/pull/34038#issuecomment-927232746


   **[Test build #143628 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143628/testReport)**
 for PR 34038 at commit 
[`f382cf2`](https://github.com/apache/spark/commit/f382cf27d1b9eb640129e08da3c2811af04cdc5f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0

2021-09-25 Thread GitBox


SparkQA commented on pull request #33873:
URL: https://github.com/apache/spark/pull/33873#issuecomment-927232712


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48143/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


SparkQA commented on pull request #34051:
URL: https://github.com/apache/spark/pull/34051#issuecomment-927232713


   **[Test build #143627 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143627/testReport)**
 for PR 34051 at commit 
[`190fa2b`](https://github.com/apache/spark/commit/190fa2b796454125d83a90309b17a1f970e90fe0).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Peng-Lei commented on pull request #34107: [SPARK-36851][SQL] Incorrect parsing of negative ANSI typed interval literals

2021-09-25 Thread GitBox


Peng-Lei commented on pull request #34107:
URL: https://github.com/apache/spark/pull/34107#issuecomment-927232500


   @MaxGekk Could you take a look ? Is this fix okay ? Thank you.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

2021-09-25 Thread GitBox


SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927232425


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48141/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Peng-Lei opened a new pull request #34107: [SPARK-36851][SQL] Incorrect parsing of negative ANSI typed interval literals

2021-09-25 Thread GitBox


Peng-Lei opened a new pull request #34107:
URL: https://github.com/apache/spark/pull/34107


   ### What changes were proposed in this pull request?
   Handle incorrect parsing of negative ANSI typed interval literals
   [SPARK-36851](https://issues.apache.org/jira/browse/SPARK-36851)
   
   
   ### Why are the changes needed?
   Incorrect result:
   ```
   spark-sql> select interval -'1' year;
   1-0
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Add ut testcase
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code

2021-09-25 Thread GitBox


SparkQA commented on pull request #34097:
URL: https://github.com/apache/spark/pull/34097#issuecomment-927232018


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48142/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #33873:
URL: https://github.com/apache/spark/pull/33873#issuecomment-927230473


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143631/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0

2021-09-25 Thread GitBox


SparkQA removed a comment on pull request #33873:
URL: https://github.com/apache/spark/pull/33873#issuecomment-927228151


   **[Test build #143631 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143631/testReport)**
 for PR 33873 at commit 
[`fc7c271`](https://github.com/apache/spark/commit/fc7c2716e6231982f6bae91e30e6a2aac5e27aa2).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0

2021-09-25 Thread GitBox


SparkQA commented on pull request #33873:
URL: https://github.com/apache/spark/pull/33873#issuecomment-927230429


   **[Test build #143631 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143631/testReport)**
 for PR 33873 at commit 
[`fc7c271`](https://github.com/apache/spark/commit/fc7c2716e6231982f6bae91e30e6a2aac5e27aa2).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE

2021-09-25 Thread GitBox


ulysses-you commented on pull request #32084:
URL: https://github.com/apache/spark/pull/32084#issuecomment-927229959


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on a change in pull request #30701: [SPARK-33212][BUILD] Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile

2021-09-25 Thread GitBox


sunchao commented on a change in pull request #30701:
URL: https://github.com/apache/spark/pull/30701#discussion_r716138328



##
File path: core/pom.xml
##
@@ -66,7 +66,13 @@
 
 
   org.apache.hadoop
-  hadoop-client
+  ${hadoop-client-api.artifact}

Review comment:
   @LuciferYang could you check with the fix in #34100? I just tested it 
with the command you pasted above:
   ```
   mvn clean install -DskipTests -pl resource-managers/yarn -am -Phadoop-2.7 
-Pyarn
   mvn test -pl resource-managers/yarn -Phadoop-2.7 -Pyarn 
-DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite
   ```
   and the tests all passed for me.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0

2021-09-25 Thread GitBox


SparkQA commented on pull request #33873:
URL: https://github.com/apache/spark/pull/33873#issuecomment-927228151


   **[Test build #143631 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143631/testReport)**
 for PR 33873 at commit 
[`fc7c271`](https://github.com/apache/spark/commit/fc7c2716e6231982f6bae91e30e6a2aac5e27aa2).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code

2021-09-25 Thread GitBox


SparkQA commented on pull request #34097:
URL: https://github.com/apache/spark/pull/34097#issuecomment-927228025


   **[Test build #143630 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143630/testReport)**
 for PR 34097 at commit 
[`85297cf`](https://github.com/apache/spark/commit/85297cf9017a5a58c5cee2e9140197ccd607b188).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

2021-09-25 Thread GitBox


SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927227822


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48141/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on a change in pull request #30701: [SPARK-33212][BUILD] Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile

2021-09-25 Thread GitBox


LuciferYang commented on a change in pull request #30701:
URL: https://github.com/apache/spark/pull/30701#discussion_r716135602



##
File path: core/pom.xml
##
@@ -66,7 +66,13 @@
 
 
   org.apache.hadoop
-  hadoop-client
+  ${hadoop-client-api.artifact}

Review comment:
   I test these command in 3.2-rc4(3.2-rc5 can't build with hadoop-2.7 now) 
, the problem still exists
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code

2021-09-25 Thread GitBox


AngersZh commented on pull request #34097:
URL: https://github.com/apache/spark/pull/34097#issuecomment-927225409


   > @AngersZh,
   > 
   > > Make generated code more simple
   > 
   > can you elabourate it more in the PR description?
   
   DOne


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code

2021-09-25 Thread GitBox


AngersZh commented on a change in pull request #34097:
URL: https://github.com/apache/spark/pull/34097#discussion_r716135156



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##
@@ -612,26 +612,28 @@ case class InSet(child: Expression, hset: Set[Any]) 
extends UnaryExpression with
 ""
   }
 
-  val ret = child.dataType match {
+  val isNaNCode = child.dataType match {
 case DoubleType => Some((v: Any) => s"java.lang.Double.isNaN($v)")
 case FloatType => Some((v: Any) => s"java.lang.Float.isNaN($v)")
 case _ => None
   }
 
-  ret.map { isNaN =>
-s"""
-  |if ($setTerm.contains($c)) {
-  |  ${ev.value} = true;
-  |} else if (${isNaN(c)}) {
-  |  ${ev.value} =  $hasNaN;
-  |}
-  |$setIsNull
-  |""".stripMargin
-  }.getOrElse(
-s"""
-   |${ev.value} = $setTerm.contains($c);
-   |$setIsNull
- """.stripMargin)
+  hasNaN match {

Review comment:
   > Can we just use if-else here? Also, let's file a separate JIRA. This 
is technically a performance improvement to avoid dispatching on nan per the 
values at in-set.
   
   Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on a change in pull request #30701: [SPARK-33212][BUILD] Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile

2021-09-25 Thread GitBox


LuciferYang commented on a change in pull request #30701:
URL: https://github.com/apache/spark/pull/30701#discussion_r716134847



##
File path: core/pom.xml
##
@@ -66,7 +66,13 @@
 
 
   org.apache.hadoop
-  hadoop-client
+  ${hadoop-client-api.artifact}

Review comment:
   @sunchao Yes, this problem still exists, only behavior of  branch-3.1 is 
expected at present
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom

2021-09-25 Thread GitBox


LuciferYang commented on pull request #34100:
URL: https://github.com/apache/spark/pull/34100#issuecomment-927224824


   branch-3.2 also seems to need this fix
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector

2021-09-25 Thread GitBox


HyukjinKwon closed pull request #34106:
URL: https://github.com/apache/spark/pull/34106


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34106:
URL: https://github.com/apache/spark/pull/34106#issuecomment-927224439


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #34105: [SPARK-36852][SQL][TESTS] Test ANSI interval support by the Parquet datasource

2021-09-25 Thread GitBox


HyukjinKwon closed pull request #34105:
URL: https://github.com/apache/spark/pull/34105


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34105: [SPARK-36852][SQL][TESTS] Test ANSI interval support by the Parquet datasource

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34105:
URL: https://github.com/apache/spark/pull/34105#issuecomment-927224188


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34098: [SPARK-36842][Core] TaskSchedulerImpl - stop TaskResultGetter properly

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34098:
URL: https://github.com/apache/spark/pull/34098#issuecomment-927224007


   cc @mridulm, @Ngone51 and @tgravescs FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34097: [SPARK-36792][SQL][FOLLOWUP] Refactor InSet generated code

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34097:
URL: https://github.com/apache/spark/pull/34097#issuecomment-927223930


   @AngersZh,
   
   > Make generated code more simple
   
   can you elabourate it more in the PR description?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #34097: [SPARK-36792][SQL][FOLLOWUP] Refactor InSet generated code

2021-09-25 Thread GitBox


HyukjinKwon commented on a change in pull request #34097:
URL: https://github.com/apache/spark/pull/34097#discussion_r716133921



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##
@@ -612,26 +612,28 @@ case class InSet(child: Expression, hset: Set[Any]) 
extends UnaryExpression with
 ""
   }
 
-  val ret = child.dataType match {
+  val isNaNCode = child.dataType match {
 case DoubleType => Some((v: Any) => s"java.lang.Double.isNaN($v)")
 case FloatType => Some((v: Any) => s"java.lang.Float.isNaN($v)")
 case _ => None
   }
 
-  ret.map { isNaN =>
-s"""
-  |if ($setTerm.contains($c)) {
-  |  ${ev.value} = true;
-  |} else if (${isNaN(c)}) {
-  |  ${ev.value} =  $hasNaN;
-  |}
-  |$setIsNull
-  |""".stripMargin
-  }.getOrElse(
-s"""
-   |${ev.value} = $setTerm.contains($c);
-   |$setIsNull
- """.stripMargin)
+  hasNaN match {

Review comment:
   Can we just use if-else here? Also, let's file a separate JIRA. This is 
technically a performance improvement to avoid dispatching on nan per the 
values at in-set.
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34099: dataset - toIterator

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34099:
URL: https://github.com/apache/spark/pull/34099#issuecomment-927223355


   Yeah .. there;s no benefit on this ..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34093: [SPARK-36294][SQL] Refactor fifth set of 20 query execution errors to use error classes

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34093:
URL: https://github.com/apache/spark/pull/34093#issuecomment-927223084


   Thanks for working on this @Peng-Lei 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34093: [SPARK-36294][SQL] Refactor fifth set of 20 query execution errors to use error classes

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34093:
URL: https://github.com/apache/spark/pull/34093#issuecomment-927223051


   Seems related test failure:
   
   
   ```
   `write.df(df, source = "csv")` threw an error with unexpected message.
   Expected match: "Error in save : 
org.apache.spark.SparkIllegalArgumentException:   Expected exactly one path to 
be specified"
   Actual message: "Error in save : 
org.apache.spark.SparkIllegalArgumentException: Expected exactly one path to be 
specified, but got: \n\tat 
org.apache.spark.sql.errors.QueryExecutionErrors$.multiplePathsSpecifiedError(QueryExecutionErrors.scala:450)\n\tat
 
org.apache.spark.sql.execution.datasources.DataSource.planForWritingFileFormat(DataSource.scala:464)\n\tat
 
org.apache.spark.sql.execution.datasources.DataSource.planForWriting(DataSource.scala:558)\n\tat
 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:382)\n\tat
 
org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:355)\n\tat
 org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:247)\n\tat 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat
 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat
 java.lang.reflect.Method.invoke(Method.java:498)\n
 \tat 
org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:164)\n\tat
 
org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:105)\n\tat
 
org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:39)\n\tat
 
io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat
 
io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\
 n\tat 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat
 
io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat
 
io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)\n\tat
 
io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 io.netty.channel.AbstractChannelHandlerC
 ontext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat 
io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat
 
io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat
 
io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)\n\tat
 
io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)\n\tat
 
io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)\n\tat
 
io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)\n\tat
 
io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)\n\tat
 io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)\n\tat 
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecuto
 r.java:986)\n\tat 
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat 
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat
 java.lang.Thread.run(Thread.java:748)\n\n"
   Backtrace:
 1. testthat::expect_error(...) test_sparkSQL.R:3875:2
 7. SparkR::write.df(df, source = "csv")
 8. SparkR:::.local(df, path, ...)
 9. SparkR:::handledCallJMethod(write, "save")
10. base::tryCatch(...)
11. base:::tryCatchList(expr, classes, parentenv, handlers)
12. 

[GitHub] [spark] HyukjinKwon commented on pull request #34053: [SPARK-36813][SQL][PYTHON] Propose an infrastructure of as-of join and imlement ps.merge_asof

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34053:
URL: https://github.com/apache/spark/pull/34053#issuecomment-927222757


   Will merge it in few days if there are no more comments.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

2021-09-25 Thread GitBox


SparkQA commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927222515


   **[Test build #143629 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143629/testReport)**
 for PR 34039 at commit 
[`5e6b359`](https://github.com/apache/spark/commit/5e6b3596da38ed0a98ef47c97169faf3ce52fa70).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


HyukjinKwon commented on a change in pull request #34051:
URL: https://github.com/apache/spark/pull/34051#discussion_r716132533



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala
##
@@ -172,8 +172,7 @@ object PartitionPruning extends Rule[LogicalPlan] with 
PredicateHelper with Join
   // We can't reuse the broadcast because the join type doesn't support 
broadcast,
   // and doing DPP means running an extra query that may have significant 
overhead.
   // We need to make sure the pruning side is very big so that DPP is 
still worthy.
-  canBroadcastBySize(otherPlan, conf) &&

Review comment:
   cc @maryannxue FYI




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-922243538


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34039:
URL: https://github.com/apache/spark/pull/34039#issuecomment-927222094


   ok to test


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34009: [SPARK-34378][SQL][AVRO] Enhance AvroSerializer validation to allow extra nullable Avro fields

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34009:
URL: https://github.com/apache/spark/pull/34009#issuecomment-927222065


   cc @HeartSaVioR too who might have a bit of context too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #33839: [SPARK-36291][SQL] Refactor second set of 20 in QueryExecutionErrors to use error classes

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #33839:
URL: https://github.com/apache/spark/pull/33839#issuecomment-927221521


   @dgd-contributor, please contact me or priv...@spark.apache.org. As I shared 
in the email, the submissions from the specific shared account will not be 
accepted for now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side

2021-09-25 Thread GitBox


wangyum commented on pull request #34070:
URL: https://github.com/apache/spark/pull/34070#issuecomment-927219835


   cc @cloud-fan  @maryannxue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34091: [SPARK-36839][INFRA] Add daily build with Hadoop 2 profile in GitHub Actions build

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34091:
URL: https://github.com/apache/spark/pull/34091#issuecomment-927219739


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #34091: [SPARK-36839][INFRA] Add daily build with Hadoop 2 profile in GitHub Actions build

2021-09-25 Thread GitBox


HyukjinKwon closed pull request #34091:
URL: https://github.com/apache/spark/pull/34091


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34091: [SPARK-36839][INFRA] Add daily build with Hadoop 2 profile in GitHub Actions build

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34091:
URL: https://github.com/apache/spark/pull/34091#issuecomment-927219648


   BTW, I am working on JDK 11 build too. Let me make a PR soon next week cc 
@dongjoon-hyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34091: [SPARK-36839][INFRA] Add daily build with Hadoop 2 profile in GitHub Actions build

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34091:
URL: https://github.com/apache/spark/pull/34091#issuecomment-927219430


   It won't block anything on dev .. let me merge this and fix the tests 
separately ..


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on pull request #34069: [SPARK-36823][SQL] Support broadcast nested loop join hint for equi-join

2021-09-25 Thread GitBox


ulysses-you commented on pull request #34069:
URL: https://github.com/apache/spark/pull/34069#issuecomment-927217863


   hi @c21 , I agree. In general bnlj is much slower than smj. I find some 
extreme case that a left join with very small left side and large right side, 
and unfortunately the right side is also skewed. Then smj does not work good, 
even failed with OOM at skewed partition.
   
   Here a simple benchmark with my local side:
   ```scala
   spark.range(0, 1000).selectExpr("id % 1 as c1", "id as 
c2").repartition(100).createOrReplaceTempView("t1")
   spark.range(0, 10).selectExpr("id as c1").createOrReplaceTempView("t2")
   
   // 5s
   spark.sql("select /*+ merge(t2) */ count(*) from t2 left join t1 on t1.c1 = 
t2.c1").collect
   
   // 3s
   spark.sql("select /*+ broadcast_nl(t2) */ count(*) from t2 left join t1 on 
t1.c1 = t2.c1").collect
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34102: [SPARK-36847][PYTHON] Explicitly specify error codes when ignoring type hint errors

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34102:
URL: https://github.com/apache/spark/pull/34102#issuecomment-927216525


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #34102: [SPARK-36847][PYTHON] Explicitly specify error codes when ignoring type hint errors

2021-09-25 Thread GitBox


HyukjinKwon closed pull request #34102:
URL: https://github.com/apache/spark/pull/34102


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #34058: [SPARK-36711][PYTHON] Support multi-index in new syntax

2021-09-25 Thread GitBox


HyukjinKwon commented on pull request #34058:
URL: https://github.com/apache/spark/pull/34058#issuecomment-927216403


   Yeah, let's hold off for a while.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34051:
URL: https://github.com/apache/spark/pull/34051#issuecomment-927215025


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48140/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #34097: [SPARK-36792][SQL][FOLLOWUP] Refactor InSet generated code

2021-09-25 Thread GitBox


AngersZh commented on pull request #34097:
URL: https://github.com/apache/spark/pull/34097#issuecomment-927215585


   ping @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34051:
URL: https://github.com/apache/spark/pull/34051#issuecomment-927215025


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48140/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


SparkQA commented on pull request #34051:
URL: https://github.com/apache/spark/pull/34051#issuecomment-927213916


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48140/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34038:
URL: https://github.com/apache/spark/pull/34038#issuecomment-927211243


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48139/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34038:
URL: https://github.com/apache/spark/pull/34038#issuecomment-927211243


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48139/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns

2021-09-25 Thread GitBox


SparkQA commented on pull request #34038:
URL: https://github.com/apache/spark/pull/34038#issuecomment-927211237


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48139/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34103:
URL: https://github.com/apache/spark/pull/34103#issuecomment-927211169


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143626/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34103:
URL: https://github.com/apache/spark/pull/34103#issuecomment-927211169


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143626/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)

2021-09-25 Thread GitBox


SparkQA removed a comment on pull request #34103:
URL: https://github.com/apache/spark/pull/34103#issuecomment-927198458


   **[Test build #143626 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143626/testReport)**
 for PR 34103 at commit 
[`12a8aca`](https://github.com/apache/spark/commit/12a8aca635ac15b1042ded973a244d3872a18c93).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)

2021-09-25 Thread GitBox


SparkQA commented on pull request #34103:
URL: https://github.com/apache/spark/pull/34103#issuecomment-927211000


   **[Test build #143626 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143626/testReport)**
 for PR 34103 at commit 
[`12a8aca`](https://github.com/apache/spark/commit/12a8aca635ac15b1042ded973a244d3872a18c93).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34103:
URL: https://github.com/apache/spark/pull/34103#issuecomment-927207228


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48138/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34103:
URL: https://github.com/apache/spark/pull/34103#issuecomment-927207228


   
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48138/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)

2021-09-25 Thread GitBox


SparkQA commented on pull request #34103:
URL: https://github.com/apache/spark/pull/34103#issuecomment-927207062


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48138/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


SparkQA commented on pull request #34051:
URL: https://github.com/apache/spark/pull/34051#issuecomment-927206217


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48140/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen closed pull request #33761: Increasing performance of upper case operation for non-ascii-only strings

2021-09-25 Thread GitBox


srowen closed pull request #33761:
URL: https://github.com/apache/spark/pull/33761


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns

2021-09-25 Thread GitBox


SparkQA commented on pull request #34038:
URL: https://github.com/apache/spark/pull/34038#issuecomment-927205558


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48139/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen closed pull request #32808: [SPARK-35598] Improve Spark-ML PCA analysis

2021-09-25 Thread GitBox


srowen closed pull request #32808:
URL: https://github.com/apache/spark/pull/32808


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen closed pull request #32866: [SPARK-35713]Bug fix for thread leak in JobCancellationSuite

2021-09-25 Thread GitBox


srowen closed pull request #32866:
URL: https://github.com/apache/spark/pull/32866


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on pull request #33879: [SPARK-36627][CORE] Fix java deserialization of proxy classes

2021-09-25 Thread GitBox


srowen commented on pull request #33879:
URL: https://github.com/apache/spark/pull/33879#issuecomment-927205189


   Out of curiosity, where do proxy classes typically come up?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on pull request #34071: [SPARK-36168][BUILD] Add support for Scala 2.13 in dev/test-dependencies.sh

2021-09-25 Thread GitBox


srowen commented on pull request #34071:
URL: https://github.com/apache/spark/pull/34071#issuecomment-927205034


   Do we need this? the dependency graph isn't scala-version-specific - not for 
purposes here of detecting changes


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on pull request #34099: dataset - toIterator

2021-09-25 Thread GitBox


srowen commented on pull request #34099:
URL: https://github.com/apache/spark/pull/34099#issuecomment-927204935


   Why does this help vs collect() and iterating over that?
   toLocalIterator is optimized over what you are trying to do here on purpose


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen commented on a change in pull request #34086: [SPARK-36836][SQL] Fix incorrect result in `sha2` expression

2021-09-25 Thread GitBox


srowen commented on a change in pull request #34086:
URL: https://github.com/apache/spark/pull/34086#discussion_r716117546



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala
##
@@ -134,8 +137,10 @@ case class Sha2(left: Expression, right: Expression)
 if ($eval2 == 224) {
   try {
 java.security.MessageDigest md = 
java.security.MessageDigest.getInstance("SHA-224");
-md.update($eval1);
-${ev.value} = UTF8String.fromBytes(md.digest());
+byte[] messageDigest = md.digest($eval1);
+String hashText = new java.math.BigInteger(1, 
messageDigest).toString(16);
+String paddedHashText = String.format("%56s", hashText).replace(' 
', '0');

Review comment:
   How about using this same code above to ensure consistency?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] srowen closed pull request #32925: SPARK-35622: DataFrame's count function do not need groupBy and avoid shuffle

2021-09-25 Thread GitBox


srowen closed pull request #32925:
URL: https://github.com/apache/spark/pull/32925


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)

2021-09-25 Thread GitBox


SparkQA commented on pull request #34103:
URL: https://github.com/apache/spark/pull/34103#issuecomment-927202978


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48138/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector

2021-09-25 Thread GitBox


SparkQA removed a comment on pull request #34106:
URL: https://github.com/apache/spark/pull/34106#issuecomment-927173007


   **[Test build #143625 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143625/testReport)**
 for PR 34106 at commit 
[`7dfb85d`](https://github.com/apache/spark/commit/7dfb85d1089a34d248f1d1a094872cde57c5d48a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector

2021-09-25 Thread GitBox


AmplabJenkins removed a comment on pull request #34106:
URL: https://github.com/apache/spark/pull/34106#issuecomment-927202626


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143625/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector

2021-09-25 Thread GitBox


AmplabJenkins commented on pull request #34106:
URL: https://github.com/apache/spark/pull/34106#issuecomment-927202626


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143625/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns

2021-09-25 Thread GitBox


SparkQA commented on pull request #34038:
URL: https://github.com/apache/spark/pull/34038#issuecomment-927202534


   **[Test build #143628 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143628/testReport)**
 for PR 34038 at commit 
[`f382cf2`](https://github.com/apache/spark/commit/f382cf27d1b9eb640129e08da3c2811af04cdc5f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


SparkQA commented on pull request #34051:
URL: https://github.com/apache/spark/pull/34051#issuecomment-927202526


   **[Test build #143627 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143627/testReport)**
 for PR 34051 at commit 
[`190fa2b`](https://github.com/apache/spark/commit/190fa2b796454125d83a90309b17a1f970e90fe0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector

2021-09-25 Thread GitBox


SparkQA commented on pull request #34106:
URL: https://github.com/apache/spark/pull/34106#issuecomment-927202385


   **[Test build #143625 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143625/testReport)**
 for PR 34106 at commit 
[`7dfb85d`](https://github.com/apache/spark/commit/7dfb85d1089a34d248f1d1a094872cde57c5d48a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP

2021-09-25 Thread GitBox


viirya commented on a change in pull request #34051:
URL: https://github.com/apache/spark/pull/34051#discussion_r716114779



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala
##
@@ -172,8 +172,7 @@ object PartitionPruning extends Rule[LogicalPlan] with 
PredicateHelper with Join
   // We can't reuse the broadcast because the join type doesn't support 
broadcast,
   // and doing DPP means running an extra query that may have significant 
overhead.
   // We need to make sure the pruning side is very big so that DPP is 
still worthy.
-  canBroadcastBySize(otherPlan, conf) &&

Review comment:
   I added one config to set a threshold for this query collecting.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >