[GitHub] [spark] LuciferYang commented on a change in pull request #30701: [SPARK-33212][BUILD] Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile
LuciferYang commented on a change in pull request #30701: URL: https://github.com/apache/spark/pull/30701#discussion_r716146866 ## File path: core/pom.xml ## @@ -66,7 +66,13 @@ org.apache.hadoop - hadoop-client + ${hadoop-client-api.artifact} Review comment: @sunchao Yes, the behavior is expected now ~ thx ~ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] mridulm edited a comment on pull request #34098: [SPARK-36842][Core] TaskSchedulerImpl - stop TaskResultGetter properly
mridulm edited a comment on pull request #34098: URL: https://github.com/apache/spark/pull/34098#issuecomment-927237439 The change looks good to me. Do you want to do the same within `SparkEnv.stop` as well ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] mridulm commented on pull request #34098: [SPARK-36842][Core] TaskSchedulerImpl - stop TaskResultGetter properly
mridulm commented on pull request #34098: URL: https://github.com/apache/spark/pull/34098#issuecomment-927237439 Do you want to do the same within `SparkEnv.stop` as well ? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom
dongjoon-hyun commented on pull request #34100: URL: https://github.com/apache/spark/pull/34100#issuecomment-927237411 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code
AmplabJenkins removed a comment on pull request #34097: URL: https://github.com/apache/spark/pull/34097#issuecomment-927237205 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48142/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code
SparkQA commented on pull request #34097: URL: https://github.com/apache/spark/pull/34097#issuecomment-927237197 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48142/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code
AmplabJenkins commented on pull request #34097: URL: https://github.com/apache/spark/pull/34097#issuecomment-927237205 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48142/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang closed pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom
gengliangwang closed pull request #34100: URL: https://github.com/apache/spark/pull/34100 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom
gengliangwang commented on pull request #34100: URL: https://github.com/apache/spark/pull/34100#issuecomment-927236621 Merging to master/3.2. Thanks all! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] mridulm commented on pull request #34083: Add docs about using Shiv for packaging (similar to PEX)
mridulm commented on pull request #34083: URL: https://github.com/apache/spark/pull/34083#issuecomment-927235605 +CC @zhouyejoe -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE
SparkQA commented on pull request #32084: URL: https://github.com/apache/spark/pull/32084#issuecomment-927234818 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48144/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
SparkQA removed a comment on pull request #34051: URL: https://github.com/apache/spark/pull/34051#issuecomment-927202526 **[Test build #143627 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143627/testReport)** for PR 34051 at commit [`190fa2b`](https://github.com/apache/spark/commit/190fa2b796454125d83a90309b17a1f970e90fe0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0
AmplabJenkins removed a comment on pull request #33873: URL: https://github.com/apache/spark/pull/33873#issuecomment-927230473 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143631/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns
SparkQA removed a comment on pull request #34038: URL: https://github.com/apache/spark/pull/34038#issuecomment-927202534 **[Test build #143628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143628/testReport)** for PR 34038 at commit [`f382cf2`](https://github.com/apache/spark/commit/f382cf27d1b9eb640129e08da3c2811af04cdc5f). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
AmplabJenkins removed a comment on pull request #34051: URL: https://github.com/apache/spark/pull/34051#issuecomment-927233297 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143627/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns
AmplabJenkins removed a comment on pull request #34038: URL: https://github.com/apache/spark/pull/34038#issuecomment-927233299 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143628/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics
AmplabJenkins removed a comment on pull request #34039: URL: https://github.com/apache/spark/pull/34039#issuecomment-927233298 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48141/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE
SparkQA commented on pull request #32084: URL: https://github.com/apache/spark/pull/32084#issuecomment-927233448 **[Test build #143632 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143632/testReport)** for PR 32084 at commit [`a846ecd`](https://github.com/apache/spark/commit/a846ecd5221bc4b21416c9c52552cdaa0e683d0d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34107: [SPARK-36851][SQL] Incorrect parsing of negative ANSI typed interval literals
AmplabJenkins commented on pull request #34107: URL: https://github.com/apache/spark/pull/34107#issuecomment-927233406 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics
AmplabJenkins commented on pull request #34039: URL: https://github.com/apache/spark/pull/34039#issuecomment-927233298 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48141/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns
AmplabJenkins commented on pull request #34038: URL: https://github.com/apache/spark/pull/34038#issuecomment-927233299 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143628/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
AmplabJenkins commented on pull request #34051: URL: https://github.com/apache/spark/pull/34051#issuecomment-927233297 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143627/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns
SparkQA commented on pull request #34038: URL: https://github.com/apache/spark/pull/34038#issuecomment-927232746 **[Test build #143628 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143628/testReport)** for PR 34038 at commit [`f382cf2`](https://github.com/apache/spark/commit/f382cf27d1b9eb640129e08da3c2811af04cdc5f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0
SparkQA commented on pull request #33873: URL: https://github.com/apache/spark/pull/33873#issuecomment-927232712 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48143/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
SparkQA commented on pull request #34051: URL: https://github.com/apache/spark/pull/34051#issuecomment-927232713 **[Test build #143627 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143627/testReport)** for PR 34051 at commit [`190fa2b`](https://github.com/apache/spark/commit/190fa2b796454125d83a90309b17a1f970e90fe0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Peng-Lei commented on pull request #34107: [SPARK-36851][SQL] Incorrect parsing of negative ANSI typed interval literals
Peng-Lei commented on pull request #34107: URL: https://github.com/apache/spark/pull/34107#issuecomment-927232500 @MaxGekk Could you take a look ? Is this fix okay ? Thank you. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics
SparkQA commented on pull request #34039: URL: https://github.com/apache/spark/pull/34039#issuecomment-927232425 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48141/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Peng-Lei opened a new pull request #34107: [SPARK-36851][SQL] Incorrect parsing of negative ANSI typed interval literals
Peng-Lei opened a new pull request #34107: URL: https://github.com/apache/spark/pull/34107 ### What changes were proposed in this pull request? Handle incorrect parsing of negative ANSI typed interval literals [SPARK-36851](https://issues.apache.org/jira/browse/SPARK-36851) ### Why are the changes needed? Incorrect result: ``` spark-sql> select interval -'1' year; 1-0 ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Add ut testcase -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code
SparkQA commented on pull request #34097: URL: https://github.com/apache/spark/pull/34097#issuecomment-927232018 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48142/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0
AmplabJenkins commented on pull request #33873: URL: https://github.com/apache/spark/pull/33873#issuecomment-927230473 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143631/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0
SparkQA removed a comment on pull request #33873: URL: https://github.com/apache/spark/pull/33873#issuecomment-927228151 **[Test build #143631 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143631/testReport)** for PR 33873 at commit [`fc7c271`](https://github.com/apache/spark/commit/fc7c2716e6231982f6bae91e30e6a2aac5e27aa2). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0
SparkQA commented on pull request #33873: URL: https://github.com/apache/spark/pull/33873#issuecomment-927230429 **[Test build #143631 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143631/testReport)** for PR 33873 at commit [`fc7c271`](https://github.com/apache/spark/commit/fc7c2716e6231982f6bae91e30e6a2aac5e27aa2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on pull request #32084: [SPARK-34980][SQL] Support coalesce partition through union in AQE
ulysses-you commented on pull request #32084: URL: https://github.com/apache/spark/pull/32084#issuecomment-927229959 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sunchao commented on a change in pull request #30701: [SPARK-33212][BUILD] Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile
sunchao commented on a change in pull request #30701: URL: https://github.com/apache/spark/pull/30701#discussion_r716138328 ## File path: core/pom.xml ## @@ -66,7 +66,13 @@ org.apache.hadoop - hadoop-client + ${hadoop-client-api.artifact} Review comment: @LuciferYang could you check with the fix in #34100? I just tested it with the command you pasted above: ``` mvn clean install -DskipTests -pl resource-managers/yarn -am -Phadoop-2.7 -Pyarn mvn test -pl resource-managers/yarn -Phadoop-2.7 -Pyarn -DwildcardSuites=org.apache.spark.deploy.yarn.YarnClusterSuite ``` and the tests all passed for me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33873: [SPARK-36624][YARN] In yarn client mode, when ApplicationMaster failed with KILLED/FAILED, driver should exit with code not 0
SparkQA commented on pull request #33873: URL: https://github.com/apache/spark/pull/33873#issuecomment-927228151 **[Test build #143631 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143631/testReport)** for PR 33873 at commit [`fc7c271`](https://github.com/apache/spark/commit/fc7c2716e6231982f6bae91e30e6a2aac5e27aa2). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code
SparkQA commented on pull request #34097: URL: https://github.com/apache/spark/pull/34097#issuecomment-927228025 **[Test build #143630 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143630/testReport)** for PR 34097 at commit [`85297cf`](https://github.com/apache/spark/commit/85297cf9017a5a58c5cee2e9140197ccd607b188). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics
SparkQA commented on pull request #34039: URL: https://github.com/apache/spark/pull/34039#issuecomment-927227822 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48141/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on a change in pull request #30701: [SPARK-33212][BUILD] Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile
LuciferYang commented on a change in pull request #30701: URL: https://github.com/apache/spark/pull/30701#discussion_r716135602 ## File path: core/pom.xml ## @@ -66,7 +66,13 @@ org.apache.hadoop - hadoop-client + ${hadoop-client-api.artifact} Review comment: I test these command in 3.2-rc4(3.2-rc5 can't build with hadoop-2.7 now) , the problem still exists -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code
AngersZh commented on pull request #34097: URL: https://github.com/apache/spark/pull/34097#issuecomment-927225409 > @AngersZh, > > > Make generated code more simple > > can you elabourate it more in the PR description? DOne -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34097: [SPARK-36838][SQL] Refactor InSet generated code
AngersZh commented on a change in pull request #34097: URL: https://github.com/apache/spark/pull/34097#discussion_r716135156 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -612,26 +612,28 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with "" } - val ret = child.dataType match { + val isNaNCode = child.dataType match { case DoubleType => Some((v: Any) => s"java.lang.Double.isNaN($v)") case FloatType => Some((v: Any) => s"java.lang.Float.isNaN($v)") case _ => None } - ret.map { isNaN => -s""" - |if ($setTerm.contains($c)) { - | ${ev.value} = true; - |} else if (${isNaN(c)}) { - | ${ev.value} = $hasNaN; - |} - |$setIsNull - |""".stripMargin - }.getOrElse( -s""" - |${ev.value} = $setTerm.contains($c); - |$setIsNull - """.stripMargin) + hasNaN match { Review comment: > Can we just use if-else here? Also, let's file a separate JIRA. This is technically a performance improvement to avoid dispatching on nan per the values at in-set. Done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on a change in pull request #30701: [SPARK-33212][BUILD] Upgrade to Hadoop 3.2.2 and move to shaded clients for Hadoop 3.x profile
LuciferYang commented on a change in pull request #30701: URL: https://github.com/apache/spark/pull/30701#discussion_r716134847 ## File path: core/pom.xml ## @@ -66,7 +66,13 @@ org.apache.hadoop - hadoop-client + ${hadoop-client-api.artifact} Review comment: @sunchao Yes, this problem still exists, only behavior of branch-3.1 is expected at present -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LuciferYang commented on pull request #34100: [SPARK-36835][FOLLOWUP][BUILD][TEST-HADOOP2.7] Fix maven issue for Hadoop 2.7 profile after enabling dependency reduced pom
LuciferYang commented on pull request #34100: URL: https://github.com/apache/spark/pull/34100#issuecomment-927224824 branch-3.2 also seems to need this fix -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector
HyukjinKwon closed pull request #34106: URL: https://github.com/apache/spark/pull/34106 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector
HyukjinKwon commented on pull request #34106: URL: https://github.com/apache/spark/pull/34106#issuecomment-927224439 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34105: [SPARK-36852][SQL][TESTS] Test ANSI interval support by the Parquet datasource
HyukjinKwon closed pull request #34105: URL: https://github.com/apache/spark/pull/34105 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34105: [SPARK-36852][SQL][TESTS] Test ANSI interval support by the Parquet datasource
HyukjinKwon commented on pull request #34105: URL: https://github.com/apache/spark/pull/34105#issuecomment-927224188 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34098: [SPARK-36842][Core] TaskSchedulerImpl - stop TaskResultGetter properly
HyukjinKwon commented on pull request #34098: URL: https://github.com/apache/spark/pull/34098#issuecomment-927224007 cc @mridulm, @Ngone51 and @tgravescs FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34097: [SPARK-36792][SQL][FOLLOWUP] Refactor InSet generated code
HyukjinKwon commented on pull request #34097: URL: https://github.com/apache/spark/pull/34097#issuecomment-927223930 @AngersZh, > Make generated code more simple can you elabourate it more in the PR description? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34097: [SPARK-36792][SQL][FOLLOWUP] Refactor InSet generated code
HyukjinKwon commented on a change in pull request #34097: URL: https://github.com/apache/spark/pull/34097#discussion_r716133921 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -612,26 +612,28 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with "" } - val ret = child.dataType match { + val isNaNCode = child.dataType match { case DoubleType => Some((v: Any) => s"java.lang.Double.isNaN($v)") case FloatType => Some((v: Any) => s"java.lang.Float.isNaN($v)") case _ => None } - ret.map { isNaN => -s""" - |if ($setTerm.contains($c)) { - | ${ev.value} = true; - |} else if (${isNaN(c)}) { - | ${ev.value} = $hasNaN; - |} - |$setIsNull - |""".stripMargin - }.getOrElse( -s""" - |${ev.value} = $setTerm.contains($c); - |$setIsNull - """.stripMargin) + hasNaN match { Review comment: Can we just use if-else here? Also, let's file a separate JIRA. This is technically a performance improvement to avoid dispatching on nan per the values at in-set. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34099: dataset - toIterator
HyukjinKwon commented on pull request #34099: URL: https://github.com/apache/spark/pull/34099#issuecomment-927223355 Yeah .. there;s no benefit on this .. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34093: [SPARK-36294][SQL] Refactor fifth set of 20 query execution errors to use error classes
HyukjinKwon commented on pull request #34093: URL: https://github.com/apache/spark/pull/34093#issuecomment-927223084 Thanks for working on this @Peng-Lei -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34093: [SPARK-36294][SQL] Refactor fifth set of 20 query execution errors to use error classes
HyukjinKwon commented on pull request #34093: URL: https://github.com/apache/spark/pull/34093#issuecomment-927223051 Seems related test failure: ``` `write.df(df, source = "csv")` threw an error with unexpected message. Expected match: "Error in save : org.apache.spark.SparkIllegalArgumentException: Expected exactly one path to be specified" Actual message: "Error in save : org.apache.spark.SparkIllegalArgumentException: Expected exactly one path to be specified, but got: \n\tat org.apache.spark.sql.errors.QueryExecutionErrors$.multiplePathsSpecifiedError(QueryExecutionErrors.scala:450)\n\tat org.apache.spark.sql.execution.datasources.DataSource.planForWritingFileFormat(DataSource.scala:464)\n\tat org.apache.spark.sql.execution.datasources.DataSource.planForWriting(DataSource.scala:558)\n\tat org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:382)\n\tat org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:355)\n\tat org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:247)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)\n\tat sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)\n\tat java.lang.reflect.Method.invoke(Method.java:498)\n \tat org.apache.spark.api.r.RBackendHandler.handleMethodCall(RBackendHandler.scala:164)\n\tat org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:105)\n\tat org.apache.spark.api.r.RBackendHandler.channelRead0(RBackendHandler.scala:39)\n\tat io.netty.channel.SimpleChannelInboundHandler.channelRead(SimpleChannelInboundHandler.java:99)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:286)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\ n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:103)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:324)\n\tat io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:296)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat io.netty.channel.AbstractChannelHandlerC ontext.fireChannelRead(AbstractChannelHandlerContext.java:357)\n\tat io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:379)\n\tat io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:365)\n\tat io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)\n\tat io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:166)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:719)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:655)\n\tat io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:581)\n\tat io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:493)\n\tat io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecuto r.java:986)\n\tat io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)\n\tat io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)\n\tat java.lang.Thread.run(Thread.java:748)\n\n" Backtrace: 1. testthat::expect_error(...) test_sparkSQL.R:3875:2 7. SparkR::write.df(df, source = "csv") 8. SparkR:::.local(df, path, ...) 9. SparkR:::handledCallJMethod(write, "save") 10. base::tryCatch(...) 11. base:::tryCatchList(expr, classes, parentenv, handlers) 12.
[GitHub] [spark] HyukjinKwon commented on pull request #34053: [SPARK-36813][SQL][PYTHON] Propose an infrastructure of as-of join and imlement ps.merge_asof
HyukjinKwon commented on pull request #34053: URL: https://github.com/apache/spark/pull/34053#issuecomment-927222757 Will merge it in few days if there are no more comments. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics
SparkQA commented on pull request #34039: URL: https://github.com/apache/spark/pull/34039#issuecomment-927222515 **[Test build #143629 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143629/testReport)** for PR 34039 at commit [`5e6b359`](https://github.com/apache/spark/commit/5e6b3596da38ed0a98ef47c97169faf3ce52fa70). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
HyukjinKwon commented on a change in pull request #34051: URL: https://github.com/apache/spark/pull/34051#discussion_r716132533 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala ## @@ -172,8 +172,7 @@ object PartitionPruning extends Rule[LogicalPlan] with PredicateHelper with Join // We can't reuse the broadcast because the join type doesn't support broadcast, // and doing DPP means running an extra query that may have significant overhead. // We need to make sure the pruning side is very big so that DPP is still worthy. - canBroadcastBySize(otherPlan, conf) && Review comment: cc @maryannxue FYI -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics
AmplabJenkins removed a comment on pull request #34039: URL: https://github.com/apache/spark/pull/34039#issuecomment-922243538 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34039: [SPARK-36798][CORE] Wait for listeners to finish before flushing metrics
HyukjinKwon commented on pull request #34039: URL: https://github.com/apache/spark/pull/34039#issuecomment-927222094 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34009: [SPARK-34378][SQL][AVRO] Enhance AvroSerializer validation to allow extra nullable Avro fields
HyukjinKwon commented on pull request #34009: URL: https://github.com/apache/spark/pull/34009#issuecomment-927222065 cc @HeartSaVioR too who might have a bit of context too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #33839: [SPARK-36291][SQL] Refactor second set of 20 in QueryExecutionErrors to use error classes
HyukjinKwon commented on pull request #33839: URL: https://github.com/apache/spark/pull/33839#issuecomment-927221521 @dgd-contributor, please contact me or priv...@spark.apache.org. As I shared in the email, the submissions from the specific shared account will not be accepted for now. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #34070: [SPARK-36840][SQL] Support DPP if there is no selective predicate on the filtering side
wangyum commented on pull request #34070: URL: https://github.com/apache/spark/pull/34070#issuecomment-927219835 cc @cloud-fan @maryannxue -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34091: [SPARK-36839][INFRA] Add daily build with Hadoop 2 profile in GitHub Actions build
HyukjinKwon commented on pull request #34091: URL: https://github.com/apache/spark/pull/34091#issuecomment-927219739 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34091: [SPARK-36839][INFRA] Add daily build with Hadoop 2 profile in GitHub Actions build
HyukjinKwon closed pull request #34091: URL: https://github.com/apache/spark/pull/34091 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34091: [SPARK-36839][INFRA] Add daily build with Hadoop 2 profile in GitHub Actions build
HyukjinKwon commented on pull request #34091: URL: https://github.com/apache/spark/pull/34091#issuecomment-927219648 BTW, I am working on JDK 11 build too. Let me make a PR soon next week cc @dongjoon-hyun -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34091: [SPARK-36839][INFRA] Add daily build with Hadoop 2 profile in GitHub Actions build
HyukjinKwon commented on pull request #34091: URL: https://github.com/apache/spark/pull/34091#issuecomment-927219430 It won't block anything on dev .. let me merge this and fix the tests separately .. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on pull request #34069: [SPARK-36823][SQL] Support broadcast nested loop join hint for equi-join
ulysses-you commented on pull request #34069: URL: https://github.com/apache/spark/pull/34069#issuecomment-927217863 hi @c21 , I agree. In general bnlj is much slower than smj. I find some extreme case that a left join with very small left side and large right side, and unfortunately the right side is also skewed. Then smj does not work good, even failed with OOM at skewed partition. Here a simple benchmark with my local side: ```scala spark.range(0, 1000).selectExpr("id % 1 as c1", "id as c2").repartition(100).createOrReplaceTempView("t1") spark.range(0, 10).selectExpr("id as c1").createOrReplaceTempView("t2") // 5s spark.sql("select /*+ merge(t2) */ count(*) from t2 left join t1 on t1.c1 = t2.c1").collect // 3s spark.sql("select /*+ broadcast_nl(t2) */ count(*) from t2 left join t1 on t1.c1 = t2.c1").collect ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34102: [SPARK-36847][PYTHON] Explicitly specify error codes when ignoring type hint errors
HyukjinKwon commented on pull request #34102: URL: https://github.com/apache/spark/pull/34102#issuecomment-927216525 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #34102: [SPARK-36847][PYTHON] Explicitly specify error codes when ignoring type hint errors
HyukjinKwon closed pull request #34102: URL: https://github.com/apache/spark/pull/34102 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #34058: [SPARK-36711][PYTHON] Support multi-index in new syntax
HyukjinKwon commented on pull request #34058: URL: https://github.com/apache/spark/pull/34058#issuecomment-927216403 Yeah, let's hold off for a while. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
AmplabJenkins removed a comment on pull request #34051: URL: https://github.com/apache/spark/pull/34051#issuecomment-927215025 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48140/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #34097: [SPARK-36792][SQL][FOLLOWUP] Refactor InSet generated code
AngersZh commented on pull request #34097: URL: https://github.com/apache/spark/pull/34097#issuecomment-927215585 ping @cloud-fan -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
AmplabJenkins commented on pull request #34051: URL: https://github.com/apache/spark/pull/34051#issuecomment-927215025 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48140/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
SparkQA commented on pull request #34051: URL: https://github.com/apache/spark/pull/34051#issuecomment-927213916 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48140/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns
AmplabJenkins removed a comment on pull request #34038: URL: https://github.com/apache/spark/pull/34038#issuecomment-927211243 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48139/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns
AmplabJenkins commented on pull request #34038: URL: https://github.com/apache/spark/pull/34038#issuecomment-927211243 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48139/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns
SparkQA commented on pull request #34038: URL: https://github.com/apache/spark/pull/34038#issuecomment-927211237 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48139/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)
AmplabJenkins removed a comment on pull request #34103: URL: https://github.com/apache/spark/pull/34103#issuecomment-927211169 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143626/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)
AmplabJenkins commented on pull request #34103: URL: https://github.com/apache/spark/pull/34103#issuecomment-927211169 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143626/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)
SparkQA removed a comment on pull request #34103: URL: https://github.com/apache/spark/pull/34103#issuecomment-927198458 **[Test build #143626 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143626/testReport)** for PR 34103 at commit [`12a8aca`](https://github.com/apache/spark/commit/12a8aca635ac15b1042ded973a244d3872a18c93). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)
SparkQA commented on pull request #34103: URL: https://github.com/apache/spark/pull/34103#issuecomment-927211000 **[Test build #143626 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143626/testReport)** for PR 34103 at commit [`12a8aca`](https://github.com/apache/spark/commit/12a8aca635ac15b1042ded973a244d3872a18c93). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)
AmplabJenkins removed a comment on pull request #34103: URL: https://github.com/apache/spark/pull/34103#issuecomment-927207228 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48138/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)
AmplabJenkins commented on pull request #34103: URL: https://github.com/apache/spark/pull/34103#issuecomment-927207228 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48138/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)
SparkQA commented on pull request #34103: URL: https://github.com/apache/spark/pull/34103#issuecomment-927207062 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48138/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
SparkQA commented on pull request #34051: URL: https://github.com/apache/spark/pull/34051#issuecomment-927206217 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48140/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen closed pull request #33761: Increasing performance of upper case operation for non-ascii-only strings
srowen closed pull request #33761: URL: https://github.com/apache/spark/pull/33761 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns
SparkQA commented on pull request #34038: URL: https://github.com/apache/spark/pull/34038#issuecomment-927205558 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48139/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen closed pull request #32808: [SPARK-35598] Improve Spark-ML PCA analysis
srowen closed pull request #32808: URL: https://github.com/apache/spark/pull/32808 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen closed pull request #32866: [SPARK-35713]Bug fix for thread leak in JobCancellationSuite
srowen closed pull request #32866: URL: https://github.com/apache/spark/pull/32866 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen commented on pull request #33879: [SPARK-36627][CORE] Fix java deserialization of proxy classes
srowen commented on pull request #33879: URL: https://github.com/apache/spark/pull/33879#issuecomment-927205189 Out of curiosity, where do proxy classes typically come up? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen commented on pull request #34071: [SPARK-36168][BUILD] Add support for Scala 2.13 in dev/test-dependencies.sh
srowen commented on pull request #34071: URL: https://github.com/apache/spark/pull/34071#issuecomment-927205034 Do we need this? the dependency graph isn't scala-version-specific - not for purposes here of detecting changes -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen commented on pull request #34099: dataset - toIterator
srowen commented on pull request #34099: URL: https://github.com/apache/spark/pull/34099#issuecomment-927204935 Why does this help vs collect() and iterating over that? toLocalIterator is optimized over what you are trying to do here on purpose -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen commented on a change in pull request #34086: [SPARK-36836][SQL] Fix incorrect result in `sha2` expression
srowen commented on a change in pull request #34086: URL: https://github.com/apache/spark/pull/34086#discussion_r716117546 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/hash.scala ## @@ -134,8 +137,10 @@ case class Sha2(left: Expression, right: Expression) if ($eval2 == 224) { try { java.security.MessageDigest md = java.security.MessageDigest.getInstance("SHA-224"); -md.update($eval1); -${ev.value} = UTF8String.fromBytes(md.digest()); +byte[] messageDigest = md.digest($eval1); +String hashText = new java.math.BigInteger(1, messageDigest).toString(16); +String paddedHashText = String.format("%56s", hashText).replace(' ', '0'); Review comment: How about using this same code above to ensure consistency? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] srowen closed pull request #32925: SPARK-35622: DataFrame's count function do not need groupBy and avoid shuffle
srowen closed pull request #32925: URL: https://github.com/apache/spark/pull/32925 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34103: [SPARK-32712][SQL] Support writing Hive bucketed table (Hive file formats with Hive hash)
SparkQA commented on pull request #34103: URL: https://github.com/apache/spark/pull/34103#issuecomment-927202978 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48138/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector
SparkQA removed a comment on pull request #34106: URL: https://github.com/apache/spark/pull/34106#issuecomment-927173007 **[Test build #143625 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143625/testReport)** for PR 34106 at commit [`7dfb85d`](https://github.com/apache/spark/commit/7dfb85d1089a34d248f1d1a094872cde57c5d48a). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector
AmplabJenkins removed a comment on pull request #34106: URL: https://github.com/apache/spark/pull/34106#issuecomment-927202626 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143625/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector
AmplabJenkins commented on pull request #34106: URL: https://github.com/apache/spark/pull/34106#issuecomment-927202626 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143625/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34038: [SPARK-36797][SQL] Union should resolve nested columns as top-level columns
SparkQA commented on pull request #34038: URL: https://github.com/apache/spark/pull/34038#issuecomment-927202534 **[Test build #143628 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143628/testReport)** for PR 34038 at commit [`f382cf2`](https://github.com/apache/spark/commit/f382cf27d1b9eb640129e08da3c2811af04cdc5f). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
SparkQA commented on pull request #34051: URL: https://github.com/apache/spark/pull/34051#issuecomment-927202526 **[Test build #143627 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143627/testReport)** for PR 34051 at commit [`190fa2b`](https://github.com/apache/spark/commit/190fa2b796454125d83a90309b17a1f970e90fe0). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #34106: [SPARK-36854][SQL] Handle ANSI intervals by the off-heap column vector
SparkQA commented on pull request #34106: URL: https://github.com/apache/spark/pull/34106#issuecomment-927202385 **[Test build #143625 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143625/testReport)** for PR 34106 at commit [`7dfb85d`](https://github.com/apache/spark/commit/7dfb85d1089a34d248f1d1a094872cde57c5d48a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #34051: [SPARK-36809][SQL] Remove broadcast for InSubqueryExec used in DPP
viirya commented on a change in pull request #34051: URL: https://github.com/apache/spark/pull/34051#discussion_r716114779 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/dynamicpruning/PartitionPruning.scala ## @@ -172,8 +172,7 @@ object PartitionPruning extends Rule[LogicalPlan] with PredicateHelper with Join // We can't reuse the broadcast because the join type doesn't support broadcast, // and doing DPP means running an extra query that may have significant overhead. // We need to make sure the pruning side is very big so that DPP is still worthy. - canBroadcastBySize(otherPlan, conf) && Review comment: I added one config to set a threshold for this query collecting. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org