[GitHub] [spark] AmplabJenkins commented on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…
AmplabJenkins commented on pull request #29907: URL: https://github.com/apache/spark/pull/29907#issuecomment-703051380 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…
AmplabJenkins removed a comment on pull request #29907: URL: https://github.com/apache/spark/pull/29907#issuecomment-703051265 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…
AmplabJenkins removed a comment on pull request #29907: URL: https://github.com/apache/spark/pull/29907#issuecomment-701138683 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…
AmplabJenkins commented on pull request #29907: URL: https://github.com/apache/spark/pull/29907#issuecomment-703051265 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] duanmeng commented on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…
duanmeng commented on pull request #29907: URL: https://github.com/apache/spark/pull/29907#issuecomment-703051053 > > Hi @duanmeng Could you elaborate more on how to reproduce the issue? Sorry that I accidently close it, I will repoen it. This issue is hard to reproduce for it should be a bug of to the cluster's disk / kernel, which make the shuffle data file empty after records committing. But we defend it in spark, should we change it from **bug** to **improvement**? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] duanmeng commented on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…
duanmeng commented on pull request #29907: URL: https://github.com/apache/spark/pull/29907#issuecomment-703050101 > Hi @duanmeng Could you elaborate more on how to reproduce the issue? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] duanmeng closed pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…
duanmeng closed pull request #29907: URL: https://github.com/apache/spark/pull/29907 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #29934: Make sure the pod template configmap has a unique name
HyukjinKwon commented on pull request #29934: URL: https://github.com/apache/spark/pull/29934#issuecomment-703042759 Can you file a jira and link it to the PR title? See also https://spark.apache.org/contributing.html This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on a change in pull request #29935: [SPARK-33055][PYTHON][SQL] Add Python CalendarIntervalType
HyukjinKwon commented on a change in pull request #29935: URL: https://github.com/apache/spark/pull/29935#discussion_r499113675 ## File path: python/pyspark/sql/types.py ## @@ -186,6 +186,30 @@ def fromInternal(self, ts): return datetime.datetime.fromtimestamp(ts // 100).replace(microsecond=ts % 100) +class CalendarIntervalType(DataType, metaclass=DataTypeSingleton): Review comment: There have been a lot of discussions about exposing interval type in other language APIs but I lost the track. @yaooqinn and @cloud-fan, are we going to make internal as a proper exposed type? Or only support it in some contexts? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya closed pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
viirya closed pull request #29916: URL: https://github.com/apache/spark/pull/29916 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
viirya commented on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703038983 Thanks! Merging to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
AmplabJenkins removed a comment on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-703036917 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
AmplabJenkins commented on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-703036917 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
SparkQA commented on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-703036911 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33981/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
SparkQA commented on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-703035085 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33981/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
AmplabJenkins removed a comment on pull request #29880: URL: https://github.com/apache/spark/pull/29880#issuecomment-703032475 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
AmplabJenkins commented on pull request #29880: URL: https://github.com/apache/spark/pull/29880#issuecomment-703032475 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
SparkQA removed a comment on pull request #29880: URL: https://github.com/apache/spark/pull/29880#issuecomment-702976531 **[Test build #129368 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129368/testReport)** for PR 29880 at commit [`11cfcd3`](https://github.com/apache/spark/commit/11cfcd30f5e38789698cbbfdd3e2a740685339f0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
SparkQA commented on pull request #29880: URL: https://github.com/apache/spark/pull/29880#issuecomment-703032175 **[Test build #129368 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129368/testReport)** for PR 29880 at commit [`11cfcd3`](https://github.com/apache/spark/commit/11cfcd30f5e38789698cbbfdd3e2a740685339f0). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
AmplabJenkins removed a comment on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703030743 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
AmplabJenkins commented on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703030743 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
SparkQA removed a comment on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703005189 **[Test build #129372 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129372/testReport)** for PR 29916 at commit [`1dd54e8`](https://github.com/apache/spark/commit/1dd54e846b52edaa10a8bddd229ff743f9a9b1da). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
AmplabJenkins removed a comment on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703008513 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/33980/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
SparkQA commented on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703030400 **[Test build #129372 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129372/testReport)** for PR 29916 at commit [`1dd54e8`](https://github.com/apache/spark/commit/1dd54e846b52edaa10a8bddd229ff743f9a9b1da). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
AmplabJenkins removed a comment on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-702872834 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
SparkQA commented on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-703027812 **[Test build #129373 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129373/testReport)** for PR 26935 at commit [`844422f`](https://github.com/apache/spark/commit/844422f38403f50b47d8eb10d4bb47c05c3f43d6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
AmplabJenkins removed a comment on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-703027624 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
AmplabJenkins commented on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-703027624 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
SparkQA removed a comment on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-702970066 **[Test build #129367 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129367/testReport)** for PR 29831 at commit [`15ec353`](https://github.com/apache/spark/commit/15ec3534e345631fd775d5679507e651291e0552). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
SparkQA commented on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-703027320 **[Test build #129367 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129367/testReport)** for PR 29831 at commit [`15ec353`](https://github.com/apache/spark/commit/15ec3534e345631fd775d5679507e651291e0552). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore
HeartSaVioR commented on pull request #26935: URL: https://github.com/apache/spark/pull/26935#issuecomment-703026816 retest this, please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] github-actions[bot] closed pull request #28780: [SPARK-31952][SQL]Fix incorrect memory spill metric when doing Aggregate
github-actions[bot] closed pull request #28780: URL: https://github.com/apache/spark/pull/28780 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
AmplabJenkins commented on pull request #29855: URL: https://github.com/apache/spark/pull/29855#issuecomment-703013765 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
AmplabJenkins removed a comment on pull request #29855: URL: https://github.com/apache/spark/pull/29855#issuecomment-703013765 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
SparkQA commented on pull request #29855: URL: https://github.com/apache/spark/pull/29855#issuecomment-703013464 **[Test build #129370 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129370/testReport)** for PR 29855 at commit [`db36f3f`](https://github.com/apache/spark/commit/db36f3fcaab6793379f6fa99ee7d27f9b5abb90d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
SparkQA removed a comment on pull request #29855: URL: https://github.com/apache/spark/pull/29855#issuecomment-702978917 **[Test build #129370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129370/testReport)** for PR 29855 at commit [`db36f3f`](https://github.com/apache/spark/commit/db36f3fcaab6793379f6fa99ee7d27f9b5abb90d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2
AmplabJenkins removed a comment on pull request #29885: URL: https://github.com/apache/spark/pull/29885#issuecomment-703011638 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2
AmplabJenkins commented on pull request #29885: URL: https://github.com/apache/spark/pull/29885#issuecomment-703011638 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2
SparkQA removed a comment on pull request #29885: URL: https://github.com/apache/spark/pull/29885#issuecomment-702933292 **[Test build #129365 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129365/testReport)** for PR 29885 at commit [`19441da`](https://github.com/apache/spark/commit/19441da91073a48aa07e5af6642cb1cea667861e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2
SparkQA commented on pull request #29885: URL: https://github.com/apache/spark/pull/29885#issuecomment-703011273 **[Test build #129365 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129365/testReport)** for PR 29885 at commit [`19441da`](https://github.com/apache/spark/commit/19441da91073a48aa07e5af6642cb1cea667861e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
AmplabJenkins removed a comment on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703008508 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
AmplabJenkins commented on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703008508 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
SparkQA commented on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703005189 **[Test build #129372 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129372/testReport)** for PR 29916 at commit [`1dd54e8`](https://github.com/apache/spark/commit/1dd54e846b52edaa10a8bddd229ff743f9a9b1da). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
AmplabJenkins removed a comment on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-702872830 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin
viirya commented on pull request #29916: URL: https://github.com/apache/spark/pull/29916#issuecomment-703003729 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
AmplabJenkins commented on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-70249 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
AmplabJenkins removed a comment on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-70249 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
SparkQA commented on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702999509 **[Test build #129366 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129366/testReport)** for PR 29936 at commit [`16b3452`](https://github.com/apache/spark/commit/16b3452d88824615a094671cb5aa9b0bdba9b498). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
SparkQA removed a comment on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702953727 **[Test build #129366 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129366/testReport)** for PR 29936 at commit [`16b3452`](https://github.com/apache/spark/commit/16b3452d88824615a094671cb5aa9b0bdba9b498). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xkrogen commented on a change in pull request #29906: [SPARK-32037][CORE] Rename blacklisting feature
xkrogen commented on a change in pull request #29906: URL: https://github.com/apache/spark/pull/29906#discussion_r499077732 ## File path: core/src/main/scala/org/apache/spark/internal/config/package.scala ## @@ -722,74 +722,83 @@ package object config { .booleanConf .createWithDefault(true) - // Blacklist confs - private[spark] val BLACKLIST_ENABLED = -ConfigBuilder("spark.blacklist.enabled") + private[spark] val EXCLUDE_ON_FAILURE_ENABLED = +ConfigBuilder("spark.excludeOnFailure.enabled") .version("2.1.0") Review comment: Do we need to update the "from" version strings here? ## File path: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala ## @@ -907,13 +908,13 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val rpcEnv: Rp protected def currentDelegationTokens: Array[Byte] = delegationTokens.get() /** - * Checks whether the executor is blacklisted. This is called when the executor tries to - * register with the scheduler, and will deny registration if this method returns true. + * Checks whether the executor is excluded due to failure(s). This is called when the executor + * tries to register with the scheduler, and will deny registration if this method returns true. Review comment: minor nit: extra space at the start of the line ## File path: core/src/main/scala/org/apache/spark/status/api/v1/api.scala ## @@ -82,10 +82,11 @@ class ExecutorStageSummary private[spark]( val shuffleWriteRecords : Long, val memoryBytesSpilled : Long, val diskBytesSpilled : Long, -val isBlacklistedForStage: Boolean, +val isBlacklistedForStage: Boolean, // deprecated Review comment: Can we `@deprecated` for this and others? ## File path: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ## @@ -284,80 +284,138 @@ private[spark] class AppStatusListener( } override def onExecutorBlacklisted(event: SparkListenerExecutorBlacklisted): Unit = { -updateBlackListStatus(event.executorId, true) +updateExcludedStatus(event.executorId, true) + } + + override def onExecutorExcluded(event: SparkListenerExecutorExcluded): Unit = { +updateExcludedStatus(event.executorId, true) } override def onExecutorBlacklistedForStage( - event: SparkListenerExecutorBlacklistedForStage): Unit = { +event: SparkListenerExecutorBlacklistedForStage): Unit = { +val now = System.nanoTime() + +Option(liveStages.get((event.stageId, event.stageAttemptId))).foreach { stage => + setStageExcludedStatus(stage, now, event.executorId) +} +liveExecutors.get(event.executorId).foreach { exec => + addExcludedStageTo(exec, event.stageId, now) +} + } + + override def onExecutorExcludedForStage( + event: SparkListenerExecutorExcludedForStage): Unit = { val now = System.nanoTime() Option(liveStages.get((event.stageId, event.stageAttemptId))).foreach { stage => - setStageBlackListStatus(stage, now, event.executorId) + setStageExcludedStatus(stage, now, event.executorId) } liveExecutors.get(event.executorId).foreach { exec => - addBlackListedStageTo(exec, event.stageId, now) + addExcludedStageTo(exec, event.stageId, now) } } override def onNodeBlacklistedForStage(event: SparkListenerNodeBlacklistedForStage): Unit = { val now = System.nanoTime() -// Implicitly blacklist every available executor for the stage associated with this node +// Implicitly exclude every available executor for the stage associated with this node Option(liveStages.get((event.stageId, event.stageAttemptId))).foreach { stage => val executorIds = liveExecutors.values.filter(_.host == event.hostId).map(_.executorId).toSeq - setStageBlackListStatus(stage, now, executorIds: _*) + setStageExcludedStatus(stage, now, executorIds: _*) } liveExecutors.values.filter(_.hostname == event.hostId).foreach { exec => - addBlackListedStageTo(exec, event.stageId, now) + addExcludedStageTo(exec, event.stageId, now) +} + } + + override def onNodeExcludedForStage(event: SparkListenerNodeExcludedForStage): Unit = { +val now = System.nanoTime() + +// Implicitly exclude every available executor for the stage associated with this node +Option(liveStages.get((event.stageId, event.stageAttemptId))).foreach { stage => + val executorIds = liveExecutors.values.filter(_.host == event.hostId).map(_.executorId).toSeq + setStageExcludedStatus(stage, now, executorIds: _*) +} +liveExecutors.values.filter(_.hostname == event.hostId).foreach { exec => + addExcludedStageTo(exec, event.stageId, now) } } private def addBlackListedStageTo(exec: LiveExecutor, stageId: Int, now: Long): Unit = { -exec.blacklistedInStages += stageId +
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
AmplabJenkins removed a comment on pull request #29855: URL: https://github.com/apache/spark/pull/29855#issuecomment-702995892 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
AmplabJenkins commented on pull request #29855: URL: https://github.com/apache/spark/pull/29855#issuecomment-702995892 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
SparkQA commented on pull request #29855: URL: https://github.com/apache/spark/pull/29855#issuecomment-702995878 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33979/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
AmplabJenkins removed a comment on pull request #29880: URL: https://github.com/apache/spark/pull/29880#issuecomment-702994100 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
SparkQA commented on pull request #29880: URL: https://github.com/apache/spark/pull/29880#issuecomment-702994093 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33978/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
AmplabJenkins commented on pull request #29880: URL: https://github.com/apache/spark/pull/29880#issuecomment-702994100 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29919: [SPARK-33042][SQL][TEST] Add a test case to ensure changes to spark.sql.optimizer.maxIterations take effect at runtime
AmplabJenkins removed a comment on pull request #29919: URL: https://github.com/apache/spark/pull/29919#issuecomment-702992524 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29919: [SPARK-33042][SQL][TEST] Add a test case to ensure changes to spark.sql.optimizer.maxIterations take effect at runtime
AmplabJenkins commented on pull request #29919: URL: https://github.com/apache/spark/pull/29919#issuecomment-702992524 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29919: [SPARK-33042][SQL][TEST] Add a test case to ensure changes to spark.sql.optimizer.maxIterations take effect at runtime
SparkQA removed a comment on pull request #29919: URL: https://github.com/apache/spark/pull/29919#issuecomment-702894900 **[Test build #129360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129360/testReport)** for PR 29919 at commit [`188d667`](https://github.com/apache/spark/commit/188d6671a4bac3b4422824f578606c52a5d527f1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29919: [SPARK-33042][SQL][TEST] Add a test case to ensure changes to spark.sql.optimizer.maxIterations take effect at runtime
SparkQA commented on pull request #29919: URL: https://github.com/apache/spark/pull/29919#issuecomment-702992003 **[Test build #129360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129360/testReport)** for PR 29919 at commit [`188d667`](https://github.com/apache/spark/commit/188d6671a4bac3b4422824f578606c52a5d527f1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
SparkQA commented on pull request #29855: URL: https://github.com/apache/spark/pull/29855#issuecomment-702991343 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33979/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
SparkQA commented on pull request #29880: URL: https://github.com/apache/spark/pull/29880#issuecomment-702987903 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33978/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
AmplabJenkins removed a comment on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-702986590 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
SparkQA commented on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-702986570 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33977/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
AmplabJenkins commented on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-702986590 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
AmplabJenkins removed a comment on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702985437 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129371/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
SparkQA commented on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702985425 **[Test build #129371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129371/testReport)** for PR 29874 at commit [`12a06c0`](https://github.com/apache/spark/commit/12a06c042011ef8302ab2b61c935714c58e8453f). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
AmplabJenkins removed a comment on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702985432 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
AmplabJenkins commented on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702985432 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
SparkQA removed a comment on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702984982 **[Test build #129371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129371/testReport)** for PR 29874 at commit [`12a06c0`](https://github.com/apache/spark/commit/12a06c042011ef8302ab2b61c935714c58e8453f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
SparkQA commented on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702984982 **[Test build #129371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129371/testReport)** for PR 29874 at commit [`12a06c0`](https://github.com/apache/spark/commit/12a06c042011ef8302ab2b61c935714c58e8453f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Victsm commented on a change in pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
Victsm commented on a change in pull request #29855: URL: https://github.com/apache/spark/pull/29855#discussion_r499071938 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/BlockPushException.java ## @@ -0,0 +1,86 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.network.shuffle; + +import java.nio.ByteBuffer; +import java.nio.charset.StandardCharsets; + +import org.apache.spark.network.shuffle.protocol.BlockTransferMessage; +import org.apache.spark.network.shuffle.protocol.PushBlockStream; + +/** + * A special exception type that would decode the encoded {@link PushBlockStream} from the + * exception String. This complements the encoding logic in + * {@link org.apache.spark.network.server.TransportRequestHandler}. + */ +public class BlockPushException extends RuntimeException { + private PushBlockStream header; + + /** + * String constant used for generating exception messages indicating a block to be merged + * arrives too late on the server side, and also for later checking such exceptions on the + * client side. When we get a block push failure because of the block arrives too late, we + * will not retry pushing the block nor log the exception on the client side. + */ + public static final String TOO_LATE_MESSAGE_SUFFIX = + "received after merged shuffle is finalized"; + + /** + * String constant used for generating exception messages indicating the server couldn't + * append a block after all available attempts due to collision with other blocks belonging + * to the same shuffle partition, and also for later checking such exceptions on the client + * side. When we get a block push failure because of the block couldn't be written due to + * this reason, we will not log the exception on the client side. + */ + public static final String COULD_NOT_FIND_OPPORTUNITY_MSG_PREFIX = + "Couldn't find an opportunity to write block"; + + private BlockPushException(PushBlockStream header, String message) { +super(message); +this.header = header; + } + + public static BlockPushException decodeException(String message) { +// Use ISO_8859_1 encoding instead of UTF_8. UTF_8 will change the byte content +// for bytes larger than 127. This would render incorrect result when encoding +// decoding the index inside the PushBlockStream message. +ByteBuffer rawBuffer = ByteBuffer.wrap(message.getBytes(StandardCharsets.ISO_8859_1)); +try { + BlockTransferMessage msgObj = BlockTransferMessage.Decoder.fromByteBuffer(rawBuffer); + if (msgObj instanceof PushBlockStream) { +PushBlockStream header = (PushBlockStream) msgObj; +// When decoding the header, the rawBuffer's position is not updated since it was +// consumed via netty's ByteBuf. Updating the rawBuffer's position here to retrieve +// the remaining exception message. +ByteBuffer remainingBuffer = (ByteBuffer) rawBuffer.position(rawBuffer.position() ++ header.encodedLength() + 1); +return new BlockPushException(header, +StandardCharsets.UTF_8.decode(remainingBuffer).toString()); + } else { +throw new UnsupportedOperationException(String.format("Cannot decode the header. " ++ "Expected PushBlockStream but got %s instead", msgObj.getClass().getSimpleName())); + } +} catch (Exception e) { + return new BlockPushException(null, message); Review comment: Before fixing this, want to first settle the discussion on `TransportRequestHandler` regarding your suggestion to keep `PushBlockStream` as a metadata tracked on the client side. I updated that thread with some of my previous thoughts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail:
[GitHub] [spark] SparkQA commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
SparkQA commented on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-702980304 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33977/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
SparkQA commented on pull request #29855: URL: https://github.com/apache/spark/pull/29855#issuecomment-702978917 **[Test build #129370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129370/testReport)** for PR 29855 at commit [`db36f3f`](https://github.com/apache/spark/commit/db36f3fcaab6793379f6fa99ee7d27f9b5abb90d). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
AmplabJenkins removed a comment on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702977132 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129369/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
AmplabJenkins removed a comment on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702977127 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
SparkQA removed a comment on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702976554 **[Test build #129369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129369/testReport)** for PR 29874 at commit [`1b0ba28`](https://github.com/apache/spark/commit/1b0ba28af9e3c7b80ebc095bea8b78b70c5b5c4a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
AmplabJenkins commented on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702977127 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
SparkQA commented on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702977113 **[Test build #129369 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129369/testReport)** for PR 29874 at commit [`1b0ba28`](https://github.com/apache/spark/commit/1b0ba28af9e3c7b80ebc095bea8b78b70c5b5c4a). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Victsm commented on a change in pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
Victsm commented on a change in pull request #29855: URL: https://github.com/apache/spark/pull/29855#discussion_r498965319 ## File path: common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java ## @@ -181,6 +182,17 @@ public void onFailure(Throwable e) { private void processStreamUpload(final UploadStream req) { assert (req.body() == null); try { + // Retain the original metadata buffer, since it will be used during the invocation of + // this method. Will be released later. + req.meta.retain(); + // Make a copy of the original metadata buffer. In benchmark, we noticed that + // we cannot respond the original metadata buffer back to the client, otherwise + // in cases where multiple concurrent shuffles are present, a wrong metadata might + // be sent back to client. This is related to the eager release of the metadata buffer, + // i.e., we always release the original buffer by the time the invocation of this + // method ends, instead of by the time we respond it to the client. This is necessary, + // otherwise we start seeing memory issues very quickly in benchmarks. + ByteBuffer meta = cloneBuffer(req.meta.nioByteBuffer()); Review comment: For the `req.meta` issue, my understanding is the following: `processStreamUpload` is only responsible for creating a a `StreamCallbackWithID` to be added into the FrameDecoder as a stream interceptor. The Netty ByteBuf `req.meta` will be released by the time this method exits. However, the stream callback would need to respond `req.meta` after this method exits. Accessing the value of the Netty ByteBuf after it's released is what's causing the issue mentioned in the comment. I tried to delay the release of `req.meta` until the stream callback finishes processing the stream, however that can lead to memory issues on the shuffle service side when there are many blocks to be transferred. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
SparkQA commented on pull request #29880: URL: https://github.com/apache/spark/pull/29880#issuecomment-702976531 **[Test build #129368 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129368/testReport)** for PR 29880 at commit [`11cfcd3`](https://github.com/apache/spark/commit/11cfcd30f5e38789698cbbfdd3e2a740685339f0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
SparkQA commented on pull request #29874: URL: https://github.com/apache/spark/pull/29874#issuecomment-702976554 **[Test build #129369 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129369/testReport)** for PR 29874 at commit [`1b0ba28`](https://github.com/apache/spark/commit/1b0ba28af9e3c7b80ebc095bea8b78b70c5b5c4a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
AmplabJenkins removed a comment on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702976109 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
AmplabJenkins commented on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702976109 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
SparkQA commented on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702976090 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33976/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] Victsm commented on a change in pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks
Victsm commented on a change in pull request #29855: URL: https://github.com/apache/spark/pull/29855#discussion_r498985394 ## File path: common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java ## @@ -181,6 +182,17 @@ public void onFailure(Throwable e) { private void processStreamUpload(final UploadStream req) { assert (req.body() == null); try { + // Retain the original metadata buffer, since it will be used during the invocation of + // this method. Will be released later. + req.meta.retain(); + // Make a copy of the original metadata buffer. In benchmark, we noticed that + // we cannot respond the original metadata buffer back to the client, otherwise + // in cases where multiple concurrent shuffles are present, a wrong metadata might + // be sent back to client. This is related to the eager release of the metadata buffer, + // i.e., we always release the original buffer by the time the invocation of this + // method ends, instead of by the time we respond it to the client. This is necessary, + // otherwise we start seeing memory issues very quickly in benchmarks. + ByteBuffer meta = cloneBuffer(req.meta.nioByteBuffer()); Review comment: I still do not want to change the `TransportClient#uploadStream` API itself. This transport layer utility was previously used for transferring large RDD partition blocks, and now reused for doing shuffle block push. In the future, it is possible that other use cases might benefit from this utility as well. I believe keeping this API generic and not specific to one use case is important. For the change you proposed to keep the `PushBlockStream` as metadata tracked on the client side, I also thought about doing that during implementation. It's cleaner than the current approach. One way to do this without incurring any potential protocol change would be to make `BlockPushCallback` inside `OneForOneBlockPusher` stateful. Currently, that callback is stateless, so multiple invocations to `TransportClient#uploadStream` for the same batch of blocks would reuse the same callback object. If we make that callback object stateful, to keep track of the additional metadata ManagedBuffer, then the callback object would have what we need built into it during object creation. My concern during the implementation was the potential JVM pressure this approach might generate, since we will create one callback object per block to be pushed. What do you think? Also CC @mridulm @tgravescs @squito @Ngone51 @jiangxb1987 for your inputs on this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xkrogen commented on a change in pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
xkrogen commented on a change in pull request #29874: URL: https://github.com/apache/spark/pull/29874#discussion_r499067150 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala ## @@ -61,7 +61,8 @@ private[hive] object IsolatedClientLoader extends Logging { val files = if (resolvedVersions.contains((resolvedVersion, hadoopVersion))) { resolvedVersions((resolvedVersion, hadoopVersion)) } else { - val remoteRepos = sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES) + val remoteRepos = sys.env.getOrElse( +"DEFAULT_ARTIFACT_REPOSITORY", sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES)) Review comment: IMO if you want to fully change the repository, you should be configuring both `DEFAULT_ARTIFACT_REPOSITORY` and `spark.sql.maven.additionalRemoteRepositories`. I think having a config whose default value changes based on an environment variable is confusing behavior. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ankits commented on a change in pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
ankits commented on a change in pull request #29874: URL: https://github.com/apache/spark/pull/29874#discussion_r499065894 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala ## @@ -61,7 +61,8 @@ private[hive] object IsolatedClientLoader extends Logging { val files = if (resolvedVersions.contains((resolvedVersion, hadoopVersion))) { resolvedVersions((resolvedVersion, hadoopVersion)) } else { - val remoteRepos = sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES) + val remoteRepos = sys.env.getOrElse( +"DEFAULT_ARTIFACT_REPOSITORY", sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES)) Review comment: @xkrogen During my testing of your suggested changes, the test still tries to download the artifact from `SQLConf.ADDITIONAL_REMOTE_REPOSITORIES` which points to `https://maven-central.storage-download.googleapis.com/maven2/`. I still need the change in `SQLConf.scala` to overwrite the maven repo. ``` val ADDITIONAL_REMOTE_REPOSITORIES = buildConf("spark.sql.maven.additionalRemoteRepositories") .doc("A comma-delimited string config of the optional additional remote Maven mirror " + "repositories. This is only used for downloading Hive jars in IsolatedClientLoader " + "if the default Maven Central repo is unreachable.") .version("3.0.0") .stringConf .createWithDefault( sys.env.getOrElse( "DEFAULT_ARTIFACT_REPOSITORY", "https://maven-central.storage-download.googleapis.com/maven2/;)) ``` Let me know your thoughts on this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ankits commented on a change in pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…
ankits commented on a change in pull request #29874: URL: https://github.com/apache/spark/pull/29874#discussion_r499065894 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala ## @@ -61,7 +61,8 @@ private[hive] object IsolatedClientLoader extends Logging { val files = if (resolvedVersions.contains((resolvedVersion, hadoopVersion))) { resolvedVersions((resolvedVersion, hadoopVersion)) } else { - val remoteRepos = sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES) + val remoteRepos = sys.env.getOrElse( +"DEFAULT_ARTIFACT_REPOSITORY", sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES)) Review comment: @xkrogen During my testing of your suggested changes, the test still tries to download the artifact from `SQLConf.ADDITIONAL_REMOTE_REPOSITORIES` which points to `https://maven-central.storage-download.googleapis.com/maven2/`. I still need the change in `SQLConf.scala` to overwrite the maven repo. ``` val ADDITIONAL_REMOTE_REPOSITORIES = buildConf("spark.sql.maven.additionalRemoteRepositories") .doc("A comma-delimited string config of the optional additional remote Maven mirror " + "repositories. This is only used for downloading Hive jars in IsolatedClientLoader " + "if the default Maven Central repo is unreachable.") .version("3.0.0") .stringConf .createWithDefault( sys.env.getOrElse( "DEFAULT_ARTIFACT_REPOSITORY", "https://maven-central.storage-download.googleapis.com/maven2/;))``` Let me know your thoughts on this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] imback82 commented on a change in pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier
imback82 commented on a change in pull request #29880: URL: https://github.com/apache/spark/pull/29880#discussion_r499065054 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala ## @@ -315,6 +315,17 @@ case class DescribeRelation( override def output: Seq[Attribute] = DescribeTableSchema.describeTableAttributes() } +/** + * The logical plan of the DESCRIBE relation_name col_name command that works for v2 tables. + */ +case class DescribeColumn( +relation: LogicalPlan, +colNameParts: Seq[String], Review comment: > A simple idea is to put an `UnresolvedAttribute` here, and analyzer can do the work for us. Since we need to have the relation resolved first, we need to match like the following in the analyzer: ```scala case DescribeColumn(r: ResolvedTable, u: UnresolvedAttribute, _) => ... ``` Is that what you had in mind? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
AmplabJenkins removed a comment on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-702872831 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
SparkQA commented on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-702970066 **[Test build #129367 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129367/testReport)** for PR 29831 at commit [`15ec353`](https://github.com/apache/spark/commit/15ec3534e345631fd775d5679507e651291e0552). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
AmplabJenkins removed a comment on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702968945 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
SparkQA commented on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702969149 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33976/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
AmplabJenkins commented on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702968945 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
SparkQA removed a comment on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702908152 **[Test build #129362 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129362/testReport)** for PR 29936 at commit [`032499e`](https://github.com/apache/spark/commit/032499ea09191cf86aa5eb4f06ca559c5e30d0c2). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2
SparkQA commented on pull request #29936: URL: https://github.com/apache/spark/pull/29936#issuecomment-702968296 **[Test build #129362 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129362/testReport)** for PR 29936 at commit [`032499e`](https://github.com/apache/spark/commit/032499ea09191cf86aa5eb4f06ca559c5e30d0c2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] CodingCat commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()
CodingCat commented on pull request #29831: URL: https://github.com/apache/spark/pull/29831#issuecomment-702967806 Jenkins, retest this, please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2
AmplabJenkins removed a comment on pull request #29885: URL: https://github.com/apache/spark/pull/29885#issuecomment-702957113 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2
SparkQA commented on pull request #29885: URL: https://github.com/apache/spark/pull/29885#issuecomment-702957091 Kubernetes integration test status success URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33975/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org