[GitHub] [spark] AmplabJenkins commented on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29907:
URL: https://github.com/apache/spark/pull/29907#issuecomment-703051380


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29907:
URL: https://github.com/apache/spark/pull/29907#issuecomment-703051265


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29907:
URL: https://github.com/apache/spark/pull/29907#issuecomment-701138683


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29907:
URL: https://github.com/apache/spark/pull/29907#issuecomment-703051265


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] duanmeng commented on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…

2020-10-02 Thread GitBox


duanmeng commented on pull request #29907:
URL: https://github.com/apache/spark/pull/29907#issuecomment-703051053


   > > Hi @duanmeng Could you elaborate more on how to reproduce the issue?
   
   Sorry that I accidently close it, I will repoen it.
   
   This issue is hard to reproduce for it should be a bug of to the cluster's 
disk / kernel, which make the shuffle data file empty after records committing. 
But we defend it in spark, should we change it from **bug** to **improvement**?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] duanmeng commented on pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…

2020-10-02 Thread GitBox


duanmeng commented on pull request #29907:
URL: https://github.com/apache/spark/pull/29907#issuecomment-703050101


   > Hi @duanmeng Could you elaborate more on how to reproduce the issue?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] duanmeng closed pull request #29907: [SPARK-33022] partition length is wrong after merge partition segment…

2020-10-02 Thread GitBox


duanmeng closed pull request #29907:
URL: https://github.com/apache/spark/pull/29907


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #29934: Make sure the pod template configmap has a unique name

2020-10-02 Thread GitBox


HyukjinKwon commented on pull request #29934:
URL: https://github.com/apache/spark/pull/29934#issuecomment-703042759


   Can you file a jira and link it to the PR title? See also 
https://spark.apache.org/contributing.html



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #29935: [SPARK-33055][PYTHON][SQL] Add Python CalendarIntervalType

2020-10-02 Thread GitBox


HyukjinKwon commented on a change in pull request #29935:
URL: https://github.com/apache/spark/pull/29935#discussion_r499113675



##
File path: python/pyspark/sql/types.py
##
@@ -186,6 +186,30 @@ def fromInternal(self, ts):
 return datetime.datetime.fromtimestamp(ts // 
100).replace(microsecond=ts % 100)
 
 
+class CalendarIntervalType(DataType, metaclass=DataTypeSingleton):

Review comment:
   There have been a lot of discussions about exposing interval type in 
other language APIs but I lost the track. @yaooqinn and @cloud-fan, are we 
going to make internal as a proper exposed type? Or only support it in some 
contexts?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya closed pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


viirya closed pull request #29916:
URL: https://github.com/apache/spark/pull/29916


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


viirya commented on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703038983


   Thanks! Merging to master.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-703036917







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-703036917







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-10-02 Thread GitBox


SparkQA commented on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-703036911


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33981/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-10-02 Thread GitBox


SparkQA commented on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-703035085


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33981/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29880:
URL: https://github.com/apache/spark/pull/29880#issuecomment-703032475







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29880:
URL: https://github.com/apache/spark/pull/29880#issuecomment-703032475







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29880:
URL: https://github.com/apache/spark/pull/29880#issuecomment-702976531


   **[Test build #129368 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129368/testReport)**
 for PR 29880 at commit 
[`11cfcd3`](https://github.com/apache/spark/commit/11cfcd30f5e38789698cbbfdd3e2a740685339f0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


SparkQA commented on pull request #29880:
URL: https://github.com/apache/spark/pull/29880#issuecomment-703032175


   **[Test build #129368 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129368/testReport)**
 for PR 29880 at commit 
[`11cfcd3`](https://github.com/apache/spark/commit/11cfcd30f5e38789698cbbfdd3e2a740685339f0).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703030743







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703030743







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703005189


   **[Test build #129372 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129372/testReport)**
 for PR 29916 at commit 
[`1dd54e8`](https://github.com/apache/spark/commit/1dd54e846b52edaa10a8bddd229ff743f9a9b1da).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703008513


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/33980/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


SparkQA commented on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703030400


   **[Test build #129372 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129372/testReport)**
 for PR 29916 at commit 
[`1dd54e8`](https://github.com/apache/spark/commit/1dd54e846b52edaa10a8bddd229ff743f9a9b1da).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-702872834


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-10-02 Thread GitBox


SparkQA commented on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-703027812


   **[Test build #129373 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129373/testReport)**
 for PR 26935 at commit 
[`844422f`](https://github.com/apache/spark/commit/844422f38403f50b47d8eb10d4bb47c05c3f43d6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-703027624







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-703027624







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-702970066


   **[Test build #129367 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129367/testReport)**
 for PR 29831 at commit 
[`15ec353`](https://github.com/apache/spark/commit/15ec3534e345631fd775d5679507e651291e0552).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


SparkQA commented on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-703027320


   **[Test build #129367 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129367/testReport)**
 for PR 29831 at commit 
[`15ec353`](https://github.com/apache/spark/commit/15ec3534e345631fd775d5679507e651291e0552).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HeartSaVioR commented on pull request #26935: [SPARK-30294][SS] Explicitly defines read-only StateStore and optimize for HDFSBackedStateStore

2020-10-02 Thread GitBox


HeartSaVioR commented on pull request #26935:
URL: https://github.com/apache/spark/pull/26935#issuecomment-703026816


   retest this, please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] github-actions[bot] closed pull request #28780: [SPARK-31952][SQL]Fix incorrect memory spill metric when doing Aggregate

2020-10-02 Thread GitBox


github-actions[bot] closed pull request #28780:
URL: https://github.com/apache/spark/pull/28780


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-703013765







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-703013765







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


SparkQA commented on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-703013464


   **[Test build #129370 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129370/testReport)**
 for PR 29855 at commit 
[`db36f3f`](https://github.com/apache/spark/commit/db36f3fcaab6793379f6fa99ee7d27f9b5abb90d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-702978917


   **[Test build #129370 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129370/testReport)**
 for PR 29855 at commit 
[`db36f3f`](https://github.com/apache/spark/commit/db36f3fcaab6793379f6fa99ee7d27f9b5abb90d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29885:
URL: https://github.com/apache/spark/pull/29885#issuecomment-703011638







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29885:
URL: https://github.com/apache/spark/pull/29885#issuecomment-703011638







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29885:
URL: https://github.com/apache/spark/pull/29885#issuecomment-702933292


   **[Test build #129365 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129365/testReport)**
 for PR 29885 at commit 
[`19441da`](https://github.com/apache/spark/commit/19441da91073a48aa07e5af6642cb1cea667861e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2

2020-10-02 Thread GitBox


SparkQA commented on pull request #29885:
URL: https://github.com/apache/spark/pull/29885#issuecomment-703011273


   **[Test build #129365 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129365/testReport)**
 for PR 29885 at commit 
[`19441da`](https://github.com/apache/spark/commit/19441da91073a48aa07e5af6642cb1cea667861e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703008508


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703008508







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


SparkQA commented on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703005189


   **[Test build #129372 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129372/testReport)**
 for PR 29916 at commit 
[`1dd54e8`](https://github.com/apache/spark/commit/1dd54e846b52edaa10a8bddd229ff743f9a9b1da).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-702872830


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on pull request #29916: [SPARK-33037][SHUFFLE] Remove knownManagers to support user's custom shuffle manager plugin

2020-10-02 Thread GitBox


viirya commented on pull request #29916:
URL: https://github.com/apache/spark/pull/29916#issuecomment-703003729


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-70249







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-70249







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


SparkQA commented on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702999509


   **[Test build #129366 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129366/testReport)**
 for PR 29936 at commit 
[`16b3452`](https://github.com/apache/spark/commit/16b3452d88824615a094671cb5aa9b0bdba9b498).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702953727


   **[Test build #129366 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129366/testReport)**
 for PR 29936 at commit 
[`16b3452`](https://github.com/apache/spark/commit/16b3452d88824615a094671cb5aa9b0bdba9b498).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xkrogen commented on a change in pull request #29906: [SPARK-32037][CORE] Rename blacklisting feature

2020-10-02 Thread GitBox


xkrogen commented on a change in pull request #29906:
URL: https://github.com/apache/spark/pull/29906#discussion_r499077732



##
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##
@@ -722,74 +722,83 @@ package object config {
   .booleanConf
   .createWithDefault(true)
 
-  // Blacklist confs
-  private[spark] val BLACKLIST_ENABLED =
-ConfigBuilder("spark.blacklist.enabled")
+  private[spark] val EXCLUDE_ON_FAILURE_ENABLED =
+ConfigBuilder("spark.excludeOnFailure.enabled")
   .version("2.1.0")

Review comment:
   Do we need to update the "from" version strings here?

##
File path: 
core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala
##
@@ -907,13 +908,13 @@ class CoarseGrainedSchedulerBackend(scheduler: 
TaskSchedulerImpl, val rpcEnv: Rp
   protected def currentDelegationTokens: Array[Byte] = delegationTokens.get()
 
   /**
-   * Checks whether the executor is blacklisted. This is called when the 
executor tries to
-   * register with the scheduler, and will deny registration if this method 
returns true.
+   * Checks whether the executor is excluded due to failure(s). This is called 
when the executor
+   *  tries to register with the scheduler, and will deny registration if this 
method returns true.

Review comment:
   minor nit: extra space at the start of the line

##
File path: core/src/main/scala/org/apache/spark/status/api/v1/api.scala
##
@@ -82,10 +82,11 @@ class ExecutorStageSummary private[spark](
 val shuffleWriteRecords : Long,
 val memoryBytesSpilled : Long,
 val diskBytesSpilled : Long,
-val isBlacklistedForStage: Boolean,
+val isBlacklistedForStage: Boolean, // deprecated

Review comment:
   Can we `@deprecated` for this and others?

##
File path: core/src/main/scala/org/apache/spark/status/AppStatusListener.scala
##
@@ -284,80 +284,138 @@ private[spark] class AppStatusListener(
   }
 
   override def onExecutorBlacklisted(event: SparkListenerExecutorBlacklisted): 
Unit = {
-updateBlackListStatus(event.executorId, true)
+updateExcludedStatus(event.executorId, true)
+  }
+
+  override def onExecutorExcluded(event: SparkListenerExecutorExcluded): Unit 
= {
+updateExcludedStatus(event.executorId, true)
   }
 
   override def onExecutorBlacklistedForStage(
-  event: SparkListenerExecutorBlacklistedForStage): Unit = {
+event: SparkListenerExecutorBlacklistedForStage): Unit = {
+val now = System.nanoTime()
+
+Option(liveStages.get((event.stageId, event.stageAttemptId))).foreach { 
stage =>
+  setStageExcludedStatus(stage, now, event.executorId)
+}
+liveExecutors.get(event.executorId).foreach { exec =>
+  addExcludedStageTo(exec, event.stageId, now)
+}
+  }
+
+  override def onExecutorExcludedForStage(
+  event: SparkListenerExecutorExcludedForStage): Unit = {
 val now = System.nanoTime()
 
 Option(liveStages.get((event.stageId, event.stageAttemptId))).foreach { 
stage =>
-  setStageBlackListStatus(stage, now, event.executorId)
+  setStageExcludedStatus(stage, now, event.executorId)
 }
 liveExecutors.get(event.executorId).foreach { exec =>
-  addBlackListedStageTo(exec, event.stageId, now)
+  addExcludedStageTo(exec, event.stageId, now)
 }
   }
 
   override def onNodeBlacklistedForStage(event: 
SparkListenerNodeBlacklistedForStage): Unit = {
 val now = System.nanoTime()
 
-// Implicitly blacklist every available executor for the stage associated 
with this node
+// Implicitly exclude every available executor for the stage associated 
with this node
 Option(liveStages.get((event.stageId, event.stageAttemptId))).foreach { 
stage =>
   val executorIds = liveExecutors.values.filter(_.host == 
event.hostId).map(_.executorId).toSeq
-  setStageBlackListStatus(stage, now, executorIds: _*)
+  setStageExcludedStatus(stage, now, executorIds: _*)
 }
 liveExecutors.values.filter(_.hostname == event.hostId).foreach { exec =>
-  addBlackListedStageTo(exec, event.stageId, now)
+  addExcludedStageTo(exec, event.stageId, now)
+}
+  }
+
+  override def onNodeExcludedForStage(event: 
SparkListenerNodeExcludedForStage): Unit = {
+val now = System.nanoTime()
+
+// Implicitly exclude every available executor for the stage associated 
with this node
+Option(liveStages.get((event.stageId, event.stageAttemptId))).foreach { 
stage =>
+  val executorIds = liveExecutors.values.filter(_.host == 
event.hostId).map(_.executorId).toSeq
+  setStageExcludedStatus(stage, now, executorIds: _*)
+}
+liveExecutors.values.filter(_.hostname == event.hostId).foreach { exec =>
+  addExcludedStageTo(exec, event.stageId, now)
 }
   }
 
   private def addBlackListedStageTo(exec: LiveExecutor, stageId: Int, now: 
Long): Unit = {
-exec.blacklistedInStages += stageId
+

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-702995892







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-702995892







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


SparkQA commented on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-702995878


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33979/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29880:
URL: https://github.com/apache/spark/pull/29880#issuecomment-702994100







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


SparkQA commented on pull request #29880:
URL: https://github.com/apache/spark/pull/29880#issuecomment-702994093


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33978/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29880:
URL: https://github.com/apache/spark/pull/29880#issuecomment-702994100







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29919: [SPARK-33042][SQL][TEST] Add a test case to ensure changes to spark.sql.optimizer.maxIterations take effect at runtime

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29919:
URL: https://github.com/apache/spark/pull/29919#issuecomment-702992524







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29919: [SPARK-33042][SQL][TEST] Add a test case to ensure changes to spark.sql.optimizer.maxIterations take effect at runtime

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29919:
URL: https://github.com/apache/spark/pull/29919#issuecomment-702992524







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29919: [SPARK-33042][SQL][TEST] Add a test case to ensure changes to spark.sql.optimizer.maxIterations take effect at runtime

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29919:
URL: https://github.com/apache/spark/pull/29919#issuecomment-702894900


   **[Test build #129360 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129360/testReport)**
 for PR 29919 at commit 
[`188d667`](https://github.com/apache/spark/commit/188d6671a4bac3b4422824f578606c52a5d527f1).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29919: [SPARK-33042][SQL][TEST] Add a test case to ensure changes to spark.sql.optimizer.maxIterations take effect at runtime

2020-10-02 Thread GitBox


SparkQA commented on pull request #29919:
URL: https://github.com/apache/spark/pull/29919#issuecomment-702992003


   **[Test build #129360 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129360/testReport)**
 for PR 29919 at commit 
[`188d667`](https://github.com/apache/spark/commit/188d6671a4bac3b4422824f578606c52a5d527f1).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


SparkQA commented on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-702991343


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33979/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


SparkQA commented on pull request #29880:
URL: https://github.com/apache/spark/pull/29880#issuecomment-702987903


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33978/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-702986590







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


SparkQA commented on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-702986570


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33977/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-702986590







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702985437


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129371/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


SparkQA commented on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702985425


   **[Test build #129371 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129371/testReport)**
 for PR 29874 at commit 
[`12a06c0`](https://github.com/apache/spark/commit/12a06c042011ef8302ab2b61c935714c58e8453f).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702985432


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702985432







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702984982


   **[Test build #129371 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129371/testReport)**
 for PR 29874 at commit 
[`12a06c0`](https://github.com/apache/spark/commit/12a06c042011ef8302ab2b61c935714c58e8453f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


SparkQA commented on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702984982


   **[Test build #129371 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129371/testReport)**
 for PR 29874 at commit 
[`12a06c0`](https://github.com/apache/spark/commit/12a06c042011ef8302ab2b61c935714c58e8453f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Victsm commented on a change in pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


Victsm commented on a change in pull request #29855:
URL: https://github.com/apache/spark/pull/29855#discussion_r499071938



##
File path: 
common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/BlockPushException.java
##
@@ -0,0 +1,86 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.network.shuffle;
+
+import java.nio.ByteBuffer;
+import java.nio.charset.StandardCharsets;
+
+import org.apache.spark.network.shuffle.protocol.BlockTransferMessage;
+import org.apache.spark.network.shuffle.protocol.PushBlockStream;
+
+/**
+ * A special exception type that would decode the encoded {@link 
PushBlockStream} from the
+ * exception String. This complements the encoding logic in
+ * {@link org.apache.spark.network.server.TransportRequestHandler}.
+ */
+public class BlockPushException extends RuntimeException {
+  private PushBlockStream header;
+
+  /**
+   * String constant used for generating exception messages indicating a block 
to be merged
+   * arrives too late on the server side, and also for later checking such 
exceptions on the
+   * client side. When we get a block push failure because of the block 
arrives too late, we
+   * will not retry pushing the block nor log the exception on the client side.
+   */
+  public static final String TOO_LATE_MESSAGE_SUFFIX =
+  "received after merged shuffle is finalized";
+
+  /**
+   * String constant used for generating exception messages indicating the 
server couldn't
+   * append a block after all available attempts due to collision with other 
blocks belonging
+   * to the same shuffle partition, and also for later checking such 
exceptions on the client
+   * side. When we get a block push failure because of the block couldn't be 
written due to
+   * this reason, we will not log the exception on the client side.
+   */
+  public static final String COULD_NOT_FIND_OPPORTUNITY_MSG_PREFIX =
+  "Couldn't find an opportunity to write block";
+
+  private BlockPushException(PushBlockStream header, String message) {
+super(message);
+this.header = header;
+  }
+
+  public static BlockPushException decodeException(String message) {
+// Use ISO_8859_1 encoding instead of UTF_8. UTF_8 will change the byte 
content
+// for bytes larger than 127. This would render incorrect result when 
encoding
+// decoding the index inside the PushBlockStream message.
+ByteBuffer rawBuffer = 
ByteBuffer.wrap(message.getBytes(StandardCharsets.ISO_8859_1));
+try {
+  BlockTransferMessage msgObj = 
BlockTransferMessage.Decoder.fromByteBuffer(rawBuffer);
+  if (msgObj instanceof PushBlockStream) {
+PushBlockStream header = (PushBlockStream) msgObj;
+// When decoding the header, the rawBuffer's position is not updated 
since it was
+// consumed via netty's ByteBuf. Updating the rawBuffer's position 
here to retrieve
+// the remaining exception message.
+ByteBuffer remainingBuffer = (ByteBuffer) 
rawBuffer.position(rawBuffer.position()
++ header.encodedLength() + 1);
+return new BlockPushException(header,
+StandardCharsets.UTF_8.decode(remainingBuffer).toString());
+  } else {
+throw new UnsupportedOperationException(String.format("Cannot decode 
the header. "
++ "Expected PushBlockStream but got %s instead", 
msgObj.getClass().getSimpleName()));
+  }
+} catch (Exception e) {
+  return new BlockPushException(null, message);

Review comment:
   Before fixing this, want to first settle the discussion on 
`TransportRequestHandler` regarding your suggestion to keep `PushBlockStream` 
as a metadata tracked on the client side.
   I updated that thread with some of my previous thoughts.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: 

[GitHub] [spark] SparkQA commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


SparkQA commented on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-702980304


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33977/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


SparkQA commented on pull request #29855:
URL: https://github.com/apache/spark/pull/29855#issuecomment-702978917


   **[Test build #129370 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129370/testReport)**
 for PR 29855 at commit 
[`db36f3f`](https://github.com/apache/spark/commit/db36f3fcaab6793379f6fa99ee7d27f9b5abb90d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702977132


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/129369/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702977127


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702976554


   **[Test build #129369 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129369/testReport)**
 for PR 29874 at commit 
[`1b0ba28`](https://github.com/apache/spark/commit/1b0ba28af9e3c7b80ebc095bea8b78b70c5b5c4a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702977127







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


SparkQA commented on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702977113


   **[Test build #129369 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129369/testReport)**
 for PR 29874 at commit 
[`1b0ba28`](https://github.com/apache/spark/commit/1b0ba28af9e3c7b80ebc095bea8b78b70c5b5c4a).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Victsm commented on a change in pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


Victsm commented on a change in pull request #29855:
URL: https://github.com/apache/spark/pull/29855#discussion_r498965319



##
File path: 
common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java
##
@@ -181,6 +182,17 @@ public void onFailure(Throwable e) {
   private void processStreamUpload(final UploadStream req) {
 assert (req.body() == null);
 try {
+  // Retain the original metadata buffer, since it will be used during the 
invocation of
+  // this method. Will be released later.
+  req.meta.retain();
+  // Make a copy of the original metadata buffer. In benchmark, we noticed 
that
+  // we cannot respond the original metadata buffer back to the client, 
otherwise
+  // in cases where multiple concurrent shuffles are present, a wrong 
metadata might
+  // be sent back to client. This is related to the eager release of the 
metadata buffer,
+  // i.e., we always release the original buffer by the time the 
invocation of this
+  // method ends, instead of by the time we respond it to the client. This 
is necessary,
+  // otherwise we start seeing memory issues very quickly in benchmarks.
+  ByteBuffer meta = cloneBuffer(req.meta.nioByteBuffer());

Review comment:
   For the `req.meta` issue, my understanding is the following:
   `processStreamUpload` is only responsible for creating a a 
`StreamCallbackWithID` to be added into the FrameDecoder as a stream 
interceptor.
   The Netty ByteBuf `req.meta` will be released by the time this method exits.
   However, the stream callback would need to respond `req.meta` after this 
method exits.
   Accessing the value of the Netty ByteBuf after it's released is what's 
causing the issue mentioned in the comment.
   I tried to delay the release of `req.meta` until the stream callback 
finishes processing the stream, however that can lead to memory issues on the 
shuffle service side when there are many blocks to be transferred.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


SparkQA commented on pull request #29880:
URL: https://github.com/apache/spark/pull/29880#issuecomment-702976531


   **[Test build #129368 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129368/testReport)**
 for PR 29880 at commit 
[`11cfcd3`](https://github.com/apache/spark/commit/11cfcd30f5e38789698cbbfdd3e2a740685339f0).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


SparkQA commented on pull request #29874:
URL: https://github.com/apache/spark/pull/29874#issuecomment-702976554


   **[Test build #129369 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129369/testReport)**
 for PR 29874 at commit 
[`1b0ba28`](https://github.com/apache/spark/commit/1b0ba28af9e3c7b80ebc095bea8b78b70c5b5c4a).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702976109







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702976109







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


SparkQA commented on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702976090


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33976/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Victsm commented on a change in pull request #29855: [SPARK-32915][CORE] Network-layer and shuffle RPC layer changes to support push shuffle blocks

2020-10-02 Thread GitBox


Victsm commented on a change in pull request #29855:
URL: https://github.com/apache/spark/pull/29855#discussion_r498985394



##
File path: 
common/network-common/src/main/java/org/apache/spark/network/server/TransportRequestHandler.java
##
@@ -181,6 +182,17 @@ public void onFailure(Throwable e) {
   private void processStreamUpload(final UploadStream req) {
 assert (req.body() == null);
 try {
+  // Retain the original metadata buffer, since it will be used during the 
invocation of
+  // this method. Will be released later.
+  req.meta.retain();
+  // Make a copy of the original metadata buffer. In benchmark, we noticed 
that
+  // we cannot respond the original metadata buffer back to the client, 
otherwise
+  // in cases where multiple concurrent shuffles are present, a wrong 
metadata might
+  // be sent back to client. This is related to the eager release of the 
metadata buffer,
+  // i.e., we always release the original buffer by the time the 
invocation of this
+  // method ends, instead of by the time we respond it to the client. This 
is necessary,
+  // otherwise we start seeing memory issues very quickly in benchmarks.
+  ByteBuffer meta = cloneBuffer(req.meta.nioByteBuffer());

Review comment:
   I still do not want to change the `TransportClient#uploadStream` API 
itself.
   This transport layer utility was previously used for transferring large RDD 
partition blocks, and now reused for doing shuffle block push.
   In the future, it is possible that other use cases might benefit from this 
utility as well.
   I believe keeping this API generic and not specific to one use case is 
important.
   
   For the change you proposed to keep the `PushBlockStream` as metadata 
tracked on the client side, I also thought about doing that during 
implementation.
   It's cleaner than the current approach.
   
   One way to do this without incurring any potential protocol change would be 
to make `BlockPushCallback` inside `OneForOneBlockPusher` stateful.
   Currently, that callback is stateless, so multiple invocations to 
`TransportClient#uploadStream` for the same batch of blocks would reuse the 
same callback object.
   If we make that callback object stateful, to keep track of the additional 
metadata ManagedBuffer, then the callback object would have what we need built 
into it during object creation.
   My concern during the implementation was the potential JVM pressure this 
approach might generate, since we will create one callback object per block to 
be pushed.
   
   What do you think?
   Also CC @mridulm @tgravescs @squito @Ngone51 @jiangxb1987 for your inputs on 
this.
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xkrogen commented on a change in pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


xkrogen commented on a change in pull request #29874:
URL: https://github.com/apache/spark/pull/29874#discussion_r499067150



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala
##
@@ -61,7 +61,8 @@ private[hive] object IsolatedClientLoader extends Logging {
 val files = if (resolvedVersions.contains((resolvedVersion, 
hadoopVersion))) {
   resolvedVersions((resolvedVersion, hadoopVersion))
 } else {
-  val remoteRepos = sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES)
+  val remoteRepos = sys.env.getOrElse(
+"DEFAULT_ARTIFACT_REPOSITORY", 
sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES))

Review comment:
   IMO if you want to fully change the repository, you should be 
configuring both `DEFAULT_ARTIFACT_REPOSITORY` and 
`spark.sql.maven.additionalRemoteRepositories`. I think having a config whose 
default value changes based on an environment variable is confusing behavior.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ankits commented on a change in pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


ankits commented on a change in pull request #29874:
URL: https://github.com/apache/spark/pull/29874#discussion_r499065894



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala
##
@@ -61,7 +61,8 @@ private[hive] object IsolatedClientLoader extends Logging {
 val files = if (resolvedVersions.contains((resolvedVersion, 
hadoopVersion))) {
   resolvedVersions((resolvedVersion, hadoopVersion))
 } else {
-  val remoteRepos = sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES)
+  val remoteRepos = sys.env.getOrElse(
+"DEFAULT_ARTIFACT_REPOSITORY", 
sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES))

Review comment:
   @xkrogen During my testing of your suggested changes, the test still 
tries to download the artifact from `SQLConf.ADDITIONAL_REMOTE_REPOSITORIES` 
which points to 
`https://maven-central.storage-download.googleapis.com/maven2/`. I still need 
the change in `SQLConf.scala` to overwrite the maven repo. 
   
   ```   val ADDITIONAL_REMOTE_REPOSITORIES =
   buildConf("spark.sql.maven.additionalRemoteRepositories")
 .doc("A comma-delimited string config of the optional additional 
remote Maven mirror " +
   "repositories. This is only used for downloading Hive jars in 
IsolatedClientLoader " +
   "if the default Maven Central repo is unreachable.")
 .version("3.0.0")
 .stringConf
 .createWithDefault(
   sys.env.getOrElse(
 "DEFAULT_ARTIFACT_REPOSITORY",
 "https://maven-central.storage-download.googleapis.com/maven2/;))
   ```
   
   
   Let me know your thoughts on this. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ankits commented on a change in pull request #29874: [SPARK-32998] Add ability to override default remote repos with inter…

2020-10-02 Thread GitBox


ankits commented on a change in pull request #29874:
URL: https://github.com/apache/spark/pull/29874#discussion_r499065894



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/IsolatedClientLoader.scala
##
@@ -61,7 +61,8 @@ private[hive] object IsolatedClientLoader extends Logging {
 val files = if (resolvedVersions.contains((resolvedVersion, 
hadoopVersion))) {
   resolvedVersions((resolvedVersion, hadoopVersion))
 } else {
-  val remoteRepos = sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES)
+  val remoteRepos = sys.env.getOrElse(
+"DEFAULT_ARTIFACT_REPOSITORY", 
sparkConf.get(SQLConf.ADDITIONAL_REMOTE_REPOSITORIES))

Review comment:
   @xkrogen During my testing of your suggested changes, the test still 
tries to download the artifact from `SQLConf.ADDITIONAL_REMOTE_REPOSITORIES` 
which points to 
`https://maven-central.storage-download.googleapis.com/maven2/`. I still need 
the change in `SQLConf.scala` to overwrite the maven repo. 
   
   ```   val ADDITIONAL_REMOTE_REPOSITORIES =
   buildConf("spark.sql.maven.additionalRemoteRepositories")
 .doc("A comma-delimited string config of the optional additional 
remote Maven mirror " +
   "repositories. This is only used for downloading Hive jars in 
IsolatedClientLoader " +
   "if the default Maven Central repo is unreachable.")
 .version("3.0.0")
 .stringConf
 .createWithDefault(
   sys.env.getOrElse(
 "DEFAULT_ARTIFACT_REPOSITORY",
 
"https://maven-central.storage-download.googleapis.com/maven2/;))```
   
   Let me know your thoughts on this. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] imback82 commented on a change in pull request #29880: [SPARK-33004][SQL] Migrate DESCRIBE column to use UnresolvedTableOrView to resolve the identifier

2020-10-02 Thread GitBox


imback82 commented on a change in pull request #29880:
URL: https://github.com/apache/spark/pull/29880#discussion_r499065054



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/v2Commands.scala
##
@@ -315,6 +315,17 @@ case class DescribeRelation(
   override def output: Seq[Attribute] = 
DescribeTableSchema.describeTableAttributes()
 }
 
+/**
+ * The logical plan of the DESCRIBE relation_name col_name command that works 
for v2 tables.
+ */
+case class DescribeColumn(
+relation: LogicalPlan,
+colNameParts: Seq[String],

Review comment:
   > A simple idea is to put an `UnresolvedAttribute` here, and analyzer 
can do the work for us.
   
   Since we need to have the relation resolved first, we need to match like the 
following in the analyzer:
   ```scala
   case DescribeColumn(r: ResolvedTable, u: UnresolvedAttribute, _) =>
   ...
   ```
   Is that what you had in mind?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-702872831


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


SparkQA commented on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-702970066


   **[Test build #129367 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129367/testReport)**
 for PR 29831 at commit 
[`15ec353`](https://github.com/apache/spark/commit/15ec3534e345631fd775d5679507e651291e0552).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702968945







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


SparkQA commented on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702969149


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33976/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


AmplabJenkins commented on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702968945







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


SparkQA removed a comment on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702908152


   **[Test build #129362 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129362/testReport)**
 for PR 29936 at commit 
[`032499e`](https://github.com/apache/spark/commit/032499ea09191cf86aa5eb4f06ca559c5e30d0c2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29936: [WIP][BUILD][SQL] Remove Hive 1.2

2020-10-02 Thread GitBox


SparkQA commented on pull request #29936:
URL: https://github.com/apache/spark/pull/29936#issuecomment-702968296


   **[Test build #129362 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129362/testReport)**
 for PR 29936 at commit 
[`032499e`](https://github.com/apache/spark/commit/032499ea09191cf86aa5eb4f06ca559c5e30d0c2).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] CodingCat commented on pull request #29831: [SPARK-32351][SQL] Show partially pushed down partition filters in explain()

2020-10-02 Thread GitBox


CodingCat commented on pull request #29831:
URL: https://github.com/apache/spark/pull/29831#issuecomment-702967806


   Jenkins, retest this, please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2

2020-10-02 Thread GitBox


AmplabJenkins removed a comment on pull request #29885:
URL: https://github.com/apache/spark/pull/29885#issuecomment-702957113







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29885: [SPARK-33010][SQL]Make DataFrameWriter.jdbc work for DataSource V2

2020-10-02 Thread GitBox


SparkQA commented on pull request #29885:
URL: https://github.com/apache/spark/pull/29885#issuecomment-702957091


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/33975/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >