[GitHub] [spark] SparkQA commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


SparkQA commented on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635798742


   **[Test build #123266 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123266/testReport)**
 for PR 28593 at commit 
[`2e55be3`](https://github.com/apache/spark/commit/2e55be370685530ac98ee0aa9c9b4ab9c5b9ab96).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635799284







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635799284







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow

2020-05-29 Thread GitBox


SparkQA commented on pull request #27627:
URL: https://github.com/apache/spark/pull/27627#issuecomment-635800559


   **[Test build #123260 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123260/testReport)**
 for PR 27627 at commit 
[`59a00c4`](https://github.com/apache/spark/commit/59a00c4e1092579532c37569261fb830c194f891).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-29 Thread GitBox


SparkQA commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-635800563


   **[Test build #123265 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123265/testReport)**
 for PR 28627 at commit 
[`2e805ec`](https://github.com/apache/spark/commit/2e805ec3276d820935b987861ab90a042c1a8638).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-05-29 Thread GitBox


SparkQA commented on pull request #28619:
URL: https://github.com/apache/spark/pull/28619#issuecomment-635800561


   **[Test build #123262 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123262/testReport)**
 for PR 28619 at commit 
[`1cae338`](https://github.com/apache/spark/commit/1cae338342376b16c42e436b2a8fdb2240e9d9b9).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-05-29 Thread GitBox


SparkQA commented on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-635800568


   **[Test build #123264 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123264/testReport)**
 for PR 27246 at commit 
[`e48a936`](https://github.com/apache/spark/commit/e48a936c0a4205e12994767ff47a21fddef60ac4).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #27627:
URL: https://github.com/apache/spark/pull/27627#issuecomment-635800850







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28619:
URL: https://github.com/apache/spark/pull/28619#issuecomment-635800877







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-635800748







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-635800585







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-29 Thread GitBox


SparkQA removed a comment on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-635779238


   **[Test build #123265 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123265/testReport)**
 for PR 28627 at commit 
[`2e805ec`](https://github.com/apache/spark/commit/2e805ec3276d820935b987861ab90a042c1a8638).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28619:
URL: https://github.com/apache/spark/pull/28619#issuecomment-635800877


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #27627:
URL: https://github.com/apache/spark/pull/27627#issuecomment-635800850


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-635800585


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-635800748


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-05-29 Thread GitBox


SparkQA removed a comment on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-635776883


   **[Test build #123264 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123264/testReport)**
 for PR 27246 at commit 
[`e48a936`](https://github.com/apache/spark/commit/e48a936c0a4205e12994767ff47a21fddef60ac4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-05-29 Thread GitBox


SparkQA removed a comment on pull request #28619:
URL: https://github.com/apache/spark/pull/28619#issuecomment-635763920


   **[Test build #123262 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123262/testReport)**
 for PR 28619 at commit 
[`1cae338`](https://github.com/apache/spark/commit/1cae338342376b16c42e436b2a8fdb2240e9d9b9).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow

2020-05-29 Thread GitBox


SparkQA removed a comment on pull request #27627:
URL: https://github.com/apache/spark/pull/27627#issuecomment-635732632


   **[Test build #123260 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123260/testReport)**
 for PR 27627 at commit 
[`59a00c4`](https://github.com/apache/spark/commit/59a00c4e1092579532c37569261fb830c194f891).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27627: [WIP][SPARK-28067][SQL] Fix incorrect results for decimal aggregate sum by returning null on decimal overflow

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #27627:
URL: https://github.com/apache/spark/pull/27627#issuecomment-635800855


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123260/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28619: [SPARK-21040][CORE] Speculate tasks which are running on decommission executors

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28619:
URL: https://github.com/apache/spark/pull/28619#issuecomment-635800888


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123262/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-635800753


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123264/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-635800594


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123265/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-05-29 Thread GitBox


maropu commented on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-635803012


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-29 Thread GitBox


sarutak commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-635804155


   retest this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-29 Thread GitBox


SparkQA commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-635804837


   **[Test build #123268 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123268/testReport)**
 for PR 28627 at commit 
[`2e805ec`](https://github.com/apache/spark/commit/2e805ec3276d820935b987861ab90a042c1a8638).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-29 Thread GitBox


SparkQA commented on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-635804842


   **[Test build #123267 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123267/testReport)**
 for PR 28645 at commit 
[`8576d28`](https://github.com/apache/spark/commit/8576d283cca15366aa86e01a7826345865308fd5).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-05-29 Thread GitBox


SparkQA commented on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-635804840


   **[Test build #123269 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123269/testReport)**
 for PR 27246 at commit 
[`e48a936`](https://github.com/apache/spark/commit/e48a936c0a4205e12994767ff47a21fddef60ac4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-635805381







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #28527: [SPARK-31709][SQL] Proper base path for database/table location when it is a relative path

2020-05-29 Thread GitBox


maropu commented on pull request #28527:
URL: https://github.com/apache/spark/pull/28527#issuecomment-635805570


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-635805507







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-635805274







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-635805274







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27246: [SPARK-30536][CORE][SQL] Sort-merge join operator spilling performance improvements

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #27246:
URL: https://github.com/apache/spark/pull/27246#issuecomment-635805381







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28627: [SPARK-31756][WEBUI][test-maven] Add real headless browser support for UI test

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28627:
URL: https://github.com/apache/spark/pull/28627#issuecomment-635805507







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #28527: [SPARK-31709][SQL] Proper base path for database/table location when it is a relative path

2020-05-29 Thread GitBox


maropu commented on pull request #28527:
URL: https://github.com/apache/spark/pull/28527#issuecomment-635805629


   cc: @cloud-fan 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28527: [SPARK-31709][SQL] Proper base path for database/table location when it is a relative path

2020-05-29 Thread GitBox


SparkQA commented on pull request #28527:
URL: https://github.com/apache/spark/pull/28527#issuecomment-635807929


   **[Test build #123270 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123270/testReport)**
 for PR 28527 at commit 
[`3fbe65a`](https://github.com/apache/spark/commit/3fbe65a0c3f1e59e5af1f5d3f3b7beb13c0636f6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #28650: [SPARK-31830][SQL] Consistent error handling for datetime formatting and parsing functions

2020-05-29 Thread GitBox


cloud-fan commented on pull request #28650:
URL: https://github.com/apache/spark/pull/28650#issuecomment-635808253


   Since this PR is for master only, let's fix the `format()` first, in case 
this PR introduces conflicts.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28527: [SPARK-31709][SQL] Proper base path for database/table location when it is a relative path

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28527:
URL: https://github.com/apache/spark/pull/28527#issuecomment-635808451







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28527: [SPARK-31709][SQL] Proper base path for database/table location when it is a relative path

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28527:
URL: https://github.com/apache/spark/pull/28527#issuecomment-635808451







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #28626: [SPARK-28481][SQL] More expressions should extend NullIntolerant

2020-05-29 Thread GitBox


cloud-fan commented on pull request #28626:
URL: https://github.com/apache/spark/pull/28626#issuecomment-635810298


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan closed pull request #28626: [SPARK-28481][SQL] More expressions should extend NullIntolerant

2020-05-29 Thread GitBox


cloud-fan closed pull request #28626:
URL: https://github.com/apache/spark/pull/28626


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] huaxingao commented on pull request #28672: [SPARK-31866][SQL][DOCS] Add Coalesce/Repartition/Repartition_By_Range Hints to SQL Reference

2020-05-29 Thread GitBox


huaxingao commented on pull request #28672:
URL: https://github.com/apache/spark/pull/28672#issuecomment-635811122


   Yes, it's for 3.0. I created jira SPARK-31866. @maropu  



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28672: [SPARK-31866][SQL][DOCS] Add Coalesce/Repartition/Repartition_By_Range Hints to SQL Reference

2020-05-29 Thread GitBox


SparkQA commented on pull request #28672:
URL: https://github.com/apache/spark/pull/28672#issuecomment-635811171


   **[Test build #123271 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123271/testReport)**
 for PR 28672 at commit 
[`60fdb93`](https://github.com/apache/spark/commit/60fdb93800a2843268ed21501ee7f26950949411).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #28648: [SPARK-31788][CORE][DSTREAM][PYTHON] Recover the support of union for different types of RDD and DStreams

2020-05-29 Thread GitBox


viirya commented on a change in pull request #28648:
URL: https://github.com/apache/spark/pull/28648#discussion_r432297967



##
File path: python/pyspark/context.py
##
@@ -864,8 +865,21 @@ def union(self, rdds):
 first_jrdd_deserializer = rdds[0]._jrdd_deserializer
 if any(x._jrdd_deserializer != first_jrdd_deserializer for x in rdds):
 rdds = [x._reserialize() for x in rdds]
-cls = SparkContext._jvm.org.apache.spark.api.java.JavaRDD
-jrdds = SparkContext._gateway.new_array(cls, len(rdds))
+gw = SparkContext._gateway
+jvm = SparkContext._jvm
+jrdd_cls = jvm.org.apache.spark.api.java.JavaRDD
+jpair_rdd_cls = jvm.org.apache.spark.api.java.JavaPairRDD
+jdouble_rdd_cls = jvm.org.apache.spark.api.java.JavaDoubleRDD
+if is_instance_of(gw, rdds[0]._jrdd, jrdd_cls):
+cls = jrdd_cls
+elif is_instance_of(gw, rdds[0]._jrdd, jpair_rdd_cls):
+cls = jpair_rdd_cls
+elif is_instance_of(gw, rdds[0]._jrdd, jdouble_rdd_cls):
+cls = jdouble_rdd_cls
+else:
+cls_name = rdds[0]._jrdd.getClass().getCanonicalName()
+raise TypeError("Unsupported Java DStream class %s" % cls_name)
+jrdds = gw.new_array(cls, len(rdds))

Review comment:
   Why we say Java DStream class here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28672: [SPARK-31866][SQL][DOCS] Add Coalesce/Repartition/Repartition_By_Range Hints to SQL Reference

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28672:
URL: https://github.com/apache/spark/pull/28672#issuecomment-635811629







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28672: [SPARK-31866][SQL][DOCS] Add Coalesce/Repartition/Repartition_By_Range Hints to SQL Reference

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28672:
URL: https://github.com/apache/spark/pull/28672#issuecomment-635811629







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-29 Thread GitBox


SparkQA commented on pull request #28647:
URL: https://github.com/apache/spark/pull/28647#issuecomment-635814387


   **[Test build #123272 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123272/testReport)**
 for PR 28647 at commit 
[`175d0e2`](https://github.com/apache/spark/commit/175d0e29fe5c53af0da3d8bbae7156eee1416f69).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-29 Thread GitBox


ulysses-you commented on a change in pull request #28647:
URL: https://github.com/apache/spark/pull/28647#discussion_r432308450



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala
##
@@ -115,6 +116,15 @@ case class CreateTableLikeCommand(
   CatalogTableType.EXTERNAL
 }
 
+val newProperties = sourceTableDesc.tableType match {
+  case VIEW =>
+// For view, we just use new properties
+properties

Review comment:
   Keep view behavior as before. Hive also does not copy view properties.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28647:
URL: https://github.com/apache/spark/pull/28647#issuecomment-635814973







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28647:
URL: https://github.com/apache/spark/pull/28647#issuecomment-635814973







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


SparkQA commented on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635816516


   **[Test build #123266 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123266/testReport)**
 for PR 28593 at commit 
[`2e55be3`](https://github.com/apache/spark/commit/2e55be370685530ac98ee0aa9c9b4ab9c5b9ab96).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635816600


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


SparkQA removed a comment on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635798742


   **[Test build #123266 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123266/testReport)**
 for PR 28593 at commit 
[`2e55be3`](https://github.com/apache/spark/commit/2e55be370685530ac98ee0aa9c9b4ab9c5b9ab96).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635816600







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635816609


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123266/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28672: [SPARK-31866][SQL][DOCS] Add COALESCE/REPARTITION/REPARTITION_BY_RANGE Hints to SQL Reference

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28672:
URL: https://github.com/apache/spark/pull/28672#issuecomment-635818029







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28672: [SPARK-31866][SQL][DOCS] Add COALESCE/REPARTITION/REPARTITION_BY_RANGE Hints to SQL Reference

2020-05-29 Thread GitBox


SparkQA commented on pull request #28672:
URL: https://github.com/apache/spark/pull/28672#issuecomment-635817841


   **[Test build #123271 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123271/testReport)**
 for PR 28672 at commit 
[`60fdb93`](https://github.com/apache/spark/commit/60fdb93800a2843268ed21501ee7f26950949411).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28672: [SPARK-31866][SQL][DOCS] Add COALESCE/REPARTITION/REPARTITION_BY_RANGE Hints to SQL Reference

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28672:
URL: https://github.com/apache/spark/pull/28672#issuecomment-635818029







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28672: [SPARK-31866][SQL][DOCS] Add COALESCE/REPARTITION/REPARTITION_BY_RANGE Hints to SQL Reference

2020-05-29 Thread GitBox


SparkQA removed a comment on pull request #28672:
URL: https://github.com/apache/spark/pull/28672#issuecomment-635811171


   **[Test build #123271 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123271/testReport)**
 for PR 28672 at commit 
[`60fdb93`](https://github.com/apache/spark/commit/60fdb93800a2843268ed21501ee7f26950949411).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28672: [SPARK-31866][SQL][DOCS] Add COALESCE/REPARTITION/REPARTITION_BY_RANGE Hints to SQL Reference

2020-05-29 Thread GitBox


maropu commented on a change in pull request #28672:
URL: https://github.com/apache/spark/pull/28672#discussion_r432308677



##
File path: docs/sql-ref-syntax-qry-select-hints.md
##
@@ -21,14 +21,69 @@ license: |
 
 ### Description
 
-Join Hints allow users to suggest the join strategy that Spark should use. 
Prior to Spark 3.0, only the `BROADCAST` Join Hint was supported. `MERGE`, 
`SHUFFLE_HASH` and `SHUFFLE_REPLICATE_NL` Joint Hints support was added in 3.0. 
When different join strategy hints are specified on both sides of a join, Spark 
prioritizes hints in the following order: `BROADCAST` over `MERGE` over 
`SHUFFLE_HASH` over `SHUFFLE_REPLICATE_NL`. When both sides are specified with 
the `BROADCAST` hint or the `SHUFFLE_HASH` hint, Spark will pick the build side 
based on the join type and the sizes of the relations. Since a given strategy 
may not support all join types, Spark is not guaranteed to use the join 
strategy suggested by the hint.
+Hints give users a way to suggest how Spark SQL to use specific approaches to 
generate its execution plan.
 
 ### Syntax
 
 ```sql
-/*+ join_hint [ , ... ] */
+/*+ hint [ , ... ] */
 ```
 
+### Partitioning Hints
+
+`COALESCE`/`REPARTITION`/`REPARTITION_BY_RANGE` hints have functionalities 
equivalent to those of the
+`Dataset` `coalesce`/`repartition`/`repartitionByRange` APIs. The `COALESCE` 
hint can be used to reduce
+the number of partitions to the specified number of partitions. The 
`REPARTITION`/`REPARTITION_BY_RANGE`
+hint can be used to repartition to the specified number of partitions using 
the specified partitioning expressions.
+The `COALESCE` hint takes a partition number as a
+parameter. The `REPARTITION` hint takes a partition number, column names, or 
both as parameters.
+The `REPARTITION_BY_RANGE` hint takes column names and an optional partition 
number as parameters.
+These hints give users a way to tune performance and control the number of 
output files in Spark SQL.
+
+### Examples
+```sql
+SELECT /*+ COALESCE(3) */ * FROM t;
+
+EXPLAIN SELECT /*+ COALESCE(3) */ * FROM t;
+== Physical Plan ==
+Coalesce 3
++- *(1) ColumnarToRow
+   +- FileScan parquet default.t[name#5,c#6] Batched: true, DataFilters: [], 
Format: Parquet,
+  Location: CatalogFileIndex[file:/spark/spark-warehouse/t], 
PartitionFilters: [],
+  PushedFilters: [], ReadSchema: struct
+
+SELECT /*+ REPARTITION(3) */ * FROM t;
+
+SELECT /*+ REPARTITION(c) */ * FROM t;
+
+SELECT /*+ REPARTITION(3, c) */ * FROM t;
+
+EXPLAIN SELECT /*+ REPARTITION(3, c) */ * FROM t;
+== Physical Plan ==
+Exchange hashpartitioning(c#6, 3), false, [id=#148]
++- *(1) ColumnarToRow
+   +- FileScan parquet default.t[name#5,c#6] Batched: true, DataFilters: [], 
Format: Parquet,
+  Location: CatalogFileIndex[file:/spark/spark-warehouse/t], 
PartitionFilters: [],
+  PushedFilters: [], ReadSchema: struct
+
+SELECT /*+ REPARTITION_BY_RANGE(c) */ * FROM t;
+
+SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t;
+
+EXPLAIN SELECT /*+ REPARTITION_BY_RANGE(3, c) */ * FROM t;
+== Physical Plan ==
+Exchange rangepartitioning(c#6 ASC NULLS FIRST, 3), false, [id=#167]
++- *(1) ColumnarToRow
+   +- FileScan parquet default.t[name#5,c#6] Batched: true, DataFilters: [], 
Format: Parquet,
+  Location: CatalogFileIndex[file:/spark/spark-warehouse/t], 
PartitionFilters: [],
+  PushedFilters: [], ReadSchema: struct
+```
+
+
+### Join Hints
+
+Join Hints allow users to suggest the join strategy that Spark should use. 
Prior to Spark 3.0, only the `BROADCAST` Join Hint was supported. `MERGE`, 
`SHUFFLE_HASH` and `SHUFFLE_REPLICATE_NL` Joint Hints support was added in 3.0. 
When different join strategy hints are specified on both sides of a join, Spark 
prioritizes hints in the following order: `BROADCAST` over `MERGE` over 
`SHUFFLE_HASH` over `SHUFFLE_REPLICATE_NL`. When both sides are specified with 
the `BROADCAST` hint or the `SHUFFLE_HASH` hint, Spark will pick the build side 
based on the join type and the sizes of the relations. Since a given strategy 
may not support all join types, Spark is not guaranteed to use the join 
strategy suggested by the hint.

Review comment:
   `Hints` -> `hints`?

##
File path: docs/sql-ref-syntax-qry-select-hints.md
##
@@ -21,14 +21,69 @@ license: |
 
 ### Description
 
-Join Hints allow users to suggest the join strategy that Spark should use. 
Prior to Spark 3.0, only the `BROADCAST` Join Hint was supported. `MERGE`, 
`SHUFFLE_HASH` and `SHUFFLE_REPLICATE_NL` Joint Hints support was added in 3.0. 
When different join strategy hints are specified on both sides of a join, Spark 
prioritizes hints in the following order: `BROADCAST` over `MERGE` over 
`SHUFFLE_HASH` over `SHUFFLE_REPLICATE_NL`. When both sides are specified with 
the `BROADCAST` hint or the `SHUFFLE_HASH` hint, Spark will pick the build side 
based on the join type and the sizes of the relations. Since a given strategy 
may not support all join types, Sp

[GitHub] [spark] maropu commented on a change in pull request #28672: [SPARK-31866][SQL][DOCS] Add COALESCE/REPARTITION/REPARTITION_BY_RANGE Hints to SQL Reference

2020-05-29 Thread GitBox


maropu commented on a change in pull request #28672:
URL: https://github.com/apache/spark/pull/28672#discussion_r432314600



##
File path: docs/sql-ref-syntax-qry-select-hints.md
##
@@ -21,14 +21,69 @@ license: |
 
 ### Description
 
-Join Hints allow users to suggest the join strategy that Spark should use. 
Prior to Spark 3.0, only the `BROADCAST` Join Hint was supported. `MERGE`, 
`SHUFFLE_HASH` and `SHUFFLE_REPLICATE_NL` Joint Hints support was added in 3.0. 
When different join strategy hints are specified on both sides of a join, Spark 
prioritizes hints in the following order: `BROADCAST` over `MERGE` over 
`SHUFFLE_HASH` over `SHUFFLE_REPLICATE_NL`. When both sides are specified with 
the `BROADCAST` hint or the `SHUFFLE_HASH` hint, Spark will pick the build side 
based on the join type and the sizes of the relations. Since a given strategy 
may not support all join types, Spark is not guaranteed to use the join 
strategy suggested by the hint.
+Hints give users a way to suggest how Spark SQL to use specific approaches to 
generate its execution plan.
 
 ### Syntax
 
 ```sql
-/*+ join_hint [ , ... ] */
+/*+ hint [ , ... ] */
 ```
 
+### Partitioning Hints
+
+`COALESCE`/`REPARTITION`/`REPARTITION_BY_RANGE` hints have functionalities 
equivalent to those of the

Review comment:
   Also, could you add links to the Dataset APIs if we describe them here? 
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Dataset





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-635820750







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #28672: [SPARK-31866][SQL][DOCS] Add COALESCE/REPARTITION/REPARTITION_BY_RANGE Hints to SQL Reference

2020-05-29 Thread GitBox


maropu commented on a change in pull request #28672:
URL: https://github.com/apache/spark/pull/28672#discussion_r432315161



##
File path: docs/sql-ref-syntax-qry-select-hints.md
##
@@ -21,14 +21,69 @@ license: |
 
 ### Description
 
-Join Hints allow users to suggest the join strategy that Spark should use. 
Prior to Spark 3.0, only the `BROADCAST` Join Hint was supported. `MERGE`, 
`SHUFFLE_HASH` and `SHUFFLE_REPLICATE_NL` Joint Hints support was added in 3.0. 
When different join strategy hints are specified on both sides of a join, Spark 
prioritizes hints in the following order: `BROADCAST` over `MERGE` over 
`SHUFFLE_HASH` over `SHUFFLE_REPLICATE_NL`. When both sides are specified with 
the `BROADCAST` hint or the `SHUFFLE_HASH` hint, Spark will pick the build side 
based on the join type and the sizes of the relations. Since a given strategy 
may not support all join types, Spark is not guaranteed to use the join 
strategy suggested by the hint.
+Hints give users a way to suggest how Spark SQL to use specific approaches to 
generate its execution plan.
 
 ### Syntax
 
 ```sql
-/*+ join_hint [ , ... ] */
+/*+ hint [ , ... ] */
 ```
 
+### Partitioning Hints
+
+`COALESCE`/`REPARTITION`/`REPARTITION_BY_RANGE` hints have functionalities 
equivalent to those of the
+`Dataset` `coalesce`/`repartition`/`repartitionByRange` APIs. The `COALESCE` 
hint can be used to reduce
+the number of partitions to the specified number of partitions. The 
`REPARTITION`/`REPARTITION_BY_RANGE`
+hint can be used to repartition to the specified number of partitions using 
the specified partitioning expressions.
+The `COALESCE` hint takes a partition number as a
+parameter. The `REPARTITION` hint takes a partition number, column names, or 
both as parameters.
+The `REPARTITION_BY_RANGE` hint takes column names and an optional partition 
number as parameters.
+These hints give users a way to tune performance and control the number of 
output files in Spark SQL.
+
+### Examples
+```sql
+SELECT /*+ COALESCE(3) */ * FROM t;
+
+EXPLAIN SELECT /*+ COALESCE(3) */ * FROM t;
+== Physical Plan ==
+Coalesce 3
++- *(1) ColumnarToRow
+   +- FileScan parquet default.t[name#5,c#6] Batched: true, DataFilters: [], 
Format: Parquet,
+  Location: CatalogFileIndex[file:/spark/spark-warehouse/t], 
PartitionFilters: [],
+  PushedFilters: [], ReadSchema: struct
+
+SELECT /*+ REPARTITION(3) */ * FROM t;

Review comment:
   One more comment; probably, the join hint section should have the same 
format for the examples.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-29 Thread GitBox


SparkQA commented on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-635820656


   **[Test build #123267 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123267/testReport)**
 for PR 28645 at commit 
[`8576d28`](https://github.com/apache/spark/commit/8576d283cca15366aa86e01a7826345865308fd5).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-29 Thread GitBox


SparkQA removed a comment on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-635804842


   **[Test build #123267 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123267/testReport)**
 for PR 28645 at commit 
[`8576d28`](https://github.com/apache/spark/commit/8576d283cca15366aa86e01a7826345865308fd5).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28656: [SPARK-31837][CORE] Shift to the new highest locality level if there is when recomputeLocality

2020-05-29 Thread GitBox


SparkQA commented on pull request #28656:
URL: https://github.com/apache/spark/pull/28656#issuecomment-635821157


   **[Test build #123273 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123273/testReport)**
 for PR 28656 at commit 
[`8744f1e`](https://github.com/apache/spark/commit/8744f1ec02db28975a80410b5a01d2c2557ad216).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-635820750


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] JkSelf commented on a change in pull request #28669: [SPARK-31864][SQL] Adjust AQE skew join trigger condition

2020-05-29 Thread GitBox


JkSelf commented on a change in pull request #28669:
URL: https://github.com/apache/spark/pull/28669#discussion_r432316209



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/OptimizeSkewedJoin.scala
##
@@ -164,15 +164,15 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends 
Rule[SparkPlan] {
   assert(left.partitionsWithSizes.length == 
right.partitionsWithSizes.length)
   val numPartitions = left.partitionsWithSizes.length
   // We use the median size of the original shuffle partitions to detect 
skewed partitions.

Review comment:
   We may also need to adjust the comment here.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28656: [SPARK-31837][CORE] Shift to the new highest locality level if there is when recomputeLocality

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28656:
URL: https://github.com/apache/spark/pull/28656#issuecomment-635821738







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28645: [SPARK-31826][SQL] Support composed type of case class for typed Scala UDF

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28645:
URL: https://github.com/apache/spark/pull/28645#issuecomment-635820758


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/123267/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28656: [SPARK-31837][CORE] Shift to the new highest locality level if there is when recomputeLocality

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28656:
URL: https://github.com/apache/spark/pull/28656#issuecomment-635821738







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] JkSelf commented on pull request #28669: [SPARK-31864][SQL] Adjust AQE skew join trigger condition

2020-05-29 Thread GitBox


JkSelf commented on pull request #28669:
URL: https://github.com/apache/spark/pull/28669#issuecomment-635823877


   Good improvement. Except one small comment.  LGTM. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on a change in pull request #28656: [SPARK-31837][CORE] Shift to the new highest locality level if there is when recomputeLocality

2020-05-29 Thread GitBox


Ngone51 commented on a change in pull request #28656:
URL: https://github.com/apache/spark/pull/28656#discussion_r432318441



##
File path: core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala
##
@@ -1107,10 +1107,19 @@ private[spark] class TaskSetManager(
   def recomputeLocality(): Unit = {
 // A zombie TaskSetManager may reach here while executorLost happens
 if (isZombie) return
+val previousLocalityIndex = currentLocalityIndex
 val previousLocalityLevel = myLocalityLevels(currentLocalityIndex)
+val previousMyLocalityLevels = myLocalityLevels
 myLocalityLevels = computeValidLocalityLevels()
 localityWaits = myLocalityLevels.map(getLocalityWait)
 currentLocalityIndex = getLocalityIndex(previousLocalityLevel)
+if (currentLocalityIndex > previousLocalityIndex) {
+  // SPARK-31837: If the new level is more local, shift to the new most 
local locality
+  // level in terms of better data locality. For example, say the previous 
locality
+  // levels are [PROCESS, NODE, ANY] and current level is ANY. After 
recompute, the
+  // locality levels are [PROCESS, NODE, RACK, ANY]. Then, we'll shift to 
RACK level.
+  currentLocalityIndex = 
getLocalityIndex(myLocalityLevels.diff(previousMyLocalityLevels).head)

Review comment:
   Hi all, there's a defect in the previous implement(always reset 
`currentLocalityIndex` to 0). Think about such a case, say we have locality 
levels [PROCESS, NODE, ANY] and current locality level is ANY. After recompute, 
we might have locality levels [PROCESS, NODE, RACK, ANY]. In this case, I think 
we'd better shit to RACK level instead of  PROCESS level, since the 
TaskSetManager has been already delayed for a while on known levels(PROCESS, 
NODE). So with this update, I think it could also ease our concern on the 
possible perf regression introduced by aggressive locality level resetting. 
@bmarcott @tgravescs @cloud-fan 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark SQL exceptions more Pythonic

2020-05-29 Thread GitBox


SparkQA commented on pull request #28661:
URL: https://github.com/apache/spark/pull/28661#issuecomment-635824757


   **[Test build #123274 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123274/testReport)**
 for PR 28661 at commit 
[`28c2a51`](https://github.com/apache/spark/commit/28c2a516eec5ffbd527b9e62869b279a162458e4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark SQL exceptions more Pythonic

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28661:
URL: https://github.com/apache/spark/pull/28661#issuecomment-635825202







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28661: [SPARK-31849][PYTHON][SQL] Make PySpark SQL exceptions more Pythonic

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28661:
URL: https://github.com/apache/spark/pull/28661#issuecomment-635825202







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #28648: [SPARK-31788][CORE][DSTREAM][PYTHON] Recover the support of union for different types of RDD and DStreams

2020-05-29 Thread GitBox


HyukjinKwon commented on a change in pull request #28648:
URL: https://github.com/apache/spark/pull/28648#discussion_r432320301



##
File path: python/pyspark/context.py
##
@@ -864,8 +865,21 @@ def union(self, rdds):
 first_jrdd_deserializer = rdds[0]._jrdd_deserializer
 if any(x._jrdd_deserializer != first_jrdd_deserializer for x in rdds):
 rdds = [x._reserialize() for x in rdds]
-cls = SparkContext._jvm.org.apache.spark.api.java.JavaRDD
-jrdds = SparkContext._gateway.new_array(cls, len(rdds))
+gw = SparkContext._gateway
+jvm = SparkContext._jvm
+jrdd_cls = jvm.org.apache.spark.api.java.JavaRDD
+jpair_rdd_cls = jvm.org.apache.spark.api.java.JavaPairRDD
+jdouble_rdd_cls = jvm.org.apache.spark.api.java.JavaDoubleRDD
+if is_instance_of(gw, rdds[0]._jrdd, jrdd_cls):
+cls = jrdd_cls
+elif is_instance_of(gw, rdds[0]._jrdd, jpair_rdd_cls):
+cls = jpair_rdd_cls
+elif is_instance_of(gw, rdds[0]._jrdd, jdouble_rdd_cls):
+cls = jdouble_rdd_cls
+else:
+cls_name = rdds[0]._jrdd.getClass().getCanonicalName()
+raise TypeError("Unsupported Java DStream class %s" % cls_name)
+jrdds = gw.new_array(cls, len(rdds))

Review comment:
   Thanks for pointing this out.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28648: [SPARK-31788][CORE][DSTREAM][PYTHON] Recover the support of union for different types of RDD and DStreams

2020-05-29 Thread GitBox


SparkQA commented on pull request #28648:
URL: https://github.com/apache/spark/pull/28648#issuecomment-635832203


   **[Test build #123275 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123275/testReport)**
 for PR 28648 at commit 
[`a7b2af4`](https://github.com/apache/spark/commit/a7b2af405e6050372930c41181ccb974210f262f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28648: [SPARK-31788][CORE][DSTREAM][PYTHON] Recover the support of union for different types of RDD and DStreams

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28648:
URL: https://github.com/apache/spark/pull/28648#issuecomment-635833750







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28648: [SPARK-31788][CORE][DSTREAM][PYTHON] Recover the support of union for different types of RDD and DStreams

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28648:
URL: https://github.com/apache/spark/pull/28648#issuecomment-635833750







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET

2020-05-29 Thread GitBox


maropu commented on a change in pull request #18323:
URL: https://github.com/apache/spark/pull/18323#discussion_r432325959



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
##
@@ -1319,3 +1319,123 @@ case class BRound(child: Expression, scale: Expression)
 with Serializable with ImplicitCastInputTypes {
   def this(child: Expression) = this(child, Literal(0))
 }
+
+/**
+ *  Returns the bucket number into which
+ *  the value of this expression would fall after being evaluated.
+ *
+ * @param expr is the expression for which the histogram is being created
+ * @param minValue is an expression that resolves
+ * to the minimum end point of the acceptable range for expr
+ * @param maxValue is an expression that resolves
+ * to the maximum end point of the acceptable range for expr
+ * @param numBucket is an expression that resolves to
+ *  a constant indicating the number of buckets
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(expr, min_value, max_value, num_bucket) - Returns the 
`bucket` to which operand would be assigned in an equidepth histogram with 
`num_bucket` buckets, in the range `min_value` to `max_value`.",
+  extended = """
+Examples:
+  > SELECT _FUNC_(5.35, 0.024, 10.06, 5);
+   3
+  """)
+// scalastyle:on line.size.limit
+case class WidthBucket(
+  expr: Expression,
+  minValue: Expression,
+  maxValue: Expression,
+  numBucket: Expression) extends Expression with ImplicitCastInputTypes {

Review comment:
   Cloud we do `WidthBucket (...) extends QuaternaryExpression with 
ImplicitCastInputTypes with CodegenFallback` for simplicity? If we need the 
codegen support, I think its okay to do so in follow-up.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET

2020-05-29 Thread GitBox


maropu commented on a change in pull request #18323:
URL: https://github.com/apache/spark/pull/18323#discussion_r432326338



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
##
@@ -1319,3 +1319,123 @@ case class BRound(child: Expression, scale: Expression)
 with Serializable with ImplicitCastInputTypes {
   def this(child: Expression) = this(child, Literal(0))
 }
+
+/**
+ *  Returns the bucket number into which
+ *  the value of this expression would fall after being evaluated.

Review comment:
   nit format:
   ```
   /**
* Returns the bucket number into which the value of this expression would 
fall
* after being evaluated.
*
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET

2020-05-29 Thread GitBox


maropu commented on a change in pull request #18323:
URL: https://github.com/apache/spark/pull/18323#discussion_r432326868



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
##
@@ -1319,3 +1319,123 @@ case class BRound(child: Expression, scale: Expression)
 with Serializable with ImplicitCastInputTypes {
   def this(child: Expression) = this(child, Literal(0))
 }
+
+/**
+ *  Returns the bucket number into which
+ *  the value of this expression would fall after being evaluated.
+ *
+ * @param expr is the expression for which the histogram is being created
+ * @param minValue is an expression that resolves
+ * to the minimum end point of the acceptable range for expr
+ * @param maxValue is an expression that resolves
+ * to the maximum end point of the acceptable range for expr
+ * @param numBucket is an expression that resolves to
+ *  a constant indicating the number of buckets
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(expr, min_value, max_value, num_bucket) - Returns the 
`bucket` to which operand would be assigned in an equidepth histogram with 
`num_bucket` buckets, in the range `min_value` to `max_value`.",
+  extended = """

Review comment:
   `extended` -> `examples`. Also, plz add a `since` tag.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] juliuszsompolski commented on a change in pull request #28671: [SPARK-31859][SPARK-31861][SPARK-31863] Fix Thriftserver session timezone issues

2020-05-29 Thread GitBox


juliuszsompolski commented on a change in pull request #28671:
URL: https://github.com/apache/spark/pull/28671#discussion_r432327826



##
File path: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkOperationUtils.scala
##
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.hive.thriftserver
+
+import org.apache.hive.service.cli.session.HiveSession
+
+import org.apache.spark.SparkContext
+import org.apache.spark.sql.{SparkSession, SQLContext}
+import org.apache.spark.sql.catalyst.catalog.CatalogTableType
+import org.apache.spark.sql.catalyst.catalog.CatalogTableType.{EXTERNAL, 
MANAGED, VIEW}
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.util.Utils
+
+/**
+ * Utils for Spark operations.
+ */
+private[hive] trait SparkOperationUtils {

Review comment:
   Nice. Do you think I could roll the close() function like that as well 
in this PR? It's repetitive across all operations, only 
SparkExecuteStatementOperation needs it slightly different.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET

2020-05-29 Thread GitBox


maropu commented on a change in pull request #18323:
URL: https://github.com/apache/spark/pull/18323#discussion_r432329283



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
##
@@ -1319,3 +1319,123 @@ case class BRound(child: Expression, scale: Expression)
 with Serializable with ImplicitCastInputTypes {
   def this(child: Expression) = this(child, Literal(0))
 }
+
+/**
+ *  Returns the bucket number into which
+ *  the value of this expression would fall after being evaluated.
+ *
+ * @param expr is the expression for which the histogram is being created
+ * @param minValue is an expression that resolves
+ * to the minimum end point of the acceptable range for expr
+ * @param maxValue is an expression that resolves
+ * to the maximum end point of the acceptable range for expr
+ * @param numBucket is an expression that resolves to
+ *  a constant indicating the number of buckets
+ */
+// scalastyle:off line.size.limit
+@ExpressionDescription(
+  usage = "_FUNC_(expr, min_value, max_value, num_bucket) - Returns the 
`bucket` to which operand would be assigned in an equidepth histogram with 
`num_bucket` buckets, in the range `min_value` to `max_value`.",
+  extended = """
+Examples:
+  > SELECT _FUNC_(5.35, 0.024, 10.06, 5);
+   3
+  """)
+// scalastyle:on line.size.limit
+case class WidthBucket(
+  expr: Expression,
+  minValue: Expression,
+  maxValue: Expression,
+  numBucket: Expression) extends Expression with ImplicitCastInputTypes {
+
+  override def children: Seq[Expression] = Seq(expr, minValue, maxValue, 
numBucket)
+  override def foldable: Boolean = children.drop(1).forall(_.foldable)
+  override def inputTypes: Seq[AbstractDataType] = Seq(DoubleType, DoubleType, 
DoubleType, LongType)
+  override def dataType: DataType = LongType
+  override def nullable: Boolean = true
+
+  private lazy val _minValue: Any = minValue.eval()
+  private lazy val minValueV = _minValue.asInstanceOf[Double]
+
+  private lazy val _maxValue: Any = maxValue.eval()
+  private lazy val maxValueV = _maxValue.asInstanceOf[Double]
+
+  private lazy val _numBucket: Any = numBucket.eval()
+  private lazy val numBucketV = _numBucket.asInstanceOf[Long]
+
+  private val errMsg = "The argument [%d] of WIDTH_BUCKET function is NULL or 
invalid."
+
+  override def eval(input: InternalRow): Any = {
+
+if (foldable) {
+  if (_minValue == null) {
+throw new RuntimeException(errMsg.format(2))
+  } else if (_maxValue == null) {
+throw new RuntimeException(errMsg.format(3))
+  } else if (_numBucket == null || numBucketV <= 0) {
+throw new RuntimeException(errMsg.format(4))
+  } else {
+val exprV = expr.eval(input)
+if (exprV == null) {
+  null
+} else {
+  MathUtils.widthBucket(exprV.asInstanceOf[Double], minValueV, 
maxValueV, numBucketV)
+}
+  }
+} else {
+  val evals = children.map(_.eval(input))
+  val invalid = evals.zipWithIndex.filter { case (e, i) =>
+(i > 0 && e == null) || (i == 3 && e.asInstanceOf[Long] <= 0)
+  }
+  if (invalid.nonEmpty) {
+invalid.foreach(l => throw new RuntimeException(errMsg.format(l._2 + 
1)))
+  } else if (evals(0) == null) {
+null
+  } else {
+MathUtils.widthBucket(

Review comment:
   `MathUtils.widthBucket` is used only for this new expr now? If so, could 
we move the method to this expr?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn opened a new pull request #28673: [SPARK-31867][SQL] Fix silent data change for datetime formatting

2020-05-29 Thread GitBox


yaooqinn opened a new pull request #28673:
URL: https://github.com/apache/spark/pull/28673


   ### What changes were proposed in this pull request?
   
   
   the new datetime formatter introduces silent data change like,
   
   ```sql
   spark-sql> select from_unixtime(1, 'yyy-MM-dd');
   NULL
   spark-sql> set spark.sql.legacy.timeParserPolicy=legacy;
   spark.sql.legacy.timeParserPolicylegacy
   spark-sql> select from_unixtime(1, 'yyy-MM-dd');
   0001970-01-01
   spark-sql>
   ```
   
   For patterns that support `SignStyle.EXCEEDS_PAD`, e.g. `y..y`(len >=4), 
when using the `NumberPrinterParser` to format it
   
   ```java
   switch (signStyle) {
 case EXCEEDS_PAD:
   if (minWidth < 19 && value >= EXCEED_POINTS[minWidth]) {
 buf.append(decimalStyle.getPositiveSign());
   }
   break;
  
  
   ``` 
   the `minWidth` == `len(y..y)`
   the `EXCEED_POINTS` is 
   
   ```java
   /**
* Array of 10 to the power of n.
*/
   static final long[] EXCEED_POINTS = new long[] {
   0L,
   10L,
   100L,
   1000L,
   1L,
   10L,
   100L,
   1000L,
   1L,
   10L,
   100L,
   };
   ```
   
   So when the `len(y..y)` is greater than 10, ` 
ArrayIndexOutOfBoundsException` will be raised.
   
And at the caller side, for `from_unixtime`, the exception will be 
suppressed and silent data change occurs. for `date_format`, the 
`ArrayIndexOutOfBoundsException` will continue.
   
   In this PR, for `EXCEPTION` mode, we wrapped exceptions from the new 
formatter during formatting to `SparkUpgradeException` for end-users to make 
proper decisions.
   
   ### Why are the changes needed?
   
   fix silent data change.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, SparkUpgradeException will take place of `null` result when something 
is wrong during new formatter's formatting phase.
   
   ### How was this patch tested?
   
   add unit tests,
   
   the 
`sql/core/src/test/resources/sql-tests/results/datetime-corrected.sql.out` is 
added for different behaviors from the caller sides.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28673: [SPARK-31867][SQL] Fix silent data change for datetime formatting

2020-05-29 Thread GitBox


SparkQA commented on pull request #28673:
URL: https://github.com/apache/spark/pull/28673#issuecomment-635842893


   **[Test build #123276 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123276/testReport)**
 for PR 28673 at commit 
[`fc91f2d`](https://github.com/apache/spark/commit/fc91f2d7a79561471077dabddf8b0b0e028d537f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET

2020-05-29 Thread GitBox


maropu commented on a change in pull request #18323:
URL: https://github.com/apache/spark/pull/18323#discussion_r432326338



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/mathExpressions.scala
##
@@ -1319,3 +1319,123 @@ case class BRound(child: Expression, scale: Expression)
 with Serializable with ImplicitCastInputTypes {
   def this(child: Expression) = this(child, Literal(0))
 }
+
+/**
+ *  Returns the bucket number into which
+ *  the value of this expression would fall after being evaluated.

Review comment:
   nit: how about the format and the rephrasing below?
   ```
   /**
* Returns the bucket number into which the value of this expression would 
fall
* after being evaluated.
*
* @param expr is the expression to compute a bucket number in the histogram
* @param minValue is the minimum value of the histogram
* @param maxValue is the maximum value of the histogram
* @param numBucket is the number of buckets
*/
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28673: [SPARK-31867][SQL] Fix silent data change for datetime formatting

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28673:
URL: https://github.com/apache/spark/pull/28673#issuecomment-635843497







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28673: [SPARK-31867][SQL] Fix silent data change for datetime formatting

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28673:
URL: https://github.com/apache/spark/pull/28673#issuecomment-635843497







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on pull request #28673: [SPARK-31867][SQL] Fix silent data change for datetime formatting

2020-05-29 Thread GitBox


yaooqinn commented on pull request #28673:
URL: https://github.com/apache/spark/pull/28673#issuecomment-635844180


   cc @cloud-fan @MaxGekk @maropu thanks very much



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #18323: [SPARK-21117][SQL] Built-in SQL Function Support - WIDTH_BUCKET

2020-05-29 Thread GitBox


maropu commented on pull request #18323:
URL: https://github.com/apache/spark/pull/18323#issuecomment-635844301


   Could you solve the conflict?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] yaooqinn commented on a change in pull request #28673: [SPARK-31867][SQL] Fix silent data change for datetime formatting

2020-05-29 Thread GitBox


yaooqinn commented on a change in pull request #28673:
URL: https://github.com/apache/spark/pull/28673#discussion_r432334147



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DateTimeFormatterHelper.scala
##
@@ -118,6 +118,31 @@ trait DateTimeFormatterHelper {
 s"before Spark 3.0, or set to CORRECTED and treat it as an invalid 
datetime string.", e)
   }
 
+  // When legacy time parser policy set to EXCEPTION, check whether we will 
get different results
+  // between legacy formatter and new formatter. If new formatter fails but 
legacy formatter works,
+  // throw a SparkUpgradeException. On the contrary, if the legacy policy set 
to CORRECTED,
+  // DateTimeParseException will address by the caller side.
+  protected def checkDiffFormatResult[T <: Date](
+  d: T,
+  legacyFormatFunc: T => String): PartialFunction[Throwable, String] = {
+case e if needConvertToSparkUpgradeException(e) =>
+  val resultCandidate = try {
+  legacyFormatFunc(d)
+} catch {
+  case _: Throwable => throw e
+}
+throw new SparkUpgradeException("3.0", s"Fail to format '$resultCandidate' 
in the new" +
+  s" formatter. You can set ${SQLConf.LEGACY_TIME_PARSER_POLICY.key} to 
LEGACY to restore" +
+  s" the behavior before Spark 3.0, or set to CORRECTED and treat it as an 
invalid" +
+  s" datetime string.", e)
+  }
+
+  private def needConvertToSparkUpgradeException(e: Throwable): Boolean = e 
match {
+case _: DateTimeException | _: ArrayIndexOutOfBoundsException

Review comment:
   Shall we use Nonfatal here?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


SparkQA commented on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635846348


   **[Test build #123277 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123277/testReport)**
 for PR 28593 at commit 
[`7637256`](https://github.com/apache/spark/commit/7637256038c483b35b615bfc824bcb3867d4cc9f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635846861







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Deegue commented on a change in pull request #26422: [SPARK-29786][SQL] Fix MetaException when dropping a partition not exists on HDFS

2020-05-29 Thread GitBox


Deegue commented on a change in pull request #26422:
URL: https://github.com/apache/spark/pull/26422#discussion_r432336314



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
##
@@ -638,6 +644,17 @@ private[hive] class HiveClientImpl(
 s"No partition is dropped. One partition spec '$s' does not exist 
in table '$table' " +
 s"database '$db'")
 }
+// Check whether the partition we are going to drop is empty.
+// We make a dummy one for the empty partition. See [SPARK-29786] for 
more details.
+parts.foreach { partition =>
+  val partPath = partition.getPath.head
+  if (isExistPath(partPath)) {
+val fs = partPath.getFileSystem(conf)
+fs.mkdirs(partPath)
+fs.deleteOnExit(partPath)
+  }

Review comment:
   > I'm confused. When the partition exists (`isExistPath` returns true), 
why you need to mkdir it again?
   
   Sorry, I mistakenly delete `!` when adjusting the code ..





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #28593: [SPARK-31710][SQL] Fail casting numeric to timestamp by default

2020-05-29 Thread GitBox


AmplabJenkins removed a comment on pull request #28593:
URL: https://github.com/apache/spark/pull/28593#issuecomment-635846861







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #28648: [SPARK-31788][CORE][DSTREAM][PYTHON] Recover the support of union for different types of RDD and DStreams

2020-05-29 Thread GitBox


SparkQA commented on pull request #28648:
URL: https://github.com/apache/spark/pull/28648#issuecomment-635848051


   **[Test build #123275 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/123275/testReport)**
 for PR 28648 at commit 
[`a7b2af4`](https://github.com/apache/spark/commit/a7b2af405e6050372930c41181ccb974210f262f).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `raise TypeError(\"Unsupported Java RDD class %s\" % 
cls_name)`



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #28648: [SPARK-31788][CORE][DSTREAM][PYTHON] Recover the support of union for different types of RDD and DStreams

2020-05-29 Thread GitBox


AmplabJenkins commented on pull request #28648:
URL: https://github.com/apache/spark/pull/28648#issuecomment-635848225







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   >