[GitHub] [spark] MaxGekk opened a new pull request #29690: [SPARK-32810][SQL][TESTS][FOLLOWUP][3.0] Check path globbing in JSON/CSV datasources v1 and v2

2020-09-09 Thread GitBox


MaxGekk opened a new pull request #29690:
URL: https://github.com/apache/spark/pull/29690


   ### What changes were proposed in this pull request?
   In the PR, I propose to move the test `SPARK-32810: CSV and JSON data 
sources should be able to read files with escaped glob metacharacter in the 
paths` from `DataFrameReaderWriterSuite` to `CSVSuite` and to `JsonSuite`. This 
will allow to run the same test in `CSVv1Suite`/`CSVv2Suite` and in 
`JsonV1Suite`/`JsonV2Suite`.
   
   ### Why are the changes needed?
   To improve test coverage by checking JSON/CSV datasources v1 and v2.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   By running affected test suites:
   ```
   $ build/sbt "sql/test:testOnly 
org.apache.spark.sql.execution.datasources.csv.*"
   $ build/sbt "sql/test:testOnly 
org.apache.spark.sql.execution.datasources.json.*"
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on pull request #29684: [SPARK-32810][SQL][TESTS][FOLLOWUP] Check path globbing in JSON/CSV datasources v1 and v2

2020-09-09 Thread GitBox


MaxGekk commented on pull request #29684:
URL: https://github.com/apache/spark/pull/29684#issuecomment-689351266


   @HyukjinKwon Here is the PR for 3.0: 
https://github.com/apache/spark/pull/29690



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dbtsai commented on pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox


dbtsai commented on pull request #29565:
URL: https://github.com/apache/spark/pull/29565#issuecomment-689351227


   @sunchao there is a conflict. Can you rebase it? Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29690: [SPARK-32810][SQL][TESTS][FOLLOWUP][3.0] Check path globbing in JSON/CSV datasources v1 and v2

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29690:
URL: https://github.com/apache/spark/pull/29690#issuecomment-689351428







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29689: [SPARK-32755][SQL][FOLLOWUP] Ensure `--` method of AttributeSet have same behavior under Scala 2.12 and 2.13

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29689:
URL: https://github.com/apache/spark/pull/29689#issuecomment-689353416







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox


SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689353373


   **[Test build #128429 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128429/testReport)**
 for PR 29626 at commit 
[`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29689: [SPARK-32755][SQL][FOLLOWUP] Ensure `--` method of AttributeSet have same behavior under Scala 2.12 and 2.13

2020-09-09 Thread GitBox


SparkQA commented on pull request #29689:
URL: https://github.com/apache/spark/pull/29689#issuecomment-689353368


   **[Test build #128436 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128436/testReport)**
 for PR 29689 at commit 
[`a595600`](https://github.com/apache/spark/commit/a595600750170d5ff4e913b29a9a3079abb895af).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29654: [SPARK-32802][SQL] Avoid using SpecificInternalRow in RunLengthEncoding#Encoder

2020-09-09 Thread GitBox


SparkQA commented on pull request #29654:
URL: https://github.com/apache/spark/pull/29654#issuecomment-689353376


   **[Test build #128432 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128432/testReport)**
 for PR 29654 at commit 
[`2e18856`](https://github.com/apache/spark/commit/2e188569be535311819ebb205fa95e0173b02749).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


SparkQA commented on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689353365


   **[Test build #128435 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128435/testReport)**
 for PR 29688 at commit 
[`eb67ccd`](https://github.com/apache/spark/commit/eb67ccd435b13784c1a14595e805b69bff81069d).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29673: [SPARK-32816][SQL] Fix analyzer bug when aggregating multiple distinct DECIMAL columns

2020-09-09 Thread GitBox


SparkQA commented on pull request #29673:
URL: https://github.com/apache/spark/pull/29673#issuecomment-689353366


   **[Test build #128428 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128428/testReport)**
 for PR 29673 at commit 
[`4df4f7c`](https://github.com/apache/spark/commit/4df4f7c8ebc80bcf854fb26764338789ea13b319).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29654: [SPARK-32802][SQL] Avoid using SpecificInternalRow in RunLengthEncoding#Encoder

2020-09-09 Thread GitBox


SparkQA removed a comment on pull request #29654:
URL: https://github.com/apache/spark/pull/29654#issuecomment-689277246


   **[Test build #128432 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128432/testReport)**
 for PR 29654 at commit 
[`2e18856`](https://github.com/apache/spark/commit/2e188569be535311819ebb205fa95e0173b02749).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


SparkQA removed a comment on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689298793


   **[Test build #128435 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128435/testReport)**
 for PR 29688 at commit 
[`eb67ccd`](https://github.com/apache/spark/commit/eb67ccd435b13784c1a14595e805b69bff81069d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689353879







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689353930







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29654: [SPARK-32802][SQL] Avoid using SpecificInternalRow in RunLengthEncoding#Encoder

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29654:
URL: https://github.com/apache/spark/pull/29654#issuecomment-689353708







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29689: [SPARK-32755][SQL][FOLLOWUP] Ensure `--` method of AttributeSet have same behavior under Scala 2.12 and 2.13

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29689:
URL: https://github.com/apache/spark/pull/29689#issuecomment-689353416


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox


SparkQA removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689266524


   **[Test build #128429 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128429/testReport)**
 for PR 29626 at commit 
[`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29689: [SPARK-32755][SQL][FOLLOWUP] Ensure `--` method of AttributeSet have same behavior under Scala 2.12 and 2.13

2020-09-09 Thread GitBox


SparkQA removed a comment on pull request #29689:
URL: https://github.com/apache/spark/pull/29689#issuecomment-689339622


   **[Test build #128436 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128436/testReport)**
 for PR 29689 at commit 
[`a595600`](https://github.com/apache/spark/commit/a595600750170d5ff4e913b29a9a3079abb895af).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29654: [SPARK-32802][SQL] Avoid using SpecificInternalRow in RunLengthEncoding#Encoder

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29654:
URL: https://github.com/apache/spark/pull/29654#issuecomment-689353708


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689353879


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29673: [SPARK-32816][SQL] Fix analyzer bug when aggregating multiple distinct DECIMAL columns

2020-09-09 Thread GitBox


SparkQA removed a comment on pull request #29673:
URL: https://github.com/apache/spark/pull/29673#issuecomment-689263764


   **[Test build #128428 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128428/testReport)**
 for PR 29673 at commit 
[`4df4f7c`](https://github.com/apache/spark/commit/4df4f7c8ebc80bcf854fb26764338789ea13b319).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29673: [SPARK-32816][SQL] Fix analyzer bug when aggregating multiple distinct DECIMAL columns

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29673:
URL: https://github.com/apache/spark/pull/29673#issuecomment-689354327


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29689: [SPARK-32755][SQL][FOLLOWUP] Ensure `--` method of AttributeSet have same behavior under Scala 2.12 and 2.13

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29689:
URL: https://github.com/apache/spark/pull/29689#issuecomment-689353423


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128436/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29673: [SPARK-32816][SQL] Fix analyzer bug when aggregating multiple distinct DECIMAL columns

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29673:
URL: https://github.com/apache/spark/pull/29673#issuecomment-689354327







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689353887


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128435/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689353930


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29654: [SPARK-32802][SQL] Avoid using SpecificInternalRow in RunLengthEncoding#Encoder

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29654:
URL: https://github.com/apache/spark/pull/29654#issuecomment-689353719


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128432/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29690: [SPARK-32810][SQL][TESTS][FOLLOWUP][3.0] Check path globbing in JSON/CSV datasources v1 and v2

2020-09-09 Thread GitBox


SparkQA commented on pull request #29690:
URL: https://github.com/apache/spark/pull/29690#issuecomment-689354697


   **[Test build #128437 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128437/testReport)**
 for PR 29690 at commit 
[`633e05e`](https://github.com/apache/spark/commit/633e05ee7aa775cc3ba17e0d1bb8c33429e28a9c).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29690: [SPARK-32810][SQL][TESTS][FOLLOWUP][3.0] Check path globbing in JSON/CSV datasources v1 and v2

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29690:
URL: https://github.com/apache/spark/pull/29690#issuecomment-689351428







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


ulysses-you commented on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689355209


   @maropu @dongjoon-hyun @cloud-fan do you have time to look this ? thanks !



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29673: [SPARK-32816][SQL] Fix analyzer bug when aggregating multiple distinct DECIMAL columns

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29673:
URL: https://github.com/apache/spark/pull/29673#issuecomment-689354343


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128428/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689353934


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128429/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox


beliefer commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689355143


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689355568







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689355568







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29626: [SPARK-32777][SQL] Aggregation support aggregate function with multiple foldable expressions.

2020-09-09 Thread GitBox


SparkQA commented on pull request #29626:
URL: https://github.com/apache/spark/pull/29626#issuecomment-689358829


   **[Test build #128438 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128438/testReport)**
 for PR 29626 at commit 
[`9a44bec`](https://github.com/apache/spark/commit/9a44bec648e9cabddf50675ac6cd5010f9856013).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on a change in pull request #29689: [SPARK-32755][SQL][FOLLOWUP] Ensure `--` method of AttributeSet have same behavior under Scala 2.12 and 2.13

2020-09-09 Thread GitBox


LuciferYang commented on a change in pull request #29689:
URL: https://github.com/apache/spark/pull/29689#discussion_r485390154



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala
##
@@ -105,10 +105,12 @@ class AttributeSet private (private val baseSet: 
mutable.LinkedHashSet[Attribute
*/
   def --(other: Iterable[NamedExpression]): AttributeSet = {
 other match {
+  // SPARK-32755: `--` method behave differently under scala 2.12 and 2.13,
+  // use a Scala 2.12 based code to maintains the insertion order in Scala 
2.13
   case otherSet: AttributeSet =>
-new AttributeSet(baseSet -- otherSet.baseSet)
+new AttributeSet(baseSet.clone() --= otherSet.baseSet)

Review comment:
   This conversion seems inevitable if use `diff` :(





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] hvanhovell commented on a change in pull request #29689: [SPARK-32755][SQL][FOLLOWUP] Ensure `--` method of AttributeSet have same behavior under Scala 2.12 and 2.13

2020-09-09 Thread GitBox


hvanhovell commented on a change in pull request #29689:
URL: https://github.com/apache/spark/pull/29689#discussion_r485390844



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala
##
@@ -105,10 +105,12 @@ class AttributeSet private (private val baseSet: 
mutable.LinkedHashSet[Attribute
*/
   def --(other: Iterable[NamedExpression]): AttributeSet = {
 other match {
+  // SPARK-32755: `--` method behave differently under scala 2.12 and 2.13,
+  // use a Scala 2.12 based code to maintains the insertion order in Scala 
2.13
   case otherSet: AttributeSet =>
-new AttributeSet(baseSet -- otherSet.baseSet)
+new AttributeSet(baseSet.clone() --= otherSet.baseSet)

Review comment:
   Let's use the current method then :).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on a change in pull request #29689: [SPARK-32755][SQL][FOLLOWUP] Ensure `--` method of AttributeSet have same behavior under Scala 2.12 and 2.13

2020-09-09 Thread GitBox


LuciferYang commented on a change in pull request #29689:
URL: https://github.com/apache/spark/pull/29689#discussion_r485393116



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/AttributeSet.scala
##
@@ -105,10 +105,12 @@ class AttributeSet private (private val baseSet: 
mutable.LinkedHashSet[Attribute
*/
   def --(other: Iterable[NamedExpression]): AttributeSet = {
 other match {
+  // SPARK-32755: `--` method behave differently under scala 2.12 and 2.13,
+  // use a Scala 2.12 based code to maintains the insertion order in Scala 
2.13
   case otherSet: AttributeSet =>
-new AttributeSet(baseSet -- otherSet.baseSet)
+new AttributeSet(baseSet.clone() --= otherSet.baseSet)

Review comment:
   ok ~





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu opened a new pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


maropu opened a new pull request #29691:
URL: https://github.com/apache/spark/pull/29691


   
   
   ### What changes were proposed in this pull request?
   
   This PR intends to fix an existing bug below in `UserDefinedTypeSuite`;
   ```
   [info] - SPARK-19311: UDFs disregard UDT type hierarchy (931 milliseconds)
   16:22:35.936 WARN org.apache.spark.sql.catalyst.expressions.SafeProjection: 
Expr codegen error and falling back to interpreter mode
   org.apache.spark.SparkException: Cannot cast 
org.apache.spark.sql.ExampleSubTypeUDT@46b1771f to 
org.apache.spark.sql.ExampleBaseTypeUDT@31e8d979.
at 
org.apache.spark.sql.catalyst.expressions.CastBase.nullSafeCastFunction(Cast.scala:891)
at 
org.apache.spark.sql.catalyst.expressions.CastBase.doGenCode(Cast.scala:852)
at 
org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:147)
   ...
   ```
   
   ### Why are the changes needed?
   
   bugfix
   
   ### Does this PR introduce _any_ user-facing change?
   
   No.
   
   ### How was this patch tested?
   
   Added unit tests.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on pull request #29160: [SPARK-32364][SQL] Use CaseInsensitiveMap for DataFrameReader/Writer options

2020-09-09 Thread GitBox


cloud-fan commented on pull request #29160:
URL: https://github.com/apache/spark/pull/29160#issuecomment-689366303


   @dongjoon-hyun shall we fix the issue in DataStreamReader/Writer as well? cc 
@HeartSaVioR 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


SparkQA commented on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-689366489


   **[Test build #128439 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128439/testReport)**
 for PR 29691 at commit 
[`2c51daa`](https://github.com/apache/spark/commit/2c51daaed003d17d007b3a8bfdcad8e7993c6557).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-689366995







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-689366995







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kiszk edited a comment on pull request #29686: [SPARK-32312][SQL][PYTHON][test-java11] Upgrade Apache Arrow to version 1.0.1

2020-09-09 Thread GitBox


kiszk edited a comment on pull request #29686:
URL: https://github.com/apache/spark/pull/29686#issuecomment-689367460


   One question. Do we still need the environment variable 
`ARROW_PRE_0_15_IPC_FORMAT` ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] kiszk commented on pull request #29686: [SPARK-32312][SQL][PYTHON][test-java11] Upgrade Apache Arrow to version 1.0.1

2020-09-09 Thread GitBox


kiszk commented on pull request #29686:
URL: https://github.com/apache/spark/pull/29686#issuecomment-689367460


   One question. Do we still need `ARROW_PRE_0_15_IPC_FORMAT` ?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #29160: [SPARK-32364][SQL] Use CaseInsensitiveMap for DataFrameReader/Writer options

2020-09-09 Thread GitBox


dongjoon-hyun commented on pull request #29160:
URL: https://github.com/apache/spark/pull/29160#issuecomment-689372116


   Sure, I'll make a PR for that tomorrow, @cloud-fan .



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang edited a comment on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-09 Thread GitBox


LuciferYang edited a comment on pull request #29660:
URL: https://github.com/apache/spark/pull/29660#issuecomment-689326251


   org.apache.spark.sql.hive.thriftserver.CliSuite.* failed because `Database 
clitestdb already exists`...



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


maropu commented on a change in pull request #29688:
URL: https://github.com/apache/spark/pull/29688#discussion_r485409066



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -2370,6 +2370,13 @@ object SQLConf {
   "(nonnegative and shorter than the maximum size).")
 .createWithDefaultString(s"${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}")
 
+  val MAX_METADATA_STRING_LENGTH = 
buildConf("spark.sql.maxMetadataStringLength")
+.doc("Maximum number of characters to output for a metadata string. e.g. " 
+
+  "`DataSourceScanExec`, every value will be abbreviated if exceed 
length.")
+.version("3.1.0")
+.intConf

Review comment:
   plz add `checkValue`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


maropu commented on a change in pull request #29688:
URL: https://github.com/apache/spark/pull/29688#discussion_r485408952



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -2370,6 +2370,13 @@ object SQLConf {
   "(nonnegative and shorter than the maximum size).")
 .createWithDefaultString(s"${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}")
 
+  val MAX_METADATA_STRING_LENGTH = 
buildConf("spark.sql.maxMetadataStringLength")
+.doc("Maximum number of characters to output for a metadata string. e.g. " 
+
+  "`DataSourceScanExec`, every value will be abbreviated if exceed 
length.")

Review comment:
   `e.g. DataSourceScanExec` => `e.g. file location in DataSourceScanExec`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #29543: [SPARK-32516][SQL][FOLLOWUP] 'path' option cannot coexist with path parameter for DataFrameWriter.save(), DataStreamReader.load

2020-09-09 Thread GitBox


cloud-fan commented on a change in pull request #29543:
URL: https://github.com/apache/spark/pull/29543#discussion_r485409606



##
File path: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
##
@@ -284,6 +285,12 @@ final class DataFrameWriter[T] private[sql](ds: 
Dataset[T]) {
* @since 1.4.0
*/
   def save(path: String): Unit = {
+if (!df.sparkSession.sessionState.conf.legacyPathOptionBehavior &&
+extraOptions.contains("path") && path.nonEmpty) {

Review comment:
   The `path` here is a String, do we really need to check `path.nonEmpty`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


maropu commented on a change in pull request #29688:
URL: https://github.com/apache/spark/pull/29688#discussion_r485409379



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
##
@@ -881,6 +882,28 @@ class FileBasedDataSourceSuite extends QueryTest
   }
 }
   }
+
+  test("SPARK-32827: Add spark.sql.maxMetadataStringLength config") {
+withTempDir { dir =>
+  val tableName = "t1"

Review comment:
   nit: `t1` -> `t`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


maropu commented on a change in pull request #29688:
URL: https://github.com/apache/spark/pull/29688#discussion_r485409893



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
##
@@ -881,6 +882,28 @@ class FileBasedDataSourceSuite extends QueryTest
   }
 }
   }
+
+  test("SPARK-32827: Add spark.sql.maxMetadataStringLength config") {
+withTempDir { dir =>
+  val tableName = "t1"
+  val path = s"${dir.getCanonicalPath}/$tableName"
+  withTable(tableName) {
+sql(s"create table t1(c int) using parquet location '$path'")

Review comment:
   plz use uppercases for SQL keywords, e.g., `CREATE TABLE`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


maropu commented on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689388959


   Adding the config looks okay to me.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cxzl25 commented on a change in pull request #29316: [SPARK-32508][SQL] Disallow empty part col values in partition spec before static partition writing

2020-09-09 Thread GitBox


cxzl25 commented on a change in pull request #29316:
URL: https://github.com/apache/spark/pull/29316#discussion_r485418420



##
File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala
##
@@ -847,4 +847,26 @@ class InsertSuite extends QueryTest with TestHiveSingleton 
with BeforeAndAfter
   }
 }
   }
+
+  test("SPARK-32508 " +
+"Disallow empty part col values in partition spec before static partition 
writing") {
+withTable("t1") {
+  spark.sql(
+"""
+  |CREATE TABLE t1 (c1 int)

Review comment:
   `InsertIntoHadoopFsRelationCommand`
   When `manageFilesourcePartitions` is turned on,`catalog.listPartitions` is 
called, here is a check to see if the partition value is empty.
   
   In the case that `manageFilesourcePartitions` is not turned on, the 
partition value is currently not checked, which means that the SQL execution 
will not fail. If I now move the check logic to the PreprocessTableInsertion 
rule, this will cause the execution to fail.
   
   Perhaps this check can only be performed when `tracksPartitionsInCatalog` is 
equal to true and the static partition is written.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cxzl25 commented on a change in pull request #29316: [SPARK-32508][SQL] Disallow empty part col values in partition spec before static partition writing

2020-09-09 Thread GitBox


cxzl25 commented on a change in pull request #29316:
URL: https://github.com/apache/spark/pull/29316#discussion_r485420463



##
File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala
##
@@ -847,4 +847,26 @@ class InsertSuite extends QueryTest with TestHiveSingleton 
with BeforeAndAfter
   }
 }
   }
+
+  test("SPARK-32508 " +
+"Disallow empty part col values in partition spec before static partition 
writing") {
+withTable("t1") {
+  spark.sql(
+"""
+  |CREATE TABLE t1 (c1 int)

Review comment:
   hive calls `getPartition` when `loadPartition`, here it will check 
whether the partition value is empty.
   
   ```java
   public Partition getPartition(...){
 || (val != null && val.length() == 0)) {
   throw new HiveException("get partition: Value for key "
   + field.getName() + " is null or empty");
   }
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu commented on pull request #29672: [SPARK-32818][SQL] Make `CONVERT_METASTORE_PARQUET` and `CONVERT_METASTORE_ORC` session level configurable

2020-09-09 Thread GitBox


AngersZh commented on pull request #29672:
URL: https://github.com/apache/spark/pull/29672#issuecomment-689404108


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29672: [SPARK-32818][SQL] Make `CONVERT_METASTORE_PARQUET` and `CONVERT_METASTORE_ORC` session level configurable

2020-09-09 Thread GitBox


SparkQA commented on pull request #29672:
URL: https://github.com/apache/spark/pull/29672#issuecomment-689407145


   **[Test build #128440 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128440/testReport)**
 for PR 29672 at commit 
[`c2ff589`](https://github.com/apache/spark/commit/c2ff589a813fe7db38245f5f03cc8df019613960).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29672: [SPARK-32818][SQL] Make `CONVERT_METASTORE_PARQUET` and `CONVERT_METASTORE_ORC` session level configurable

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29672:
URL: https://github.com/apache/spark/pull/29672#issuecomment-689407758







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29672: [SPARK-32818][SQL] Make `CONVERT_METASTORE_PARQUET` and `CONVERT_METASTORE_ORC` session level configurable

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29672:
URL: https://github.com/apache/spark/pull/29672#issuecomment-689407758







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29565:
URL: https://github.com/apache/spark/pull/29565#issuecomment-689411721







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29565:
URL: https://github.com/apache/spark/pull/29565#issuecomment-689411721







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sunchao commented on a change in pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox


sunchao commented on a change in pull request #29565:
URL: https://github.com/apache/spark/pull/29565#discussion_r485432133



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/UnwrapCastInBinaryComparison.scala
##
@@ -0,0 +1,216 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.optimizer
+
+import org.apache.spark.sql.catalyst.expressions._
+import org.apache.spark.sql.catalyst.expressions.Literal.FalseLiteral
+import org.apache.spark.sql.catalyst.plans.logical.LogicalPlan
+import org.apache.spark.sql.catalyst.rules.Rule
+import org.apache.spark.sql.types._
+
+/**
+ * Unwrap casts in binary comparison operations with patterns like following:
+ *
+ * `BinaryComparison(Cast(fromExp, toType), Literal(value, toType))`
+ *   or
+ * `BinaryComparison(Literal(value, toType), Cast(fromExp, toType))`
+ *
+ * This rule optimizes expressions with the above pattern by either replacing 
the cast with simpler
+ * constructs, or moving the cast from the expression side to the literal 
side, which enables them
+ * to be optimized away later and pushed down to data sources.
+ *
+ * Currently this only handles cases where `fromType` (of `fromExp`) and 
`toType` are of integral
+ * types (i.e., byte, short, int and long). The rule checks to see if the 
literal `value` is
+ * within range `(min, max)`, where `min` and `max` are the minimum and 
maximum value of
+ * `fromType`, respectively. If this is true then it means we can safely cast 
`value` to `fromType`
+ * and thus able to move the cast to the literal side.
+ *
+ * If the `value` is not within range `(min, max)`, the rule breaks the 
scenario into different
+ * cases and try to replace each with simpler constructs.
+ *
+ * if `value > max`, the cases are of following:
+ *  - `cast(fromExp, toType) > value` ==> if(isnull(fromExp), null, false)
+ *  - `cast(fromExp, toType) >= value` ==> if(isnull(fromExp), null, false)
+ *  - `cast(fromExp, toType) === value` ==> if(isnull(fromExp), null, false)
+ *  - `cast(fromExp, toType) <=> value` ==> false
+ *  - `cast(fromExp, toType) <= value` ==> if(isnull(fromExp), null, true)
+ *  - `cast(fromExp, toType) < value` ==> if(isnull(fromExp), null, true)
+ *
+ * if `value == max`, the cases are of following:
+ *  - `cast(fromExp, toType) > value` ==> if(isnull(fromExp), null, false)
+ *  - `cast(fromExp, toType) >= value` ==> fromExp == max
+ *  - `cast(fromExp, toType) === value` ==> fromExp == max
+ *  - `cast(fromExp, toType) <=> value` ==> fromExp == max
+ *  - `cast(fromExp, toType) <= value` ==> if(isnull(fromExp), null, true)
+ *  - `cast(fromExp, toType) < value` ==> fromExp =!= max
+ *
+ * Similarly for the cases when `value == min` and `value < min`.
+ *
+ * Further, the above `if(isnull(fromExp), null, false)` is represented using 
conjunction
+ * `and(isnull(fromExp), null)`, to enable further optimization and filter 
pushdown to data sources.
+ * Similarly, `if(isnull(fromExp), null, true)` is represented with 
`or(isnotnull(fromExp), null)`.
+ */
+object UnwrapCastInBinaryComparison extends Rule[LogicalPlan] {
+  override def apply(plan: LogicalPlan): LogicalPlan = plan transform {
+case l: LogicalPlan =>
+  l transformExpressionsUp {
+case e @ BinaryComparison(_, _) => unwrapCast(e)
+  }
+  }
+
+  private def unwrapCast(exp: Expression): Expression = exp match {
+// Not a canonical form. In this case we first canonicalize the expression 
by swapping the
+// literal and cast side, then process the result and swap the literal and 
cast again to
+// restore the original order.
+case BinaryComparison(Literal(_, toType: IntegralType), Cast(fromExp, _: 
IntegralType, _))
+if canImplicitlyCast(fromExp, toType) =>
+  def swap(e: Expression): Expression = e match {
+case GreaterThan(left, right) => LessThan(right, left)
+case GreaterThanOrEqual(left, right) => LessThanOrEqual(right, left)
+case EqualTo(left, right) => EqualTo(right, left)
+case EqualNullSafe(left, right) => EqualNullSafe(right, left)
+case LessThanOrEqual(left, right) => GreaterThanOrEqual(right, left)
+case LessTha

[GitHub] [spark] SparkQA commented on pull request #29565: [SPARK-24994][SQL] Add UnwrapCastInBinaryComparison optimizer to simplify literal types

2020-09-09 Thread GitBox


SparkQA commented on pull request #29565:
URL: https://github.com/apache/spark/pull/29565#issuecomment-689415160


   **[Test build #128441 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128441/testReport)**
 for PR 29565 at commit 
[`265c169`](https://github.com/apache/spark/commit/265c1698c29730b0dee2865773557d7a3c4c1144).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29316: [SPARK-32508][SQL] Disallow empty part col values in partition spec before static partition writing

2020-09-09 Thread GitBox


SparkQA commented on pull request #29316:
URL: https://github.com/apache/spark/pull/29316#issuecomment-689419138


   **[Test build #128442 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128442/testReport)**
 for PR 29316 at commit 
[`232a835`](https://github.com/apache/spark/commit/232a8358f3c22e14afc419f11b60047d9a1a3882).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29316: [SPARK-32508][SQL] Disallow empty part col values in partition spec before static partition writing

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29316:
URL: https://github.com/apache/spark/pull/29316#issuecomment-689419911







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29316: [SPARK-32508][SQL] Disallow empty part col values in partition spec before static partition writing

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29316:
URL: https://github.com/apache/spark/pull/29316#issuecomment-689419911







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on pull request #29160: [SPARK-32364][SQL] Use CaseInsensitiveMap for DataFrameReader/Writer options

2020-09-09 Thread GitBox


dongjoon-hyun commented on pull request #29160:
URL: https://github.com/apache/spark/pull/29160#issuecomment-689425629


   BTW, @cloud-fan . Is @HeartSaVioR working on that? I'm wondering the reason 
why you ping him in that task.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


ulysses-you commented on a change in pull request #29688:
URL: https://github.com/apache/spark/pull/29688#discussion_r485450580



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
##
@@ -881,6 +882,28 @@ class FileBasedDataSourceSuite extends QueryTest
   }
 }
   }
+
+  test("SPARK-32827: Add spark.sql.maxMetadataStringLength config") {
+withTempDir { dir =>
+  val tableName = "t1"
+  val path = s"${dir.getCanonicalPath}/$tableName"
+  withTable(tableName) {
+sql(s"create table t1(c int) using parquet location '$path'")

Review comment:
   done.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


SparkQA commented on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689427681


   **[Test build #128443 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128443/testReport)**
 for PR 29688 at commit 
[`239f62b`](https://github.com/apache/spark/commit/239f62bc7f58cac1a690b123ce37acfdd2279422).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689428477







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689428477







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


ulysses-you commented on a change in pull request #29688:
URL: https://github.com/apache/spark/pull/29688#discussion_r485453573



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -2370,6 +2370,14 @@ object SQLConf {
   "(nonnegative and shorter than the maximum size).")
 .createWithDefaultString(s"${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}")
 
+  val MAX_METADATA_STRING_LENGTH = 
buildConf("spark.sql.maxMetadataStringLength")
+.doc("Maximum number of characters to output for a metadata string. e.g. " 
+
+  "file location in `DataSourceScanExec`, every value will be abbreviated 
if exceed length.")
+.version("3.1.0")
+.intConf
+.checkValue(_ > 3, "This value must be bigger than 3.")

Review comment:
   This value if from `org.apache.commons.lang3.StringUtils.abbreviate` 
whose marker is `...`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


SparkQA commented on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689432286


   **[Test build #128444 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128444/testReport)**
 for PR 29688 at commit 
[`681f84e`](https://github.com/apache/spark/commit/681f84e4991bc2477e9354f29b1d2841ae5b656e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689432986







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29688:
URL: https://github.com/apache/spark/pull/29688#issuecomment-689432986







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] ulysses-you commented on a change in pull request #29688: [SPARK-32827][SQL] Add spark.sql.maxMetadataStringLength config

2020-09-09 Thread GitBox


ulysses-you commented on a change in pull request #29688:
URL: https://github.com/apache/spark/pull/29688#discussion_r485453573



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -2370,6 +2370,14 @@ object SQLConf {
   "(nonnegative and shorter than the maximum size).")
 .createWithDefaultString(s"${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}")
 
+  val MAX_METADATA_STRING_LENGTH = 
buildConf("spark.sql.maxMetadataStringLength")
+.doc("Maximum number of characters to output for a metadata string. e.g. " 
+
+  "file location in `DataSourceScanExec`, every value will be abbreviated 
if exceed length.")
+.version("3.1.0")
+.intConf
+.checkValue(_ > 3, "This value must be bigger than 3.")

Review comment:
   This value from `org.apache.commons.lang3.StringUtils.abbreviate` whose 
marker is `...`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-68923







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


SparkQA commented on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-689444209


   **[Test build #128439 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128439/testReport)**
 for PR 29691 at commit 
[`2c51daa`](https://github.com/apache/spark/commit/2c51daaed003d17d007b3a8bfdcad8e7993c6557).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


SparkQA removed a comment on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-689366489


   **[Test build #128439 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128439/testReport)**
 for PR 29691 at commit 
[`2c51daa`](https://github.com/apache/spark/commit/2c51daaed003d17d007b3a8bfdcad8e7993c6557).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-68923


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-68934


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128439/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] Ngone51 commented on pull request #29580: [SPARK-32738][CORE] Should reduce the number of active threads if fatal error happens in `Inbox.process`

2020-09-09 Thread GitBox


Ngone51 commented on pull request #29580:
URL: https://github.com/apache/spark/pull/29580#issuecomment-689451055


   LGTM.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] LuciferYang commented on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-09 Thread GitBox


LuciferYang commented on pull request #29660:
URL: https://github.com/apache/spark/pull/29660#issuecomment-689451710


   local test `org.apache.spark.sql.hive.thriftserver.CliSuite`, all 28 case 
succeeded



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AngersZhuuuu opened a new pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox


AngersZh opened a new pull request #29692:
URL: https://github.com/apache/spark/pull/29692


   ### What changes were proposed in this pull request?
   For BroadcastNestedLoopJoin, we will broadcast boradcast-side child to all 
executor and use stream side partition's data traversal broadcast-side data 
one-by-one. 
   
   We have meet some case that stream side data skew and all success task wait 
for skewed partition to finish.
   
   We know that the execution time increases exponentially with the amount of 
partition's data.
   
   If skewd with 100x,  skewed partition's data will execute 100x than 
non-skewed part.
   
   It is a bottleneckļ¼Œ with AE, we can avoid this by  split skewed part's data  
to make it more balanced.
   
   
   ### Why are the changes needed?
   NO
   
   
   ### Does this PR introduce _any_ user-facing change?
   NO
   
   
   ### How was this patch tested?
   WIP
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox


SparkQA commented on pull request #29692:
URL: https://github.com/apache/spark/pull/29692#issuecomment-689463957


   **[Test build #128445 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128445/testReport)**
 for PR 29692 at commit 
[`7aba44d`](https://github.com/apache/spark/commit/7aba44d5aa6cac633e56d8e4b34d213b5d0a87d6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29692:
URL: https://github.com/apache/spark/pull/29692#issuecomment-689464467







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29692:
URL: https://github.com/apache/spark/pull/29692#issuecomment-689464467







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox


SparkQA commented on pull request #29692:
URL: https://github.com/apache/spark/pull/29692#issuecomment-689465338


   **[Test build #128445 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128445/testReport)**
 for PR 29692 at commit 
[`7aba44d`](https://github.com/apache/spark/commit/7aba44d5aa6cac633e56d8e4b34d213b5d0a87d6).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox


SparkQA removed a comment on pull request #29692:
URL: https://github.com/apache/spark/pull/29692#issuecomment-689463957


   **[Test build #128445 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128445/testReport)**
 for PR 29692 at commit 
[`7aba44d`](https://github.com/apache/spark/commit/7aba44d5aa6cac633e56d8e4b34d213b5d0a87d6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29692:
URL: https://github.com/apache/spark/pull/29692#issuecomment-689465349







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29692:
URL: https://github.com/apache/spark/pull/29692#issuecomment-689465349


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29692: [WIP][SPARK-32830][SQL] Optimize Skewed BroadcastNestedLoopJoin with AE

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29692:
URL: https://github.com/apache/spark/pull/29692#issuecomment-689465359


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/128445/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] xuanyuanking commented on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-09 Thread GitBox


xuanyuanking commented on pull request #29660:
URL: https://github.com/apache/spark/pull/29660#issuecomment-689468148


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-09 Thread GitBox


SparkQA commented on pull request #29660:
URL: https://github.com/apache/spark/pull/29660#issuecomment-689471142


   **[Test build #128446 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128446/testReport)**
 for PR 29660 at commit 
[`9185a95`](https://github.com/apache/spark/commit/9185a95c29bc532e9f0aea5bdd4f2185c5093642).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29660:
URL: https://github.com/apache/spark/pull/29660#issuecomment-689471539







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29660: [SPARK-32808][SQL] Fix some test cases of `sql/core` module in scala 2.13

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29660:
URL: https://github.com/apache/spark/pull/29660#issuecomment-689471539







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


SparkQA commented on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-689474457


   **[Test build #128447 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/128447/testReport)**
 for PR 29691 at commit 
[`6d3ecdb`](https://github.com/apache/spark/commit/6d3ecdb156c87df516a4484e0167f3c2d8d93f6d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


AmplabJenkins commented on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-689475074







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29691: [SPARK-32828][SQL] Cast from a derived user-defined type to a base type

2020-09-09 Thread GitBox


AmplabJenkins removed a comment on pull request #29691:
URL: https://github.com/apache/spark/pull/29691#issuecomment-689475074







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   >