[GitHub] [spark] SparkQA removed a comment on pull request #30326: [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples

2020-11-10 Thread GitBox


SparkQA removed a comment on pull request #30326:
URL: https://github.com/apache/spark/pull/30326#issuecomment-725261615


   **[Test build #130916 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130916/testReport)**
 for PR 30326 at commit 
[`4203606`](https://github.com/apache/spark/commit/4203606bf33ad903441ea2a8be81f9f9fcf997a2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30326: [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #30326:
URL: https://github.com/apache/spark/pull/30326#issuecomment-725268481


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


cloud-fan commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521179168



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##
@@ -1919,9 +1920,14 @@ case class ArrayPosition(left: Expression, right: 
Expression)
b
   """,
   since = "2.4.0")
-case class ElementAt(left: Expression, right: Expression)
+case class ElementAt(
+left: Expression,
+right: Expression,
+failOnError: Boolean = SQLConf.get.ansiEnabled)

Review comment:
   Making it a parameter is more robust to retain this info. Otherwise, we 
may change it when transform and copy the expresion.
   
   This also helps if we want to support SAFE prefix like 
https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30326: [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples

2020-11-10 Thread GitBox


AmplabJenkins commented on pull request #30326:
URL: https://github.com/apache/spark/pull/30326#issuecomment-725268481







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30326: [MINOR][GRAPHX] Correct typos in the sub-modules: graphx, external, and examples

2020-11-10 Thread GitBox


SparkQA commented on pull request #30326:
URL: https://github.com/apache/spark/pull/30326#issuecomment-725268456


   **[Test build #130916 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130916/testReport)**
 for PR 30326 at commit 
[`4203606`](https://github.com/apache/spark/commit/4203606bf33ad903441ea2a8be81f9f9fcf997a2).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725265973


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/130917/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


SparkQA removed a comment on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725264615


   **[Test build #130917 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130917/testReport)**
 for PR 30297 at commit 
[`a9312a0`](https://github.com/apache/spark/commit/a9312a0546e891266323423e007f65d78ce49ff4).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725265964


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


viirya commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521175748



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##
@@ -1919,9 +1920,14 @@ case class ArrayPosition(left: Expression, right: 
Expression)
b
   """,
   since = "2.4.0")
-case class ElementAt(left: Expression, right: Expression)
+case class ElementAt(
+left: Expression,
+right: Expression,
+failOnError: Boolean = SQLConf.get.ansiEnabled)

Review comment:
   Why not just have `val failOnError: Boolean = SQLConf.get.ansiEnabled`? 
Do you need to assign it a value other than `SQLConf.get.ansiEnabled` when 
constructing?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


SparkQA commented on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725265949


   **[Test build #130917 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130917/testReport)**
 for PR 30297 at commit 
[`a9312a0`](https://github.com/apache/spark/commit/a9312a0546e891266323423e007f65d78ce49ff4).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


viirya commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521175748



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##
@@ -1919,9 +1920,14 @@ case class ArrayPosition(left: Expression, right: 
Expression)
b
   """,
   since = "2.4.0")
-case class ElementAt(left: Expression, right: Expression)
+case class ElementAt(
+left: Expression,
+right: Expression,
+failOnError: Boolean = SQLConf.get.ansiEnabled)

Review comment:
   Why just have `val failOnError: Boolean = SQLConf.get.ansiEnabled`? Do 
you need to assign it a value other than `SQLConf.get.ansiEnabled` when 
constructing?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


AmplabJenkins commented on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725265964







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


SparkQA commented on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725264607







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30309: [WIP][SPARK-33407][PYTHON] Simplify the exception message from Python UDFs

2020-11-10 Thread GitBox


SparkQA commented on pull request #30309:
URL: https://github.com/apache/spark/pull/30309#issuecomment-725264247


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35516/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


SparkQA commented on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725263945


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35515/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak edited a comment on pull request #30292: [SPARK-33166][DOC] Provide Search Function in Spark docs site

2020-11-10 Thread GitBox


sarutak edited a comment on pull request #30292:
URL: https://github.com/apache/spark/pull/30292#issuecomment-725262955


   > @sarutak Because this PR not merged and not publish to the online Spark 
document.
   
   Yes. But even if this is published, the search results consistently refer to 
the latest release at the time point.
   Imagine that this feature is merged to 3.1 and then 3.2 is released in the 
future, and someone still uses Spark 3.1.
   In that case, even though the user uses this feature for 3.1, the search 
result refer to the document for 3.2 right?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak edited a comment on pull request #30292: [SPARK-33166][DOC] Provide Search Function in Spark docs site

2020-11-10 Thread GitBox


sarutak edited a comment on pull request #30292:
URL: https://github.com/apache/spark/pull/30292#issuecomment-725262955


   > @sarutak Because this PR not merged and not publish to the online Spark 
document.
   
   Yes. But even if this is published, the search results consistently refers 
to the latest release at the time point.
   Imagine that this feature is merged to 3.1 and then 3.2 is released in the 
future, and someone still uses Spark 3.1.
   In that case, even though the user uses this feature for 3.1, one moves to 
the document for 3.2 right?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak edited a comment on pull request #30292: [SPARK-33166][DOC] Provide Search Function in Spark docs site

2020-11-10 Thread GitBox


sarutak edited a comment on pull request #30292:
URL: https://github.com/apache/spark/pull/30292#issuecomment-725262955


   > @sarutak Because this PR not merged and not publish to the online Spark 
document.
   
   Yes. But even if this is published the search results consistently refers to 
the latest release at the time point.
   Imagine that this feature is merged to 3.1 and then 3.2 is released in the 
future, and someone still uses Spark 3.1.
   In that case, even though the user uses this feature for 3.1, one moves to 
the document for 3.2 right?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak edited a comment on pull request #30292: [SPARK-33166][DOC] Provide Search Function in Spark docs site

2020-11-10 Thread GitBox


sarutak edited a comment on pull request #30292:
URL: https://github.com/apache/spark/pull/30292#issuecomment-725262955


   > @sarutak Because this PR not merged and not publish to the online Spark 
document.
   
   Yes. But even if this is published, the search results consistently refer to 
the latest release at the time point.
   Imagine that this feature is merged to 3.1 and then 3.2 is released in the 
future, and someone still uses Spark 3.1.
   In that case, even though the user uses this feature for 3.1, one moves to 
the document for 3.2 right?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] leanken commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


leanken commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521172555



##
File path: docs/sql-ref-ansi-compliance.md
##
@@ -111,6 +111,13 @@ SELECT * FROM t;
 
 The behavior of some SQL functions can be different under ANSI mode 
(`spark.sql.ansi.enabled=true`).
   - `size`: This function returns null for null input under ANSI mode.

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] sarutak commented on pull request #30292: [SPARK-33166][DOC] Provide Search Function in Spark docs site

2020-11-10 Thread GitBox


sarutak commented on pull request #30292:
URL: https://github.com/apache/spark/pull/30292#issuecomment-725262955


   > @sarutak Because this PR not merged and not publish to the online Spark 
document.
   Yes. But even if this is published the search results consistently refers to 
the latest release at the time point.
   Imagine that this feature is merged to 3.1 and then 3.2 is released in the 
future, and someone still uses Spark 3.1.
   In that case, even though the user uses this feature for 3.1, one moves to 
the document for 3.2 right?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30326: [MINOR][GRAPHX] Correct typos

2020-11-10 Thread GitBox


SparkQA commented on pull request #30326:
URL: https://github.com/apache/spark/pull/30326#issuecomment-725261615


   **[Test build #130916 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130916/testReport)**
 for PR 30326 at commit 
[`4203606`](https://github.com/apache/spark/commit/4203606bf33ad903441ea2a8be81f9f9fcf997a2).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #30297: [SPARK-33386][SQL] Accessing array elements in ElementAt/Elt/GetArrayItem should failed if index is out of bound

2020-11-10 Thread GitBox


maropu commented on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725261520


   Looks fine otherwise.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30325: [SPARK-33416][SQL] Avoid Hive metastore stack overflow when InSet predicate have many values

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #30325:
URL: https://github.com/apache/spark/pull/30325#issuecomment-725261010







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] li36909 commented on pull request #30248: [SPARK-33339][PYTHON] Pyspark application will hang due to non Exception error

2020-11-10 Thread GitBox


li36909 commented on pull request #30248:
URL: https://github.com/apache/spark/pull/30248#issuecomment-725260785


   > It has a conflict in branch-2.4. It's sort of a corner case so I think we 
don't bother porting it back. @li36909, please go ahead and open a PR to 
backport if you're willing to do.
   
   ok, I will open a PR at branch-2.4, thank you!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30325: [SPARK-33416][SQL] Avoid Hive metastore stack overflow when InSet predicate have many values

2020-11-10 Thread GitBox


SparkQA commented on pull request #30325:
URL: https://github.com/apache/spark/pull/30325#issuecomment-725260995


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35513/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30325: [SPARK-33416][SQL] Avoid Hive metastore stack overflow when InSet predicate have many values

2020-11-10 Thread GitBox


AmplabJenkins commented on pull request #30325:
URL: https://github.com/apache/spark/pull/30325#issuecomment-725261010







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements should failed if index is out of bound.

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521169867



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/stringExpressions.scala
##
@@ -231,15 +232,23 @@ case class ConcatWs(children: Seq[Expression])
  */
 // scalastyle:off line.size.limit
 @ExpressionDescription(
-  usage = "_FUNC_(n, input1, input2, ...) - Returns the `n`-th input, e.g., 
returns `input2` when `n` is 2.",
+  usage = """
+_FUNC_(n, input1, input2, ...) - Returns the `n`-th input, e.g., returns 
`input2` when `n` is 2.
+If the index exceeds the length of the array, Returns NULL if ANSI mode is 
off;
+Throws ArrayIndexOutOfBoundsException when ANSI mode is on.

Review comment:
   The same comment with 
https://github.com/apache/spark/pull/30297#discussion_r521166711.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements should failed if index is out of bound.

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521166711



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##
@@ -1906,8 +1906,8 @@ case class ArrayPosition(left: Expression, right: 
Expression)
 @ExpressionDescription(
   usage = """
 _FUNC_(array, index) - Returns element of array at given (1-based) index. 
If index < 0,
-  accesses elements from the last to the first. Returns NULL if the index 
exceeds the length
-  of the array.
+  accesses elements from the last to the first. If the index exceeds the 
length of the array,
+  Returns NULL if ANSI mode is off; Throws ArrayIndexOutOfBoundsException 
when ANSI mode is on.

Review comment:
   How about rewriting it like this?
   ```
The function returns NULL if the index exceeds the length of the array and 
`spark.sql.ansi.enabled` is set to false. If `spark.sql.ansi.enabled` is set to 
true, it throws ArrayIndexOutOfBoundsException for invalid indices.
   ```
   by referring to the `Size` usage:
   
https://github.com/apache/spark/blob/8760032f4f7e1ef36fee6afc45923d3826ef14fc/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala#L79-L82

##
File path: docs/sql-ref-ansi-compliance.md
##
@@ -111,6 +111,13 @@ SELECT * FROM t;
 
 The behavior of some SQL functions can be different under ANSI mode 
(`spark.sql.ansi.enabled=true`).
   - `size`: This function returns null for null input under ANSI mode.

Review comment:
   (This is not related to this PR though) could you remove `under ANSI 
mode` in this statement, too?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #2:
URL: https://github.com/apache/spark/pull/2#issuecomment-725257940







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


AmplabJenkins commented on pull request #2:
URL: https://github.com/apache/spark/pull/2#issuecomment-725257940







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30309: [WIP][SPARK-33407][PYTHON] Simplify the exception message from Python UDFs

2020-11-10 Thread GitBox


SparkQA commented on pull request #30309:
URL: https://github.com/apache/spark/pull/30309#issuecomment-725257686


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35514/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer edited a comment on pull request #30292: [SPARK-33166][DOC] Provide Search Function in Spark docs site

2020-11-10 Thread GitBox


beliefer edited a comment on pull request #30292:
URL: https://github.com/apache/spark/pull/30292#issuecomment-725256974


   @sarutak Because this PR not merged and not publish to the online Spark 
document.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


SparkQA removed a comment on pull request #2:
URL: https://github.com/apache/spark/pull/2#issuecomment-725094540


   **[Test build #130900 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130900/testReport)**
 for PR 2 at commit 
[`15bac5b`](https://github.com/apache/spark/commit/15bac5bfecb209ba7b6963d83423b659fbc5086d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on pull request #30292: [SPARK-33166][DOC] Provide Search Function in Spark docs site

2020-11-10 Thread GitBox


beliefer commented on pull request #30292:
URL: https://github.com/apache/spark/pull/30292#issuecomment-725256974


   @sarutak Because this PR not merged and publish to the online Spark document.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


SparkQA commented on pull request #2:
URL: https://github.com/apache/spark/pull/2#issuecomment-725256865


   **[Test build #130900 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130900/testReport)**
 for PR 2 at commit 
[`15bac5b`](https://github.com/apache/spark/commit/15bac5bfecb209ba7b6963d83423b659fbc5086d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


SparkQA commented on pull request #2:
URL: https://github.com/apache/spark/pull/2#issuecomment-725255595


   **[Test build #130915 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130915/testReport)**
 for PR 2 at commit 
[`d039c33`](https://github.com/apache/spark/commit/d039c33de33ea4bab4cea3170925c0c4f92ca771).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


beliefer commented on a change in pull request #2:
URL: https://github.com/apache/spark/pull/2#discussion_r521164688



##
File path: sql/core/src/test/resources/sql-functions/sql-expression-schema.md
##
@@ -346,4 +346,4 @@
 | org.apache.spark.sql.catalyst.expressions.xml.XPathList | xpath | SELECT 
xpath('b1b2b3c1c2','a/b/text()') | 
structb1b2b3c1c2, 
a/b/text()):array> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathLong | xpath_long | 
SELECT xpath_long('12', 'sum(a/b)') | 
struct12, sum(a/b)):bigint> |
 | org.apache.spark.sql.catalyst.expressions.xml.XPathShort | xpath_short | 
SELECT xpath_short('12', 'sum(a/b)') | 
struct12, sum(a/b)):smallint> |
-| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | 
SELECT xpath_string('bcc','a/c') | 
structbcc, a/c):string> |
\ No newline at end of file
+| org.apache.spark.sql.catalyst.expressions.xml.XPathString | xpath_string | 
SELECT xpath_string('bcc','a/c') | 
structbcc, a/c):string> |

Review comment:
   I tried revert it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


beliefer commented on a change in pull request #2:
URL: https://github.com/apache/spark/pull/2#discussion_r521163422



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##
@@ -178,6 +180,86 @@ case class Like(left: Expression, right: Expression, 
escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes 
with NullIntolerant {
+
+  protected def patterns: Seq[Any]
+
+  protected def isNotDefined: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+.map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+if (hasNull) {
+  null

Review comment:
   Thanks!





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30327: [WIP] Test

2020-11-10 Thread GitBox


SparkQA commented on pull request #30327:
URL: https://github.com/apache/spark/pull/30327#issuecomment-725252621


   **[Test build #130914 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130914/testReport)**
 for PR 30327 at commit 
[`0e6f09f`](https://github.com/apache/spark/commit/0e6f09f9b7a7984c93e80269aace51f66a3662b3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30325: [SPARK-33416][SQL] Avoid Hive metastore stack overflow when InSet predicate have many values

2020-11-10 Thread GitBox


SparkQA commented on pull request #30325:
URL: https://github.com/apache/spark/pull/30325#issuecomment-725252148


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35513/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on pull request #30318: [SPARK-33412][SQL] OverwriteByExpression should resolve its delete condition based on the table relation not the input query

2020-11-10 Thread GitBox


HyukjinKwon commented on pull request #30318:
URL: https://github.com/apache/spark/pull/30318#issuecomment-725250995


   Merged to master. It has conflict with branch-3.0. @cloud-fan mind opening a 
PR for branch-3.0?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


beliefer commented on a change in pull request #2:
URL: https://github.com/apache/spark/pull/2#discussion_r521160412



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala
##
@@ -1408,7 +1408,20 @@ class AstBuilder(conf: SQLConf) extends 
SqlBaseBaseVisitor[AnyRef] with Logging
   case Some(SqlBaseParser.ANY) | Some(SqlBaseParser.SOME) =>
 getLikeQuantifierExprs(ctx.expression).reduceLeft(Or)
   case Some(SqlBaseParser.ALL) =>
-getLikeQuantifierExprs(ctx.expression).reduceLeft(And)
+validate(!ctx.expression.isEmpty, "Expected something between '(' 
and ')'.", ctx)
+val expressions = ctx.expression.asScala.map(expression)
+if (expressions.size > 200 && expressions.forall(_.foldable)) {

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon closed pull request #30318: [SPARK-33412][SQL] OverwriteByExpression should resolve its delete condition based on the table relation not the input query

2020-11-10 Thread GitBox


HyukjinKwon closed pull request #30318:
URL: https://github.com/apache/spark/pull/30318


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30327: [WIP] Test

2020-11-10 Thread GitBox


SparkQA commented on pull request #30327:
URL: https://github.com/apache/spark/pull/30327#issuecomment-725249562


   **[Test build #130912 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130912/testReport)**
 for PR 30327 at commit 
[`52602f4`](https://github.com/apache/spark/commit/52602f43ef3b80630375937f970932948527a7ff).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30299: [SPARK-33389][SQL] Make internal classes of SparkSession always using active SQLConf

2020-11-10 Thread GitBox


SparkQA commented on pull request #30299:
URL: https://github.com/apache/spark/pull/30299#issuecomment-725249564


   **[Test build #130913 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130913/testReport)**
 for PR 30299 at commit 
[`7047924`](https://github.com/apache/spark/commit/70479247efe6a6b4b3e4d653281ce3a4ea8c5224).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] WeichenXu123 opened a new pull request #30327: [WIP] Test

2020-11-10 Thread GitBox


WeichenXu123 opened a new pull request #30327:
URL: https://github.com/apache/spark/pull/30327


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30297: [SPARK-33386][SQL] Accessing array elements should failed if index is out of bound.

2020-11-10 Thread GitBox


SparkQA commented on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725246332


   **[Test build #130911 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130911/testReport)**
 for PR 30297 at commit 
[`6109838`](https://github.com/apache/spark/commit/610983835626ae5afb7bc7fd6ec4efa0aec9f548).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30309: [WIP][SPARK-33407][PYTHON] Simplify the exception message from Python UDFs

2020-11-10 Thread GitBox


SparkQA commented on pull request #30309:
URL: https://github.com/apache/spark/pull/30309#issuecomment-725246240


   **[Test build #130910 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130910/testReport)**
 for PR 30309 at commit 
[`2c7b4af`](https://github.com/apache/spark/commit/2c7b4af57adbe627fd8d9322a13702dded136daa).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon commented on a change in pull request #30309: [WIP][SPARK-33407][PYTHON] Simplify the exception message from Python UDFs

2020-11-10 Thread GitBox


HyukjinKwon commented on a change in pull request #30309:
URL: https://github.com/apache/spark/pull/30309#discussion_r521155393



##
File path: python/pyspark/worker.py
##
@@ -604,17 +604,19 @@ def process():
 # reuse.
 TaskContext._setTaskContext(None)
 BarrierTaskContext._setTaskContext(None)
-except BaseException:
+except BaseException as e:
 try:
-exc_info = traceback.format_exc()
-if isinstance(exc_info, bytes):
-# exc_info may contains other encoding bytes, replace the 
invalid bytes and convert
-# it back to utf-8 again
-exc_info = exc_info.decode("utf-8", "replace").encode("utf-8")

Review comment:
   We dropped Python 2. It doesn't need it anymore.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements should failed if index is out of bound.

2020-11-10 Thread GitBox


cloud-fan commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521152883



##
File path: docs/sql-ref-ansi-compliance.md
##
@@ -111,6 +111,13 @@ SELECT * FROM t;
 
 The behavior of some SQL functions can be different under ANSI mode 
(`spark.sql.ansi.enabled=true`).
   - `size`: This function returns null for null input under ANSI mode.
+  - `element_at`: This function throws `ArrayIndexOutOfBoundsException` if 
using invalid indices. 
+  - `elt`: This function throws `ArrayIndexOutOfBoundsException` if using 
invalid indices.
+
+### SQL Operators
+
+The behavior of some SQL operators can be different under ANSI mode 
(`spark.sql.ansi.enabled=true`).
+  - `GetArrayItem`: This operator throws `ArrayIndexOutOfBoundsException` if 
using invalid indices.

Review comment:
   to be more user facing, `GetArrayItem` -> `array_col[index]`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] leanken commented on pull request #30297: [SPARK-33386][SQL] Accessing array elements should failed if index is out of bound.

2020-11-10 Thread GitBox


leanken commented on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725240046


   updated, @cloud-fan and @maropu 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30297: [SPARK-33386][SQL] Accessing array elements should failed if index is out of bound.

2020-11-10 Thread GitBox


SparkQA commented on pull request #30297:
URL: https://github.com/apache/spark/pull/30297#issuecomment-725240297


   **[Test build #130909 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130909/testReport)**
 for PR 30297 at commit 
[`f8cfa5b`](https://github.com/apache/spark/commit/f8cfa5be4573a495e5fd281fd665a1c140be4c0f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] cloud-fan commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements should failed if index is out of bound.

2020-11-10 Thread GitBox


cloud-fan commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521150251



##
File path: docs/sql-ref-ansi-compliance.md
##
@@ -111,6 +111,8 @@ SELECT * FROM t;
 
 The behavior of some SQL functions can be different under ANSI mode 
(`spark.sql.ansi.enabled=true`).
   - `size`: This function returns null for null input under ANSI mode.
+  - `element_at`: This function throws `ArrayIndexOutOfBoundsException` if 
using invalid indices under ANSI mode.
+  - `elt`: This function throws `ArrayIndexOutOfBoundsException` if using 
invalid indices under ANSI mode.

Review comment:
   +1





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29339: [SPARK-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #29339:
URL: https://github.com/apache/spark/pull/29339#issuecomment-725238424


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/35512/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #29339: [SPARK-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-11-10 Thread GitBox


AmplabJenkins commented on pull request #29339:
URL: https://github.com/apache/spark/pull/29339#issuecomment-725238415







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #29339: [SPARK-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #29339:
URL: https://github.com/apache/spark/pull/29339#issuecomment-725238415


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29339: [SPARK-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-11-10 Thread GitBox


SparkQA commented on pull request #29339:
URL: https://github.com/apache/spark/pull/29339#issuecomment-725238388


   Kubernetes integration test status failure
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35512/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #27429: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #27429:
URL: https://github.com/apache/spark/pull/27429#issuecomment-725237671







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements should failed if index is out of bound.

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521149081



##
File path: docs/sql-ref-ansi-compliance.md
##
@@ -111,6 +111,8 @@ SELECT * FROM t;
 
 The behavior of some SQL functions can be different under ANSI mode 
(`spark.sql.ansi.enabled=true`).
   - `size`: This function returns null for null input under ANSI mode.
+  - `element_at`: This function throws `ArrayIndexOutOfBoundsException` if 
using invalid indices under ANSI mode.
+  - `elt`: This function throws `ArrayIndexOutOfBoundsException` if using 
invalid indices under ANSI mode.

Review comment:
   nit: how about removing "`under ANSI mode`" in each entry? They look 
redundant.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #27429: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression

2020-11-10 Thread GitBox


AmplabJenkins commented on pull request #27429:
URL: https://github.com/apache/spark/pull/27429#issuecomment-725237671







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30309: [WIP][SPARK-33407][PYTHON] Simplify the exception message from Python UDFs

2020-11-10 Thread GitBox


SparkQA commented on pull request #30309:
URL: https://github.com/apache/spark/pull/30309#issuecomment-725237233


   **[Test build #130908 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130908/testReport)**
 for PR 30309 at commit 
[`6b74f5f`](https://github.com/apache/spark/commit/6b74f5f04b44c78708ffbd26470316303bb657ef).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #27429: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression

2020-11-10 Thread GitBox


SparkQA removed a comment on pull request #27429:
URL: https://github.com/apache/spark/pull/27429#issuecomment-725082859


   **[Test build #130898 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130898/testReport)**
 for PR 27429 at commit 
[`bbe10d6`](https://github.com/apache/spark/commit/bbe10d61cd3845ba0d0a031dfacde1a2861df8db).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30297: [SPARK-33386][SQL] Accessing array elements should failed if index is out of bound.

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30297:
URL: https://github.com/apache/spark/pull/30297#discussion_r521147000



##
File path: docs/sql-ref-ansi-compliance.md
##
@@ -111,6 +111,8 @@ SELECT * FROM t;
 
 The behavior of some SQL functions can be different under ANSI mode 
(`spark.sql.ansi.enabled=true`).
   - `size`: This function returns null for null input under ANSI mode.
+  - `element_at`: This function throws `ArrayIndexOutOfBoundsException` if 
using invalid indices under ANSI mode.
+  - `elt`: This function throws `ArrayIndexOutOfBoundsException` if using 
invalid indices under ANSI mode.

Review comment:
   I think its better to describe the behaviour change of `GetArrayItem`, 
too, so how about creating a new subsection for it like `Other SQL Operations`?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #27429: [SPARK-28330][SQL] Support ANSI SQL: result offset clause in query expression

2020-11-10 Thread GitBox


SparkQA commented on pull request #27429:
URL: https://github.com/apache/spark/pull/27429#issuecomment-725236676


   **[Test build #130898 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130898/testReport)**
 for PR 27429 at commit 
[`bbe10d6`](https://github.com/apache/spark/commit/bbe10d61cd3845ba0d0a031dfacde1a2861df8db).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30299: [SPARK-33389][SQL] Make internal classes of SparkSession always using active SQLConf

2020-11-10 Thread GitBox


luluorta commented on a change in pull request #30299:
URL: https://github.com/apache/spark/pull/30299#discussion_r521143561



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/BooleanSimplificationSuite.scala
##
@@ -188,27 +188,29 @@ class BooleanSimplificationSuite extends PlanTest with 
ExpressionEvalHelper with
 checkCondition(!(('e || 'f) && ('g || 'h)), (!'e && !'f) || (!'g && !'h))
   }
 
-  private val caseInsensitiveConf = new SQLConf().copy(SQLConf.CASE_SENSITIVE 
-> false)
-  private val caseInsensitiveAnalyzer = new Analyzer(
-new SessionCatalog(new InMemoryCatalog, EmptyFunctionRegistry, 
caseInsensitiveConf),
-caseInsensitiveConf)
+  private val analyzer = new Analyzer(
+new SessionCatalog(new InMemoryCatalog, EmptyFunctionRegistry))
 
   test("(a && b) || (a && c) => a && (b || c) when case insensitive") {
-val plan = caseInsensitiveAnalyzer.execute(
-  testRelation.where(('a > 2 && 'b > 3) || ('A > 2 && 'b < 5)))
-val actual = Optimize.execute(plan)
-val expected = caseInsensitiveAnalyzer.execute(
-  testRelation.where('a > 2 && ('b > 3 || 'b < 5)))
-comparePlans(actual, expected)
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  val plan = analyzer.execute(
+testRelation.where(('a > 2 && 'b > 3) || ('A > 2 && 'b < 5)))
+  val actual = Optimize.execute(plan)
+  val expected = analyzer.execute(
+testRelation.where('a > 2 && ('b > 3 || 'b < 5)))
+  comparePlans(actual, expected)
+}
   }
 
   test("(a || b) && (a || c) => a || (b && c) when case insensitive") {
-val plan = caseInsensitiveAnalyzer.execute(
-  testRelation.where(('a > 2 || 'b > 3) && ('A > 2 || 'b < 5)))
-val actual = Optimize.execute(plan)
-val expected = caseInsensitiveAnalyzer.execute(
-  testRelation.where('a > 2 || ('b > 3 && 'b < 5)))
-comparePlans(actual, expected)
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {

Review comment:
   reverted

##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/HiveTableScanExec.scala
##
@@ -60,8 +59,6 @@ case class HiveTableScanExec(
   require(partitionPruningPred.isEmpty || relation.isPartitioned,
 "Partition pruning predicates only supported for partitioned tables.")
 
-  override def conf: SQLConf = sparkSession.sessionState.conf

Review comment:
   reverted





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30299: [SPARK-33389][SQL] Make internal classes of SparkSession always using active SQLConf

2020-11-10 Thread GitBox


luluorta commented on a change in pull request #30299:
URL: https://github.com/apache/spark/pull/30299#discussion_r521143263



##
File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/BooleanSimplificationSuite.scala
##
@@ -188,27 +188,29 @@ class BooleanSimplificationSuite extends PlanTest with 
ExpressionEvalHelper with
 checkCondition(!(('e || 'f) && ('g || 'h)), (!'e && !'f) || (!'g && !'h))
   }
 
-  private val caseInsensitiveConf = new SQLConf().copy(SQLConf.CASE_SENSITIVE 
-> false)
-  private val caseInsensitiveAnalyzer = new Analyzer(
-new SessionCatalog(new InMemoryCatalog, EmptyFunctionRegistry, 
caseInsensitiveConf),
-caseInsensitiveConf)
+  private val analyzer = new Analyzer(
+new SessionCatalog(new InMemoryCatalog, EmptyFunctionRegistry))
 
   test("(a && b) || (a && c) => a && (b || c) when case insensitive") {
-val plan = caseInsensitiveAnalyzer.execute(
-  testRelation.where(('a > 2 && 'b > 3) || ('A > 2 && 'b < 5)))
-val actual = Optimize.execute(plan)
-val expected = caseInsensitiveAnalyzer.execute(
-  testRelation.where('a > 2 && ('b > 3 || 'b < 5)))
-comparePlans(actual, expected)
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {

Review comment:
   reverted





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30299: [SPARK-33389][SQL] Make internal classes of SparkSession always using active SQLConf

2020-11-10 Thread GitBox


luluorta commented on a change in pull request #30299:
URL: https://github.com/apache/spark/pull/30299#discussion_r521142212



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
##
@@ -42,7 +42,7 @@ import org.apache.spark.sql.catalyst.trees.TreeNodeRef
 import org.apache.spark.sql.catalyst.util.toPrettySQL
 import org.apache.spark.sql.connector.catalog._
 import org.apache.spark.sql.connector.catalog.CatalogV2Implicits._
-import org.apache.spark.sql.connector.catalog.TableChange.{AddColumn, After, 
ColumnChange, ColumnPosition, DeleteColumn, RenameColumn, UpdateColumnComment, 
UpdateColumnNullability, UpdateColumnPosition, UpdateColumnType}
+import org.apache.spark.sql.connector.catalog.TableChange.{First => _, _}

Review comment:
   reverted





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30299: [SPARK-33389][SQL] Make internal classes of SparkSession always using active SQLConf

2020-11-10 Thread GitBox


luluorta commented on a change in pull request #30299:
URL: https://github.com/apache/spark/pull/30299#discussion_r521142283



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala
##
@@ -61,34 +61,38 @@ class SessionCatalog(
 externalCatalogBuilder: () => ExternalCatalog,
 globalTempViewManagerBuilder: () => GlobalTempViewManager,
 functionRegistry: FunctionRegistry,
-conf: SQLConf,
 hadoopConf: Configuration,
 parser: ParserInterface,
-functionResourceLoader: FunctionResourceLoader) extends Logging {
+functionResourceLoader: FunctionResourceLoader,
+cacheSize: Int = SQLConf.get.tableRelationCacheSize,
+cacheTTL: Long = SQLConf.get.metadataCacheTTL) extends Logging with 
HasConf {
   import SessionCatalog._
   import CatalogTypes.TablePartitionSpec
 
   // For testing only.
   def this(
   externalCatalog: ExternalCatalog,
   functionRegistry: FunctionRegistry,
-  conf: SQLConf) = {
+  staticConf: SQLConf) = {

Review comment:
   reverted





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30299: [SPARK-33389][SQL] Make internal classes of SparkSession always using active SQLConf

2020-11-10 Thread GitBox


luluorta commented on a change in pull request #30299:
URL: https://github.com/apache/spark/pull/30299#discussion_r521141904



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/HasConf.scala
##
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst
+
+import org.apache.spark.sql.internal.SQLConf
+
+/**
+ * Trait for shared SQLConf.

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30325: [SPARK-33416][SQL] Avoid Hive metastore stack overflow when InSet predicate have many values

2020-11-10 Thread GitBox


SparkQA commented on pull request #30325:
URL: https://github.com/apache/spark/pull/30325#issuecomment-725231145


   **[Test build #130907 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130907/testReport)**
 for PR 30325 at commit 
[`f9d018c`](https://github.com/apache/spark/commit/f9d018c28a860c65c5ba47bd28f2a23a3b5d2be3).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30299: [SPARK-33389][SQL] Make internal classes of SparkSession always using active SQLConf

2020-11-10 Thread GitBox


luluorta commented on a change in pull request #30299:
URL: https://github.com/apache/spark/pull/30299#discussion_r521134961



##
File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql
##
@@ -149,7 +149,7 @@ select to_timestamp('2019-10-06 A', '-MM-dd G');
 select to_timestamp('22 05 2020 Friday', 'dd MM  EE');
 select to_timestamp('22 05 2020 Friday', 'dd MM  E');
 select unix_timestamp('22 05 2020 Friday', 'dd MM  E');
-select from_json('{"time":"26/October/2015"}', 'time Timestamp', 
map('timestampFormat', 'dd/M/'));
+select from_json('{"timestamp":"26/October/2015"}', 'timestamp Timestamp', 
map('timestampFormat', 'dd/M/'));

Review comment:
   After this PR, dynamically set "spark.sql.ansi.enabled" actually takes 
effect in parsing phase. This query will fails parsing cause `time` is a 
reserved key word of SQL standard.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #29339: [SPARK-32512][SQL] add alter table add/drop partition command for datasourcev2

2020-11-10 Thread GitBox


SparkQA commented on pull request #29339:
URL: https://github.com/apache/spark/pull/29339#issuecomment-725228809


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35512/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30326: [MINOR][GRAPHX] Correct typos

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #30326:
URL: https://github.com/apache/spark/pull/30326#issuecomment-725228634







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30326: [MINOR][GRAPHX] Correct typos

2020-11-10 Thread GitBox


SparkQA commented on pull request #30326:
URL: https://github.com/apache/spark/pull/30326#issuecomment-725228620


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35511/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30326: [MINOR][GRAPHX] Correct typos

2020-11-10 Thread GitBox


AmplabJenkins commented on pull request #30326:
URL: https://github.com/apache/spark/pull/30326#issuecomment-725228634







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30315: [SPARK-33388][SQL] Merge In and InSet predicate

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #30315:
URL: https://github.com/apache/spark/pull/30315#issuecomment-725228239







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30315: [SPARK-33388][SQL] Merge In and InSet predicate

2020-11-10 Thread GitBox


AmplabJenkins commented on pull request #30315:
URL: https://github.com/apache/spark/pull/30315#issuecomment-725228239







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on pull request #30324: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread GitBox


maropu commented on pull request #30324:
URL: https://github.com/apache/spark/pull/30324#issuecomment-725228254


   Thanks, @dongjoon-hyun @HyukjinKwon ! Merged to master/branch-3.0/branch-2.4.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on pull request #30315: [SPARK-33388][SQL] Merge In and InSet predicate

2020-11-10 Thread GitBox


SparkQA removed a comment on pull request #30315:
URL: https://github.com/apache/spark/pull/30315#issuecomment-725078337


   **[Test build #130897 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130897/testReport)**
 for PR 30315 at commit 
[`a3675d9`](https://github.com/apache/spark/commit/a3675d92941b0db08d2b7a36e63d3076f200797e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu closed pull request #30324: [SPARK-33417][SQL][TEST] Correct the behaviour of query filters in TPCDSQueryBenchmark

2020-11-10 Thread GitBox


maropu closed pull request #30324:
URL: https://github.com/apache/spark/pull/30324


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30315: [SPARK-33388][SQL] Merge In and InSet predicate

2020-11-10 Thread GitBox


SparkQA commented on pull request #30315:
URL: https://github.com/apache/spark/pull/30315#issuecomment-725227411


   **[Test build #130897 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/130897/testReport)**
 for PR 30315 at commit 
[`a3675d9`](https://github.com/apache/spark/commit/a3675d92941b0db08d2b7a36e63d3076f200797e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `trait BaseIn extends Predicate `
 * `case class In(value: Expression, override val list: Seq[Expression]) 
extends BaseIn `
 * `case class InSet(value: Expression, override val hset: Set[Any]) 
extends BaseIn `



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on pull request #30318: [SPARK-33412][SQL] OverwriteByExpression should resolve its delete condition based on the table relation not the input query

2020-11-10 Thread GitBox


AmplabJenkins removed a comment on pull request #30318:
URL: https://github.com/apache/spark/pull/30318#issuecomment-725226485







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30318: [SPARK-33412][SQL] OverwriteByExpression should resolve its delete condition based on the table relation not the input query

2020-11-10 Thread GitBox


SparkQA commented on pull request #30318:
URL: https://github.com/apache/spark/pull/30318#issuecomment-725226468


   Kubernetes integration test status success
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35510/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on pull request #30318: [SPARK-33412][SQL] OverwriteByExpression should resolve its delete condition based on the table relation not the input query

2020-11-10 Thread GitBox


AmplabJenkins commented on pull request #30318:
URL: https://github.com/apache/spark/pull/30318#issuecomment-725226485







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


wangyum commented on a change in pull request #2:
URL: https://github.com/apache/spark/pull/2#discussion_r521135433



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##
@@ -178,6 +180,86 @@ case class Like(left: Expression, right: Expression, 
escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes 
with NullIntolerant {
+
+  protected def patterns: Seq[Any]
+
+  protected def isNotDefined: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+.map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+if (hasNull) {
+  null

Review comment:
   ```sql
   spark-sql> select 'a' like all ('%a%', null);
   NULL
   spark-sql> select 'a' not like all ('%a%', null);
   false
   spark-sql> select 'a' like any ('%a%', null);
   true
   spark-sql> select 'a' not like any ('%a%', null);
   NULL
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30325: [SPARK-33416][SQL] Avoid Hive metastore stack overflow when InSet predicate have many values

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30325:
URL: https://github.com/apache/spark/pull/30325#discussion_r521126955



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##
@@ -1267,9 +1267,19 @@ private[spark] class HiveExternalCatalog(conf: 
SparkConf, hadoopConf: Configurat
 val catalogTable = restoreTableMetadata(rawTable)
 
 val partColNameMap = buildLowerCasePartColNameMap(catalogTable)
+val hivePredicates = predicates.map {
+  // Avoid Hive metastore stack overflow.
+  case InSet(child, values)

Review comment:
   Ah, ok. Thanks.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30325: [SPARK-33416][SQL] Avoid Hive metastore stack overflow when InSet predicate have many values

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30325:
URL: https://github.com/apache/spark/pull/30325#discussion_r521135541



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/StaticSQLConf.scala
##
@@ -54,6 +54,15 @@ object StaticSQLConf {
 .transform(_.toLowerCase(Locale.ROOT))
 .createWithDefault("global_temp")
 
+  val HIVE_METASTORE_PARTITION_PRUNING_INSET_THRESHOLD =

Review comment:
   hm, I see. I think users might lower the value on runtime just after 
they see the exception, so  IMO it is useful that users can update the value on 
runtime.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


wangyum commented on a change in pull request #2:
URL: https://github.com/apache/spark/pull/2#discussion_r521135433



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/regexpExpressions.scala
##
@@ -178,6 +180,86 @@ case class Like(left: Expression, right: Expression, 
escapeChar: Char)
   }
 }
 
+/**
+ * Optimized version of LIKE ALL, when all pattern values are literal.
+ */
+abstract class LikeAllBase extends UnaryExpression with ImplicitCastInputTypes 
with NullIntolerant {
+
+  protected def patterns: Seq[Any]
+
+  protected def isNotDefined: Boolean
+
+  override def inputTypes: Seq[DataType] = StringType :: Nil
+
+  override def dataType: DataType = BooleanType
+
+  override def nullable: Boolean = true
+
+  private lazy val hasNull: Boolean = patterns.contains(null)
+
+  private lazy val cache = patterns.filterNot(_ == null)
+.map(s => Pattern.compile(StringUtils.escapeLikeRegex(s.toString, '\\')))
+
+  override def eval(input: InternalRow): Any = {
+if (hasNull) {
+  null

Review comment:
   ```sql
   spark-sql> select 'a' like all ('%a%', null);
   NULL
   spark-sql> select 'a' not like all ('%a%', null);
   false
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] luluorta commented on a change in pull request #30299: [SPARK-33389][SQL] Make internal classes of SparkSession always using active SQLConf

2020-11-10 Thread GitBox


luluorta commented on a change in pull request #30299:
URL: https://github.com/apache/spark/pull/30299#discussion_r521134961



##
File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql
##
@@ -149,7 +149,7 @@ select to_timestamp('2019-10-06 A', '-MM-dd G');
 select to_timestamp('22 05 2020 Friday', 'dd MM  EE');
 select to_timestamp('22 05 2020 Friday', 'dd MM  E');
 select unix_timestamp('22 05 2020 Friday', 'dd MM  E');
-select from_json('{"time":"26/October/2015"}', 'time Timestamp', 
map('timestampFormat', 'dd/M/'));
+select from_json('{"timestamp":"26/October/2015"}', 'timestamp Timestamp', 
map('timestampFormat', 'dd/M/'));

Review comment:
   After this PR, dynamically set "spark.sql.ansi.enabled" actually takes 
effect in parsing phase. This query will failed parsing cause `time` is a 
reserved key word of SQL standard.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] beliefer commented on a change in pull request #29999: [SPARK-33045][SQL] Support build-in function like_all and fix StackOverflowError issue.

2020-11-10 Thread GitBox


beliefer commented on a change in pull request #2:
URL: https://github.com/apache/spark/pull/2#discussion_r521133744



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala
##
@@ -102,6 +102,8 @@ package object dsl {
 def like(other: Expression, escapeChar: Char = '\\'): Expression =
   Like(expr, other, escapeChar)
 def rlike(other: Expression): Expression = RLike(expr, other)
+def likeAll(others: Literal*): Expression = LikeAll(expr, 
others.map(_.eval(EmptyRow)))

Review comment:
   OK





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30225: [SPARK-33187][SQL] Add a check on the number of returned partitions in the HiveShim#getPartitionsByFilter method

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30225:
URL: https://github.com/apache/spark/pull/30225#discussion_r521132554



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -815,6 +815,14 @@ object SQLConf {
   .booleanConf
   .createWithDefault(true)
 
+  val HIVE_METASTORE_PARTITION_LIMIT =
+buildConf("spark.sql.hive.metastorePartitionLimit")
+  .doc("The maximum number of metastore partitions allowed. Use -1 for 
unlimited.")

Review comment:
   The current statement looks a bit ambiguous, so how about following the 
statement of the Hive config you pointed out?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30225: [SPARK-33187][SQL] Add a check on the number of returned partitions in the HiveShim#getPartitionsByFilter method

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30225:
URL: https://github.com/apache/spark/pull/30225#discussion_r521132100



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala
##
@@ -773,26 +773,7 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
 filters.flatMap(convert).mkString(" and ")
   }
 
-  private def quoteStringLiteral(str: String): String = {

Review comment:
   Why did you move this func?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30326: [MINOR][GRAPHX] Correct typos

2020-11-10 Thread GitBox


SparkQA commented on pull request #30326:
URL: https://github.com/apache/spark/pull/30326#issuecomment-725221115


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35511/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30225: [SPARK-33187][SQL] Add a check on the number of returned partitions in the HiveShim#getPartitionsByFilter method

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30225:
URL: https://github.com/apache/spark/pull/30225#discussion_r521131433



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala
##
@@ -1233,6 +1242,34 @@ private[client] class Shim_v2_1 extends Shim_v2_0 {
   override def alterPartitions(hive: Hive, tableName: String, newParts: 
JList[Partition]): Unit = {
 alterPartitionsMethod.invoke(hive, tableName, newParts, 
environmentContextInAlterTable)
   }
+
+  override def getPartitionsByFilter(
+  hive: Hive,
+  table: Table,
+  predicates: Seq[Expression]): Seq[Partition] = {
+
+// Hive getPartitionsByFilter() takes a string that represents partition
+// predicates like "str_key=\"value\" and int_key=1 ..."
+val filter = convertFilters(table, predicates)
+
+val limit = SQLConf.get.metastorePartitionLimit
+if (limit > -1) {
+  val num = try {
+getNumPartitionsByFilterMethod.invoke(hive, table, 
filter).asInstanceOf[Int]
+  } catch {
+case ex: Exception =>
+  logWarning("Caught Hive MetaException attempting to get partition 
metadata by " +

Review comment:
   Could you make the message clearer? e.g., ...get the number of 
partitions from , but  using 0 for a partition number





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30225: [SPARK-33187][SQL] Add a check on the number of returned partitions in the HiveShim#getPartitionsByFilter method

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30225:
URL: https://github.com/apache/spark/pull/30225#discussion_r521130875



##
File path: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala
##
@@ -1233,6 +1242,34 @@ private[client] class Shim_v2_1 extends Shim_v2_0 {
   override def alterPartitions(hive: Hive, tableName: String, newParts: 
JList[Partition]): Unit = {
 alterPartitionsMethod.invoke(hive, tableName, newParts, 
environmentContextInAlterTable)
   }
+
+  override def getPartitionsByFilter(
+  hive: Hive,
+  table: Table,
+  predicates: Seq[Expression]): Seq[Partition] = {
+
+// Hive getPartitionsByFilter() takes a string that represents partition
+// predicates like "str_key=\"value\" and int_key=1 ..."
+val filter = convertFilters(table, predicates)
+
+val limit = SQLConf.get.metastorePartitionLimit
+if (limit > -1) {
+  val num = try {
+getNumPartitionsByFilterMethod.invoke(hive, table, 
filter).asInstanceOf[Int]
+  } catch {
+case ex: Exception =>

Review comment:
   `case NonFatal(_) `?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] wangyum commented on a change in pull request #30325: [SPARK-33416][SQL] Avoid Hive metastore stack overflow when InSet predicate have many values

2020-11-10 Thread GitBox


wangyum commented on a change in pull request #30325:
URL: https://github.com/apache/spark/pull/30325#discussion_r521130075



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/StaticSQLConf.scala
##
@@ -54,6 +54,15 @@ object StaticSQLConf {
 .transform(_.toLowerCase(Locale.ROOT))
 .createWithDefault("global_temp")
 
+  val HIVE_METASTORE_PARTITION_PRUNING_INSET_THRESHOLD =

Review comment:
   Yes, for 2 reasons:
   1. This parameter should be set according to your own Hive Metastore and 
does not need to be modified frequently.
   2. All SQL configs in `HiveExternalCatalog` are static config, e.g.: 
`SCHEMA_STRING_LENGTH_THRESHOLD` and `DEBUG_MODE`.
   
   Of course, we can make this parameter to runtime config if needed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on pull request #30318: [SPARK-33412][SQL] OverwriteByExpression should resolve its delete condition based on the table relation not the input query

2020-11-10 Thread GitBox


SparkQA commented on pull request #30318:
URL: https://github.com/apache/spark/pull/30318#issuecomment-725218357


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35510/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30225: [SPARK-33187][SQL] Add a check on the number of returned partitions in the HiveShim#getPartitionsByFilter method

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30225:
URL: https://github.com/apache/spark/pull/30225#discussion_r521128972



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -815,6 +815,14 @@ object SQLConf {
   .booleanConf
   .createWithDefault(true)
 
+  val HIVE_METASTORE_PARTITION_LIMIT =
+buildConf("spark.sql.hive.metastorePartitionLimit")
+  .doc("The maximum number of metastore partitions allowed. Use -1 for 
unlimited.")
+  .version("3.0.2")
+  .intConf
+  .checkValue(_ >= -1, "The maximum must be a positive integer, -1 to 
apply no limit.")
+  .createWithDefault(10)

Review comment:
   `-1` to follow the Hive config?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] maropu commented on a change in pull request #30225: [SPARK-33187][SQL] Add a check on the number of returned partitions in the HiveShim#getPartitionsByFilter method

2020-11-10 Thread GitBox


maropu commented on a change in pull request #30225:
URL: https://github.com/apache/spark/pull/30225#discussion_r521128775



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##
@@ -815,6 +815,14 @@ object SQLConf {
   .booleanConf
   .createWithDefault(true)
 
+  val HIVE_METASTORE_PARTITION_LIMIT =
+buildConf("spark.sql.hive.metastorePartitionLimit")
+  .doc("The maximum number of metastore partitions allowed. Use -1 for 
unlimited.")
+  .version("3.0.2")

Review comment:
   `3.0.2` -> `3.1.0`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   7   8   9   10   >