date:20200813

[GitHub] [spark] cloud-fan closed pull request #29423: [SPARK-20680][SQL][FOLLOW-UP] Add HiveVoidType in HiveClientImpl

2020-08-13 Thread GitBox



cloud-fan closed pull request #29423:
URL: https://github.com/apache/spark/pull/29423


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #29423: [SPARK-20680][SQL][FOLLOW-UP] Add HiveVoidType in HiveClientImpl

2020-08-13 Thread GitBox



cloud-fan commented on pull request #29423:
URL: https://github.com/apache/spark/pull/29423#issuecomment-673921605


   thanks, merging to master!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on a change in pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-13 Thread GitBox



maropu commented on a change in pull request #29428:
URL: https://github.com/apache/spark/pull/29428#discussion_r470442824



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/SparkSqlParserSuite.scala
##
@@ -330,4 +331,44 @@ class SparkSqlParserSuite extends AnalysisTest {
 assertEqual("ADD FILE /path with space/abc.txt", AddFileCommand("/path 
with space/abc.txt"))
 assertEqual("ADD JAR /path with space/abc.jar", AddJarCommand("/path with 
space/abc.jar"))
   }
+
+  test("SPARK-32608: script transform with row format delimit") {
+assertEqual(

Review comment:
   Could you add end-2-end tests, too?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #28939: [SPARK-32119][CORE] ExecutorPlugin doesn't work with Standalone Cluster and Kubernetes with --jars

2020-08-13 Thread GitBox



SparkQA commented on pull request #28939:
URL: https://github.com/apache/spark/pull/28939#issuecomment-673919276


   Kubernetes integration test starting
   URL: 
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/32059/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29378: [SPARK-30069][CORE][YARN] Clean up non-shuffle disk block manager files following executor exists on YARN

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29378:
URL: https://github.com/apache/spark/pull/29378#issuecomment-673917333







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AngersZhuuuu edited a comment on pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-13 Thread GitBox



AngersZh edited a comment on pull request #29428:
URL: https://github.com/apache/spark/pull/29428#issuecomment-673841449


   FYI @maropu @cloud-fan 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29378: [SPARK-30069][CORE][YARN] Clean up non-shuffle disk block manager files following executor exists on YARN

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29378:
URL: https://github.com/apache/spark/pull/29378#issuecomment-673917333







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29378: [SPARK-30069][CORE][YARN] Clean up non-shuffle disk block manager files following executor exists on YARN

2020-08-13 Thread GitBox



SparkQA commented on pull request #29378:
URL: https://github.com/apache/spark/pull/29378#issuecomment-673916869


   **[Test build #127441 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127441/testReport)**
 for PR 29378 at commit 
[`21e6f60`](https://github.com/apache/spark/commit/21e6f609c4f66861c9b2e87d178fe160811dced5).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673914905







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673914905







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29082:
URL: https://github.com/apache/spark/pull/29082#issuecomment-673913916


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127434/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



SparkQA commented on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673914502


   **[Test build #127440 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127440/testReport)**
 for PR 29430 at commit 
[`d46483f`](https://github.com/apache/spark/commit/d46483ff8e493fbdebd73a0713afb7cfd6e44e8e).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29082:
URL: https://github.com/apache/spark/pull/29082#issuecomment-673913912


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29082:
URL: https://github.com/apache/spark/pull/29082#issuecomment-673913912







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-13 Thread GitBox



SparkQA removed a comment on pull request #29082:
URL: https://github.com/apache/spark/pull/29082#issuecomment-673880029


   **[Test build #127434 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127434/testReport)**
 for PR 29082 at commit 
[`b231182`](https://github.com/apache/spark/commit/b23118243bf89f4afebc13640743cc92ff3bb15f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-13 Thread GitBox



SparkQA commented on pull request #29082:
URL: https://github.com/apache/spark/pull/29082#issuecomment-673913538


   **[Test build #127434 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127434/testReport)**
 for PR 29082 at commit 
[`b231182`](https://github.com/apache/spark/commit/b23118243bf89f4afebc13640743cc92ff3bb15f).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox



cloud-fan commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470436157



##
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/window/WindowExecBase.scala
##
@@ -184,6 +188,18 @@ trait WindowExecBase extends UnaryExecNode {
   MutableProjection.create(expressions, schema),
 offset)
 
+  case ("WHOLE_OFFSET", _, IntegerLiteral(offset), _) =>
+target: InternalRow =>
+  new FixedOffsetWindowFunctionFrame(

Review comment:
   nit: `PartitionBasedOffsetWindowFunctionFrame`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gatorsmile commented on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



gatorsmile commented on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673913013


   This is against 2.4. Could you also check whether the master branch still 
has such an issue?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673907864


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox



cloud-fan commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470435851



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,54 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the `offset`th row of 
beginning
+ * the window partition (counting from 1). Offsets start at 1. When the value 
of `input` is null

Review comment:
   We remove `(counting from 1)` as it's duplicated





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox



cloud-fan commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470435737



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -474,6 +479,54 @@ case class Lag(input: Expression, offset: Expression, 
default: Expression)
   override val direction = Descending
 }
 
+/**
+ * The NthValue function returns the value of `input` at the `offset`th row of 
beginning

Review comment:
   `of beginning` -> `from beginning of the window partition.`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gatorsmile commented on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



gatorsmile commented on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673912640


   ok to test



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox



cloud-fan commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470435293



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -363,6 +363,11 @@ abstract class OffsetWindowFunction
*/
   val direction: SortDirection
 
+  /**
+   * Whether the offset is based on the entire frame.

Review comment:
   nit
   ```
   /**
* Whether the offset is based on the entire partition.
*/
   def isOffsetPartitionBased ...
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox



cloud-fan commented on a change in pull request #28685:
URL: https://github.com/apache/spark/pull/28685#discussion_r470435293



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/windowExpressions.scala
##
@@ -363,6 +363,11 @@ abstract class OffsetWindowFunction
*/
   val direction: SortDirection
 
+  /**
+   * Whether the offset is based on the entire frame.

Review comment:
   nit
   ```
   /**
* Whether the offset is based on the entire partitions.
*/
   def isOffsetPartitionBased ...
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gatorsmile commented on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



gatorsmile commented on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673912266


   CC @cloud-fan @MaxGekk 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673910191







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673910191







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



SparkQA commented on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673909855


   **[Test build #127439 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127439/testReport)**
 for PR 29429 at commit 
[`a9dc47e`](https://github.com/apache/spark/commit/a9dc47eac12e6d4ef4f482eaa81796b65c977c9c).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673907450


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon closed pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



HyukjinKwon closed pull request #29429:
URL: https://github.com/apache/spark/pull/29429


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673907864


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29428:
URL: https://github.com/apache/spark/pull/29428#issuecomment-673907599







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29428:
URL: https://github.com/apache/spark/pull/29428#issuecomment-673907599







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29430:
URL: https://github.com/apache/spark/pull/29430#issuecomment-673907450


   Can one of the admins verify this patch?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-13 Thread GitBox



SparkQA removed a comment on pull request #29428:
URL: https://github.com/apache/spark/pull/29428#issuecomment-673842527


   **[Test build #127430 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127430/testReport)**
 for PR 29428 at commit 
[`b4d816e`](https://github.com/apache/spark/commit/b4d816e26766923a40c42d2b3ae4356802b16886).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29428: [SPARK-32608][SQL] Script Transform ROW FORMAT DELIMIT value should format value

2020-08-13 Thread GitBox



SparkQA commented on pull request #29428:
URL: https://github.com/apache/spark/pull/29428#issuecomment-673906820


   **[Test build #127430 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127430/testReport)**
 for PR 29428 at commit 
[`b4d816e`](https://github.com/apache/spark/commit/b4d816e26766923a40c42d2b3ae4356802b16886).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



HyukjinKwon commented on pull request #29333:
URL: https://github.com/apache/spark/pull/29333#issuecomment-673906662


   Okay .. I checked that it works in both PR and commit. There look a bit of 
delay roughly 2 ~ 5 minutes to have the test report after the main tests are 
finished but I think this is good enough.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] mingjialiu opened a new pull request #29430: [SPARK-32609] Incorrect exchange reuse with DataSourceV2

2020-08-13 Thread GitBox



mingjialiu opened a new pull request #29430:
URL: https://github.com/apache/spark/pull/29430


   
   
   ### What changes were proposed in this pull request?
   Compare pushedFilters in DataSourceV2ScanExec's equals function.
   
   
   
   
   ### Why are the changes needed?
   Scans with different filters were considered equal, thus causing incorrect 
exchange reuse. This change fix the issue.
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   no.
   
   
   
   ### How was this patch tested?
   unit test coverage and ad hoc verification. Without my change, the unit test 
will fail.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673905624







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673905624







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



SparkQA commented on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673905236


   **[Test build #127437 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127437/testReport)**
 for PR 29429 at commit 
[`ed2bb56`](https://github.com/apache/spark/commit/ed2bb56d01f6187d27c891daf3aea2971c96013d).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673905175







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673905175







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #28939: [SPARK-32119][CORE] ExecutorPlugin doesn't work with Standalone Cluster and Kubernetes with --jars

2020-08-13 Thread GitBox



SparkQA commented on pull request #28939:
URL: https://github.com/apache/spark/pull/28939#issuecomment-673905273


   **[Test build #127438 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127438/testReport)**
 for PR 28939 at commit 
[`e57d053`](https://github.com/apache/spark/commit/e57d053990e634fc6f03afbea761a9f32eeaee89).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu edited a comment on pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



maropu edited a comment on pull request #29333:
URL: https://github.com/apache/spark/pull/29333#issuecomment-673902655


   Thanks, @HyukjinKwon @cpintado ! Great work and the results became very easy 
to see!!!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] sarutak edited a comment on pull request #28939: [SPARK-32119][CORE] ExecutorPlugin doesn't work with Standalone Cluster and Kubernetes with --jars

2020-08-13 Thread GitBox



sarutak edited a comment on pull request #28939:
URL: https://github.com/apache/spark/pull/28939#issuecomment-673904151


   > This looks good to me - once Tom's comment is addressed.
   
   Oops, I've forgotten to remove the unused code. Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] sarutak commented on pull request #28939: [SPARK-32119][CORE] ExecutorPlugin doesn't work with Standalone Cluster and Kubernetes with --jars

2020-08-13 Thread GitBox



sarutak commented on pull request #28939:
URL: https://github.com/apache/spark/pull/28939#issuecomment-673904151


   > This looks good to me - once Tom's comment is addressed.
   Oops, I've forgotten to remove the unused code. Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] maropu commented on pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



maropu commented on pull request #29333:
URL: https://github.com/apache/spark/pull/29333#issuecomment-673902655


   Thanks, @HyukjinKwon @cpintado ! Great work!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-13 Thread GitBox



Ngone51 commented on pull request #29413:
URL: https://github.com/apache/spark/pull/29413#issuecomment-673901356


   @itskals  Ah..no. I was not saying configuring individual event sizes is 
easy. I was just confused about @tgravescs 's comment.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-13 Thread GitBox



SparkQA commented on pull request #29395:
URL: https://github.com/apache/spark/pull/29395#issuecomment-673900225


   **[Test build #127436 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127436/testReport)**
 for PR 29395 at commit 
[`9c18479`](https://github.com/apache/spark/commit/9c18479e0b71c5b6ec1a2a0f268c598cf03fa879).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29395:
URL: https://github.com/apache/spark/pull/29395#issuecomment-673898076







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29395:
URL: https://github.com/apache/spark/pull/29395#issuecomment-673898076







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29270: [SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29270:
URL: https://github.com/apache/spark/pull/29270#issuecomment-673896427


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127429/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29270: [SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29270:
URL: https://github.com/apache/spark/pull/29270#issuecomment-673896423


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29270: [SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29270:
URL: https://github.com/apache/spark/pull/29270#issuecomment-673896423







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #29270: [SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

2020-08-13 Thread GitBox



SparkQA removed a comment on pull request #29270:
URL: https://github.com/apache/spark/pull/29270#issuecomment-673840529


   **[Test build #127429 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127429/testReport)**
 for PR 29270 at commit 
[`891346e`](https://github.com/apache/spark/commit/891346e6b541cc181f1aa5213d0540330bdf99ec).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29270: [SPARK-32466][TEST][SQL] Add PlanStabilitySuite to detect SparkPlan regression

2020-08-13 Thread GitBox



SparkQA commented on pull request #29270:
URL: https://github.com/apache/spark/pull/29270#issuecomment-673896023


   **[Test build #127429 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127429/testReport)**
 for PR 29270 at commit 
[`891346e`](https://github.com/apache/spark/commit/891346e6b541cc181f1aa5213d0540330bdf99ec).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Ngone51 commented on a change in pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-13 Thread GitBox



Ngone51 commented on a change in pull request #29395:
URL: https://github.com/apache/spark/pull/29395#discussion_r470420645



##
File path: core/src/main/scala/org/apache/spark/internal/config/Tests.scala
##
@@ -61,4 +61,19 @@ private[spark] object Tests {
 .version("3.0.0")
 .intConf
 .createWithDefault(2)
+
+  val RESOURCES_WARNING_TESTING = 
ConfigBuilder("spark.resources.warnings.testing")
+.version("3.1.0")
+.booleanConf
+.createWithDefault(false)
+
+  // This configuration is used for unit tests to allow skipping the task cpus 
to cores validation
+  // to allow emulating standalone mode behavior while running in local mode. 
Standalone mode
+  // by default doesn't specify a number of executor cores, it just uses all 
the ones available
+  // on the host.
+  val SKIP_VALIDATE_CORES_TESTING =
+  ConfigBuilder("spark.testing.skipValidateCores")
+.version("3.1.0")
+.booleanConf
+.createWithDefault(false)

Review comment:
   Thank you @dongjoon-hyun for letting me know. I was wondering about it 
previously.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #29370: [SPARK-32526][SQL]Fix some test cases of `sql/catalyst` module in scala 2.13

2020-08-13 Thread GitBox



LuciferYang commented on pull request #29370:
URL: https://github.com/apache/spark/pull/29370#issuecomment-673893066


   @srowen @HyukjinKwon I will try to give a new pr to resolve rest problems



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] gengliangwang commented on pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



gengliangwang commented on pull request #29333:
URL: https://github.com/apache/spark/pull/29333#issuecomment-673892677


   A late LGTM. Thanks for the great work!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #28685:
URL: https://github.com/apache/spark/pull/28685#issuecomment-673882009







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #28685:
URL: https://github.com/apache/spark/pull/28685#issuecomment-673882009







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #28685: [SPARK-27951][SQL] Support ANSI SQL NTH_VALUE window function

2020-08-13 Thread GitBox



SparkQA commented on pull request #28685:
URL: https://github.com/apache/spark/pull/28685#issuecomment-673881668


   **[Test build #127435 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127435/testReport)**
 for PR 28685 at commit 
[`08405e8`](https://github.com/apache/spark/commit/08405e8703cb8119530161b7a86a3d564c1224ce).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29082:
URL: https://github.com/apache/spark/pull/29082#issuecomment-673880280







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29082:
URL: https://github.com/apache/spark/pull/29082#issuecomment-673880280







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29082: [SPARK-32288][UI] Add exception summary for failed tasks in stage page

2020-08-13 Thread GitBox



SparkQA commented on pull request #29082:
URL: https://github.com/apache/spark/pull/29082#issuecomment-673880029


   **[Test build #127434 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127434/testReport)**
 for PR 29082 at commit 
[`b231182`](https://github.com/apache/spark/commit/b23118243bf89f4afebc13640743cc92ff3bb15f).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itskals edited a comment on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-13 Thread GitBox



itskals edited a comment on pull request #29413:
URL: https://github.com/apache/spark/pull/29413#issuecomment-673878203


   > Much harder? IIUC, if users have some experienced stats of the queues of 
the applications, I guess they could set the individual queues more accurately 
and we don't need such "pool" at all.
   
   @Ngone51  if configuring the event sizes was so easy, then I am fine. I am 
of the opinion that it is bit hard to arrive at right number, might need trial 
and error... Guessed, it would have been easier to configure 1 number than 3 or 
4... also with some dynamism like this PR, will help... anyways thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] itskals commented on pull request #29413: [SPARK-32597][CORE] Tune Event Drop in Async Event Queue

2020-08-13 Thread GitBox



itskals commented on pull request #29413:
URL: https://github.com/apache/spark/pull/29413#issuecomment-673878203


   > Much harder? IIUC, if users have some experienced stats of the queues of 
the applications, I guess they could set the individual queues more accurately 
and we don't need such "pool" at all.
   
   @Ngone51  if configuring the event sizes was so easy, then I am fine. 
Guessed, it would have been easier to configure 1 number than 3 or 4... thanks. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #29370: [SPARK-32526][SQL]Fix some test cases of `sql/catalyst` module in scala 2.13

2020-08-13 Thread GitBox



HyukjinKwon commented on pull request #29370:
URL: https://github.com/apache/spark/pull/29370#issuecomment-673875162


   Nice to get this being fixed!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29423: [SPARK-20680][SQL][FOLLOW-UP] Add HiveVoidType in HiveClientImpl

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29423:
URL: https://github.com/apache/spark/pull/29423#issuecomment-673874120







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29423: [SPARK-20680][SQL][FOLLOW-UP] Add HiveVoidType in HiveClientImpl

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29423:
URL: https://github.com/apache/spark/pull/29423#issuecomment-673874120







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #29423: [SPARK-20680][SQL][FOLLOW-UP] Add HiveVoidType in HiveClientImpl

2020-08-13 Thread GitBox



SparkQA removed a comment on pull request #29423:
URL: https://github.com/apache/spark/pull/29423#issuecomment-673767434


   **[Test build #127427 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127427/testReport)**
 for PR 29423 at commit 
[`57d8fd8`](https://github.com/apache/spark/commit/57d8fd86c93caf34d1586175f96df173a6239946).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29423: [SPARK-20680][SQL][FOLLOW-UP] Add HiveVoidType in HiveClientImpl

2020-08-13 Thread GitBox



SparkQA commented on pull request #29423:
URL: https://github.com/apache/spark/pull/29423#issuecomment-673873670


   **[Test build #127427 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127427/testReport)**
 for PR 29423 at commit 
[`57d8fd8`](https://github.com/apache/spark/commit/57d8fd86c93caf34d1586175f96df173a6239946).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #29427: [SPARK-25557][SQL][TEST][Followup] Add case-sensitivity test for ORC predicate pushdown

2020-08-13 Thread GitBox



viirya commented on a change in pull request #29427:
URL: https://github.com/apache/spark/pull/29427#discussion_r470399287



##
File path: 
sql/core/v1.2/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala
##
@@ -513,5 +513,98 @@ class OrcFilterSuite extends OrcTest with 
SharedSparkSession {
   ).get.toString
 }
   }
+
+  test("SPARK-25557: case sensitivity in predicate pushdown") {
+withTempPath { dir =>
+  val count = 10
+  val tableName = "spark_25557"
+  val tableDir1 = dir.getAbsoluteFile + "/table1"
+
+  // Physical ORC files have both `A` and `a` fields.
+  withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+spark.range(count).repartition(count).selectExpr("id - 1 as A", "id as 
a")
+  .write.mode("overwrite").orc(tableDir1)
+  }
+
+  // Metastore table has both `A` and `a` fields too.
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (A LONG, a LONG) USING ORC LOCATION 
'$tableDir1'
+ """.stripMargin)
+
+  checkAnswer(sql(s"select a, A from $tableName"), (0 until 
count).map(c => Row(c, c - 1)))
+
+  val actual1 = stripSparkFilter(sql(s"select A from $tableName where 
A < 0"))
+  assert(actual1.count() == 1)
+
+  val actual2 = stripSparkFilter(sql(s"select A from $tableName where 
a < 0"))
+  assert(actual2.count() == 0)
+}
+
+// Exception thrown for ambiguous case.
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  val e = intercept[AnalysisException] {
+sql(s"select a from $tableName where a < 0").collect()
+  }
+  assert(e.getMessage.contains(
+"Reference 'a' is ambiguous"))
+}
+  }
+
+  // Metastore table has only `A` field.
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (A LONG) USING ORC LOCATION 
'$tableDir1'
+ """.stripMargin)
+
+  val e = intercept[SparkException] {
+sql(s"select A from $tableName where A < 0").collect()
+  }
+  assert(e.getCause.isInstanceOf[RuntimeException] && 
e.getCause.getMessage.contains(
+"""Found duplicate field(s) "A": [A, a] in case-insensitive 
mode"""))
+}
+  }
+
+  // Physical ORC files have only `A` field.
+  val tableDir2 = dir.getAbsoluteFile + "/table2"
+  withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+spark.range(count).repartition(count).selectExpr("id - 1 as A")
+  .write.mode("overwrite").orc(tableDir2)
+  }
+
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (a LONG) USING ORC LOCATION 
'$tableDir2'
+ """.stripMargin)
+
+  checkAnswer(sql(s"select a from $tableName"), (0 until count).map(c 
=> Row(c - 1)))
+
+  val actual = stripSparkFilter(sql(s"select a from $tableName where a 
< 0"))
+  // TODO: ORC predicate pushdown should work under case-insensitive 
analysis.
+  // assert(actual.count() == 1)

Review comment:
   I use original SPARK-25557 as PR title now. If we want to backport this 
test to branch-3.0, should I create a new JIRA ticket for this?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



HyukjinKwon commented on a change in pull request #29333:
URL: https://github.com/apache/spark/pull/29333#discussion_r470397420



##
File path: .github/workflows/master.yml
##
@@ -170,13 +170,19 @@ jobs:
 # Show installed packages in R.
 sudo Rscript -e 'pkg_list <- as.data.frame(installed.packages()[, 
c(1,3:4)]); pkg_list[is.na(pkg_list$Priority), 1:2, drop = FALSE]'
 # Run the tests.
-- name: "Run tests: ${{ matrix.modules }}"
+- name: Run tests
   run: |
 # Hive tests become flaky when running in parallel as it's too 
intensive.
 if [[ "$MODULES_TO_TEST" == "hive" ]]; then export SERIAL_SBT_TESTS=1; 
fi
 mkdir -p ~/.m2
 ./dev/run-tests --parallelism 2 --modules "$MODULES_TO_TEST" 
--included-tags "$INCLUDED_TAGS" --excluded-tags "$EXCLUDED_TAGS"
 rm -rf ~/.m2/repository/org/apache/spark
+- name: Upload test results to report
+  if: always()
+  uses: actions/upload-artifact@v2

Review comment:
   Yeah, if the tests don't fail, it should upload JUnit XML files and then 
report the successful test cases. e.g.) 1000 tests passed 6 skipped 0 failures.
   
   GitHub Actions has things like `failure()` but I think we should run this 
always (to report successful cases and also failed cases).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #29427: [SPARK-25557][SQL][TEST][Followup] Add case-sensitivity test for ORC predicate pushdown

2020-08-13 Thread GitBox



viirya commented on a change in pull request #29427:
URL: https://github.com/apache/spark/pull/29427#discussion_r470398272



##
File path: 
sql/core/v1.2/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala
##
@@ -513,5 +513,98 @@ class OrcFilterSuite extends OrcTest with 
SharedSparkSession {
   ).get.toString
 }
   }
+
+  test("SPARK-25557: case sensitivity in predicate pushdown") {
+withTempPath { dir =>
+  val count = 10
+  val tableName = "spark_25557"
+  val tableDir1 = dir.getAbsoluteFile + "/table1"
+
+  // Physical ORC files have both `A` and `a` fields.
+  withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+spark.range(count).repartition(count).selectExpr("id - 1 as A", "id as 
a")
+  .write.mode("overwrite").orc(tableDir1)
+  }
+
+  // Metastore table has both `A` and `a` fields too.
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (A LONG, a LONG) USING ORC LOCATION 
'$tableDir1'
+ """.stripMargin)
+
+  checkAnswer(sql(s"select a, A from $tableName"), (0 until 
count).map(c => Row(c, c - 1)))
+
+  val actual1 = stripSparkFilter(sql(s"select A from $tableName where 
A < 0"))
+  assert(actual1.count() == 1)
+
+  val actual2 = stripSparkFilter(sql(s"select A from $tableName where 
a < 0"))
+  assert(actual2.count() == 0)
+}
+
+// Exception thrown for ambiguous case.
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  val e = intercept[AnalysisException] {
+sql(s"select a from $tableName where a < 0").collect()
+  }
+  assert(e.getMessage.contains(
+"Reference 'a' is ambiguous"))
+}
+  }
+
+  // Metastore table has only `A` field.
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (A LONG) USING ORC LOCATION 
'$tableDir1'
+ """.stripMargin)
+
+  val e = intercept[SparkException] {
+sql(s"select A from $tableName where A < 0").collect()
+  }
+  assert(e.getCause.isInstanceOf[RuntimeException] && 
e.getCause.getMessage.contains(
+"""Found duplicate field(s) "A": [A, a] in case-insensitive 
mode"""))
+}
+  }
+
+  // Physical ORC files have only `A` field.
+  val tableDir2 = dir.getAbsoluteFile + "/table2"
+  withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+spark.range(count).repartition(count).selectExpr("id - 1 as A")
+  .write.mode("overwrite").orc(tableDir2)
+  }
+
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (a LONG) USING ORC LOCATION 
'$tableDir2'
+ """.stripMargin)
+
+  checkAnswer(sql(s"select a from $tableName"), (0 until count).map(c 
=> Row(c - 1)))
+
+  val actual = stripSparkFilter(sql(s"select a from $tableName where a 
< 0"))
+  // TODO: ORC predicate pushdown should work under case-insensitive 
analysis.
+  // assert(actual.count() == 1)

Review comment:
   Yes, this should be fixed in branch-3.0 too.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



HyukjinKwon commented on pull request #29333:
URL: https://github.com/apache/spark/pull/29333#issuecomment-673870618


   Also, I would like to thank @cpintado from GitHub. He virtually guided me 
here a lot on this.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



HyukjinKwon commented on a change in pull request #29333:
URL: https://github.com/apache/spark/pull/29333#discussion_r470397420



##
File path: .github/workflows/master.yml
##
@@ -170,13 +170,19 @@ jobs:
 # Show installed packages in R.
 sudo Rscript -e 'pkg_list <- as.data.frame(installed.packages()[, 
c(1,3:4)]); pkg_list[is.na(pkg_list$Priority), 1:2, drop = FALSE]'
 # Run the tests.
-- name: "Run tests: ${{ matrix.modules }}"
+- name: Run tests
   run: |
 # Hive tests become flaky when running in parallel as it's too 
intensive.
 if [[ "$MODULES_TO_TEST" == "hive" ]]; then export SERIAL_SBT_TESTS=1; 
fi
 mkdir -p ~/.m2
 ./dev/run-tests --parallelism 2 --modules "$MODULES_TO_TEST" 
--included-tags "$INCLUDED_TAGS" --excluded-tags "$EXCLUDED_TAGS"
 rm -rf ~/.m2/repository/org/apache/spark
+- name: Upload test results to report
+  if: always()
+  uses: actions/upload-artifact@v2

Review comment:
   Yeah, if the tests fail, it should upload JUnit XML files and then 
report the failed test cases. GitHub Actions has things like `failure()` but I 
think we should run this always.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



HyukjinKwon commented on a change in pull request #29333:
URL: https://github.com/apache/spark/pull/29333#discussion_r470397420



##
File path: .github/workflows/master.yml
##
@@ -170,13 +170,19 @@ jobs:
 # Show installed packages in R.
 sudo Rscript -e 'pkg_list <- as.data.frame(installed.packages()[, 
c(1,3:4)]); pkg_list[is.na(pkg_list$Priority), 1:2, drop = FALSE]'
 # Run the tests.
-- name: "Run tests: ${{ matrix.modules }}"
+- name: Run tests
   run: |
 # Hive tests become flaky when running in parallel as it's too 
intensive.
 if [[ "$MODULES_TO_TEST" == "hive" ]]; then export SERIAL_SBT_TESTS=1; 
fi
 mkdir -p ~/.m2
 ./dev/run-tests --parallelism 2 --modules "$MODULES_TO_TEST" 
--included-tags "$INCLUDED_TAGS" --excluded-tags "$EXCLUDED_TAGS"
 rm -rf ~/.m2/repository/org/apache/spark
+- name: Upload test results to report
+  if: always()
+  uses: actions/upload-artifact@v2

Review comment:
   Yeah, if the tests fail, it should upload JUnit XML files and then 
report the failed test cases. GitHub Actions has things like `failure()` but I 
think we should run this always (to report successful cases and also failed 
cases).





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673870274







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



HyukjinKwon commented on pull request #29333:
URL: https://github.com/apache/spark/pull/29333#issuecomment-673868134


   Thank you all !!



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29427: [SPARK-25557][SQL][TEST][Followup] Add case-sensitivity test for ORC predicate pushdown

2020-08-13 Thread GitBox



dongjoon-hyun commented on a change in pull request #29427:
URL: https://github.com/apache/spark/pull/29427#discussion_r470396482



##
File path: 
sql/core/v1.2/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala
##
@@ -513,5 +513,98 @@ class OrcFilterSuite extends OrcTest with 
SharedSparkSession {
   ).get.toString
 }
   }
+
+  test("SPARK-25557: case sensitivity in predicate pushdown") {
+withTempPath { dir =>
+  val count = 10
+  val tableName = "spark_25557"
+  val tableDir1 = dir.getAbsoluteFile + "/table1"
+
+  // Physical ORC files have both `A` and `a` fields.
+  withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+spark.range(count).repartition(count).selectExpr("id - 1 as A", "id as 
a")
+  .write.mode("overwrite").orc(tableDir1)
+  }
+
+  // Metastore table has both `A` and `a` fields too.
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (A LONG, a LONG) USING ORC LOCATION 
'$tableDir1'
+ """.stripMargin)
+
+  checkAnswer(sql(s"select a, A from $tableName"), (0 until 
count).map(c => Row(c, c - 1)))
+
+  val actual1 = stripSparkFilter(sql(s"select A from $tableName where 
A < 0"))
+  assert(actual1.count() == 1)
+
+  val actual2 = stripSparkFilter(sql(s"select A from $tableName where 
a < 0"))
+  assert(actual2.count() == 0)
+}
+
+// Exception thrown for ambiguous case.
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  val e = intercept[AnalysisException] {
+sql(s"select a from $tableName where a < 0").collect()
+  }
+  assert(e.getMessage.contains(
+"Reference 'a' is ambiguous"))
+}
+  }
+
+  // Metastore table has only `A` field.
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (A LONG) USING ORC LOCATION 
'$tableDir1'
+ """.stripMargin)
+
+  val e = intercept[SparkException] {
+sql(s"select A from $tableName where A < 0").collect()
+  }
+  assert(e.getCause.isInstanceOf[RuntimeException] && 
e.getCause.getMessage.contains(
+"""Found duplicate field(s) "A": [A, a] in case-insensitive 
mode"""))
+}
+  }
+
+  // Physical ORC files have only `A` field.
+  val tableDir2 = dir.getAbsoluteFile + "/table2"
+  withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+spark.range(count).repartition(count).selectExpr("id - 1 as A")
+  .write.mode("overwrite").orc(tableDir2)
+  }
+
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (a LONG) USING ORC LOCATION 
'$tableDir2'
+ """.stripMargin)
+
+  checkAnswer(sql(s"select a from $tableName"), (0 until count).map(c 
=> Row(c - 1)))
+
+  val actual = stripSparkFilter(sql(s"select a from $tableName where a 
< 0"))
+  // TODO: ORC predicate pushdown should work under case-insensitive 
analysis.
+  // assert(actual.count() == 1)

Review comment:
   Can we have this non-nested test case on `branch-3.0`, too?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-673868049


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/127432/
   Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-13 Thread GitBox



dongjoon-hyun commented on a change in pull request #29395:
URL: https://github.com/apache/spark/pull/29395#discussion_r470395517



##
File path: core/src/main/scala/org/apache/spark/internal/config/Tests.scala
##
@@ -61,4 +61,19 @@ private[spark] object Tests {
 .version("3.0.0")
 .intConf
 .createWithDefault(2)
+
+  val RESOURCES_WARNING_TESTING = 
ConfigBuilder("spark.resources.warnings.testing")
+.version("3.1.0")

Review comment:
   This should be `3.0.1` when it comes to `branch-3.0`, @Ngone51 .





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun closed pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



dongjoon-hyun closed pull request #29333:
URL: https://github.com/apache/spark/pull/29333


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673870274







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-13 Thread GitBox



AmplabJenkins removed a comment on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-673868046


   Merged build finished. Test FAILed.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-13 Thread GitBox



SparkQA commented on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-673868031


   **[Test build #127432 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127432/testReport)**
 for PR 28841 at commit 
[`1ee4af4`](https://github.com/apache/spark/commit/1ee4af433229baa55b3b1d3c970ef362bb2525fa).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] viirya commented on a change in pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



viirya commented on a change in pull request #29333:
URL: https://github.com/apache/spark/pull/29333#discussion_r470397100



##
File path: .github/workflows/master.yml
##
@@ -170,13 +170,19 @@ jobs:
 # Show installed packages in R.
 sudo Rscript -e 'pkg_list <- as.data.frame(installed.packages()[, 
c(1,3:4)]); pkg_list[is.na(pkg_list$Priority), 1:2, drop = FALSE]'
 # Run the tests.
-- name: "Run tests: ${{ matrix.modules }}"
+- name: Run tests
   run: |
 # Hive tests become flaky when running in parallel as it's too 
intensive.
 if [[ "$MODULES_TO_TEST" == "hive" ]]; then export SERIAL_SBT_TESTS=1; 
fi
 mkdir -p ~/.m2
 ./dev/run-tests --parallelism 2 --modules "$MODULES_TO_TEST" 
--included-tags "$INCLUDED_TAGS" --excluded-tags "$EXCLUDED_TAGS"
 rm -rf ~/.m2/repository/org/apache/spark
+- name: Upload test results to report
+  if: always()
+  uses: actions/upload-artifact@v2

Review comment:
   If previous `Run tests` is passed without failure, do we still need run 
this? I remember Github Actions has some conditions other than `always()` can 
be used?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



HyukjinKwon commented on a change in pull request #29333:
URL: https://github.com/apache/spark/pull/29333#discussion_r470397420



##
File path: .github/workflows/master.yml
##
@@ -170,13 +170,19 @@ jobs:
 # Show installed packages in R.
 sudo Rscript -e 'pkg_list <- as.data.frame(installed.packages()[, 
c(1,3:4)]); pkg_list[is.na(pkg_list$Priority), 1:2, drop = FALSE]'
 # Run the tests.
-- name: "Run tests: ${{ matrix.modules }}"
+- name: Run tests
   run: |
 # Hive tests become flaky when running in parallel as it's too 
intensive.
 if [[ "$MODULES_TO_TEST" == "hive" ]]; then export SERIAL_SBT_TESTS=1; 
fi
 mkdir -p ~/.m2
 ./dev/run-tests --parallelism 2 --modules "$MODULES_TO_TEST" 
--included-tags "$INCLUDED_TAGS" --excluded-tags "$EXCLUDED_TAGS"
 rm -rf ~/.m2/repository/org/apache/spark
+- name: Upload test results to report
+  if: always()
+  uses: actions/upload-artifact@v2

Review comment:
   Yeah, if the tests file, it should upload JUnit XML files and then 
report the failed test cases. GitHub Actions has things like `failure()` but I 
think we should run this always.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-13 Thread GitBox



SparkQA removed a comment on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-673865344


   **[Test build #127432 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127432/testReport)**
 for PR 28841 at commit 
[`1ee4af4`](https://github.com/apache/spark/commit/1ee4af433229baa55b3b1d3c970ef362bb2525fa).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-673868046







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #29333: [SPARK-32357][INFRA] Publish failed and succeeded test reports in GitHub Actions

2020-08-13 Thread GitBox



HyukjinKwon commented on pull request #29333:
URL: https://github.com/apache/spark/pull/29333#issuecomment-673869619


   I opened a PR to verify if this works well in the main commit 
(https://github.com/apache/spark/commit/5debde94019d46d4ab66f7927d9e5e8c4d16a7ec)
 and the PR (https://github.com/apache/spark/pull/29429).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] HyukjinKwon opened a new pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



HyukjinKwon opened a new pull request #29429:
URL: https://github.com/apache/spark/pull/29429


   ### What changes were proposed in this pull request?
   
   This PR is to trigger the test report at 
https://github.com/apache/spark/pull/29333.
   
   ### Why are the changes needed?
   
   N/A
   
   ### Does this PR introduce _any_ user-facing change?
   
   N/A
   
   ### How was this patch tested?
   
   N/A
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] Karl-WangSK commented on pull request #29360: [SPARK-32542][SQL] Add an optimizer rule to split an Expand into multiple Expands for aggregates

2020-08-13 Thread GitBox



Karl-WangSK commented on pull request #29360:
URL: https://github.com/apache/spark/pull/29360#issuecomment-673868646


   yes.The shuffle output is the same, because the size of the data is the 
same. 
   As you can see the benchmark:
   cube 7 fields k1, k2, k3, k4, k5, k6, k7(128x projections)  and cube 6 
fields k1, k2, k3, k4, k5, k6(64x projections) with  grouping off
   data size is double ,but the time ,one is 2.4min ,the another one is 8.7min, 
not just double time .It will be affected by data size Especially when the 
memory is limited.
   The original data I created is about 20M, executor memory is 1g. when it 
expands to 64x or  128x. It will have big impact on shuffle performance.
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #29429: [DO-NOT-MERGE] Verify GitHub Actions test report

2020-08-13 Thread GitBox



SparkQA commented on pull request #29429:
URL: https://github.com/apache/spark/pull/29429#issuecomment-673870087


   **[Test build #127433 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/127433/testReport)**
 for PR 29429 at commit 
[`c356076`](https://github.com/apache/spark/commit/c356076a0b761e9e8f598fe8468eca5191b7bad6).



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29395: [3.0][SPARK-32518][CORE] CoarseGrainedSchedulerBackend.maxNumConcurrentTasks should consider all kinds of resources

2020-08-13 Thread GitBox



dongjoon-hyun commented on a change in pull request #29395:
URL: https://github.com/apache/spark/pull/29395#discussion_r470395606



##
File path: core/src/main/scala/org/apache/spark/internal/config/Tests.scala
##
@@ -61,4 +61,19 @@ private[spark] object Tests {
 .version("3.0.0")
 .intConf
 .createWithDefault(2)
+
+  val RESOURCES_WARNING_TESTING = 
ConfigBuilder("spark.resources.warnings.testing")
+.version("3.1.0")
+.booleanConf
+.createWithDefault(false)
+
+  // This configuration is used for unit tests to allow skipping the task cpus 
to cores validation
+  // to allow emulating standalone mode behavior while running in local mode. 
Standalone mode
+  // by default doesn't specify a number of executor cores, it just uses all 
the ones available
+  // on the host.
+  val SKIP_VALIDATE_CORES_TESTING =
+  ConfigBuilder("spark.testing.skipValidateCores")
+.version("3.1.0")
+.booleanConf
+.createWithDefault(false)

Review comment:
   ditto. This should be `3.0.1` when it comes to `branch-3.0`, @Ngone51 .
   Also, after merging this, please update `master` branch consistently.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-08-13 Thread GitBox



dongjoon-hyun commented on pull request #28904:
URL: https://github.com/apache/spark/pull/28904#issuecomment-673870133


   Sure. Have a nice vacation and take care, @HeartSaVioR .



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #29427: [SPARK-25557][SQL][TEST][Followup] Add case-sensitivity test for ORC predicate pushdown

2020-08-13 Thread GitBox



dongjoon-hyun commented on a change in pull request #29427:
URL: https://github.com/apache/spark/pull/29427#discussion_r470396482



##
File path: 
sql/core/v1.2/src/test/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilterSuite.scala
##
@@ -513,5 +513,98 @@ class OrcFilterSuite extends OrcTest with 
SharedSparkSession {
   ).get.toString
 }
   }
+
+  test("SPARK-25557: case sensitivity in predicate pushdown") {
+withTempPath { dir =>
+  val count = 10
+  val tableName = "spark_25557"
+  val tableDir1 = dir.getAbsoluteFile + "/table1"
+
+  // Physical ORC files have both `A` and `a` fields.
+  withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+spark.range(count).repartition(count).selectExpr("id - 1 as A", "id as 
a")
+  .write.mode("overwrite").orc(tableDir1)
+  }
+
+  // Metastore table has both `A` and `a` fields too.
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (A LONG, a LONG) USING ORC LOCATION 
'$tableDir1'
+ """.stripMargin)
+
+  checkAnswer(sql(s"select a, A from $tableName"), (0 until 
count).map(c => Row(c, c - 1)))
+
+  val actual1 = stripSparkFilter(sql(s"select A from $tableName where 
A < 0"))
+  assert(actual1.count() == 1)
+
+  val actual2 = stripSparkFilter(sql(s"select A from $tableName where 
a < 0"))
+  assert(actual2.count() == 0)
+}
+
+// Exception thrown for ambiguous case.
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  val e = intercept[AnalysisException] {
+sql(s"select a from $tableName where a < 0").collect()
+  }
+  assert(e.getMessage.contains(
+"Reference 'a' is ambiguous"))
+}
+  }
+
+  // Metastore table has only `A` field.
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (A LONG) USING ORC LOCATION 
'$tableDir1'
+ """.stripMargin)
+
+  val e = intercept[SparkException] {
+sql(s"select A from $tableName where A < 0").collect()
+  }
+  assert(e.getCause.isInstanceOf[RuntimeException] && 
e.getCause.getMessage.contains(
+"""Found duplicate field(s) "A": [A, a] in case-insensitive 
mode"""))
+}
+  }
+
+  // Physical ORC files have only `A` field.
+  val tableDir2 = dir.getAbsoluteFile + "/table2"
+  withSQLConf(SQLConf.CASE_SENSITIVE.key -> "true") {
+spark.range(count).repartition(count).selectExpr("id - 1 as A")
+  .write.mode("overwrite").orc(tableDir2)
+  }
+
+  withTable(tableName) {
+withSQLConf(SQLConf.CASE_SENSITIVE.key -> "false") {
+  sql(
+s"""
+   |CREATE TABLE $tableName (a LONG) USING ORC LOCATION 
'$tableDir2'
+ """.stripMargin)
+
+  checkAnswer(sql(s"select a from $tableName"), (0 until count).map(c 
=> Row(c - 1)))
+
+  val actual = stripSparkFilter(sql(s"select a from $tableName where a 
< 0"))
+  // TODO: ORC predicate pushdown should work under case-insensitive 
analysis.
+  // assert(actual.count() == 1)

Review comment:
   Can we have this case on `branch-3.0`, too?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #28841: [SPARK-31962][SQL] Provide modifiedAfter and modifiedBefore options when filtering from a batch-based file data source

2020-08-13 Thread GitBox



AmplabJenkins commented on pull request #28841:
URL: https://github.com/apache/spark/pull/28841#issuecomment-673865720







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

1 2 3 4 5 >

1 - 100 of 493 matches

Mail list logo