[GitHub] [spark] AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27570: [SPARK-30820][SPARKR][ML] Add 
FMClassifier to SparkR
URL: https://github.com/apache/spark/pull/27570#issuecomment-594176426
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23996/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27752: [SPARK-30999][SQL] Don't cancel a QueryStageExec which failed before call doMaterialize

2020-03-03 Thread GitBox
SparkQA commented on issue #27752: [SPARK-30999][SQL] Don't cancel a 
QueryStageExec which failed before call doMaterialize 
URL: https://github.com/apache/spark/pull/27752#issuecomment-594176778
 
 
   **[Test build #119236 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119236/testReport)**
 for PR 27752 at commit 
[`654d9d3`](https://github.com/apache/spark/commit/654d9d33bbffdaee7d818f938dd6a1f271208c0d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27571: [SPARK-30819][SPARKR][ML] Add FMRegressor wrapper to SparkR

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27571: [SPARK-30819][SPARKR][ML]  Add 
FMRegressor wrapper to SparkR
URL: https://github.com/apache/spark/pull/27571#issuecomment-594176386
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27571: [SPARK-30819][SPARKR][ML] Add FMRegressor wrapper to SparkR

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27571: [SPARK-30819][SPARKR][ML]  Add 
FMRegressor wrapper to SparkR
URL: https://github.com/apache/spark/pull/27571#issuecomment-594176398
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23995/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27571: [SPARK-30819][SPARKR][ML] Add FMRegressor wrapper to SparkR

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27571: [SPARK-30819][SPARKR][ML]  Add 
FMRegressor wrapper to SparkR
URL: https://github.com/apache/spark/pull/27571#issuecomment-594176386
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add 
FMClassifier to SparkR
URL: https://github.com/apache/spark/pull/27570#issuecomment-594176426
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23996/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594176363
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594176377
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23993/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27593: [SPARK-30818][SPARKR][ML] Add SparkR LinearRegression wrapper

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27593: [SPARK-30818][SPARKR][ML] Add SparkR 
LinearRegression wrapper
URL: https://github.com/apache/spark/pull/27593#issuecomment-594176297
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27593: [SPARK-30818][SPARKR][ML] Add SparkR LinearRegression wrapper

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27593: [SPARK-30818][SPARKR][ML] Add 
SparkR LinearRegression wrapper
URL: https://github.com/apache/spark/pull/27593#issuecomment-594176309
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23994/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27571: [SPARK-30819][SPARKR][ML] Add FMRegressor wrapper to SparkR

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27571: [SPARK-30819][SPARKR][ML]  Add 
FMRegressor wrapper to SparkR
URL: https://github.com/apache/spark/pull/27571#issuecomment-594176398
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23995/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594176363
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27593: [SPARK-30818][SPARKR][ML] Add SparkR LinearRegression wrapper

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27593: [SPARK-30818][SPARKR][ML] Add 
SparkR LinearRegression wrapper
URL: https://github.com/apache/spark/pull/27593#issuecomment-594176297
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27776: [SPARK-31024][SQL] Allow 
specifying session catalog name `spark_catalog` in qualified column names for 
v1 tables
URL: https://github.com/apache/spark/pull/27776#issuecomment-594170070
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594176377
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23993/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27570: [SPARK-30820][SPARKR][ML] Add 
FMClassifier to SparkR
URL: https://github.com/apache/spark/pull/27570#issuecomment-594176417
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27593: [SPARK-30818][SPARKR][ML] Add SparkR LinearRegression wrapper

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27593: [SPARK-30818][SPARKR][ML] Add SparkR 
LinearRegression wrapper
URL: https://github.com/apache/spark/pull/27593#issuecomment-594176309
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23994/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27571: [SPARK-30819][SPARKR][ML] Add FMRegressor wrapper to SparkR

2020-03-03 Thread GitBox
SparkQA commented on issue #27571: [SPARK-30819][SPARKR][ML]  Add FMRegressor 
wrapper to SparkR
URL: https://github.com/apache/spark/pull/27571#issuecomment-594175789
 
 
   **[Test build #119254 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119254/testReport)**
 for PR 27571 at commit 
[`4c5b2a5`](https://github.com/apache/spark/commit/4c5b2a59574f59927b962f9657f82837f88db74b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier to SparkR

2020-03-03 Thread GitBox
SparkQA commented on issue #27570: [SPARK-30820][SPARKR][ML] Add FMClassifier 
to SparkR
URL: https://github.com/apache/spark/pull/27570#issuecomment-594175786
 
 
   **[Test build #119255 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119255/testReport)**
 for PR 27570 at commit 
[`2156bed`](https://github.com/apache/spark/commit/2156bed223ec28279fbaa18e2bc0f8c47ade7d0d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
SparkQA commented on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594175748
 
 
   **[Test build #119253 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119253/testReport)**
 for PR 27728 at commit 
[`77ea177`](https://github.com/apache/spark/commit/77ea177985516e05bf89e3c05a9c87050583).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on issue #27593: [SPARK-30818][SPARKR][ML] Add SparkR LinearRegression wrapper

2020-03-03 Thread GitBox
zero323 commented on issue #27593: [SPARK-30818][SPARKR][ML] Add SparkR 
LinearRegression wrapper
URL: https://github.com/apache/spark/pull/27593#issuecomment-594175502
 
 
   Retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
dongjoon-hyun commented on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594175144
 
 
   @dbtsai . You can address the above comments in the spin-off PR.
   - https://github.com/apache/spark/pull/27778


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] zero323 commented on issue #27571: [SPARK-30819][SPARKR][ML] Add FMRegressor wrapper to SparkR

2020-03-03 Thread GitBox
zero323 commented on issue #27571: [SPARK-30819][SPARKR][ML]  Add FMRegressor 
wrapper to SparkR
URL: https://github.com/apache/spark/pull/27571#issuecomment-594175279
 
 
   Retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
dongjoon-hyun commented on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594175454
 
 
   Retest this please.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun edited a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
dongjoon-hyun edited a comment on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594175144
 
 
   @dbtsai . You can address the above two comments in the spin-off PR.
   - https://github.com/apache/spark/pull/27778


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
dongjoon-hyun commented on a change in pull request #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#discussion_r387299506
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ##
 @@ -437,61 +437,74 @@ object DataSourceStrategy {
 }
   }
 
+  /**
+   * Find the column name of an expression that can be pushed down.
+   */
+  private[sql] def pushDownColName(e: Expression): Option[String] = {
+import 
org.apache.spark.sql.connector.catalog.CatalogV2Implicits.MultipartIdentifierHelper
+def helper(e: Expression): Option[Seq[String]] = e match {
+  case a: Attribute => Some(Seq(a.name))
+  case s: GetStructField => helper(s.child).map(_ :+ 
s.childSchema(s.ordinal).name)
+  case _ => None
+}
+helper(e).map(_.quoted)
+  }
+
   private def translateLeafNodeFilter(predicate: Expression): Option[Filter] = 
predicate match {
-case expressions.EqualTo(a: Attribute, Literal(v, t)) =>
-  Some(sources.EqualTo(a.name, convertToScala(v, t)))
-case expressions.EqualTo(Literal(v, t), a: Attribute) =>
-  Some(sources.EqualTo(a.name, convertToScala(v, t)))
-
-case expressions.EqualNullSafe(a: Attribute, Literal(v, t)) =>
-  Some(sources.EqualNullSafe(a.name, convertToScala(v, t)))
-case expressions.EqualNullSafe(Literal(v, t), a: Attribute) =>
-  Some(sources.EqualNullSafe(a.name, convertToScala(v, t)))
-
-case expressions.GreaterThan(a: Attribute, Literal(v, t)) =>
-  Some(sources.GreaterThan(a.name, convertToScala(v, t)))
-case expressions.GreaterThan(Literal(v, t), a: Attribute) =>
-  Some(sources.LessThan(a.name, convertToScala(v, t)))
-
-case expressions.LessThan(a: Attribute, Literal(v, t)) =>
-  Some(sources.LessThan(a.name, convertToScala(v, t)))
-case expressions.LessThan(Literal(v, t), a: Attribute) =>
-  Some(sources.GreaterThan(a.name, convertToScala(v, t)))
-
-case expressions.GreaterThanOrEqual(a: Attribute, Literal(v, t)) =>
-  Some(sources.GreaterThanOrEqual(a.name, convertToScala(v, t)))
-case expressions.GreaterThanOrEqual(Literal(v, t), a: Attribute) =>
-  Some(sources.LessThanOrEqual(a.name, convertToScala(v, t)))
-
-case expressions.LessThanOrEqual(a: Attribute, Literal(v, t)) =>
-  Some(sources.LessThanOrEqual(a.name, convertToScala(v, t)))
-case expressions.LessThanOrEqual(Literal(v, t), a: Attribute) =>
-  Some(sources.GreaterThanOrEqual(a.name, convertToScala(v, t)))
-
-case expressions.InSet(a: Attribute, set) =>
-  val toScala = CatalystTypeConverters.createToScalaConverter(a.dataType)
-  Some(sources.In(a.name, set.toArray.map(toScala)))
+case expressions.EqualTo(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.EqualTo(_, convertToScala(v, t)))
+case expressions.EqualTo(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.EqualTo(_, convertToScala(v, t)))
+
+case expressions.EqualNullSafe(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.EqualNullSafe(_, convertToScala(v, t)))
+case expressions.EqualNullSafe(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.EqualNullSafe(_, convertToScala(v, t)))
+
+case expressions.GreaterThan(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.GreaterThan(_, convertToScala(v, t)))
+case expressions.GreaterThan(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.LessThan(_, convertToScala(v, t)))
+
+case expressions.LessThan(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.LessThan(_, convertToScala(v, t)))
+case expressions.LessThan(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.GreaterThan(_, convertToScala(v, t)))
+
+case expressions.GreaterThanOrEqual(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.GreaterThanOrEqual(_, convertToScala(v, 
t)))
+case expressions.GreaterThanOrEqual(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.LessThanOrEqual(_, convertToScala(v, t)))
+
+case expressions.LessThanOrEqual(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.LessThanOrEqual(_, convertToScala(v, t)))
+case expressions.LessThanOrEqual(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.GreaterThanOrEqual(_, convertToScala(v, 
t)))
+
+case expressions.InSet(e: Expression, set) =>
+  val toScala = CatalystTypeConverters.createToScalaConverter(e.dataType)
+  pushDownColName(e).map(sources.In(_, set.toArray.map(toScala)))
 
 // Because we only convert In to InSet in Optimizer when there are more 
than certain
 // items. So it is possible we still get an In expression here that needs 
to be pushed
 // down.
-case 

[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
dongjoon-hyun commented on a change in pull request #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#discussion_r387298806
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ##
 @@ -437,61 +437,74 @@ object DataSourceStrategy {
 }
   }
 
+  /**
+   * Find the column name of an expression that can be pushed down.
+   */
+  private[sql] def pushDownColName(e: Expression): Option[String] = {
+import 
org.apache.spark.sql.connector.catalog.CatalogV2Implicits.MultipartIdentifierHelper
+def helper(e: Expression): Option[Seq[String]] = e match {
+  case a: Attribute => Some(Seq(a.name))
+  case s: GetStructField => helper(s.child).map(_ :+ 
s.childSchema(s.ordinal).name)
+  case _ => None
+}
+helper(e).map(_.quoted)
+  }
+
   private def translateLeafNodeFilter(predicate: Expression): Option[Filter] = 
predicate match {
-case expressions.EqualTo(a: Attribute, Literal(v, t)) =>
-  Some(sources.EqualTo(a.name, convertToScala(v, t)))
-case expressions.EqualTo(Literal(v, t), a: Attribute) =>
-  Some(sources.EqualTo(a.name, convertToScala(v, t)))
-
-case expressions.EqualNullSafe(a: Attribute, Literal(v, t)) =>
-  Some(sources.EqualNullSafe(a.name, convertToScala(v, t)))
-case expressions.EqualNullSafe(Literal(v, t), a: Attribute) =>
-  Some(sources.EqualNullSafe(a.name, convertToScala(v, t)))
-
-case expressions.GreaterThan(a: Attribute, Literal(v, t)) =>
-  Some(sources.GreaterThan(a.name, convertToScala(v, t)))
-case expressions.GreaterThan(Literal(v, t), a: Attribute) =>
-  Some(sources.LessThan(a.name, convertToScala(v, t)))
-
-case expressions.LessThan(a: Attribute, Literal(v, t)) =>
-  Some(sources.LessThan(a.name, convertToScala(v, t)))
-case expressions.LessThan(Literal(v, t), a: Attribute) =>
-  Some(sources.GreaterThan(a.name, convertToScala(v, t)))
-
-case expressions.GreaterThanOrEqual(a: Attribute, Literal(v, t)) =>
-  Some(sources.GreaterThanOrEqual(a.name, convertToScala(v, t)))
-case expressions.GreaterThanOrEqual(Literal(v, t), a: Attribute) =>
-  Some(sources.LessThanOrEqual(a.name, convertToScala(v, t)))
-
-case expressions.LessThanOrEqual(a: Attribute, Literal(v, t)) =>
-  Some(sources.LessThanOrEqual(a.name, convertToScala(v, t)))
-case expressions.LessThanOrEqual(Literal(v, t), a: Attribute) =>
-  Some(sources.GreaterThanOrEqual(a.name, convertToScala(v, t)))
-
-case expressions.InSet(a: Attribute, set) =>
-  val toScala = CatalystTypeConverters.createToScalaConverter(a.dataType)
-  Some(sources.In(a.name, set.toArray.map(toScala)))
+case expressions.EqualTo(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.EqualTo(_, convertToScala(v, t)))
+case expressions.EqualTo(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.EqualTo(_, convertToScala(v, t)))
+
+case expressions.EqualNullSafe(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.EqualNullSafe(_, convertToScala(v, t)))
+case expressions.EqualNullSafe(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.EqualNullSafe(_, convertToScala(v, t)))
+
+case expressions.GreaterThan(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.GreaterThan(_, convertToScala(v, t)))
+case expressions.GreaterThan(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.LessThan(_, convertToScala(v, t)))
+
+case expressions.LessThan(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.LessThan(_, convertToScala(v, t)))
+case expressions.LessThan(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.GreaterThan(_, convertToScala(v, t)))
+
+case expressions.GreaterThanOrEqual(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.GreaterThanOrEqual(_, convertToScala(v, 
t)))
+case expressions.GreaterThanOrEqual(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.LessThanOrEqual(_, convertToScala(v, t)))
+
+case expressions.LessThanOrEqual(e: Expression, Literal(v, t)) =>
+  pushDownColName(e).map(sources.LessThanOrEqual(_, convertToScala(v, t)))
+case expressions.LessThanOrEqual(Literal(v, t), e: Expression) =>
+  pushDownColName(e).map(sources.GreaterThanOrEqual(_, convertToScala(v, 
t)))
+
+case expressions.InSet(e: Expression, set) =>
+  val toScala = CatalystTypeConverters.createToScalaConverter(e.dataType)
+  pushDownColName(e).map(sources.In(_, set.toArray.map(toScala)))
 
 Review comment:
   If you don't mind, can we rewrite this like the following to prevent 
potential minor regression? The above new code execute 
`CatalystTypeConverters.createToScalaConverter` for all expressions while 

[GitHub] [spark] AmplabJenkins commented on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27776: [SPARK-31024][SQL] Allow specifying 
session catalog name `spark_catalog` in qualified column names for v1 tables
URL: https://github.com/apache/spark/pull/27776#issuecomment-594170082
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119249/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27776: [SPARK-31024][SQL] Allow specifying 
session catalog name `spark_catalog` in qualified column names for v1 tables
URL: https://github.com/apache/spark/pull/27776#issuecomment-594170070
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] 
Refactor DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#discussion_r387294541
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala
 ##
 @@ -437,61 +437,72 @@ object DataSourceStrategy {
 }
   }
 
+  /**
+   * Find the column name of an expression that can be pushed down.
+   */
+  private[sql] def pushDownColName(e: Expression): Option[String] = {
+def helper(e: Expression): Option[Seq[String]] = e match {
+  case a: Attribute => Some(Seq(a.name))
+  case _ => None
+}
+helper(e).flatMap(_.headOption)
 
 Review comment:
   Although I know the background, shall we write like the following simpler 
way in this PR?
   ```scala
   def helper(e: Expression): Option[String] = e match {
 case a: Attribute => Some(a.name)
 case _ => None
   }
   helper(e)
   ```


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables

2020-03-03 Thread GitBox
SparkQA removed a comment on issue #27776: [SPARK-31024][SQL] Allow specifying 
session catalog name `spark_catalog` in qualified column names for v1 tables
URL: https://github.com/apache/spark/pull/27776#issuecomment-594116491
 
 
   **[Test build #119249 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119249/testReport)**
 for PR 27776 at commit 
[`e588663`](https://github.com/apache/spark/commit/e5886637eed1166f5b9abbbe669709573ce289ce).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27776: [SPARK-31024][SQL] Allow specifying session catalog name `spark_catalog` in qualified column names for v1 tables

2020-03-03 Thread GitBox
SparkQA commented on issue #27776: [SPARK-31024][SQL] Allow specifying session 
catalog name `spark_catalog` in qualified column names for v1 tables
URL: https://github.com/apache/spark/pull/27776#issuecomment-594169684
 
 
   **[Test build #119249 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119249/testReport)**
 for PR 27776 at commit 
[`e588663`](https://github.com/apache/spark/commit/e5886637eed1166f5b9abbbe669709573ce289ce).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] tgravescs commented on issue #27773: [SPARK-29154][CORE] Update Spark scheduler for stage level scheduling

2020-03-03 Thread GitBox
tgravescs commented on issue #27773: [SPARK-29154][CORE] Update Spark scheduler 
for stage level scheduling
URL: https://github.com/apache/spark/pull/27773#issuecomment-594169369
 
 
   @mridulm @squito  if either of you have time to review


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27771: [SPARK-31020][SQL] Support foldable schemas by `from_csv`

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27771: [SPARK-31020][SQL] Support 
foldable schemas by `from_csv`
URL: https://github.com/apache/spark/pull/27771#issuecomment-594164395
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119239/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594164391
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119248/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594164378
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
SparkQA removed a comment on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594112894
 
 
   **[Test build #119248 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119248/testReport)**
 for PR 27728 at commit 
[`77ea177`](https://github.com/apache/spark/commit/77ea177985516e05bf89e3c05a9c87050583).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27771: [SPARK-31020][SQL] Support foldable schemas by `from_csv`

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27771: [SPARK-31020][SQL] Support 
foldable schemas by `from_csv`
URL: https://github.com/apache/spark/pull/27771#issuecomment-594164383
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27771: [SPARK-31020][SQL] Support foldable schemas by `from_csv`

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27771: [SPARK-31020][SQL] Support foldable 
schemas by `from_csv`
URL: https://github.com/apache/spark/pull/27771#issuecomment-594164383
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594164391
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119248/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27771: [SPARK-31020][SQL] Support foldable schemas by `from_csv`

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27771: [SPARK-31020][SQL] Support foldable 
schemas by `from_csv`
URL: https://github.com/apache/spark/pull/27771#issuecomment-594164395
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119239/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594164378
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27771: [SPARK-31020][SQL] Support foldable schemas by `from_csv`

2020-03-03 Thread GitBox
SparkQA removed a comment on issue #27771: [SPARK-31020][SQL] Support foldable 
schemas by `from_csv`
URL: https://github.com/apache/spark/pull/27771#issuecomment-594045806
 
 
   **[Test build #119239 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119239/testReport)**
 for PR 27771 at commit 
[`a72568e`](https://github.com/apache/spark/commit/a72568ef7560e0996a012235873cc5bc395ec364).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
SparkQA commented on issue #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#issuecomment-594164190
 
 
   **[Test build #119248 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119248/testReport)**
 for PR 27728 at commit 
[`77ea177`](https://github.com/apache/spark/commit/77ea177985516e05bf89e3c05a9c87050583).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27771: [SPARK-31020][SQL] Support foldable schemas by `from_csv`

2020-03-03 Thread GitBox
SparkQA commented on issue #27771: [SPARK-31020][SQL] Support foldable schemas 
by `from_csv`
URL: https://github.com/apache/spark/pull/27771#issuecomment-594164032
 
 
   **[Test build #119239 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119239/testReport)**
 for PR 27771 at commit 
[`a72568e`](https://github.com/apache/spark/commit/a72568ef7560e0996a012235873cc5bc395ec364).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594162321
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594162329
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23992/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594162329
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23992/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594162321
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594161759
 
 
   **[Test build #119252 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119252/testReport)**
 for PR 27778 at commit 
[`ea2d1f6`](https://github.com/apache/spark/commit/ea2d1f6bbe6e57424097cc3b5c80fe0a6e90afe2).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] 
Refactor DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#discussion_r387278397
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ##
 @@ -22,68 +22,82 @@ import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.plans.PlanTest
 import org.apache.spark.sql.sources
 import org.apache.spark.sql.test.SharedSparkSession
+import org.apache.spark.sql.types.{IntegerType, StringType, StructField, 
StructType}
 
 class DataSourceStrategySuite extends PlanTest with SharedSparkSession {
+  val attrInts = Seq(
+'cint.int,
+  ).zip(Seq(
+"cint",
+  ))
 
-  test("translate simple expression") {
-val attrInt = 'cint.int
-val attrStr = 'cstr.string
+  val attrStrs = Seq(
+'cstr.int,
+  ).zip(Seq(
+"cstr",
+  ))
+
+  test("translate simple expression") { attrInts.zip(attrStrs)
 
 Review comment:
   ~Indentation?~ Never mind.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594156029
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119251/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594156019
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
SparkQA removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594154950
 
 
   **[Test build #119251 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119251/testReport)**
 for PR 27778 at commit 
[`56c56a0`](https://github.com/apache/spark/commit/56c56a0e36fd8e610d9cb525e2cd9f8f08ba99ca).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594156000
 
 
   **[Test build #119251 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119251/testReport)**
 for PR 27778 at commit 
[`56c56a0`](https://github.com/apache/spark/commit/56c56a0e36fd8e610d9cb525e2cd9f8f08ba99ca).
* This patch **fails Scala style tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594156029
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119251/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594156019
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
dongjoon-hyun commented on a change in pull request #27778: [SPARK-31027] [SQL] 
Refactor DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#discussion_r387278397
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategySuite.scala
 ##
 @@ -22,68 +22,82 @@ import org.apache.spark.sql.catalyst.expressions._
 import org.apache.spark.sql.catalyst.plans.PlanTest
 import org.apache.spark.sql.sources
 import org.apache.spark.sql.test.SharedSparkSession
+import org.apache.spark.sql.types.{IntegerType, StringType, StructField, 
StructType}
 
 class DataSourceStrategySuite extends PlanTest with SharedSparkSession {
+  val attrInts = Seq(
+'cint.int,
+  ).zip(Seq(
+"cint",
+  ))
 
-  test("translate simple expression") {
-val attrInt = 'cint.int
-val attrStr = 'cstr.string
+  val attrStrs = Seq(
+'cstr.int,
+  ).zip(Seq(
+"cstr",
+  ))
+
+  test("translate simple expression") { attrInts.zip(attrStrs)
 
 Review comment:
   Indentation?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594151900
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594151910
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23991/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
SparkQA commented on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594154950
 
 
   **[Test build #119251 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119251/testReport)**
 for PR 27778 at commit 
[`56c56a0`](https://github.com/apache/spark/commit/56c56a0e36fd8e610d9cb525e2cd9f8f08ba99ca).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dbtsai commented on issue #27778: [SPARK-31027] [SQL] Refactor DataSourceStrategy to be more extendable

2020-03-03 Thread GitBox
dbtsai commented on issue #27778: [SPARK-31027] [SQL] Refactor 
DataSourceStrategy to be more extendable
URL: https://github.com/apache/spark/pull/27778#issuecomment-594153195
 
 
   cc @dongjoon-hyun @gengliangwang @cloud-fan @rdblue 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on issue #27769: [SPARK-30998][SQL][2.4] ClassCastException when a generator having nested inner generators

2020-03-03 Thread GitBox
dongjoon-hyun commented on issue #27769: [SPARK-30998][SQL][2.4] 
ClassCastException when a generator having nested inner generators
URL: https://github.com/apache/spark/pull/27769#issuecomment-594152999
 
 
   Thank you, @maropu and @cloud-fan .


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor `DataSourceStrategy.scala` to minimize the changes to support nested predicate pushdown

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor 
`DataSourceStrategy.scala` to minimize the changes to support nested predicate 
pushdown
URL: https://github.com/apache/spark/pull/27778#issuecomment-594151900
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor `DataSourceStrategy.scala` to minimize the changes to support nested predicate pushdown

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27778: [SPARK-31027] [SQL] Refactor 
`DataSourceStrategy.scala` to minimize the changes to support nested predicate 
pushdown
URL: https://github.com/apache/spark/pull/27778#issuecomment-594151910
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23991/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dbtsai opened a new pull request #27778: [SPARK-31027] [SQL] Refactor `DataSourceStrategy.scala` to minimize the changes to support nested predicate pushdown

2020-03-03 Thread GitBox
dbtsai opened a new pull request #27778: [SPARK-31027] [SQL] Refactor 
`DataSourceStrategy.scala` to minimize the changes to support nested predicate 
pushdown
URL: https://github.com/apache/spark/pull/27778
 
 
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce any user-facing change?
   
   
   
   ### How was this patch tested?
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun closed pull request #27749: [SPARK-30997][SQL] Fix an analysis failure in generators with aggregate functions

2020-03-03 Thread GitBox
dongjoon-hyun closed pull request #27749: [SPARK-30997][SQL] Fix an analysis 
failure in generators with aggregate functions
URL: https://github.com/apache/spark/pull/27749
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference from an example

2020-03-03 Thread GitBox
MaxGekk commented on a change in pull request #22666: [SPARK-25672][SQL] 
schema_of_csv() - schema inference from an example
URL: https://github.com/apache/spark/pull/22666#discussion_r387269929
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/ExprUtils.scala
 ##
 @@ -19,14 +19,39 @@ package org.apache.spark.sql.catalyst.expressions
 
 import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.util.ArrayBasedMapData
-import org.apache.spark.sql.types.{MapType, StringType, StructType}
+import org.apache.spark.sql.types.{DataType, MapType, StringType, StructType}
+import org.apache.spark.unsafe.types.UTF8String
 
 object ExprUtils {
 
-  def evalSchemaExpr(exp: Expression): StructType = exp match {
-case Literal(s, StringType) => StructType.fromDDL(s.toString)
+  def evalSchemaExpr(exp: Expression): StructType = {
+// Use `DataType.fromDDL` since the type string can be struct<...>.
+val dataType = exp match {
+  case Literal(s, StringType) =>
+DataType.fromDDL(s.toString)
+  case e @ SchemaOfCsv(_: Literal, _) =>
+val ddlSchema = e.eval(EmptyRow).asInstanceOf[UTF8String]
+DataType.fromDDL(ddlSchema.toString)
+  case e => throw new AnalysisException(
+"Schema should be specified in DDL format as a string literal or 
output of " +
+  s"the schema_of_csv function instead of ${e.sql}")
+}
+
+if (!dataType.isInstanceOf[StructType]) {
+  throw new AnalysisException(
+s"Schema should be struct type but got ${dataType.sql}.")
+}
+dataType.asInstanceOf[StructType]
+  }
+
+  def evalTypeExpr(exp: Expression): DataType = exp match {
+case Literal(s, StringType) => DataType.fromDDL(s.toString)
 
 Review comment:
   For example, a column with CSV string may be a result of string functions. 
So, you could just invoke the functions with an particular inputs. Currently, 
we force people to materialize an example and copy-past it to 
`schema_of_csv()`. That could cause maintainability issues, so, users should 
keep in sync the example in `schema_of_csv()` with the code which forms CSV 
column.
   
   I prepared the PR https://github.com/apache/spark/pull/2 to avoid the 
restriction which is not necessary from my point of view.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
viirya commented on a change in pull request #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#discussion_r387264452
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
 ##
 @@ -49,15 +49,34 @@ class ParquetFilters(
 pushDownInFilterThreshold: Int,
 caseSensitive: Boolean) {
   // A map which contains parquet field name and data type, if predicate push 
down applies.
-  private val nameToParquetField : Map[String, ParquetField] = {
-// Here we don't flatten the fields in the nested schema but just look up 
through
-// root fields. Currently, accessing to nested fields does not push down 
filters
-// and it does not support to create filters for them.
-val primitiveFields =
-
schema.getFields.asScala.filter(_.isPrimitive).map(_.asPrimitiveType()).map { f 
=>
-  f.getName -> ParquetField(f.getName,
-ParquetSchemaType(f.getOriginalType,
-  f.getPrimitiveTypeName, f.getTypeLength, f.getDecimalMetadata))
+  // The keys are the column names. For nested column, `dot` will be used as a 
separator.
+  // For column name that contains `dot`, backquote will be used.
+  // See `org.apache.spark.sql.connector.catalog.quote` for implementation 
details.
+  private val nameToParquetField : Map[String, ParquetPrimitiveField] = {
+// Recursively traverse the parquet schema to get primitive fields that 
can be pushed-down.
+// `parentFieldNames` is used to keep track of the current nested level 
when traversing.
+def getPrimitiveFields(
+fields: Seq[Type],
+parentFieldNames: Array[String] = Array.empty): 
Seq[ParquetPrimitiveField] = {
+  fields.flatMap {
+case p: PrimitiveType =>
+  Some(ParquetPrimitiveField(fieldNames = parentFieldNames :+ 
p.getName,
+fieldType = ParquetSchemaType(p.getOriginalType,
+  p.getPrimitiveTypeName, p.getTypeLength, p.getDecimalMetadata)))
+// Note that when g is a `Struct`, `g.getOriginalType` is `null`.
+// When g is a `Map`, `g.getOriginalType` is `MAP`.
+// When g is a `List`, `g.getOriginalType` is `LIST`.
+case g: GroupType if g.getOriginalType == null =>
+  getPrimitiveFields(g.getFields.asScala, parentFieldNames :+ 
g.getName)
+// Parquet only supports push-down for primitive types; as a result, 
Map and List types
+// are removed.
+case _ => None
+  }
+}
+
+val primitiveFields = getPrimitiveFields(schema.getFields.asScala).map { 
field =>
+  import 
org.apache.spark.sql.connector.catalog.CatalogV2Implicits.MultipartIdentifierHelper
 
 Review comment:
   nit: move `import` outside `map?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
viirya commented on a change in pull request #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#discussion_r387261748
 
 

 ##
 File path: 
sql/core/src/main/java/org/apache/parquet/filter2/predicate/NestedFilterApi.java
 ##
 @@ -0,0 +1,57 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.parquet.filter2.predicate;
+
+import org.apache.parquet.hadoop.metadata.ColumnPath;
 
 Review comment:
   nit: import order?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27728: [SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
viirya commented on a change in pull request #27728: 
[SPARK-25556][SPARK-17636][SPARK-31026][SQL][test-hive1.2] Nested Column 
Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#discussion_r387266696
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFilters.scala
 ##
 @@ -75,13 +94,13 @@ class ParquetFilters(
   }
 
   /**
-   * Holds a single field information stored in the underlying parquet file.
+   * Holds a single primitive field information stored in the underlying 
parquet file.
*
-   * @param fieldName field name in parquet file
+   * @param fieldNames field names in parquet file
 
 Review comment:
   I think It still indicates a single field name? Though it could contain 
multiple identifiers.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26918: [SPARK-30279][SQL] Support 32 or more grouping attributes for GROUPING_ID

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #26918: [SPARK-30279][SQL] Support 32 
or more grouping attributes for GROUPING_ID 
URL: https://github.com/apache/spark/pull/26918#issuecomment-594145847
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119234/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #26918: [SPARK-30279][SQL] Support 32 or more grouping attributes for GROUPING_ID

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #26918: [SPARK-30279][SQL] Support 32 
or more grouping attributes for GROUPING_ID 
URL: https://github.com/apache/spark/pull/26918#issuecomment-594145833
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26918: [SPARK-30279][SQL] Support 32 or more grouping attributes for GROUPING_ID

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #26918: [SPARK-30279][SQL] Support 32 or more 
grouping attributes for GROUPING_ID 
URL: https://github.com/apache/spark/pull/26918#issuecomment-594145847
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119234/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #26918: [SPARK-30279][SQL] Support 32 or more grouping attributes for GROUPING_ID

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #26918: [SPARK-30279][SQL] Support 32 or more 
grouping attributes for GROUPING_ID 
URL: https://github.com/apache/spark/pull/26918#issuecomment-594145833
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #26918: [SPARK-30279][SQL] Support 32 or more grouping attributes for GROUPING_ID

2020-03-03 Thread GitBox
SparkQA removed a comment on issue #26918: [SPARK-30279][SQL] Support 32 or 
more grouping attributes for GROUPING_ID 
URL: https://github.com/apache/spark/pull/26918#issuecomment-593991498
 
 
   **[Test build #119234 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119234/testReport)**
 for PR 26918 at commit 
[`6476b62`](https://github.com/apache/spark/commit/6476b62667b2a38cabf44c2fad447c4bab9005d5).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #26918: [SPARK-30279][SQL] Support 32 or more grouping attributes for GROUPING_ID

2020-03-03 Thread GitBox
SparkQA commented on issue #26918: [SPARK-30279][SQL] Support 32 or more 
grouping attributes for GROUPING_ID 
URL: https://github.com/apache/spark/pull/26918#issuecomment-594144781
 
 
   **[Test build #119234 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119234/testReport)**
 for PR 26918 at commit 
[`6476b62`](https://github.com/apache/spark/commit/6476b62667b2a38cabf44c2fad447c4bab9005d5).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27777: [SPARK-31025][SQL] Support foldable CSV strings by `schema_of_csv`

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #2: [SPARK-31025][SQL] Support 
foldable CSV strings by `schema_of_csv`
URL: https://github.com/apache/spark/pull/2#issuecomment-594138770
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27777: [SPARK-31025][SQL] Support foldable CSV strings by `schema_of_csv`

2020-03-03 Thread GitBox
SparkQA commented on issue #2: [SPARK-31025][SQL] Support foldable CSV 
strings by `schema_of_csv`
URL: https://github.com/apache/spark/pull/2#issuecomment-594141546
 
 
   **[Test build #119250 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119250/testReport)**
 for PR 2 at commit 
[`d4da235`](https://github.com/apache/spark/commit/d4da2352f90e683c02e12b4f9e161284f0146734).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27777: [SPARK-31025][SQL] Support foldable CSV strings by `schema_of_csv`

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #2: [SPARK-31025][SQL] Support 
foldable CSV strings by `schema_of_csv`
URL: https://github.com/apache/spark/pull/2#issuecomment-594138782
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23990/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dongjoon-hyun commented on a change in pull request #27749: [SPARK-30997][SQL] An analysis failure in generators with aggregate functions

2020-03-03 Thread GitBox
dongjoon-hyun commented on a change in pull request #27749: [SPARK-30997][SQL] 
An analysis failure in generators with aggregate functions
URL: https://github.com/apache/spark/pull/27749#discussion_r387260348
 
 

 ##
 File path: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisErrorSuite.scala
 ##
 @@ -433,6 +433,13 @@ class AnalysisErrorSuite extends AnalysisTest {
   :: Nil
   )
 
+  errorTest(
+"generator nested in expressions for aggregates",
+testRelation.select(Explode(CreateArray(min($"a") :: max($"a") :: Nil)) + 
1),
+"Generators are not supported when it's nested in expressions, but got: " +
+  "(explode(array(min(a), max(a))) + 1)" :: Nil
+  )
+
 
 Review comment:
   Thanks for adding. It looks fine.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27777: [SPARK-31025][SQL] Support foldable CSV strings by `schema_of_csv`

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #2: [SPARK-31025][SQL] Support foldable 
CSV strings by `schema_of_csv`
URL: https://github.com/apache/spark/pull/2#issuecomment-594138770
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27777: [SPARK-31025][SQL] Support foldable CSV strings by `schema_of_csv`

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #2: [SPARK-31025][SQL] Support foldable 
CSV strings by `schema_of_csv`
URL: https://github.com/apache/spark/pull/2#issuecomment-594138782
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/23990/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27679: [SPARK-30776][ML] Support FValueSelector for continuous features and continuous labels

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27679: [SPARK-30776][ML] Support 
FValueSelector for continuous features and continuous labels
URL: https://github.com/apache/spark/pull/27679#issuecomment-594138522
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27679: [SPARK-30776][ML] Support FValueSelector for continuous features and continuous labels

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27679: [SPARK-30776][ML] Support 
FValueSelector for continuous features and continuous labels
URL: https://github.com/apache/spark/pull/27679#issuecomment-594138522
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27679: [SPARK-30776][ML] Support FValueSelector for continuous features and continuous labels

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27679: [SPARK-30776][ML] Support 
FValueSelector for continuous features and continuous labels
URL: https://github.com/apache/spark/pull/27679#issuecomment-594138532
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119247/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27679: [SPARK-30776][ML] Support FValueSelector for continuous features and continuous labels

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27679: [SPARK-30776][ML] Support 
FValueSelector for continuous features and continuous labels
URL: https://github.com/apache/spark/pull/27679#issuecomment-594138532
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119247/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk opened a new pull request #27777: [SPARK-31025][SQL] Support foldable CSV strings by `schema_of_csv`

2020-03-03 Thread GitBox
MaxGekk opened a new pull request #2: [SPARK-31025][SQL] Support foldable 
CSV strings by `schema_of_csv`
URL: https://github.com/apache/spark/pull/2
 
 
   ### What changes were proposed in this pull request?
   In the PR, I propose to change checking of the input parameter in the 
`SchemaOfCsv` expression, and allow foldable `child` expression.
   
   ### Why are the changes needed?
   To improve user experience with Spark SQL. Currently, only string literals 
are acceptable as CSV examples by `schema_of_csv`:
   ```sql
   spark-sql> select schema_of_csv(concat_ws(',', 0.1, 1));
   Error in query: cannot resolve 'schema_of_csv(concat_ws(',', CAST(0.1BD AS 
STRING), CAST(1 AS STRING)))' due to data type mismatch: The input csv should 
be a string literal and not null; however, got concat_ws(',', CAST(0.1BD AS 
STRING), CAST(1 AS STRING)).; line 1 pos 7;
   'Project [unresolvedalias(schema_of_csv(concat_ws(,, cast(0.1 as string), 
cast(1 as string))), None)]
   +- OneRowRelation
   ```
   
   ### Does this PR introduce any user-facing change?
   Yes, after change the `schema_of_csv` accept foldable expressions, for 
example:
   ```sql
   ```
   
   ### How was this patch tested?
   - By existing test suites `CsvFunctionsSuite` and `CsvExpressionsSuite`.
   - Added new test to `CsvFunctionsSuite` to check foldable input.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27679: [SPARK-30776][ML] Support FValueSelector for continuous features and continuous labels

2020-03-03 Thread GitBox
SparkQA removed a comment on issue #27679: [SPARK-30776][ML] Support 
FValueSelector for continuous features and continuous labels
URL: https://github.com/apache/spark/pull/27679#issuecomment-594105637
 
 
   **[Test build #119247 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119247/testReport)**
 for PR 27679 at commit 
[`4584465`](https://github.com/apache/spark/commit/4584465a7681b5199f9cc31c755e7e96ee36bb1d).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA commented on issue #27679: [SPARK-30776][ML] Support FValueSelector for continuous features and continuous labels

2020-03-03 Thread GitBox
SparkQA commented on issue #27679: [SPARK-30776][ML] Support FValueSelector for 
continuous features and continuous labels
URL: https://github.com/apache/spark/pull/27679#issuecomment-594138027
 
 
   **[Test build #119247 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119247/testReport)**
 for PR 27679 at commit 
[`4584465`](https://github.com/apache/spark/commit/4584465a7681b5199f9cc31c755e7e96ee36bb1d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys

2020-03-03 Thread GitBox
viirya commented on a change in pull request #27772: [SPARK-31019][SQL] make it 
clear that people can deduplicate map keys
URL: https://github.com/apache/spark/pull/27772#discussion_r387252151
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
 ##
 @@ -2435,17 +2450,6 @@ object SQLConf {
 .booleanConf
 .createWithDefault(false)
 
-  val LEGACY_ALLOW_DUPLICATED_MAP_KEY =
 
 Review comment:
   We need to update sql-migration-guide doc too. We already documented 
`spark.sql.legacy.allowDuplicatedMapKeys` there.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] viirya commented on a change in pull request #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys

2020-03-03 Thread GitBox
viirya commented on a change in pull request #27772: [SPARK-31019][SQL] make it 
clear that people can deduplicate map keys
URL: https://github.com/apache/spark/pull/27772#discussion_r387252151
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
 ##
 @@ -2435,17 +2450,6 @@ object SQLConf {
 .booleanConf
 .createWithDefault(false)
 
-  val LEGACY_ALLOW_DUPLICATED_MAP_KEY =
 
 Review comment:
   We need to update sql-migration-guide doc too.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dbtsai commented on a change in pull request #27728: [SPARK-17636][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
dbtsai commented on a change in pull request #27728: 
[SPARK-17636][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#discussion_r387248516
 
 

 ##
 File path: 
sql/core/v2.3/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFilters.scala
 ##
 @@ -64,9 +64,11 @@ private[sql] object OrcFilters extends OrcFiltersBase {
* Create ORC filter as a SearchArgument instance.
*/
   def createFilter(schema: StructType, filters: Seq[Filter]): 
Option[SearchArgument] = {
-val dataTypeMap = schema.map(f => f.name -> f.dataType).toMap
+val dataTypeMap = schema.map(f => quoteAttributeNameIfNeeded(f.name) -> 
f.dataType).toMap
 // Combines all convertible filters using `And` to produce a single 
conjunction
-val conjunctionOptional = buildTree(convertibleFilters(schema, 
dataTypeMap, filters))
+// TODO: ORC doesn't support predicate pushdown for nested field yet, so 
they are removed.
 
 Review comment:
   Done. thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dbtsai commented on a change in pull request #27728: [SPARK-17636][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
dbtsai commented on a change in pull request #27728: 
[SPARK-17636][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#discussion_r387245545
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetTest.scala
 ##
 @@ -63,13 +63,38 @@ private[sql] trait ParquetTest extends 
FileBasedDataSourceTest {
   (f: String => Unit): Unit = withDataSourceFile(data)(f)
 
   /**
-   * Writes `data` to a Parquet file and reads it back as a [[DataFrame]],
+   * Writes `data` objects to a Parquet file and reads it back as a 
[[DataFrame]],
* which is then passed to `f`. The Parquet file will be deleted after `f` 
returns.
*/
-  protected def withParquetDataFrame[T <: Product: ClassTag: TypeTag]
+  protected def withParquetDFfromObjs[T <: Product: ClassTag: TypeTag]
   (data: Seq[T], testVectorized: Boolean = true)
   (f: DataFrame => Unit): Unit = withDataSourceDataFrame(data, 
testVectorized)(f)
 
+  /**
+   * Writes `df` dataframe to a Parquet file and reads it back as a 
[[DataFrame]],
+   * which is then passed to `f`. The Parquet file will be deleted after `f` 
returns.
+   */
+  protected def withParquetDFfromDF[T <: Product: ClassTag: TypeTag]
+  (df: DataFrame, testVectorized: Boolean = true)
+  (f: DataFrame => Unit): Unit = {
+withTempPath { file =>
+  df.write.format(dataSourceName).save(file.getCanonicalPath)
+  readFile(file.getCanonicalPath, testVectorized)(f)
+}
+  }
+
+  /**
+   * Writes `df` to a Parquet file and reads it back as a [[DataFrame]],
+   * which is then passed to `f`. The Parquet file will be deleted after `f` 
returns.
+   */
+  protected def toParquetDataFrame(df: DataFrame, testVectorized: Boolean = 
true)
 
 Review comment:
   They are used in couple places. I can submit another PR for this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27772: [SPARK-31019][SQL] make it 
clear that people can deduplicate map keys
URL: https://github.com/apache/spark/pull/27772#issuecomment-594127264
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119232/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27772: [SPARK-31019][SQL] make it clear that 
people can deduplicate map keys
URL: https://github.com/apache/spark/pull/27772#issuecomment-594127264
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/119232/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins removed a comment on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys

2020-03-03 Thread GitBox
AmplabJenkins removed a comment on issue #27772: [SPARK-31019][SQL] make it 
clear that people can deduplicate map keys
URL: https://github.com/apache/spark/pull/27772#issuecomment-594127250
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] AmplabJenkins commented on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys

2020-03-03 Thread GitBox
AmplabJenkins commented on issue #27772: [SPARK-31019][SQL] make it clear that 
people can deduplicate map keys
URL: https://github.com/apache/spark/pull/27772#issuecomment-594127250
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] SparkQA removed a comment on issue #27772: [SPARK-31019][SQL] make it clear that people can deduplicate map keys

2020-03-03 Thread GitBox
SparkQA removed a comment on issue #27772: [SPARK-31019][SQL] make it clear 
that people can deduplicate map keys
URL: https://github.com/apache/spark/pull/27772#issuecomment-593967450
 
 
   **[Test build #119232 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/119232/testReport)**
 for PR 27772 at commit 
[`80c7450`](https://github.com/apache/spark/commit/80c74509ab1a86cc001887060b34fd3c29ec5a81).


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] dbtsai commented on a change in pull request #27728: [SPARK-17636][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet

2020-03-03 Thread GitBox
dbtsai commented on a change in pull request #27728: 
[SPARK-17636][SQL][test-hive1.2] Nested Column Predicate Pushdown for Parquet
URL: https://github.com/apache/spark/pull/27728#discussion_r387244858
 
 

 ##
 File path: 
sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetTest.scala
 ##
 @@ -63,13 +63,38 @@ private[sql] trait ParquetTest extends 
FileBasedDataSourceTest {
   (f: String => Unit): Unit = withDataSourceFile(data)(f)
 
   /**
-   * Writes `data` to a Parquet file and reads it back as a [[DataFrame]],
+   * Writes `data` objects to a Parquet file and reads it back as a 
[[DataFrame]],
* which is then passed to `f`. The Parquet file will be deleted after `f` 
returns.
*/
-  protected def withParquetDataFrame[T <: Product: ClassTag: TypeTag]
+  protected def withParquetDFfromObjs[T <: Product: ClassTag: TypeTag]
 
 Review comment:
   Sounds good idea.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   8   9   10   >