[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-10-29 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/21320 > https://issues.apache.org/jira/browse/SPARK-25879 > > If we select a nested field and a top level field, the schema pruning will fail. Here is the reproducible test, > ... Hi

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-08 Thread jainaks
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 Hi @mallman, Thanks for this PR. It has huge impact on performance, when querying the nested parquet schema. I had used the original PR#16578 and found an issue, that it does not works well when the

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/21320 Hi @jainaks. I can see why your query would not work. In the example you provide, if you refer to the column as `name.First`, does your query succeed? --- -

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/28/

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3918/ Tes

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #91678 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91678/testReport)** for PR 21320 at commit [`8ead76e`](https://github.com/apache/spark/commit/8e

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91678/ Test FAILed. ---

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #91678 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91678/testReport)** for PR 21320 at commit [`8ead76e`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3920/ Tes

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #91679 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91679/testReport)** for PR 21320 at commit [`89febc8`](https://github.com/apache/spark/commit/89

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/30/

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #91679 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91679/testReport)** for PR 21320 at commit [`89febc8`](https://github.com/apache/spark/commit/8

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91679/ Test FAILed. ---

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/33/

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/3923/ Tes

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #91684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91684/testReport)** for PR 21320 at commit [`7f67ec0`](https://github.com/apache/spark/commit/7f

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #91684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91684/testReport)** for PR 21320 at commit [`7f67ec0`](https://github.com/apache/spark/commit/7

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91684/ Test FAILed. ---

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-11 Thread jainaks
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 @mallman It does work fine with "name.First". --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional c

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-12 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21320 @mallman Sorry for the delay. Super busy during the Spark summit. Will continue the code review in the next few days. --- -

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-12 Thread jainaks
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 Hi @mallman , I found another major issue after having this fix. Schema: a: struct (nullable = true) ||-- b: struct (nullable = true) |||-- c1: string (nullable = tru

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-12 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/21320 Hi @jainaks. Thanks for your report. Do you have the same problem running your test with this PR? --- - To unsubscribe, e-mail:

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-12 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/21320 @gatorsmile The last couple of build test failures appear to be entirely unrelated to this PR. The error message in the one failed test reads `org.scalatest.exceptions.TestFailedException: Unable to

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-12 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/21320 > @mallman It does work fine with "name.First". @jainaks What is the value of the Spark SQL configuration setting `spark.sql.caseSensitive` when you run this query? Also, are you querying th

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-12 Thread jainaks
Github user jainaks commented on the issue: https://github.com/apache/spark/pull/21320 > @jainaks What is the value of the Spark SQL configuration setting spark.sql.caseSensitive when you run this query? Also, are you querying the parquet file as part of a Hive metastore table or from

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-18 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/21320 @mallman Yes, the issue with window functions is reproducible even with this PR. Can you attach a (small) parquet file I can use to test this scenario? --- ---

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/310/

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-18 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #92070 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92070/testReport)** for PR 21320 at commit [`a255bcb`](https://github.com/apache/spark/commit/a2

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4204/ Tes

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92070/ Test FAILed. ---

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-19 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #92070 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92070/testReport)** for PR 21320 at commit [`a255bcb`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-19 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/21320 @gatorsmile The last build was killed by SIGKILL. Can you start a new build, please? --- - To unsubscribe, e-mail: reviews-unsub

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/21320 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: rev

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/392/

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #92191 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92191/testReport)** for PR 21320 at commit [`a255bcb`](https://github.com/apache/spark/commit/a2

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #92191 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92191/testReport)** for PR 21320 at commit [`a255bcb`](https://github.com/apache/spark/commit/a

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92191/ Test PASSed. ---

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/444/

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #92272 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92272/testReport)** for PR 21320 at commit [`cb858f2`](https://github.com/apache/spark/commit/cb

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-24 Thread mallman
Github user mallman commented on the issue: https://github.com/apache/spark/pull/21320 @gatorsmile I've removed the changes to the files as you requested. This removes support for schema pruning on filters of queries. I've pushed the previous revision to a new branch in our `spark-pub

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21320 **[Test build #92272 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92272/testReport)** for PR 21320 at commit [`cb858f2`](https://github.com/apache/spark/commit/c

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional comma

[GitHub] spark issue #21320: [SPARK-4502][SQL] Parquet nested column pruning - founda...

2018-06-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21320 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92272/ Test PASSed. ---

<    1   2   3