[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-17 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 Thank you, @gatorsmile ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-17 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20511 LGTM Thanks! Merged to master. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 Sure. @gatorsmile . I'll create a new PR for Apache Spark 2.4 default configuration and a migration guide for 2.3 to 2.4 after this PR is merged. ---

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20511 I agree on what @omalley said. The new reader based on ORC 1.4 is better than the old reader. That is why we chose the new reader as the default at the beginning. We also saw the performance

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-16 Thread omalley
Github user omalley commented on the issue: https://github.com/apache/spark/pull/20511 I'm frustrated with the direction this has gone. The new reader is much better than the old reader, which uses Hive 1.2. ORC 1.4.3 had a pair of important, but not large or complex fixes.

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 Thank you. I created [SPARK-23452 Extend test coverage to all ORC readers](https://issues.apache.org/jira/browse/SPARK-23452) for that. Maybe, we can add more comments during that JIRA task.

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-16 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20511 @dongjoon-hyun That also sounds good to me. Always welcome to seeing more contributions to help improve the test coverage. Doing our best to improve the test coverage and code readability.

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-16 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 @gatorsmile . In general, I agree with your idea to have a complete test coverage. And, I think we can double the test coverage by enabling and disabling the vectorization for the

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 Thank you, All. Now, it's ready for review again for Apache Spark 2.4. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87483/ Test PASSed. ---

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87483 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87483/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87483 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87483/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/920/

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 Sure, I rebased this. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-15 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20511 @dongjoon-hyun Could you rebase this PR? We want to merge it to master. Thanks! --- - To unsubscribe, e-mail:

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20511 uh, I missed the previous comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20511 Based on our standard, we do not change the dependencies in the minor releases. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-14 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 I see. Thank you for confirmation, @marmbrus ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-14 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/20511 Unfortunately, dependency changes are not typically allowed in patch releases. --- - To unsubscribe, e-mail:

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-14 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 @marmbrus , @sameeragarwal , @gatorsmile , @cloud-fan , @rxin . Can we have this patch for Apache Spark 2.3.1? --- -

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-14 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 +1. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-14 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20511 Based on what @marmbrus suggested above, given how late it is in the cycle, we have to disable the native ORC reader (and also the filter pushdown) in 2.3.0 to avoid delaying the release, but

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87418/ Test PASSed. ---

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87418/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87412/ Test PASSed. ---

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87412 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87412/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87408/ Test PASSed. ---

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87408 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87408/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87409/ Test PASSed. ---

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87409 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87409/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread marmbrus
Github user marmbrus commented on the issue: https://github.com/apache/spark/pull/20511 Sorry if I'm missing some context here, but our typical process this late in the release (we are over a month since the branch was cut) would be to disable any new features that still have

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/871/

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 Rebased to the master in order to update the migration doc, too. BTW, the previous failure is irrelevant: `org.apache.spark.sql.kafka010.KafkaContinuousSourceTopicDeletionSuite.subscribing

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87418/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87398/ Test FAILed. ---

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87398 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87398/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87412 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87412/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/866/

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87409 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87409/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/865/

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20511 **[Test build #87408 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87408/testReport)** for PR 20511 at commit

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/864/

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20511 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #20511: [SPARK-23340][SQL] Upgrade Apache ORC to 1.4.3

2018-02-13 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20511 Title of PR/JIRA is updated, too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional