[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-24 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19868 Can we also update the title? ``` Avoid iterating all partition paths when spark.sql.hive.verifyPartitionPath=true ``` This is not true, we didn't fix the problem of

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-22 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 Sure, updated. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-21 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19868 @jinxing64 Let's not talk too much about the problem of `spark.sql.hive.verifyPartitionPath`. We should just introduce `spark.sql.files.ignoreMissingFiles` and say this can replace

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-21 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 @cloud-fan Thanks for ping~ I updated the description. Let me know if I should refine it. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-20 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19868 Basically we need to introduce this new `spark.sql.files.ignoreMissingFiles` config in detail. And them explain how can we use it to replace `spark.sql.hive.verifyPartitionPath`. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-20 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 Sure, let me do it today. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-19 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19868 We switched to a totally different approach in the middle and forgot to update the PR description... @jinxing64 can you update it? thanks! ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-09-18 Thread rxin
Github user rxin commented on the issue: https://github.com/apache/spark/pull/19868 can somebody explain to me what the pr description has to do with missingFiles? I'm probably missing something but i feel the implementation is very different from the pr description. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 @cloud-fan Thanks a lot for merging. I will address the left comments today. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19868 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19868 nvm, the new test follows the existing code style in this file. Feel free to address them in followup PR. --- - To

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89438/ Test PASSed. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89438 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89438/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2379/

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89438 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89438/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19868 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89424/ Test FAILed. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-17 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89424 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89424/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2366/

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89424/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89397/ Test PASSed. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89397 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89397/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2345/

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89397 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89397/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-16 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 @cloud-fan Thanks again for review; I updated according to your comments and please take another look. --- - To

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89372/ Test PASSed. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89372 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89372/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2331/

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89372 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89372/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19868 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89371/ Test FAILed. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89370 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89370/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/89370/ Test FAILed. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89371 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89371/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 @cloud-fan @jiangxb1987 I updated and add a config `spark.files.ignoreMissingFiles`. It works for HadoopRDD and NewHadoopRDD in two cases: 1. "file not found" when `getPartitions` 2.

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2330/

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89371 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89371/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-15 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #89370 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/89370/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2018-04-08 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 cc @cloud-fan @jerryshao @jiangxb1987 would you take a look at this? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84408/ Test PASSed. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #84408 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84408/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #84408 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84408/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/19868 Jenkins, retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84399/ Test FAILed. ---

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19868 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #84399 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84399/testReport)** for PR 19868 at commit

[GitHub] spark issue #19868: [SPARK-22676] Avoid iterating all partition paths when s...

2017-12-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19868 **[Test build #84399 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84399/testReport)** for PR 19868 at commit