[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/21601 merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user tgravescs commented on the issue: https://github.com/apache/spark/pull/21601 +1 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92773/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92773 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92773/testReport)** for PR 21601 at commit [`bcb2991`](https://github.com/apache/spark/commit/bcb2991b278cafb2f163bae0069293c61b939898). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/792/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/791/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92772 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92772/testReport)** for PR 21601 at commit [`f1c4160`](https://github.com/apache/spark/commit/f1c41608c22e3b11271838852370021b10d546ed). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92772/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92772 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92772/testReport)** for PR 21601 at commit [`f1c4160`](https://github.com/apache/spark/commit/f1c41608c22e3b11271838852370021b10d546ed). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92566/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92566 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92566/testReport)** for PR 21601 at commit [`15356df`](https://github.com/apache/spark/commit/15356df4f796a5e811a79431fb9f9bb122f03c8b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92566 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92566/testReport)** for PR 21601 at commit [`15356df`](https://github.com/apache/spark/commit/15356df4f796a5e811a79431fb9f9bb122f03c8b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/640/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92545/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92545 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92545/testReport)** for PR 21601 at commit [`b351406`](https://github.com/apache/spark/commit/b3514067db43b543d8ceac38a0e1ffe6c1a5692e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92545 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92545/testReport)** for PR 21601 at commit [`b351406`](https://github.com/apache/spark/commit/b3514067db43b543d8ceac38a0e1ffe6c1a5692e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/629/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/21601 @attilapiros I will modify the test to add a check/assert which makes it easy to follow and validate what we are trying to achieve in the test. For the rest of the cases, since these are hadoop related configs and not directly related to spark, I didn't add additional test cases as these are more related to the `CombinedFileInputFormat` rather than `WholeTextFileInputFormat`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user attilapiros commented on the issue: https://github.com/apache/spark/pull/21601 I read your changes and in the test I was searching for an check/assert but found none. I understand it is about checking no exception is thrown during the directory content reading but I still missed some asserts and covering more cases, at least: - min split size per node < maxSplitSize && min split size per rack < maxSplitSize - min split size per node > maxSplitSize && min split size per rack < maxSplitSize - min split size per node < maxSplitSize && min split size per rack > maxSplitSize As I see it is hard to add checks/asserts but what about testing WholeTextFileInputFormat directly? In your test you could inherit from WholeTextFileInputFormat and override the protected setters for maxSplitSize, minSplitSizeNode, minSplitSizeRack and and store the values in your new test class so asserts and checks can be added. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user dhruve commented on the issue: https://github.com/apache/spark/pull/21601 @vanzin Can you review this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92144/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92144 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92144/testReport)** for PR 21601 at commit [`e2d4e07`](https://github.com/apache/spark/commit/e2d4e07984751a7fc08e53f98dbd604d47f2f035). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/361/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92144 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92144/testReport)** for PR 21601 at commit [`e2d4e07`](https://github.com/apache/spark/commit/e2d4e07984751a7fc08e53f98dbd604d47f2f035). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92143/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92143 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92143/testReport)** for PR 21601 at commit [`2369e3a`](https://github.com/apache/spark/commit/2369e3acee730b7d4e45175870de0ecac601069b). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/360/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21601 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21601 **[Test build #92143 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92143/testReport)** for PR 21601 at commit [`2369e3a`](https://github.com/apache/spark/commit/2369e3acee730b7d4e45175870de0ecac601069b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org