[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-12 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/21601
  
merged to master


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-10 Thread tgravescs
Github user tgravescs commented on the issue:

https://github.com/apache/spark/pull/21601
  
+1


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92773/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92773 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92773/testReport)**
 for PR 21601 at commit 
[`bcb2991`](https://github.com/apache/spark/commit/bcb2991b278cafb2f163bae0069293c61b939898).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/792/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/791/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92772 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92772/testReport)**
 for PR 21601 at commit 
[`f1c4160`](https://github.com/apache/spark/commit/f1c41608c22e3b11271838852370021b10d546ed).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92772/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-09 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92772 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92772/testReport)**
 for PR 21601 at commit 
[`f1c4160`](https://github.com/apache/spark/commit/f1c41608c22e3b11271838852370021b10d546ed).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92566/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92566 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92566/testReport)**
 for PR 21601 at commit 
[`15356df`](https://github.com/apache/spark/commit/15356df4f796a5e811a79431fb9f9bb122f03c8b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92566 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92566/testReport)**
 for PR 21601 at commit 
[`15356df`](https://github.com/apache/spark/commit/15356df4f796a5e811a79431fb9f9bb122f03c8b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/640/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92545/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92545 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92545/testReport)**
 for PR 21601 at commit 
[`b351406`](https://github.com/apache/spark/commit/b3514067db43b543d8ceac38a0e1ffe6c1a5692e).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92545 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92545/testReport)**
 for PR 21601 at commit 
[`b351406`](https://github.com/apache/spark/commit/b3514067db43b543d8ceac38a0e1ffe6c1a5692e).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/629/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-07-02 Thread dhruve
Github user dhruve commented on the issue:

https://github.com/apache/spark/pull/21601
  
@attilapiros I will modify the test to add a check/assert which makes it 
easy to follow and validate what we are trying to achieve in the test. For the 
rest of the cases, since these are hadoop related configs and not directly 
related to spark, I didn't add additional test cases as these are more related 
to the `CombinedFileInputFormat` rather than `WholeTextFileInputFormat`. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-23 Thread attilapiros
Github user attilapiros commented on the issue:

https://github.com/apache/spark/pull/21601
  
I read your changes and in the test I was searching for an check/assert but 
found none. I understand it is about checking no exception is thrown during the 
directory content reading but I still missed some asserts and covering more 
cases, at least:
- min split size per node < maxSplitSize && min split size per rack < 
maxSplitSize  
- min split size per node > maxSplitSize && min split size per rack < 
maxSplitSize
- min split size per node < maxSplitSize && min split size per rack > 
maxSplitSize

As I see it is hard to add checks/asserts but what about testing 
WholeTextFileInputFormat directly?
In your test you could inherit from WholeTextFileInputFormat and override 
the protected setters for maxSplitSize, minSplitSizeNode, minSplitSizeRack and 
and store the values in your new test class so asserts and checks can be added.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-21 Thread dhruve
Github user dhruve commented on the issue:

https://github.com/apache/spark/pull/21601
  
@vanzin Can you review this PR?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92144/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92144 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92144/testReport)**
 for PR 21601 at commit 
[`e2d4e07`](https://github.com/apache/spark/commit/e2d4e07984751a7fc08e53f98dbd604d47f2f035).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/361/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92144 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92144/testReport)**
 for PR 21601 at commit 
[`e2d4e07`](https://github.com/apache/spark/commit/e2d4e07984751a7fc08e53f98dbd604d47f2f035).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92143/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92143 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92143/testReport)**
 for PR 21601 at commit 
[`2369e3a`](https://github.com/apache/spark/commit/2369e3acee730b7d4e45175870de0ecac601069b).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/360/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21601
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21601: [SPARK-24610] fix reading small files via wholeTextFiles

2018-06-20 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21601
  
**[Test build #92143 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92143/testReport)**
 for PR 21601 at commit 
[`2369e3a`](https://github.com/apache/spark/commit/2369e3acee730b7d4e45175870de0ecac601069b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org