[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-26 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20372
  
please see https://spark.apache.org/contributing.html
open a JIRA and update this PR?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-26 Thread vgankidi
Github user vgankidi commented on the issue:

https://github.com/apache/spark/pull/20372
  
I agree with @ash211. Applications shouldn't rely on the order of the files 
within a partition. 
This optimization looks good to me. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-24 Thread ash211
Github user ash211 commented on the issue:

https://github.com/apache/spark/pull/20372
  
Tagging folks who have touched this code recently: @vgankidi @ericl @davies 

This seems to provide a more compact packing in every scenario, which 
should improve execution times.  One risk is that individual partitions are no 
longer always contiguous ranges of files in order, but rather sometimes they 
have a gap.  In the test this is the `(file1, file6)` partition.  If something 
depends on this past behavior it could now break, though I don't think anything 
should be requiring this partition ordering.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20372
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20372
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86591/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20372
  
**[Test build #86591 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86591/testReport)**
 for PR 20372 at commit 
[`57722cf`](https://github.com/apache/spark/commit/57722cfaa035dc63da21c6bd442d995b8a0bcf0a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-24 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20372
  
**[Test build #86591 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86591/testReport)**
 for PR 20372 at commit 
[`57722cf`](https://github.com/apache/spark/commit/57722cfaa035dc63da21c6bd442d995b8a0bcf0a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-24 Thread glentakahashi
Github user glentakahashi commented on the issue:

https://github.com/apache/spark/pull/20372
  
The large non-splittable files is already tested by 
https://github.com/glentakahashi/spark/blob/c575977a5952bf50b605be8079c9be1e30f3bd36/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala#L346
 actually


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-23 Thread ash211
Github user ash211 commented on the issue:

https://github.com/apache/spark/pull/20372
  
Please fix the scala style checks --

```
Running Scala style checks

Scalastyle checks failed at following occurrences:
[error] 
/home/jenkins/workspace/SparkPullRequestBuilder/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala:459:
 File line length exceeds 100 characters
[error] 
/home/jenkins/workspace/SparkPullRequestBuilder/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala:463:
 File line length exceeds 100 characters
[error] Total time: 14 s, completed Jan 23, 2018 10:44:36 PM
[error] running 
/home/jenkins/workspace/SparkPullRequestBuilder/dev/lint-scala ; received 
return code 1
```

and verify locally with `./dev/lint-scala`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20372
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20372
  
**[Test build #86567 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86567/testReport)**
 for PR 20372 at commit 
[`ef04de9`](https://github.com/apache/spark/commit/ef04de9766584b0a8ab13f290c9850e44570).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20372
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86567/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-23 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20372
  
**[Test build #86567 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86567/testReport)**
 for PR 20372 at commit 
[`ef04de9`](https://github.com/apache/spark/commit/ef04de9766584b0a8ab13f290c9850e44570).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-23 Thread ash211
Github user ash211 commented on the issue:

https://github.com/apache/spark/pull/20372
  
Jenkins, this is ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20372
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20372: Improved block merging logic for partitions

2018-01-23 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20372
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org