[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20372 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20372 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user ash211 commented on the issue: https://github.com/apache/spark/pull/20372 Jenkins, this is ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20372 **[Test build #86567 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86567/testReport)** for PR 20372 at commit [`ef04de9`](https://github.com/apache/spark/commit/ef04de9766584b0a8ab13f290c9850e44570). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20372 **[Test build #86567 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86567/testReport)** for PR 20372 at commit [`ef04de9`](https://github.com/apache/spark/commit/ef04de9766584b0a8ab13f290c9850e44570). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20372 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86567/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20372 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user ash211 commented on the issue: https://github.com/apache/spark/pull/20372 Please fix the scala style checks -- ``` Running Scala style checks Scalastyle checks failed at following occurrences: [error] /home/jenkins/workspace/SparkPullRequestBuilder/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala:459: File line length exceeds 100 characters [error] /home/jenkins/workspace/SparkPullRequestBuilder/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala:463: File line length exceeds 100 characters [error] Total time: 14 s, completed Jan 23, 2018 10:44:36 PM [error] running /home/jenkins/workspace/SparkPullRequestBuilder/dev/lint-scala ; received return code 1 ``` and verify locally with `./dev/lint-scala` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user glentakahashi commented on the issue: https://github.com/apache/spark/pull/20372 The large non-splittable files is already tested by https://github.com/glentakahashi/spark/blob/c575977a5952bf50b605be8079c9be1e30f3bd36/sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategySuite.scala#L346 actually --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20372 **[Test build #86591 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86591/testReport)** for PR 20372 at commit [`57722cf`](https://github.com/apache/spark/commit/57722cfaa035dc63da21c6bd442d995b8a0bcf0a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20372 **[Test build #86591 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86591/testReport)** for PR 20372 at commit [`57722cf`](https://github.com/apache/spark/commit/57722cfaa035dc63da21c6bd442d995b8a0bcf0a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20372 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86591/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20372 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user ash211 commented on the issue: https://github.com/apache/spark/pull/20372 Tagging folks who have touched this code recently: @vgankidi @ericl @davies This seems to provide a more compact packing in every scenario, which should improve execution times. One risk is that individual partitions are no longer always contiguous ranges of files in order, but rather sometimes they have a gap. In the test this is the `(file1, file6)` partition. If something depends on this past behavior it could now break, though I don't think anything should be requiring this partition ordering. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user vgankidi commented on the issue: https://github.com/apache/spark/pull/20372 I agree with @ash211. Applications shouldn't rely on the order of the files within a partition. This optimization looks good to me. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20372: Improved block merging logic for partitions
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20372 please see https://spark.apache.org/contributing.html open a JIRA and update this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org