[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/21638 Ideally the last test should have 50 partitions? is it because we really need the test data to be at least 50 bytes? ideally a multiple of 50, I guess. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user bomeng commented on the issue: https://github.com/apache/spark/pull/21638 Here is the test code, not sure it is right or not --- ``` test("Number of partitions") { sc = new SparkContext(new SparkConf().setAppName("test").setMaster("local") .set("spark.files.maxPartitionBytes", "10") .set("spark.files.openCostInBytes", "0") .set("spark.default.parallelism", "1")) val dir1 = Utils.createTempDir() val dirpath1 = dir1.getAbsolutePath val dir2 = Utils.createTempDir() val dirpath2 = dir2.getAbsolutePath val file1 = new File(dir1, "part-0") val file2 = new File(dir1, "part-1") Files.write("someline1 in file1\nsomeline2 in file1\nsomeline3 in file1", file1, StandardCharsets.UTF_8) Files.write("someline1 in file2\nsomeline2 in file2\nsomeline3 in file2", file2, StandardCharsets.UTF_8) assert(sc.binaryFiles(dirpath1, minPartitions = 1).getNumPartitions == 2) assert(sc.binaryFiles(dirpath1, minPartitions = 2).getNumPartitions == 2) assert(sc.binaryFiles(dirpath1, minPartitions = 50).getNumPartitions == 2) } ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21638 Yea, let's add a regression test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21638 @bomeng Could you submit a follow-up PR to add a test case? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/21638 Merged to master --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/95295/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #95295 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95295/testReport)** for PR 21638 at commit [`5e46efb`](https://github.com/apache/spark/commit/5e46efb5f5ce86297c4aeb23bf934fd9942de3de). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #95295 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/95295/testReport)** for PR 21638 at commit [`5e46efb`](https://github.com/apache/spark/commit/5e46efb5f5ce86297c4aeb23bf934fd9942de3de). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2577/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #4290 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4290/testReport)** for PR 21638 at commit [`c24fbe5`](https://github.com/apache/spark/commit/c24fbe5cdf259814c30d9038fa3c35a2934ac39f). * This patch **fails Spark unit tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #4290 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4290/testReport)** for PR 21638 at commit [`c24fbe5`](https://github.com/apache/spark/commit/c24fbe5cdf259814c30d9038fa3c35a2934ac39f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21638 Yea, it's internal to Spark. Might be good to keep it but that concern should be secondary IMHO. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/21638 Except for `binaryFiles`, everything else that needs to change is private to Spark. I know it's public in the bytecode, but only Java callers could accidentally exploit that. Still I don't personally care too much either way, as long as all the unused args are documented, I guess, for completeness. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user bomeng commented on the issue: https://github.com/apache/spark/pull/21638 Either way works for me, but I think since this is not a private method, so people may use it in their own approach. The minimal change will be the best. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/21638 Because this method is internal to Spark, why not just take out the parameter? Yes it's superfluous now, but it's been this way for a while, and seems perhaps better to avoid a behavior change. In fact you can pull a `minPartitions` parameter out of several private methods then. You can't remove the parameter to `binaryFiles`, sure, but it can be documented as doing nothing. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #93067 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93067/testReport)** for PR 21638 at commit [`c24fbe5`](https://github.com/apache/spark/commit/c24fbe5cdf259814c30d9038fa3c35a2934ac39f). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/983/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #93067 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93067/testReport)** for PR 21638 at commit [`c24fbe5`](https://github.com/apache/spark/commit/c24fbe5cdf259814c30d9038fa3c35a2934ac39f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21638 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user bomeng commented on the issue: https://github.com/apache/spark/pull/21638 @HyukjinKwon please review. thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92350/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #92350 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92350/testReport)** for PR 21638 at commit [`c24fbe5`](https://github.com/apache/spark/commit/c24fbe5cdf259814c30d9038fa3c35a2934ac39f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #92350 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92350/testReport)** for PR 21638 at commit [`c24fbe5`](https://github.com/apache/spark/commit/c24fbe5cdf259814c30d9038fa3c35a2934ac39f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/490/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21638 Not sure yet but let's leave that out of this PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user MaxGekk commented on the issue: https://github.com/apache/spark/pull/21638 It seems there is similar code there: https://github.com/apache/spark/blob/e76b0124fbe463def00b1dffcfd8fd47e04772fe/sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala#L424-L433 . Should it be changed in the same way? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92309/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #92309 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92309/testReport)** for PR 21638 at commit [`0fc35d4`](https://github.com/apache/spark/commit/0fc35d4e0db34239cd3c52b0cf21445c59d2dede). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21638 **[Test build #92309 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92309/testReport)** for PR 21638 at commit [`0fc35d4`](https://github.com/apache/spark/commit/0fc35d4e0db34239cd3c52b0cf21445c59d2dede). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21638: [SPARK-22357][CORE] SparkContext.binaryFiles ignore minP...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21638 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/464/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org