[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-24 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 @cloud-fan , exception screenshot. Let me know if you want any change. ![image](https://user-images.githubusercontent.com/2989575/47471258-1793ce00-d83c-11e8-90bf-107865fc9032.png)

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-24 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97975/ Test PASSed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #97975 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97975/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-24 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #97975 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97975/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19788 BTW, let's add a config for this feature. We may enable adaptive execution by default in the future, and we should still allow users to run spark with legacy shuffle service. We should also throw

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #97787 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97787/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #97745 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97745/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19788 Hi @yucai , good points on the performance concerns. Let's go with the previous approach: https://github.com/apache/spark/pull/19788#issuecomment-366887404 sorry for the back and forth!

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #97745 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97745/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/97715/ Test FAILed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #97715 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97715/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #97715 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/97715/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-10-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-08-11 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 **Summary** One disk IO solution's performance seems not as good as current PR19877's implementation. **Benchmark** ```scacla spark.range(1, 512000L, 1,

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-07-31 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19788 > One possible solution is to read all contiguous partition in one shot and then send each shuffle block one by one, how do you think? We may need benchmark performance in this way.

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-07-30 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 @cloud-fan @gatorsmile I am trying the new method as suggested and I have a question. If we make it **purely server-side** optimization, for external shuffle service, it has no idea how

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-07-30 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 @gatorsmile @cloud-fan @carsonwang I will update it recently. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-07-30 Thread gatorsmile
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/19788 ping @yucai @carsonwang --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-29 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 Have synced with @cloud-fan , I will update this, thanks very much! --- - To unsubscribe, e-mail:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-22 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19788 I think we can do this better, to make it a purely server-side optimization. The shuffle protocol can already fetch multiple blocks in one request, i.e. the `OpenBlocks` request. The

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-04 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 @cloud-fan , sorry for late response, could you help take a look at? --- - To unsubscribe, e-mail:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87947/ Test PASSed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-04 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87947 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87947/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-04 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87947 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87947/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87930/ Test PASSed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87930 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87930/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87930 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87930/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-01 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-03-01 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 I did local UT, org.apache.spark.sql.FileBasedDataSourceSuite is good. --- - To unsubscribe, e-mail:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87814/ Test FAILed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87814 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87814/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-28 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87812/ Test PASSed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87812 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87812/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87814 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87814/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-28 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87812 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87812/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-22 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 @cloud-fan if encryption is enabled `blockManager.serializerManager().encryptionEnabled() == true`, shall we disable this feature also? ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87610/ Test FAILed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-22 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87610 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87610/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-22 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87610 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87610/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87607/ Test FAILed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87607 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87607/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-21 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87607 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87607/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-21 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 Good suggestion, thanks @cloud-fan! Let me update accordingly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-19 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19788 The idea LGTM, but I think @JoshRosen has a valid concern. My 2 cents: 1. The concept of reading multiple reducer partitions in one shot was introduced by `ShuffleManager.getReader`. Although

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87404/ Test PASSed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-13 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87404 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87404/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-13 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87404 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87404/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-13 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/19788 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 I did local UT, org.apache.spark.sql.FileBasedDataSourceSuite looks good, @jerryshao could you help re-trigger testing? --- - To

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 @JoshRosen this feature only happens in SparkSQL, and SparkSQL uses UnsafeRowSerializer, which supports relocation of serialized objects, so I think we only consider compression, right? I have added

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87312/ Test FAILed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87312 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87312/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87312 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87312/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87305/ Test FAILed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87305 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87305/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-11 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #87305 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87305/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-03 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 Thanks @wangyum for this nice data support! Now, we can see obvious time reduce from this feature. --- - To unsubscribe,

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-02 Thread wangyum
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/19788 Thanks @yucai , It's a great improvement for many output files. The figure below is our comparison: **Before**:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86991/ Test PASSed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #86991 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86991/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #86991 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86991/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #86939 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86939/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86939/ Test FAILed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #86939 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86939/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #86938 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86938/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86938/ Test FAILed. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-02-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19788 **[Test build #86938 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86938/testReport)** for PR 19788 at commit

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2018-01-18 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-12-14 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 I will update a new version. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-12-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19788 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-28 Thread JoshRosen
Github user JoshRosen commented on the issue: https://github.com/apache/spark/pull/19788 Is there an implicit assumption here that contiguous partitions' data can be decompressed / deserialized in a single stream? If the shuffled data is written with a non-relocatable serializer

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-27 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19788 Sounds good, it would be great if we could document it clearly that if user wants to use adaptive execution, they have to update the external shuffle service. ---

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-27 Thread yucai
Github user yucai commented on the issue: https://github.com/apache/spark/pull/19788 @jerryshao @cloud-fan @gczsjdy Because this feature is only used in adaptive execution, how about this way: - Remove `spark.shuffle.continuousFetch` - When

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-26 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19788 Can we just add the `ContinuousShuffleBlockId` without adding new conf `spark.shuffle.continuousFetch`? While in classes related to shuffle read like `ShuffleBlockFetcherIterator`, we also pattern

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-26 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19788 @yucai I'm thinking of the necessity to add this new configuration `spark.shuffle.continuousFetch` like you mentioned above. This PR you proposed is actually a superset of previous way, it is

[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...

2017-11-26 Thread gczsjdy
Github user gczsjdy commented on the issue: https://github.com/apache/spark/pull/19788 What are ` external shuffle service` here? Can you explain a little bit? --- - To unsubscribe, e-mail:

  1   2   >