[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-08 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21212 @jinxing64 > I guess your concern is ArrayBuffer will do lots of copy as size of elements grows, and we don't need fast random access in ShuffleBlockFetcherIterator my concern wasn't

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-07 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21212 Thanks for merging ! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-07 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21212 thanks, merging to master! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail:

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90283/ Test PASSed. ---

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90283 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90283/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2968/

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90283 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90283/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2966/

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90280/ Test FAILed. ---

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90280 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90280/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-06 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90280 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90280/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90243/ Test FAILed. ---

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90243 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90243/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-05 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21212 @squito I ananlyzed "YourKit Memory Inspections" to analyze the heap, but didn't find many duplicate objects in ArrayBuffer. I guess your concern is ArrayBuffer will do lots of copy as size

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2944/

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-05 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90243 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90243/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21212 > do you mean optimize space usage for MapStatus when there are lots of consecutive empty-blocks ? Yea, something like doing an RLE for the size array in `CompressedMapStatus`. But this

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90127/ Test PASSed. ---

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90127 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90127/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21212 can you add a test in MapOutputTrackerSuite and update the pr description to include all the changes? but overall looks good. ---

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21212 If you have a heap dump, there are tools that can check for wasted space in ArrayBuffer. Eg. [jxray](http://www.jxray.com/) or [YourKit Memory

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21212 @squito @cloud-fan @jiangxb1987 Thanks a lot for review. > shall we also optimize the space usage for MapStatus @cloud-fan do you mean optimize space usage for MapStatus when

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2867/

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-03 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90127 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90127/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread squito
Github user squito commented on the issue: https://github.com/apache/spark/pull/21212 This makes sense to me. You should update the comment on `MapOutputTracker.getMapSizesByExecutorId` to mention that it excludes the zero-sized blocks, and also remove the filter in

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90041/ Test FAILed. ---

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90041 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90041/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/21212 I think it's reasonable to filter out empty shuffle blocks, shall we also optimize the space usage for `MapStatus`? then we can also reduce network traffic. ---

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/21212 How much memory did the converted pairs consume? If the empty blocks should be a issue can we just clean up the empty blocks? ---

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2804/

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90041 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90041/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21212 Jenkins, retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands,

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90032/ Test FAILed. ---

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-02 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90032 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90032/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-01 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21212 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/2795/

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-01 Thread SparkQA
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21212 **[Test build #90032 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90032/testReport)** for PR 21212 at commit

[GitHub] spark issue #21212: [SPARK-24143] filter empty blocks when convert mapstatus...

2018-05-01 Thread jinxing64
Github user jinxing64 commented on the issue: https://github.com/apache/spark/pull/21212 @squito @cloud-fan @jiangxb1987 Do you think this make sense? --- - To unsubscribe, e-mail: