[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-11 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 @cloud-fan Very sorry to reply so late. I updated the code followed your suggestion. Does this need performance test? If needed, I will test it late. ---

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-06 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19135 is it better to do batch unrolling? i.e., we can check memory usage and request memory for like every 10 records, instead of doing it for every record. ---

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-06 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 @jiangxb1987 Ok, I can test it later. The following picture is when I run kmeans and put the source data into the offheap memory, and you can see the CPU time occupied by

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-06 Thread jiangxb1987
Github user jiangxb1987 commented on the issue: https://github.com/apache/spark/pull/19135 It would be great to test the perf on executors with various corsPerExecutor settings to ensure we don't bring in regressions by the code change. ---

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-05 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 hi @cloud-fan, The previous writing is the same as `putIteratorAsValues`. Now I have modified the code, each application for an additional `chunkSize` bytes of memory, because the size of

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-05 Thread cloud-fan
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/19135 Does this patch has regressions? It seems to me that allocating more memory may starve other tasks/operators and reduce the overall performance. ---

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-05 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 Firstly, Serialization time did not take a long time. You can see follow: https://user-images.githubusercontent.com/12733256/30067330-1596eb1e-928d-11e7-818a-4a292e601a26.png;>

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-05 Thread jerryshao
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19135 Sorry I'm not so familiar with this part, but from the test result seems that the performance just improved a little. I would doubt the way you generate RDD `0 until Integer.MAX_VALUE` might take

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-05 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 Here https://github.com/apache/spark/pull/19135/files?diff=unified#diff-870cd3693df7a5add2ac3119d7d91d34L373, we call `reserveAdditionalMemoryIfNecessary()` for every record write. ---

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-05 Thread ConeyLiu
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19135 Hi, @cloud-fan @jerryshao Would you mind take a look ? Thanks a lot --- - To unsubscribe, e-mail:

[GitHub] spark issue #19135: [SPARK-21923][CORE]Avoid call reserveUnrollMemoryForThis...

2017-09-05 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19135 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional