On Tue, 23 Feb 2021 08:06:14 GMT, Ralf Schmelter <rschmel...@openjdk.org> wrote:

>> Hi @schmelter-sap,
>> Thanks a lot for reviewing and benchmarking. 
>> 
>>> I've benchmarked the code on my machine (128GB memory, 56 logical CPUs) 
>>> with an example creating a 32 GB heap dump. I only saw a 10 percent 
>>> reduction in time, both using uncompressed and compressed dumps. Have you 
>>> seen better numbers in your benchmarks?
>>>
>>> And it seems to potentially use a lot more temporary memory. In my example 
>>> I had a 4 GB array in the heap and the new code allocated 4 GB of 
>>> additional memory to write this array. This could happen in more threads in 
>>> parallel, increasing the memory consumption even more.
>> 
>> I have done some preliminary test on my machine (16GB, 8 core), the data are 
>> shown as follow:
>> `$ jmap -dump:file=dump4.bin,parallel=4 127420`
>> `Dumping heap to /home/lzang1/Source/jdk/dump4.bin ...`
>> `Heap dump file created [932950649 bytes in 0.591 secs]`
>> `$ jmap -dump:file=dump1.bin,parallel=1 127420`
>> `Dumping heap to /home/lzang1/Source/jdk/dump1.bin ...`
>> `Heap dump file created [932950739 bytes in 2.957 secs]`
>> 
>> But I do have observed unstable data reported on a machine with more cores 
>> and larger RAM, plus a workload with more heap usage. I thought that may be 
>> related with the memory consumption as you mentioned. And I am investigating 
>> the way to optimize it.
>> 
>>> If the above problems could be fixed, I would suggest to just use the 
>>> parallel code in all cases.
>> 
>> Thanks a lot! I will let you know when I make some progress on optimization.
>> 
>> BRs,
>> Lin
>
> Hi @linzang,
> 
> I've done more benchmarking using different numbers of threads for parallel 
> heap iteration and have found values which give at least a factor of 2 
> speedup (for gzipped dumps) or 1.6 (for unzipped dumps). For my scenario 
> using gzip compression about 10 percent of the available CPUs for parallel 
> iteration gave the best speedup, for the uncompressed one it was about 7 
> percent. 
> 
> Note that the baseline I compared against was not the parallel=1 case, but 
> the old code. The parallel=1 case was always 10 to 20 percent slower than the 
> old code.
> 
> Best regards,
> Ralf

Dear @ralf,
Really Thanks for benchmarking it!
It is a little surprise to me that "parallel=1" is 10~20 percent slower than 
before. I believe this can be avoid with some revise in code. And I also found 
a potential memory leak in the implementation, WIP in fixing it.

> I've done more benchmarking using different numbers of threads for parallel 
> heap iteration and have found values which give at least a factor of 2 
> speedup (for gzipped dumps) or 1.6 (for unzipped dumps). For my scenario 
> using gzip compression about 10 percent of the available CPUs for parallel 
> iteration gave the best speedup, for the uncompressed one it was about 7 
> percent.

This data are really interest to me, it seems using gzipped dump is faster than 
unzipped dump, is the because of disk writing or something else? I would 
investigate more about it~
Thanks a lot!

BRs,
Lin

-------------

PR: https://git.openjdk.java.net/jdk/pull/2261

Reply via email to