Hi Ralf,
One part of this caught my eye and now I look at the webrev I have some
concerns. Introducing new threads to the VM is not something that should
be done lightly and it has to be done very carefully - I need to look
closer at this aspect. Further when using Mutexes/Monitors in such code
you have to be extremely careful about how (or even if) those
Mutex/Monitor get deleted. The code you have at present is not safe
because you cannot know when other threads have completely exited the
Monitor/Mutex code. The last thread to terminate will signal the
destructing thread (blocked in wait) then release the monitor, allowing
the destructing thread to acquire the monitor and then delete the _lock.
But at the point at which the monitor becomes free and the destructor
thread is unparked, the terminating thread may be context switched out
and remain inside the Monitor code. The destructor thread then deletes
the monitor and frees it. When the terminating thread resumes, if it
touches any memory associated with the Monitor it could SEGV.
To safely delete a Monitor/Mutex you have to know for certain that all
threads using it have completely ceased to use it. You cannot use that
Monitor/Mutex as the means for determining that. It is a non-trivial
problem to solve.
Cheers,
David
-----
On 11/02/2020 1:33 am, Schmelter, Ralf wrote:
Hi Yasumasa,
You can use `DCmdArgument<jlong>` for -gz option.
That is what I originally tried. But then you always have to supply a
compression level (just specifying -gz doesn't work). Since I would expect most
users never caring about the compression level, I switched to a string option,
which can handle this pattern.
_nr_of_threads, _id_to_write, _current in CompressionBackend should be added
`volatile` at least.
I don't think that is needed. Apart from the initialization, they are only
changed under lock protection.
BTW how much processing time is different between single threaded and multi
threaded?
I've benchmarked an example, which creates a ~31 GB uncompressed hprof file,
with a VM which doesn't use any background threads. Here are the size of the
create files, the compression level and the time spend:
Uncompressed, 31.6 G, 71 sec
gzipped level 1, 7.57 G, 463 sec (x6.5)
gzipped level 3, 7.10 G, 609 sec (x8.6)
gzipped level 6, 6.49 G, 1415 sec (x19.9)
So even the fastest gzip compression makes writing the dump at least 5 times as
slow.
Also I want to know what number is set to ParallelGCThreads.
ParallelGCThreads seems to affect to thread num for GZip compression.
Originally, I've tried to use the WorkGang (CollectedHeap::
get_safepoint_workers()) of the GC to do the work. But this wouldn't work
because Shenandoah could not iterate the heap from a worker thread. So I've
opted to start the needed threads itself for the time of the heap dump. I've
used ParallelGCThreads as the maximum number of threads, since this is what
would be used for a GC too. So it should not clog up the machine more than a
GC. Maybe it would be even better to additionally limit the threads by the
compression level.
Best regards,
Ralf Schmelter
-----Original Message-----
From: Yasumasa Suenaga <[email protected]>
Sent: Samstag, 8. Februar 2020 14:46
To: Schmelter, Ralf <[email protected]>; OpenJDK Serviceability
<[email protected]>
Cc: [email protected]
Subject: Re: RFR(L) 8237354: Add option to jcmd to write a gzipped heap dump
Hi Ralf,
- diagnosticCommand.cpp
You can use `DCmdArgument<jlong>` for -gz option.
If you want to use lesser type (e.g. int, unsigned char), I guess you need
to modify GenDCmdArgument class.
- heapDumper.cpp
_nr_of_threads, _id_to_write, _current in CompressionBackend should be added
`volatile` at least.
(Other values need to be checked)
BTW how much processing time is different between single threaded and multi
threaded?
Also I want to know what number is set to ParallelGCThreads.
ParallelGCThreads seems to affect to thread num for GZip compression.
Thanks,
Yasumasa