Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail)

臧琳 Sat, 09 May 2020 00:50:09 -0700

Dear All, 
       May I ask your help again for review the latest change?  Thanks!


BRs,
Lin

On 2020/4/28, 1:54 PM, "linzang(臧琳)" <[email protected]> wrote:

    Hi Stefan, 
      >>  - Adding Atomic::load/store.
      >>  - Removing the time measurement in the run_task. I renamed G1's 
function 
      >>  to run_task_timed. If we need this outside of G1, we can rethink the 
API 
      >>  at that point.
       >>  - ZGC style cleanups
       Thanks for revising the patch,  they are all good to me, and I have made 
a tiny change based on it: 
           http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04/ 
           
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_04-delta/
      it reduce the scope of mutex in ParHeapInspectTask, and delete 
unnecessary comments.

    BRs,
    Lin

    On 2020/4/27, 4:34 PM, "Stefan Karlsson" <[email protected]> wrote:

        Hi Lin,

        On 2020-04-26 05:10, linzang(臧琳) wrote:
        > Hi Stefan and Paul，
        >      I have made a new patch based on your comments and Stefan's Poc 
code:
        >      Webrev: 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/
        >      Delta(based on Stefan's change:) : 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/

        Thanks for providing a delta patch. It makes it much easier to look at, 
        and more likely for reviewers to continue reviewing.

        I'm going to continue focusing on the GC parts, and leave the rest to 
        others to review.

        > 
        >      And Here are main changed I made and want to discuss with you:
        >      1.  changed"parallelThreadNum=" to "parallel=" for jmap -histo 
options.
        >      2.  Add logic to test where parallelHeapInspection is fail, in 
heapInspection.cpp
        >            This is because the parHeapInspectTask create thread local 
KlassInfoTable in it's work() method, and this may fail because of native OOM, 
in this case, the parallel should fail and serial heap inspection can be tried.
        >            One more thing I want discuss with you is about the member 
"_success" of parHeapInspectTask, when native OOM happenes, it is set to false. 
And since this "set" operation can be conducted in multiple threads, should it 
be atomic ops?  IMO, this is not necessary because "_success" can only be set 
to false, and there is no way to change it from back to true after the 
ParHeapInspectTask instance is created, so it is save to be non-atomic, do you 
agree with that?

        In these situations you should be using the Atomic::load/store 
        primitives. We're moving toward a later C++ standard were data races 
are 
        considered undefined behavior.

        >     3. make CollectedHeap::run_task() be an abstract virtual func, so 
that every subclass of collectedHeap should support it, so later implementation 
of new collectedHeap will not miss the "parallel" features.
        >           The problem I want to discuss with you is about epsilonHeap 
and SerialHeap, as they may not need parallel heap iteration, so I only make 
task->work(0), in case the run_task() is invoked someway in future. Another way 
is to left run_task()  unimplemented, which one do you think is better?

        I don't have a strong opinion about this.

          And also please help take a look at the zHeap, as there is a class 
        zTask that wrap the abstractGangTask, and the collectedHeap::run_task() 
        only accept  AbstraceGangTask* as argument, so I made a delegate class 
        to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp.
        > 
        >        There maybe other better ways to sovle the above problems, 
welcome for any comments, Thanks!

        I've created a few cleanups and changes on top of your latest patch:

        https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta
        https://cr.openjdk.java.net/~stefank/8215624/webrev.02

        - Adding Atomic::load/store.
        - Removing the time measurement in the run_task. I renamed G1's 
function 
        to run_task_timed. If we need this outside of G1, we can rethink the 
API 
        at that point.
        - ZGC style cleanups

        Thanks,
        StefanK

        > 
        > BRs,
        > Lin
        > 
        > On 2020/4/23, 11:08 AM, "linzang(臧琳)" <[email protected]> wrote:
        > 
        >      Thanks Paul! I agree with using "parallel", will make the update 
in next patch, Thanks for help update the CSR.
        > 
        >      BRs,
        >      Lin
        > 
        >      On 2020/4/23, 4:42 AM, "Hohensee, Paul" <[email protected]> 
wrote:
        > 
        >          For the interface, I'd use "parallel" instead of 
"parallelThreadNum". All the other options are lower case, and it's a lot 
easier to type "parallel". I took the liberty of updating the CSR. If you're ok 
with it, you might want to change variable names and such, plus of course 
JMap.usage.
        > 
        >          Thanks,
        >          Paul
        > 
        >          On 4/22/20, 2:29 AM, "serviceability-dev on behalf of 
linzang(臧琳)" <[email protected] on behalf of 
[email protected]> wrote:
        > 
        >              Dear Stefan,
        > 
        >                      Thanks a lot! I agree with you to decouple the 
heap inspection code with GC's.
        >                      I will start  from your POC code, may discuss 
with you later.
        > 
        > 
        >              BRs,
        >              Lin
        > 
        >              On 2020/4/22, 5:14 PM, "Stefan Karlsson" 
<[email protected]> wrote:
        > 
        >                  Hi Lin,
        > 
        >                  I took a look at this earlier and saw that the heap 
inspection code is
        >                  strongly coupled with the CollectedHeap and 
G1CollectedHeap. I'd prefer
        >                  if we'd abstract this away, so that the GCs only 
provide a "parallel
        >                  object iteration" interface, and the heap inspection 
code is kept elsewhere.
        > 
        >                  I started experimenting with doing that, but other 
higher-priority (to
        >                  me) tasks have had to take precedence.
        > 
        >                  I've uploaded my work-in-progress / proof-of-concept:
        >                    
https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/
        >                    
https://cr.openjdk.java.net/~stefank/8215624/webrev.01/
        > 
        >                  The current code doesn't handle the lifecycle 
(deletion) of the
        >                  ParallelObjectIterators. There's also code left 
unimplemented in around
        >                  CollectedHeap::run_task. However, I think this could 
work as a basis to
        >                  pull out the heap inspection code out of the GCs.
        > 
        >                  Thanks,
        >                  StefanK
        > 
        >                  On 2020-04-22 02:21, linzang(臧琳) wrote:
        >                  > Dear all,
        >                  >       May I ask you help to review? This RFR has 
been there for quite a while.
        >                  >       Thanks!
        >                  >
        >                  > BRs,
        >                  > Lin
        >                  >
        >                  > > On 2020/3/16, 5:18 PM, "linzang(臧琳)" 
<[email protected]> wrote:>
        >                  >
        >                  >>    Just update a new path, my preliminary measure 
show about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core 
platform with parallel thread number set to 4).
        >                  >>     webrev: 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/
        >                  >>     bug: 
https://bugs.openjdk.java.net/browse/JDK-8215624
        >                  >>     CSR: 
https://bugs.openjdk.java.net/browse/JDK-8239290
        >                  >>     BRs,
        >                  >>       Lin
        >                  >>       > On 2020/3/2, 9:56 PM, "linzang(臧琳)" 
<[email protected]> wrote:
        >                  >>       >
        >                  >>       >    Dear all,
        >                  >>       >          Let me try to ease the reviewing 
work by some explanation :P
        >                  >>       >          The patch's target is to speed 
up jmap -histo for heap iteration, from my experience it is necessary for large 
heap investigation. E.g in bigData scenario I have tried to conduct jmap -histo 
against 180GB heap, it does take quite a while.
        >                  >>       >          And if my understanding is 
corrent, even the jmap -histo without "live" option does heap inspection with 
heap lock acquired. so it is very likely to block mutator thread in 
allocation-sensitive scenario. I would say the faster the heap inspection does, 
the shorter the mutator be blocked. This is parallel iteration for jmap is 
necessary.
        >                  >>       >          I think the parallel heap 
inspection should be applied to all kind of heap. However, consider the heap 
layout are different for  GCs, much time is required to understand all kinds of 
the heap layout to make the whole change. IMO, It is not wise to have a huge 
patch for the whole solution at once, and it is even harder to review it. So I 
plan to implement it incrementally, the first patch (this one) is going to 
confirm the implemention detail of how jmap accept the new option, passes it to 
attachListener of the jvm process and then how to make the parallel inspection 
closure be generic enough to make it easy to extend to different heap layout. 
And also how to implement the heap inspection in specific gc's heap. This patch 
use G1's heap as the begining.
        >                  >>       >          This patch actually do several 
things:
        >                  >>       >          1. Add an option 
"parallelThreadNum=<N>" to jmap -histo, the default behavior is to set N to 0, 
means let's JVM decide how many threads to use for heap inspection. Set this 
option to 1 will disable parallel heap inspection. (more details in CSR: 
https://bugs.openjdk.java.net/browse/JDK-8239290)
        >                  >>       >          2. Make a change in how Jmap 
passing arguments, changes in 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html,
 originally it pass options as separate arguments to attachListener, this patch 
change to that all options be compose to a single string. So the arg_count_max 
in attachListener.hpp do not need to be changed, and hence avoid the 
compatibility issue, as disscussed at 
https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html
        >                  >>       >         3. Add an abstract class 
ParHeapInspectTask in heapInspection.hpp / heapInspection.cpp, It's work(uint 
worker_id) method prepares the data structure (KlassInfoTable) need for every 
parallel worker thread, and then call do_object_iterate_parallel() which is 
heap specific implementation. I also added some machenism in KlassInfoTable to 
support parallel iteration, such as merge().
        >                  >>       >        4. In specific heap (G1 in this 
patch), create a subclass of ParHeapInspectTask, implement the 
do_object_iterate_parallel() for parallel heap inspection. For G1, it simply 
invoke g1CollectedHeap's object_iterate_parallel().
        >                  >>       >        5. Add related test.
        >                  >>       >        6. it may be easy to extend this 
patch for other kinds of heap by creating subclass of ParHeapInspectTask and 
implement the do_object_iterate_parallel().
        >                  >>       >
        >                  >>       >    Hope these info could help on code 
review and initate the discussion :-)
        >                  >>       >    Thanks!
        >                  >>       >
        >                  >>       >    BRs,
        >                  >>       >    Lin
        >                  >>       >    >On 2020/2/19, 9:40 AM, "linzang(臧琳)" 
<[email protected]> wrote:.
        >                  >>       >    >
        >                  >>       >    >  Re-post this RFR with correct 
enhancement number to make it trackable.
        >                  >>       >    >  please ignore the previous wrong 
post. sorry for troubles.
        >                  >>       >    >
        >                  >>       >    >   webrev: 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/
        >                  >>       >    >    Hi bug: 
https://bugs.openjdk.java.net/browse/JDK-8215624
        >                  >>       >    >    CSR: 
https://bugs.openjdk.java.net/browse/JDK-8239290
        >                  >>       >    >    --------------
        >                  >>       >    >    Lin
        >                  >>       >    >    >Hi Lin,
        >                  >   >     >    >    >
        >                  >>       >    >    >Could you, please, re-post your 
RFR with the right enhancement number in
        >                  >>       >    >    >the message subject?
        >                  >>       >    >    >It will be more trackable this 
way.
        >                  >>       >    >    >
        >                  >>       >    >    >Thanks,
        >                  >>       >    >    >Serguei
        >                  >>       >    >    >
        >                  >>       >    >    >
        >                  >>       >    >    >On 2/17/20 10:29 PM, linzang(臧琳) 
wrote:
        >                  >>       >    >    >> Dear David,
        >                  >>       >    >    >>        Thanks a lot!
        >                  >>       >    >    >>       I have updated the 
refined code to 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/.
        >                  >>       >    >    >>        IMHO the parallel heap 
inspection can be extended to all kinds of heap as long as the heap layout can 
support parallel iteration.
        >                  >>       >    >    >>        Maybe we can firstly 
use this webrev to discuss how to implement it, because I am not sure my 
current implementation is an appropriate way to communicate with collectedHeap, 
then we can extend the solution to other kinds of heap.
        >                  >>       >    >    >>
        >                  >>       >    >    >> Thanks,
        >                  >>       >    >    >> --------------
        >                  >>       >    >    >> Lin
        >                  >>       >    >    >>> Hi Lin,
        >                  >>       >    >    >>>
        >                  >>       >    >    >>> Adding in hotspot-gc-dev as 
they need to see how this interacts with GC
        >                  >>       >    >    >>> worker threads, and whether 
it needs to be extended beyond G1.
        >                  >>       >    >    >>>
        >                  >>       >    >   >>> I happened to spot one nit 
when browsing:
        >                  >>       >    >    >>>
        >                  >>       >    >    >>> 
src/hotspot/share/gc/shared/collectedHeap.hpp
        >                  >>       >    >    >>>
        >                  >>       >    >    >>> +   virtual bool 
run_par_heap_inspect_task(KlassInfoTable* cit,
        >                  >>       >    >    >>> +                             
             BoolObjectClosure* filter,
        >                  >>       >    >    >>> +                             
             size_t* missed_count,
        >                  >>       >    >    >>> +                             
             size_t thread_num) {
        >                  >>       >    >    >>> +     return NULL;
        >                  >>       >    >    >>>
        >                  >>       >    >    >>> s/NULL/false/
        >                  >>       >    >    >>>
        >                  >>       >    >    >>> Cheers,
        >                  >>       >    >    >>> David
        >                  >   >     >    >    >>>
        >                  >>       >    >    >>> On 18/02/2020 2:15 pm, 
linzang(臧琳) wrote:
        >                  >>       >    >    >>>> Dear All,
        >                  >>       >    >    >>>>         May I ask your help 
to review the follow changes:
        >                  >>       >    >    >>>>         webrev:
        >                  >>       >    >    >>>> 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/
        >                  >>       >    >    >>>>      bug: 
https://bugs.openjdk.java.net/browse/JDK-8215624
        >                  >>       >    >    >>>>      related CSR: 
https://bugs.openjdk.java.net/browse/JDK-8239290
        >                  >>       >    >    >>>>         This patch enable 
parallel heap inspection of G1 for jmap histo.
        >                  >>       >    >    >>>>         my simple test shown 
it can speed up 2x of jmap -histo with
        >                  >>       >    >    >>>> parallelThreadNum set to 2 
for heap at ~500M on 4-core platform.
        >                  >>       >    >    >>>>
        >                  >>       >    >    >>>> 
------------------------------------------------------------------------
        >                  >>       >    >    >>>> BRs,
        >                  >>       >    >    >>>> Lin
        >                  >>       >    >    >> >
        >                  >>       >    >    >
        >                  >
        >                  >
        >                  >
        > 
        > 
        > 
        > 
        > 
        >

Re: RFR(L): 8215624: add parallel heap inspection support for jmap histo(G1)(Internet mail)

Reply via email to