Hi Lin,

On 2020-04-26 05:10, linzang(臧琳) wrote:
Hi Stefan and Paul,
     I have made a new patch based on your comments and Stefan's Poc code:
     Webrev: http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03/
     Delta(based on Stefan's change:) : 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_03-delta/webrev_03-delta/

Thanks for providing a delta patch. It makes it much easier to look at, and more likely for reviewers to continue reviewing.

I'm going to continue focusing on the GC parts, and leave the rest to others to review.


     And Here are main changed I made and want to discuss with you:
     1.  changed"parallelThreadNum=" to "parallel=" for jmap -histo options.
     2.  Add logic to test where parallelHeapInspection is fail, in 
heapInspection.cpp
           This is because the parHeapInspectTask create thread local 
KlassInfoTable in it's work() method, and this may fail because of native OOM, 
in this case, the parallel should fail and serial heap inspection can be tried.
           One more thing I want discuss with you is about the member "_success" of 
parHeapInspectTask, when native OOM happenes, it is set to false. And since this "set" operation 
can be conducted in multiple threads, should it be atomic ops?  IMO, this is not necessary because 
"_success" can only be set to false, and there is no way to change it from back to true after the 
ParHeapInspectTask instance is created, so it is save to be non-atomic, do you agree with that?

In these situations you should be using the Atomic::load/store primitives. We're moving toward a later C++ standard were data races are considered undefined behavior.

    3. make CollectedHeap::run_task() be an abstract virtual func, so that every subclass 
of collectedHeap should support it, so later implementation of new collectedHeap will not 
miss the "parallel" features.
          The problem I want to discuss with you is about epsilonHeap and 
SerialHeap, as they may not need parallel heap iteration, so I only make 
task->work(0), in case the run_task() is invoked someway in future. Another way 
is to left run_task()  unimplemented, which one do you think is better?

I don't have a strong opinion about this.

And also please help take a look at the zHeap, as there is a class zTask that wrap the abstractGangTask, and the collectedHeap::run_task() only accept AbstraceGangTask* as argument, so I made a delegate class to adapt it , please see src/hotspot/share/gc/z/zHeap.cpp.

       There maybe other better ways to sovle the above problems, welcome for 
any comments, Thanks!

I've created a few cleanups and changes on top of your latest patch:

https://cr.openjdk.java.net/~stefank/8215624/webrev.02.delta
https://cr.openjdk.java.net/~stefank/8215624/webrev.02

- Adding Atomic::load/store.
- Removing the time measurement in the run_task. I renamed G1's function to run_task_timed. If we need this outside of G1, we can rethink the API at that point.
- ZGC style cleanups

Thanks,
StefanK


BRs,
Lin

On 2020/4/23, 11:08 AM, "linzang(臧琳)" <linz...@tencent.com> wrote:

     Thanks Paul! I agree with using "parallel", will make the update in next 
patch, Thanks for help update the CSR.

     BRs,
     Lin

     On 2020/4/23, 4:42 AM, "Hohensee, Paul" <hohen...@amazon.com> wrote:

         For the interface, I'd use "parallel" instead of "parallelThreadNum". All the 
other options are lower case, and it's a lot easier to type "parallel". I took the liberty of 
updating the CSR. If you're ok with it, you might want to change variable names and such, plus of course 
JMap.usage.

         Thanks,
         Paul

         On 4/22/20, 2:29 AM, "serviceability-dev on behalf of linzang(臧琳)" 
<serviceability-dev-boun...@openjdk.java.net on behalf of linz...@tencent.com> wrote:

             Dear Stefan,

                     Thanks a lot! I agree with you to decouple the heap 
inspection code with GC's.
                     I will start  from your POC code, may discuss with you 
later.


             BRs,
             Lin

             On 2020/4/22, 5:14 PM, "Stefan Karlsson" 
<stefan.karls...@oracle.com> wrote:

                 Hi Lin,

                 I took a look at this earlier and saw that the heap inspection 
code is
                 strongly coupled with the CollectedHeap and G1CollectedHeap. 
I'd prefer
                 if we'd abstract this away, so that the GCs only provide a 
"parallel
                 object iteration" interface, and the heap inspection code is 
kept elsewhere.

                 I started experimenting with doing that, but other 
higher-priority (to
                 me) tasks have had to take precedence.

                 I've uploaded my work-in-progress / proof-of-concept:
                   https://cr.openjdk.java.net/~stefank/8215624/webrev.01.delta/
                   https://cr.openjdk.java.net/~stefank/8215624/webrev.01/

                 The current code doesn't handle the lifecycle (deletion) of the
                 ParallelObjectIterators. There's also code left unimplemented 
in around
                 CollectedHeap::run_task. However, I think this could work as a 
basis to
                 pull out the heap inspection code out of the GCs.

                 Thanks,
                 StefanK

                 On 2020-04-22 02:21, linzang(臧琳) wrote:
                 > Dear all,
                 >       May I ask you help to review? This RFR has been there 
for quite a while.
                 >       Thanks!
                 >
                 > BRs,
                 > Lin
                 >
                 > > On 2020/3/16, 5:18 PM, "linzang(臧琳)" <linz...@tencent.com> 
wrote:>
                 >
                 >>    Just update a new path, my preliminary measure show 
about 3.5x speedup of jmap histo on a nearly full 4GB G1 heap (8-core platform with 
parallel thread number set to 4).
                 >>     webrev: 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_02/
                 >>     bug: https://bugs.openjdk.java.net/browse/JDK-8215624
                 >>     CSR: https://bugs.openjdk.java.net/browse/JDK-8239290
                 >>     BRs,
                 >>       Lin
                 >>       > On 2020/3/2, 9:56 PM, "linzang(臧琳)" 
<linz...@tencent.com> wrote:
                 >>       >
                 >>       >    Dear all,
                 >>       >          Let me try to ease the reviewing work by 
some explanation :P
                 >>       >          The patch's target is to speed up jmap 
-histo for heap iteration, from my experience it is necessary for large heap 
investigation. E.g in bigData scenario I have tried to conduct jmap -histo against 180GB 
heap, it does take quite a while.
                 >>       >          And if my understanding is corrent, even the jmap 
-histo without "live" option does heap inspection with heap lock acquired. so it is very 
likely to block mutator thread in allocation-sensitive scenario. I would say the faster the heap 
inspection does, the shorter the mutator be blocked. This is parallel iteration for jmap is 
necessary.
                 >>       >          I think the parallel heap inspection 
should be applied to all kind of heap. However, consider the heap layout are different 
for  GCs, much time is required to understand all kinds of the heap layout to make the 
whole change. IMO, It is not wise to have a huge patch for the whole solution at once, 
and it is even harder to review it. So I plan to implement it incrementally, the first 
patch (this one) is going to confirm the implemention detail of how jmap accept the new 
option, passes it to attachListener of the jvm process and then how to make the parallel 
inspection closure be generic enough to make it easy to extend to different heap layout. 
And also how to implement the heap inspection in specific gc's heap. This patch use G1's 
heap as the begining.
                 >>       >          This patch actually do several things:
                 >>       >          1. Add an option "parallelThreadNum=<N>" 
to jmap -histo, the default behavior is to set N to 0, means let's JVM decide how many threads to use 
for heap inspection. Set this option to 1 will disable parallel heap inspection. (more details in CSR: 
https://bugs.openjdk.java.net/browse/JDK-8239290)
                 >>       >          2. Make a change in how Jmap passing 
arguments, changes in 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/src/jdk.jcmd/share/classes/sun/tools/jmap/JMap.java.udiff.html,
 originally it pass options as separate arguments to attachListener, this patch change 
to that all options be compose to a single string. So the arg_count_max in 
attachListener.hpp do not need to be changed, and hence avoid the compatibility issue, 
as disscussed at 
https://mail.openjdk.java.net/pipermail/serviceability-dev/2019-March/027334.html
                 >>       >         3. Add an abstract class ParHeapInspectTask 
in heapInspection.hpp / heapInspection.cpp, It's work(uint worker_id) method prepares 
the data structure (KlassInfoTable) need for every parallel worker thread, and then call 
do_object_iterate_parallel() which is heap specific implementation. I also added some 
machenism in KlassInfoTable to support parallel iteration, such as merge().
                 >>       >        4. In specific heap (G1 in this patch), 
create a subclass of ParHeapInspectTask, implement the do_object_iterate_parallel() for 
parallel heap inspection. For G1, it simply invoke g1CollectedHeap's 
object_iterate_parallel().
                 >>       >        5. Add related test.
                 >>       >        6. it may be easy to extend this patch for 
other kinds of heap by creating subclass of ParHeapInspectTask and implement the 
do_object_iterate_parallel().
                 >>       >
                 >>       >    Hope these info could help on code review and 
initate the discussion :-)
                 >>       >    Thanks!
                 >>       >
                 >>       >    BRs,
                 >>       >    Lin
                 >>       >    >On 2020/2/19, 9:40 AM, "linzang(臧琳)" 
<linz...@tencent.com> wrote:.
                 >>       >    >
                 >>       >    >  Re-post this RFR with correct enhancement 
number to make it trackable.
                 >>       >    >  please ignore the previous wrong post. sorry 
for troubles.
                 >>       >    >
                 >>       >    >   webrev: 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215624/webrev_01/
                 >>       >    >    Hi bug: 
https://bugs.openjdk.java.net/browse/JDK-8215624
                 >>       >    >    CSR: 
https://bugs.openjdk.java.net/browse/JDK-8239290
                 >>       >    >    --------------
                 >>       >    >    Lin
                 >>       >    >    >Hi Lin,
                 >   >     >    >    >
                 >>       >    >    >Could you, please, re-post your RFR with 
the right enhancement number in
                 >>       >    >    >the message subject?
                 >>       >    >    >It will be more trackable this way.
                 >>       >    >    >
                 >>       >    >    >Thanks,
                 >>       >    >    >Serguei
                 >>       >    >    >
                 >>       >    >    >
                 >>       >    >    >On 2/17/20 10:29 PM, linzang(臧琳) wrote:
                 >>       >    >    >> Dear David,
                 >>       >    >    >>        Thanks a lot!
                 >>       >    >    >>       I have updated the refined code to 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_01/.
                 >>       >    >    >>        IMHO the parallel heap inspection 
can be extended to all kinds of heap as long as the heap layout can support parallel iteration.
                 >>       >    >    >>        Maybe we can firstly use this 
webrev to discuss how to implement it, because I am not sure my current implementation is an 
appropriate way to communicate with collectedHeap, then we can extend the solution to other kinds 
of heap.
                 >>       >    >    >>
                 >>       >    >    >> Thanks,
                 >>       >    >    >> --------------
                 >>       >    >    >> Lin
                 >>       >    >    >>> Hi Lin,
                 >>       >    >    >>>
                 >>       >    >    >>> Adding in hotspot-gc-dev as they need 
to see how this interacts with GC
                 >>       >    >    >>> worker threads, and whether it needs to 
be extended beyond G1.
                 >>       >    >    >>>
                 >>       >    >   >>> I happened to spot one nit when browsing:
                 >>       >    >    >>>
                 >>       >    >    >>> 
src/hotspot/share/gc/shared/collectedHeap.hpp
                 >>       >    >    >>>
                 >>       >    >    >>> +   virtual bool 
run_par_heap_inspect_task(KlassInfoTable* cit,
                 >>       >    >    >>> +                                       
   BoolObjectClosure* filter,
                 >>       >    >    >>> +                                       
   size_t* missed_count,
                 >>       >    >    >>> +                                       
   size_t thread_num) {
                 >>       >    >    >>> +     return NULL;
                 >>       >    >    >>>
                 >>       >    >    >>> s/NULL/false/
                 >>       >    >    >>>
                 >>       >    >    >>> Cheers,
                 >>       >    >    >>> David
                 >   >     >    >    >>>
                 >>       >    >    >>> On 18/02/2020 2:15 pm, linzang(臧琳) 
wrote:
                 >>       >    >    >>>> Dear All,
                 >>       >    >    >>>>         May I ask your help to review 
the follow changes:
                 >>       >    >    >>>>         webrev:
                 >>       >    >    >>>> 
http://cr.openjdk.java.net/~lzang/jmap-8214535/8215264/webrev_00/
                 >>       >    >    >>>>      bug: 
https://bugs.openjdk.java.net/browse/JDK-8215624
                 >>       >    >    >>>>      related CSR: 
https://bugs.openjdk.java.net/browse/JDK-8239290
                 >>       >    >    >>>>         This patch enable parallel 
heap inspection of G1 for jmap histo.
                 >>       >    >    >>>>         my simple test shown it can 
speed up 2x of jmap -histo with
                 >>       >    >    >>>> parallelThreadNum set to 2 for heap at 
~500M on 4-core platform.
                 >>       >    >    >>>>
                 >>       >    >    >>>> 
------------------------------------------------------------------------
                 >>       >    >    >>>> BRs,
                 >>       >    >    >>>> Lin
                 >>       >    >    >> >
                 >>       >    >    >
                 >
                 >
                 >






Reply via email to