Re: RFR: JDK-8306441: Two-stage Segmented Heap Dump [v14]

2023-07-05 Thread Alex Menkov
On Fri, 30 Jun 2023 22:23:19 GMT, Alex Menkov  wrote:

>> Yi Yang has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   memory leak
>
> I think commit "use HandshakeClosure instead of VMOperation" is a wrong way 
> to go.
> It restricts use of parallel dumping only to attach case.
> I have pending changes in heap dumper to support virtual threads and I'm 
> going switch heap dumper to always use DumpMerger.

> Hi @alexmenkov,
> 
> > It restricts use of parallel dumping only to attach case.
> 
> I'm not sure what you mean. Using handshake won't limit it to only attach 
> cases; it can also be used in other cases like HeapDumpBeforeFullGC. The only 
> difference is that previously, VMThread merged the dump files, and now it's 
> the Attach listener that merges them. For the file merging process, my 
> candidate options are as follows:
> 
> 1. Execute with VMThread outside the safepoint, which will block VMThread 
> from executing other vmoperations.
> 
> 2. Execute with Attach listener thread outside the safepoint, which will 
> block Attach listener from processing requests like jcmd/jstack.
> 
> 3. Create a new temporary thread to execute the file merging outside the 
> safepoint, which will have some resource consumption.
> 
> 
> I don't have a strong motivation to use 2, and 3 may be a good solution with 
> the availability of virtual threads. If you have any concerns, we can 
> consider using the most conservative option 1 to simplify the review process, 
> and then optimize the file merging process in a follow-up patch.

I mean that you can't be sure that you can use attach listener thread.
My concerns about attach listener thread are:
- AttachListener can be disabled at all or fail to initialize;
- attach listener thread may be not yet available when we need to perform heap 
dump;
- need to ensure attach listener thread can't be blocked (for example waiting 
for next command)

I think it makes sense to go option 1 now and add optimizations as follow-up 
changes (they will be much smaller and easier to review).

-

PR Comment: https://git.openjdk.org/jdk/pull/13667#issuecomment-1622516883


Re: RFR: JDK-8306441: Two-stage Segmented Heap Dump [v14]

2023-07-02 Thread Yi Yang
On Fri, 30 Jun 2023 22:23:19 GMT, Alex Menkov  wrote:

>> Yi Yang has updated the pull request incrementally with one additional 
>> commit since the last revision:
>> 
>>   memory leak
>
> I think commit "use HandshakeClosure instead of VMOperation" is a wrong way 
> to go.
> It restricts use of parallel dumping only to attach case.
> I have pending changes in heap dumper to support virtual threads and I'm 
> going switch heap dumper to always use DumpMerger.

Hi @alexmenkov,

> It restricts use of parallel dumping only to attach case.

I'm not sure what you mean. Using handshake won't limit it to only attach 
cases; it can also be used in other cases like HeapDumpBeforeFullGC. The only 
difference is that previously, VMThread merged the dump files, and now it's the 
Attach listener that merges them. For the file merging process, my candidate 
options are as follows:

1. Execute with VMThread outside the safepoint, which will block VMThread from 
executing other vmoperations.
2. Execute with Attach listener thread outside the safepoint, which will block 
Attach listener from processing requests like jcmd/jstack.
3. Create a new temporary thread to execute the file merging outside the 
safepoint, which will have some resource consumption.

I don't have a strong motivation to use 2, and 3 may be a good solution with 
the availability of virtual threads. If you have any concerns, we can consider 
using the most conservative option 1 to simplify the review process, and then 
optimize the file merging process in a follow-up patch.

-

PR Comment: https://git.openjdk.org/jdk/pull/13667#issuecomment-1617115036


Re: RFR: JDK-8306441: Two-stage Segmented Heap Dump [v14]

2023-06-30 Thread Alex Menkov
On Wed, 28 Jun 2023 08:04:21 GMT, Yi Yang  wrote:

>> ### Motivation and proposal
>> Hi, heap dump brings about pauses for application's execution(STW), this is 
>> a well-known pain. JDK-8252842 have added parallel support to heapdump in an 
>> attempt to alleviate this issue. However, all concurrent threads 
>> competitively write heap data to the same file, and more memory is required 
>> to maintain the concurrent buffer queue. In experiments, we did not feel a 
>> significant performance improvement from that.
>> 
>> The minor-pause solution, which is presented in this PR, is a two-stage 
>> segmented heap dump:
>> 
>> 1. Stage One(STW): Concurrent threads directly write data to multiple heap 
>> files.
>> 2. Stage Two(Non-STW): Merge multiple heap files into one complete heap dump 
>> file.
>> 
>> Now concurrent worker threads are not required to maintain a buffer queue, 
>> which would result in more memory overhead, nor do they need to compete for 
>> locks. The changes in the overall design are as follows:
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/86d2717d-c621-446f-98c2-cce761ec8db5)
>> Fig1. Before
>> 
>> ![image](https://github.com/openjdk/jdk/assets/5010047/c912c516-83b6-4343-a2e8-5d5bdb99cb87)
>> Fig2. After this patch
>> 
>> ### Performance evaluation
>> | memory | numOfThread | STW | Total  | Compression |
>> | --- | - | -- |  |  |
>> | 8g | 1 thread | 15.612 secs | 15.612 secs | N |
>> | 8g | 32 thread |  2.5617250 secs | 14.498 secs | N |
>> | 8g | 32 thread | 2.3084878 secs | 3.198 secs | Compress1 |
>> | 8g | 32 thread | 10.9355128 secs | 21.882 secs | Compress2 |
>> | 8g | 96 thread | 2.6790452 secs | 14.012 secs | N |
>> | 8g | 96 thread | 2.3044796 secs | 3.589 secs | Compress1 |
>> | 8g | 96 thread | 9.7585151 secs | 20.219 secs| Compress2 |
>> | 16g | 1 thread | 26.278 secs | 26.278 secs | N |
>> | 16g | 32 thread |  5.2313740 secs | 26.417 secs | N |
>> | 16g | 32 thread | 5.6946983 secs | 6.538 secs | Compress1 |
>> | 16g | 32 thread | 21.8211105 secs | 41.133 secs | Compress2 |
>> | 16g | 96 thread | 6.2445556 secs | 27.141 secs | N |
>> | 16g | 96 thread | 4.6007096 secs | 6.259 secs | Compress1 |
>> | 16g | 96 thread | 19.2965783 secs |  39.007 secs | Compress2 |
>> | 32g | 1 thread | 48.149 secs | 48.149 secs | N |
>> | 32g | 32 thread | 10.7734677 secs | 61.643 secs | N |
>> | 32g | 32 thread | 10.1642097 secs | 10.903 secs | Compress1 |
>> | 32g | 32 thread | 43.8407607 secs | 88.152 secs | Compress2 |
>> | 32g | 96 thread | 13.1522042 ...
>
> Yi Yang has updated the pull request incrementally with one additional commit 
> since the last revision:
> 
>   memory leak

I think commit "use HandshakeClosure instead of VMOperation" is a wrong way to 
go.
It restricts use of parallel dumping only to attach case.
I have pending changes in heap dumper to support virtual threads and I'm going 
switch heap dumper to always use DumpMerger.

-

PR Comment: https://git.openjdk.org/jdk/pull/13667#issuecomment-1615243585


Re: RFR: JDK-8306441: Two-stage Segmented Heap Dump [v14]

2023-06-28 Thread Yi Yang
> ### Motivation and proposal
> Hi, heap dump brings about pauses for application's execution(STW), this is a 
> well-known pain. JDK-8252842 have added parallel support to heapdump in an 
> attempt to alleviate this issue. However, all concurrent threads 
> competitively write heap data to the same file, and more memory is required 
> to maintain the concurrent buffer queue. In experiments, we did not feel a 
> significant performance improvement from that.
> 
> The minor-pause solution, which is presented in this PR, is a two-stage 
> segmented heap dump:
> 
> 1. Stage One(STW): Concurrent threads directly write data to multiple heap 
> files.
> 2. Stage Two(Non-STW): Merge multiple heap files into one complete heap dump 
> file.
> 
> Now concurrent worker threads are not required to maintain a buffer queue, 
> which would result in more memory overhead, nor do they need to compete for 
> locks. The changes in the overall design are as follows:
> 
> ![image](https://github.com/openjdk/jdk/assets/5010047/86d2717d-c621-446f-98c2-cce761ec8db5)
> Fig1. Before
> 
> ![image](https://github.com/openjdk/jdk/assets/5010047/c912c516-83b6-4343-a2e8-5d5bdb99cb87)
> Fig1. After this patch
> 
> ### Performance evaluation
> | memory | numOfThread | STW | Total  | Compression |
> | --- | - | -- |  |  |
> | 8g | 1 thread | 15.612 secs | 15.612 secs | N |
> | 8g | 32 thread |  2.5617250 secs | 14.498 secs | N |
> | 8g | 32 thread | 2.3084878 secs | 3.198 secs | Compress1 |
> | 8g | 32 thread | 10.9355128 secs | 21.882 secs | Compress2 |
> | 8g | 96 thread | 2.6790452 secs | 14.012 secs | N |
> | 8g | 96 thread | 2.3044796 secs | 3.589 secs | Compress1 |
> | 8g | 96 thread | 9.7585151 secs | 20.219 secs| Compress2 |
> | 16g | 1 thread | 26.278 secs | 26.278 secs | N |
> | 16g | 32 thread |  5.2313740 secs | 26.417 secs | N |
> | 16g | 32 thread | 5.6946983 secs | 6.538 secs | Compress1 |
> | 16g | 32 thread | 21.8211105 secs | 41.133 secs | Compress2 |
> | 16g | 96 thread | 6.2445556 secs | 27.141 secs | N |
> | 16g | 96 thread | 4.6007096 secs | 6.259 secs | Compress1 |
> | 16g | 96 thread | 19.2965783 secs |  39.007 secs | Compress2 |
> | 32g | 1 thread | 48.149 secs | 48.149 secs | N |
> | 32g | 32 thread | 10.7734677 secs | 61.643 secs | N |
> | 32g | 32 thread | 10.1642097 secs | 10.903 secs | Compress1 |
> | 32g | 32 thread | 43.8407607 secs | 88.152 secs | Compress2 |
> | 32g | 96 thread | 13.1522042 secs | 61.432 secs | N |
> | 32g | 96 thread | 9.0954641 secs | 9.885 secs | C...

Yi Yang has updated the pull request incrementally with one additional commit 
since the last revision:

  memory leak

-

Changes:
  - all: https://git.openjdk.org/jdk/pull/13667/files
  - new: https://git.openjdk.org/jdk/pull/13667/files/ecab3116..2012b5ca

Webrevs:
 - full: https://webrevs.openjdk.org/?repo=jdk=13667=13
 - incr: https://webrevs.openjdk.org/?repo=jdk=13667=12-13

  Stats: 52 lines in 1 file changed: 22 ins; 13 del; 17 mod
  Patch: https://git.openjdk.org/jdk/pull/13667.diff
  Fetch: git fetch https://git.openjdk.org/jdk.git pull/13667/head:pull/13667

PR: https://git.openjdk.org/jdk/pull/13667