On Tue, 9 Mar 2021 13:57:24 GMT, Lin Zang <lz...@openjdk.org> wrote:

>> Update a new patch that reduce the memory consumption issue.
>> As mentioned in the CR,  there is internal buffer used for segmented heap 
>> dump.  The dumped object data firstly write into this buffer and then 
>> flush() when the size is known. when the internal buffer is full, the 
>> current implementation do:
>> 
>> - new a larger buffer,  copy data from old buffer into the new one, and then 
>> use it as the internal buffer.
>> 
>> This could cause large memory consumption because the old buffer data are 
>> copied, and also the old buffer can not be "free" until next GC. 
>> 
>> For example, if the internel buffer's length is 1MB, when it is full, a new 
>> 2MB buffer is allocated so there is actually 3MB memory taken (old buffer + 
>> new buffer). And in this case, for the ~4GB large array, it keeps generating 
>> new buffers and do copying, which takes both CPU and memory intensively and 
>> cause the timeout issue.
>> 
>> This patch optimize it by creating a array list of byte[]. when old buffer 
>> is full, it saved into the list and the new one is created and used as the 
>> internal buffer. In this case, the above example takes 2MB(1MB for old, 
>> saved in the list; and 1MB for the new buffer)
>> 
>> Together with the "write through" mode introduced in this PR, by which all 
>> arrays are write through to underlying stream and hence no extra buffer 
>> requried. The PR could help fix the LargeArray issue and also save memory.
>> 
>> Thanks!
>> Lin
>
> As discussed in CR https://bugs.openjdk.java.net/browse/JDK-8262386, the 
> byte[] list is much more like an optimization. Revert it in the PR and I will 
> create a separate CR and PR for it. 
> 
> Thanks,
> Lin

Hi Lin,

One concern I have is the naming used in the fix.
First, the term write-through you use is unusual and confusing.
Would it better to say 'direct or unbuffered output' instead?

 612         // Now the total size of data to dump is known and can be filled 
to segment header.
 613         // Enable write-through mode to avoid internal buffer copies.
 614         if (useSegmentedHeapDump) {
 615             long longBytes = length * typeSize + headerSize;
 616             int bytesToWrite = (int) (longBytes);
 617             
hprofBufferedOut.fillSegmentSizeAndEnableWriteThrough(bytesToWrite);
 618         }

Why 'longBytes' ? The scope of this local variable is very short, and it is 
clear that its type is long.
Why not something like 'size' or 'totalSize' ?
There is no need in parentheses around the 'longBytes' at L616.
Also, there is no need to for one more local variable 'bytesToWrite'.
Something like below is more clear:
  int size = (int)(length * typeSize + headerSize);
I agree with Chris, it is confusing why the decision about write-trough mode is 
made in this method.
The method fillSegmentSizeAndEnableWriteThrough() is called for any array if 
useSegmentedHeapDump == true,
so any segmented output for arrays is direct (in write-through mode).
Is this new dimension really needs to be introduced?

Thanks,
Serguei

-------------

PR: https://git.openjdk.java.net/jdk/pull/2803

Reply via email to