On Tue, 9 Mar 2021 13:57:24 GMT, Lin Zang <lz...@openjdk.org> wrote: >> Update a new patch that reduce the memory consumption issue. >> As mentioned in the CR, there is internal buffer used for segmented heap >> dump. The dumped object data firstly write into this buffer and then >> flush() when the size is known. when the internal buffer is full, the >> current implementation do: >> >> - new a larger buffer, copy data from old buffer into the new one, and then >> use it as the internal buffer. >> >> This could cause large memory consumption because the old buffer data are >> copied, and also the old buffer can not be "free" until next GC. >> >> For example, if the internel buffer's length is 1MB, when it is full, a new >> 2MB buffer is allocated so there is actually 3MB memory taken (old buffer + >> new buffer). And in this case, for the ~4GB large array, it keeps generating >> new buffers and do copying, which takes both CPU and memory intensively and >> cause the timeout issue. >> >> This patch optimize it by creating a array list of byte[]. when old buffer >> is full, it saved into the list and the new one is created and used as the >> internal buffer. In this case, the above example takes 2MB(1MB for old, >> saved in the list; and 1MB for the new buffer) >> >> Together with the "write through" mode introduced in this PR, by which all >> arrays are write through to underlying stream and hence no extra buffer >> requried. The PR could help fix the LargeArray issue and also save memory. >> >> Thanks! >> Lin > > As discussed in CR https://bugs.openjdk.java.net/browse/JDK-8262386, the > byte[] list is much more like an optimization. Revert it in the PR and I will > create a separate CR and PR for it. > > Thanks, > Lin
Hi Lin, One concern I have is the naming used in the fix. First, the term write-through you use is unusual and confusing. Would it better to say 'direct or unbuffered output' instead? 612 // Now the total size of data to dump is known and can be filled to segment header. 613 // Enable write-through mode to avoid internal buffer copies. 614 if (useSegmentedHeapDump) { 615 long longBytes = length * typeSize + headerSize; 616 int bytesToWrite = (int) (longBytes); 617 hprofBufferedOut.fillSegmentSizeAndEnableWriteThrough(bytesToWrite); 618 } Why 'longBytes' ? The scope of this local variable is very short, and it is clear that its type is long. Why not something like 'size' or 'totalSize' ? There is no need in parentheses around the 'longBytes' at L616. Also, there is no need to for one more local variable 'bytesToWrite'. Something like below is more clear: int size = (int)(length * typeSize + headerSize); I agree with Chris, it is confusing why the decision about write-trough mode is made in this method. The method fillSegmentSizeAndEnableWriteThrough() is called for any array if useSegmentedHeapDump == true, so any segmented output for arrays is direct (in write-through mode). Is this new dimension really needs to be introduced? Thanks, Serguei ------------- PR: https://git.openjdk.java.net/jdk/pull/2803