LakshSingla commented on PR #13755:
URL: https://github.com/apache/druid/pull/13755#issuecomment-1453045496

   After debugging, it was found that there can be the following places of 
memory leaks (before this patch):
   1. The SuperSorter holds the PartitionedOutputChannels' map which contains 
the writableChannel and the memory allocator, which are not useful once the 
channel has been written to. Therefore while storing the 
PartitionedOutputChannels, we convert them to readOnly() which throws away the 
reference to the writableChannel and the memory allocator allowing them to be 
GCed and thereby reducing the footprint of the SuperSorter. Attached below is a 
heap dump before the PR
   <img width="764" alt="Screenshot 2023-03-03 at 11 52 00 AM" 
src="https://user-images.githubusercontent.com/30999375/222646939-d73a486f-05c2-4391-8290-f4833ffa8421.png";>
   
   2. The output channels created by `ComposingOutputChannelFactory` contain a 
collection of writableChannels and the readableChannelSuppliers (from which it 
is composed off). Even if we throw away the reference to the frame memory 
allocators of the original channel, the readableChannelSuppliers still hold the 
reference to the memory allocators of the individual output channels. To 
alleviate this, while building the readableChannelSuppliers in the 
ComposingOutputChannelFactory, we only get the readOnly() version of the output 
channel.
   
   3. Another potential memory hog is when the output channels created by the 
`ComposingOutputChannelFactory` contain more than one outputChannel. Once an 
output channel is exhausted in the composition, we move on to the next output 
channel and never write to the older one again. However, we still hold the 
reference to the memory allocator of the older channels because the code 
assumes that we can write on it again (till the composition itself is marked as 
readOnly()). Therefore multiple memory allocators can be held up for a 
composition, even though we require one at a time. While the future ones are 
created lazily, the older ones are not closed, and this PR addresses that as 
well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to