sodonnel opened a new pull request #2849:
URL: https://github.com/apache/hadoop/pull/2849


   This is a relatively simple change to reduce the memory used by the 
Directory Scanner and also simplify the logic in the ScanInfo object.
   
   This change ensures the same File object is re-used for all blocks in a 
directory. Previously a large part of the path was repeated for each block file.
   
   Aside from that, the logic of the directory scanner remains the same.
   
   Comparing heap dumps, the memory used by 100K blocks goes from ~35MB to 
19MB. Or 350MB per 1M blocks down to 190MB per 1M blocks. This is a reduction 
of about 46%.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to