[ https://issues.apache.org/jira/browse/HDFS-15937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen O'Donnell updated HDFS-15937: ------------------------------------- Fix Version/s: 3.2.3 3.1.5 3.4.0 3.3.1 > Reduce memory used during datanode layout upgrade > ------------------------------------------------- > > Key: HDFS-15937 > URL: https://issues.apache.org/jira/browse/HDFS-15937 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode > Affects Versions: 3.3.0, 3.1.4, 3.2.2, 3.4.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0, 3.1.5, 3.2.3 > > Attachments: heap-dump-after.png, heap-dump-before.png > > Time Spent: 2h > Remaining Estimate: 0h > > When the datanode block layout is upgrade from -56 (256x256) to -57 (32x32), > we have found the datanode uses a lot more memory than usual. > For each volume, the blocks are scanned and a list is created holding a > series of LinkArgs objects. This object contains a File object for the block > source and destination. The file object stores the path as a string, eg: > /data01/dfs/dn/current/BP-586623041-127.0.0.1-1617017575175/current/finalized/subdir0/subdir0/blk_1073741825_1001.meta > /data01/dfs/dn/current/BP-586623041-127.0.0.1-1617017575175/current/finalized/subdir0/subdir0/blk_1073741825 > This is string is repeated for every block and meta file on the DN, and much > of the string is the same each time, leading to a large amount of memory. > If we change the linkArgs to store: > * Src Path without the block, eg > /data01/dfs/dn/previous.tmp/BP-586623041-127.0.0.1-1617017575175/current/finalized/subdir0/subdir0 > * Dest Path without the block eg > /data01/dfs/dn/current/BP-586623041-127.0.0.1-1617017575175/current/finalized/subdir0/subdir10 > * Block / Meta file name, eg blk_12345678_1001 or blk_12345678_1001.meta > Then ensure were reuse the same file object for repeated src and dest paths, > we can save most of the memory without reworking the logic of the code. > The current logic works along the source paths recursively, so you can easily > re-use the src path object. > For the destination path, there are only 32x32 (1024) distinct paths, so we > can simply cache them in a hashMap and lookup the re-useable object each time. > I tested locally by generating 100k block files and attempting the layout > upgrade. A heap dump showed the 100k blocks using about 140MB of heap. That > is close to 1.5GB per 1M blocks. > After the change outlined above the same 100K blocks used about 20MB of heap, > so 200MB per million blocks. > A general DN sizing recommendation is 1GB of heap per 1M blocks, so the > upgrade should be able to happen within the pre-upgrade heap. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org