[ https://issues.apache.org/jira/browse/HDFS-14531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16858709#comment-16858709 ]
Nathan Roberts commented on HDFS-14531: --------------------------------------- Actually, maybe disabling the DirectoryScanner is more than a workaround. Maybe that should be the default. What is this really protecting against these days? For large disks it's super expensive memory-wise and if there are enough blocks or enough system memory pressure it can cause tons of I/O as well. > Datanode's ScanInfo requires excessive memory > --------------------------------------------- > > Key: HDFS-14531 > URL: https://issues.apache.org/jira/browse/HDFS-14531 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Affects Versions: 2.0.0-alpha > Reporter: Daryn Sharp > Priority: Major > Attachments: Screen Shot 2019-05-31 at 12.25.54 PM.png > > > The DirectoryScanner's ScanInfo map consumes ~4.5X memory as replicas as the > replica map. For 1.1M replicas: the replica map is ~91M while the scan info > is ~405M. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org