[ https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887592#comment-16887592 ]
Yiqun Lin commented on HDFS-14313: ---------------------------------- I have two comments for the core implementation of the ReplicaCachingGetSpaceUsed. {code} @Override public synchronized Set<? extends Replica> deepCopyReplica(String bpid) throws IOException { Set<? extends Replica> replicas = new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET : volumeMap.replicas(bpid)); return replicas; } {code} why not use dataset lock? Here the volumeMap is thread-safe controlled by the dataset lock in FsDatasetImpl operations. Synchronized cannot ensure the race condition. Also it would be better to use {{ Collections.unmodifiableSet}} to make replica info is not allowed to be modified outside. {code} if (CollectionUtils.isNotEmpty(replicaInfos)) { + for (ReplicaInfo replicaInfo : replicaInfos) { + if (Objects.equals(replicaInfo.getVolume().getStorageID(), + volume.getStorageID())) { + dfsUsed += replicaInfo.getBytesOnDisk(); + dfsUsed += replicaInfo.getMetadataLength(); + count++; + } + } + } {code} I think {{replicaInfo.getBlockDataLength}} will be more accurate than {{replicaInfo.getBytesOnDisk}}. It gets the length from actual file while getBytesOnDisk is from block inner variable. Will take detail review soon. > Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory > instead of df/du > ---------------------------------------------------------------------------------------- > > Key: HDFS-14313 > URL: https://issues.apache.org/jira/browse/HDFS-14313 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, performance > Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0 > Reporter: Lisheng Sun > Assignee: Lisheng Sun > Priority: Major > Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, > HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, > HDFS-14313.005.patch, HDFS-14313.006.patch > > > There are two ways of DU/DF getting used space that are insufficient. > # Running DU across lots of disks is very expensive and running all of the > processes at the same time creates a noticeable IO spike. > # Running DF is inaccurate when the disk sharing by multiple datanode or > other servers. > Getting hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfos in memory > is very small and accurate. -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org