[ 
https://issues.apache.org/jira/browse/HDFS-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16887592#comment-16887592
 ] 

Yiqun Lin commented on HDFS-14313:
----------------------------------

I have two comments for the core implementation of the 
ReplicaCachingGetSpaceUsed.

{code}
 @Override
  public synchronized Set<? extends Replica> deepCopyReplica(String bpid)
      throws IOException {
    Set<? extends Replica> replicas =
        new HashSet<>(volumeMap.replicas(bpid) == null ? Collections.EMPTY_SET
            : volumeMap.replicas(bpid));
    return replicas;
  }
{code}
why not use dataset lock? Here the volumeMap is thread-safe controlled by the 
dataset lock in FsDatasetImpl operations. Synchronized cannot ensure the race 
condition. Also it would be better to use {{ Collections.unmodifiableSet}} to 
make replica info is not allowed to be modified outside.

{code}
if (CollectionUtils.isNotEmpty(replicaInfos)) {
+        for (ReplicaInfo replicaInfo : replicaInfos) {
+          if (Objects.equals(replicaInfo.getVolume().getStorageID(),
+              volume.getStorageID())) {
+            dfsUsed += replicaInfo.getBytesOnDisk();
+            dfsUsed += replicaInfo.getMetadataLength();
+            count++;
+          }
+        }
+      }
{code}
I think {{replicaInfo.getBlockDataLength}} will be more accurate than 
{{replicaInfo.getBytesOnDisk}}. It gets the length from actual file while 
getBytesOnDisk is from block inner variable.

Will take detail review soon.


> Get hdfs used space from FsDatasetImpl#volumeMap#ReplicaInfo in memory  
> instead of df/du
> ----------------------------------------------------------------------------------------
>
>                 Key: HDFS-14313
>                 URL: https://issues.apache.org/jira/browse/HDFS-14313
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode, performance
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0, 2.9.0, 3.0.0, 3.1.0
>            Reporter: Lisheng Sun
>            Assignee: Lisheng Sun
>            Priority: Major
>         Attachments: HDFS-14313.000.patch, HDFS-14313.001.patch, 
> HDFS-14313.002.patch, HDFS-14313.003.patch, HDFS-14313.004.patch, 
> HDFS-14313.005.patch, HDFS-14313.006.patch
>
>
> There are two ways of DU/DF getting used space that are insufficient.
>  #  Running DU across lots of disks is very expensive and running all of the 
> processes at the same time creates a noticeable IO spike.
>  #  Running DF is inaccurate when the disk sharing by multiple datanode or 
> other servers.
>  Getting hdfs used space from  FsDatasetImpl#volumeMap#ReplicaInfos in memory 
> is very small and accurate. 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to