[ 
https://issues.apache.org/jira/browse/HDDS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134151#comment-17134151
 ] 

Istvan Fajth edited comment on HDDS-3721 at 6/12/20, 11:48 AM:
---------------------------------------------------------------

At first I thought this one is simply a client side problem, but after going 
into the details a bit, I realised that there might be a reason why HDFS has 
this on the server side, and started to check into it, but then I had to put 
this one aside a bit.

The benefits of approaching this from the client side, is that it stays on the 
client side, and avoids a heavy implementation on the OM side, on the other 
hand on the OM side, on the other hand, it is painfully slow, and the runtime 
scales up with the number of elements in a directory, it was running for ~25 
seconds on a folder with 82k files in 3.5k subfolders.
The problem of approaching this from the client side, is that it leads to 4 
calls per subdirectory (14k calls in this case)... 1 READ_BUCKET, then 1 
GET_FILE_STATUS (to see if this is a file or a dir), then if it is a directory 
1 READ_BUCKET again, and finally a LIST_STATUS, which then can not be 
controlled or throttled by the server side much as these are coming from the 
client side and from possibly multiple clients at some times.


The benefit of having something similar in the OM API, is to have just one 
call, and we can do throttling and any kind of optimisation on the OM side as 
needed, and we might ultimately cache the values even if that becomes necessary.
The problem of this approach is that it possibly requires a lock, and is an 
operation that is blocking OM for too long... I am unsure though whether we 
need even the read lock.


[~arp], can you give some insight why you would like to avoid implementing this 
on OM side, perhaps why at the end was it implemented on the server side for 
HDFS?


was (Author: pifta):
At first I thought this one is simply a client side problem, but after going 
into the details a bit, I realised that there might be a reason why HDFS has 
this on the server side, and started to check into it, but then I had to put 
this one aside a bit.

The benefits of approaching this from the client side, is that it stays on the 
client side, and avoids a heavy implementation on the OM side, on the other 
hand on the OM side, on the other hand, it is painfully slow, and the runtime 
scales up with the number of elements in a directory, it was running for ~25 
seconds on a folder with 82k files in 3.5k subfolders.
The problem of approaching this from the client side, is that it leads to 4 
calls per subdirectory... 1 READ_BUCKET, then 1 GET_FILE_STATUS (to see if this 
is a file or a dir), then if it is a directory 1 READ_BUCKET again, and finally 
a LIST_STATUS, which then can not be controlled or throttled by the server side 
much as these are coming from the client side and from possibly multiple 
clients at some times.


The benefit of having something similar in the OM API, is to have just one 
call, and we can do throttling and any kind of optimisation on the OM side as 
needed, and we might ultimately cache the values even if that becomes necessary.
The problem of this approach is that it possibly requires a lock, and is an 
operation that is blocking OM for too long... I am unsure though whether we 
need even the read lock.


[~arp], can you give some insight why you would like to avoid implementing this 
on OM side, perhaps why at the end was it implemented on the server side for 
HDFS?

> Implement getContentSummary to provide replicated size properly to dfs -du 
> command
> ----------------------------------------------------------------------------------
>
>                 Key: HDDS-3721
>                 URL: https://issues.apache.org/jira/browse/HDDS-3721
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: Istvan Fajth
>            Assignee: Istvan Fajth
>            Priority: Major
>              Labels: Triaged
>
> Currently when you run hdfs dfs -du command against a path on Ozone, it uses 
> the default implementation from FileSystem class in the Hadoop project, and 
> that does not care to calculate with replication factor by default. In 
> DistributedFileSystem and in a couple of FileSystem implementation there is 
> an override to calculate the full replicated size properly.
> Currently the output is something like this for a folder which has file with 
> replication factor of 3:
> {code}
> hdfs dfs -du -s -h o3fs://perfbucket.volume.ozone1/terasort/datagen
> 931.3 G  931.3 G  o3fs://perfbucket.volume.ozone1/terasort/datagen
> {code}
> The command in Ozone's case as well should report the replicated size az the 
> second number so something around 2.7TB in this case.
> In order to do so, we should implement getContentSummary and calculate the 
> replicated size in the response properly in order to get there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: ozone-issues-h...@hadoop.apache.org

Reply via email to