[ 
https://issues.apache.org/jira/browse/HDFS-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14741632#comment-14741632
 ] 

Joep Rottinghuis commented on HDFS-8898:
----------------------------------------

So it sounds like we're discussing two things here:
1) Getting the quota itself for a directory that a user has access to. There 
seems to be little security concerns with this.
2) Getting the quota, and the "ContentSummary" / count / usage for a directory 
that a user has access to, even if they might not have access to all the 
sub-directories. This is where [~jlowe] pointed out that there could be a 
potential security implication.

Even with yielding the NN lock, it seems the NN can still lock for ~1 sec per 
10M files in a sub-directory to check the entire sub-directory sub-directory 
tree for permissions.
To address the potential security implications for 2) we could either make this 
a cluster-wide (final) config value, or we could do something with an extended 
attribute on the directory itself to allow or disallow a particular directory 
to be traversed (or not).

1) would give a huge performance boost for the cases when people just want to 
know what the quota is.
2) would give a huge performance boost for the cases when people want to know a 
quota plus what's left for large directories relatively high in the directory 
structure (let alone / on a huge namespace of many tens of millions of files).

> Create API and command-line argument to get quota without need to get file 
> and directory counts
> -----------------------------------------------------------------------------------------------
>
>                 Key: HDFS-8898
>                 URL: https://issues.apache.org/jira/browse/HDFS-8898
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: fs
>            Reporter: Joep Rottinghuis
>
> On large directory structures it takes significant time to iterate through 
> the file and directory counts recursively to get a complete ContentSummary.
> When you want to just check for the quota on a higher level directory it 
> would be good to have an option to skip the file and directory counts.
> Moreover, currently one can only check the quota if you have access to all 
> the directories underneath. For example, if I have a large home directory 
> under /user/joep and I host some files for another user in a sub-directory, 
> the moment they create an unreadable sub-directory under my home I can no 
> longer check what my quota is. Understood that I cannot check the current 
> file counts unless I can iterate through all the usage, but for 
> administrative purposes it is nice to be able to get the current quota 
> setting on a directory without the need to iterate through and run into 
> permission issues on sub-directories.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to