[jira] [Updated] (HDFS-1377) Quota bug for partial blocks allows quotas to be violated

Eli Collins (JIRA) Fri, 06 May 2011 11:40:43 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-1377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Eli Collins updated HDFS-1377:
------------------------------

    Fix Version/s: 0.20.204.0

It is already checked into branch-0.20-security and branch-0.20-security-204.  
Updating the fix version to match. Would be great if this were done for the 
other jiras in the branch.

> Quota bug for partial blocks allows quotas to be violated 
> ----------------------------------------------------------
>
>                 Key: HDFS-1377
>                 URL: https://issues.apache.org/jira/browse/HDFS-1377
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0, 0.23.0
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>            Priority: Blocker
>             Fix For: 0.20.3, 0.20.204.0, 0.21.1, Federation Branch, 0.22.0, 
> 0.23.0
>
>         Attachments: HDFS-1377.patch, hdfs-1377-1.patch, 
> hdfs-1377-b20-1.patch, hdfs-1377-b20-2.patch, hdfs-1377-b20-3.patch
>
>
> There's a bug in the quota code that causes them not to be respected when a 
> file is not an exact multiple of the block size. Here's an example:
> {code}
> $ hadoop fs -mkdir /test
> $ hadoop dfsadmin -setSpaceQuota 384M /test
> $ ls dir/ | wc -l   # dir contains 101 files
> 101
> $ du -ms dir        # each is 3mb
> 304   dir
> $ hadoop fs -put dir /test
> $ hadoop fs -count -q /test
>         none             inf       402653184      -550502400            2     
>      101          317718528 hdfs://haus01.sf.cloudera.com:10020/test
> $ hadoop fs -stat "%o %r" /test/dir/f30
> 134217728 3    # three 128mb blocks
> {code}
> INodeDirectoryWithQuota caches the number of bytes consumed by it's children 
> in {{diskspace}}. The quota adjustment code has a bug that causes 
> {{diskspace}} to get updated incorrectly when a file is not an exact multiple 
> of the block size (the value ends up being negative). 
> This causes the quota checking code to think that the files in the directory 
> consumes less space than they actually do, so the verifyQuota does not throw 
> a QuotaExceededException even when the directory is over quota. However the 
> bug isn't visible to users because {{fs count -q}} reports the numbers 
> generated by INode#getContentSummary which adds up the sizes of the blocks 
> rather than use the cached INodeDirectoryWithQuota#diskspace value.
> In FSDirectory#addBlock the disk space consumed is set conservatively to the 
> full block size * the number of replicas:
> {code}
> updateCount(inodes, inodes.length-1, 0,
>     fileNode.getPreferredBlockSize()*fileNode.getReplication(), true);
> {code}
> In FSNameSystem#addStoredBlock we adjust for this conservative estimate by 
> subtracting out the difference between the conservative estimate and what the 
> number of bytes actually stored was:
> {code}
> //Updated space consumed if required.
> INodeFile file = (storedBlock != null) ? storedBlock.getINode() : null;
> long diff = (file == null) ? 0 :
>     (file.getPreferredBlockSize() - storedBlock.getNumBytes());
> if (diff > 0 && file.isUnderConstruction() &&
>     cursize < storedBlock.getNumBytes()) {
> ...
>     dir.updateSpaceConsumed(path, 0, -diff*file.getReplication());
> {code}
> We do the same in FSDirectory#replaceNode when completing the file, but at a 
> file granularity (I believe the intent here is to correct for the cases when 
> there's a failure replicating blocks and recovery). Since oldnode is under 
> construction INodeFile#diskspaceConsumed will use the preferred block size  
> (vs of Block#getNumBytes used by newnode) so we will again subtract out the 
> difference between the full block size and what the number of bytes actually 
> stored was:
> {code}
> long dsOld = oldnode.diskspaceConsumed();
> ...
> //check if disk space needs to be updated.
> long dsNew = 0;
> if (updateDiskspace && (dsNew = newnode.diskspaceConsumed()) != dsOld) {
>   try {
>     updateSpaceConsumed(path, 0, dsNew-dsOld);
> ...
> {code}
> So in the above example we started with diskspace at 384mb (3 * 128mb) and 
> then we subtract 375mb (to reflect only 9mb raw was actually used) twice so 
> for each file the diskspace for the directory is - 366mb (384mb minus 2 * 
> 375mb). Which is why the quota gets negative and yet we can still write more 
> files.
> So a directory with lots of single block files (if you have multiple blocks 
> on the final partial block ends up subtracting from the diskspace used) ends 
> up having a quota that's way off.
> I think the fix is to in FSDirectory#replaceNode not have the 
> diskspaceConsumed calculations differ when the old and new INode have the 
> same blocks. I'll work on a patch which also adds a quota test for blocks 
> that are not multiples of the block size and warns in 
> INodeDirectory#computeContentSummary if the computed size does not reflect 
> the cached value.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HDFS-1377) Quota bug for partial blocks allows quotas to be violated

Reply via email to