[jira] [Updated] (HDFS-10529) Df reports incorrect usage when appending less than block size

Andrew Wang (JIRA) Fri, 15 Jul 2016 19:21:07 -0700

     [ 
https://issues.apache.org/jira/browse/HDFS-10529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Andrew Wang updated HDFS-10529:
-------------------------------
    Target Version/s: 3.0.0-alpha2  (was: 2.8.0, 3.0.0-alpha1)

> Df reports incorrect usage when appending less than block size
> --------------------------------------------------------------
>
>                 Key: HDFS-10529
>                 URL: https://issues.apache.org/jira/browse/HDFS-10529
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.7.2, 3.0.0-alpha1
>            Reporter: Pranav Prakash
>            Assignee: Pranav Prakash
>            Priority: Minor
>              Labels: datanode, fs, hdfs
>         Attachments: HDFS-10529.000.patch
>
>
> Steps to recreate issue:
> 1. Create a 100MB file on HDFS cluster with 128MB blocksize and replication 
> factor 3
> 2. Append 100MB to the file
> 3. Df reports around 900MB even though it should only be around 600MB.
> Looking at the blocks confirms that df is incorrect, as there exist only two 
> blocks on each DN -- a 128MB block and a 72MB block.
> This issue seems to arise because BlockPoolSlice does not account for the 
> delta increase in dfsUsage when an append happens to a partially-filled 
> block, and instead naively adds the total block size. For instance, in the 
> example scenario when when block is "filled" from 100 to 128MB, 
> addFinalizedBlock() in BlockPoolSlice adds the size of the newly created 
> block into the total instead of accounting for the difference/delta in block 
> size between old and new.  This has the effect of double-counting the old 
> partially-filled block: it is counted once when it is first created (in the 
> example scenario when the 100MB file is created) and again when it becomes 
> part of the filled block (in the example scenario when the 128MB block is 
> formed form the initial 100MB block). Thus the perceived size becomes 100MB + 
> 128MB + 72 = 300 MB for each DN, or 900MB across the cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Updated] (HDFS-10529) Df reports incorrect usage when appending less than block size

Reply via email to