[ 
https://issues.apache.org/jira/browse/HDFS-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13572926#comment-13572926
 ] 

Todd Lipcon commented on HDFS-4258:
-----------------------------------

I'm not sure the example given in the description can actually happen -- 
HDFS-3031 may have already fixed this case in its patch. The block comment that 
patch added describes the three possible cases on addBlock():

{code}
+        // The block that the client claims is the current last block
+        // doesn't match up with what we think is the last block. There are
+        // three possibilities:
+        // 1) This is the first block allocation of an append() pipeline
+        //    which started appending exactly at a block boundary.
+        //    In this case, the client isn't passed the previous block,
+        //    so it makes the allocateBlock() call with previous=null.
+        //    We can distinguish this since the last block of the file
+        //    will be exactly a full block.
+        // 2) This is a retry from a client that missed the response of a
+        //    prior getAdditionalBlock() call, perhaps because of a network
+        //    timeout, or because of an HA failover. In that case, we know
+        //    by the fact that the client is re-issuing the RPC that it
+        //    never began to write to the old block. Hence it is safe to
+        //    abandon it and allocate a new one.
+        // 3) This is an entirely bogus request/bug -- we should error out
+        //    rather than potentially appending a new block with an empty
+        //    one in the middle, etc
{code}

and it has code to detect case 3, which is what we seem to be talking about 
here.

Also, it seems like, if the main goal is to deal with this issue, you don't 
need anything nearly as complex -- you only would need inode IDs in the 
FileUnderConstruction structure - just a UUID for each open file. That wouldn't 
have the memory concerns, and would address the problem as it's been described, 
right?

To be clear, I'm not against adding inode numbers, but I agree it would be nice 
to write up what the use cases are and make sure that the benefits outweigh the 
cost. I agree we can squeeze the inum into some extra bits in our memory, but 
there are plenty of other things that would be nice to squeeze in as well (eg 
xattrs, hierarchical storage classes, etc) - if we put all of them in, it could 
really hurt.

Another use case that I see for inums, if they're guaranteed to not recycle 
(not sure if that's the case) would be for modification checks in tools like 
distcp. We currently can't rely on just the file name and size, since the file 
could have been replaced, but the combination of inum and size would be unique.
                
> Rename of Being Written Files
> -----------------------------
>
>                 Key: HDFS-4258
>                 URL: https://issues.apache.org/jira/browse/HDFS-4258
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client, namenode
>    Affects Versions: 3.0.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Brandon Li
>         Attachments: HDFS-4258.patch, HDFS-4258.patch, HDFS-4258.patch, 
> HDFS-4258.patch
>
>
> When a being written file or it's ancestor directories is renamed, the path 
> in the file lease is also renamed.  Then the writer of the file usually will 
> fail since the file path in the writer is not updated.
> Moreover, I think there is a bug as follow:
> # Client writes 0's to F_0="/foo/file" and writes 1's to F_1="/bar/file" at 
> the same time.
> # Rename /bar to /baz
> # Rename /foo to /bar
> Then, writing to F_0 will fail since /foo/file does not exist anymore but 
> writing to F_1 may succeed since /bar/file exits as a different file.  In 
> such case, the content of /bar/file could be partly 0's and partly 1's.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to