[ 
https://issues.apache.org/jira/browse/HDFS-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12867750#action_12867750
 ] 

Todd Lipcon commented on HDFS-1142:
-----------------------------------

Hi Konstantin. Thanks for the detailed response.

bq. Suppose there is only one new client, and the old owner had died already. 
The client tries create(). This triggers lease recovery on NN, which starts the 
recovery under HDFS_NameNode, and throws RecoveryInProgressException back to 
the client. The client retries as expected, and the next time gets 
AlreadyBeingCreatedException. Thinking that somebody else got lucky before him 
the client bails out, which is not right as there is nobody esle competing for 
the file. 

What if we specifically compare the holder to the HDFS_Namenode special value, 
and in this case throw RecoveryInProgressException instead of 
AlreadyBeingCreatedException?

bq. Does that makes sense? I don't see a problem here. Do you have failing 
tests because of that?

Yes - please see the new test case included in the patch above. The issue is 
that the client can continue to do things like completeFile or allocate new 
blocks while recovery is underway.

bq. For future reference it is very undesirable to declare public methods in 
FSNamesystem to provide access to them from tests

I agree. However, in order to do mockito spying on commitBlockSynchronization, 
using a trampoline class like NameNodeAdapter would not work. If you agree with 
my above points, I can see if I can move the spy call into NameNodeAdapter 
itself.
BTW, isn't this the point of the "Private" InterfaceAudience annotation?


Let me know if you agree with the above idea (throwing 
RecoveryInProgressException when the lease is held by HDFS_NameNode).

> Lease recovery doesn't reassign lease when triggered by append()
> ----------------------------------------------------------------
>
>                 Key: HDFS-1142
>                 URL: https://issues.apache.org/jira/browse/HDFS-1142
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.21.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>         Attachments: hdfs-1142.txt, hdfs-1142.txt
>
>
> If a soft lease has expired and another writer calls append(), it triggers 
> lease recovery but doesn't reassign the lease to a new owner. Therefore, the 
> old writer can continue to allocate new blocks, try to steal back the lease, 
> etc. This is for the testRecoveryOnBlockBoundary case of HDFS-1139

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to