[ https://issues.apache.org/jira/browse/HDFS-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13688511#comment-13688511 ]
Suresh Srinivas commented on HDFS-4849: --------------------------------------- bq. I thought Suresh had some concrete suggestions, which I could have quickly corrected. But it is something else. I did have concrete points. But you fail to see it and you are free to blow it away as fluff! I disagree with you terming the new functionality as idempotent. I have tried to explain it many ways. I will given another last shot. # If an operation is idempotent, it can be retried any number of times. There is no retry cache required. If an operation is non-idempotent, the only way you can handle them is using retry cache. The retry cache only works for a single client, and only when the operations are retried. Idempotent operations work for any client, irrespective of whether it is a retried request or new request. # What you have in this patch is, code equivalent to retry cache, done using lease. That is the reason why, this functionality only works for a single client and not for multiple clients. I have already posted some comments about the code snippet you posted and I have described why a client using a file system instance does not have any issue without this patch and why this patch introduces *data corruption* issues. Lets complete that thread. As I said earlier, I agree that this is an important feature. In the interest of making progress on this issue, here are the next steps: # I disagree that this makes create and append idempotent. As long as it is termed as enable retries from a single client, my concern will be addressed. We should also address multi threaded client issue and *data corruption* issue, that I have brought up earlier. Here are comments I have on the current patch: # Given this is equivalent to retry cache and we are just sending the same result back for retried method call, lets not have audit log event. It is easy to turn it off. Also doing this will make this a compatible change. # Currently the checkIfRetry for non retried create calls depends on catching an Exception. This is expensive. Please use another variant of method that does not depend on Exception and just gets true or false. # One big problem with the current approach is (I have not looked at all the create flags): #* create with overwrite enabled is already idempotent. This patch uses retry cache for that and changes the semantics. So we should not change that behavior. #* Please run some benchmark to make sure the impact on create call (non-retried) is quantified. # Another suggestion. Move checkIfRetry to the else clause in the following code snippet: {code} if (myFile == null) { if (!create) { throw new FileNotFoundException("failed to overwrite or append to non-existent file " + src + " on client " + clientMachine); } } else { // checkIfRetry here {code} > Idempotent create and append operations. > ---------------------------------------- > > Key: HDFS-4849 > URL: https://issues.apache.org/jira/browse/HDFS-4849 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 2.0.4-alpha > Reporter: Konstantin Shvachko > Assignee: Konstantin Shvachko > Priority: Blocker > Attachments: idempotentCreate.patch, idempotentCreate.patch, > idempotentCreate.patch > > > create, append and delete operations can be made idempotent. This will reduce > chances for a job or other app failures when NN fails over. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira