[ https://issues.apache.org/jira/browse/HDFS-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686225#comment-13686225 ]
Konstantin Shvachko commented on HDFS-4849: ------------------------------------------- Suresh, I was referring to my different reply. There is hyper-link pointing directly there. I'll repeat facts below: - DFSClient.clientName is defined through configuration via taskId field. -- This is the only way to guarantee uniqueness of the clientName now. -- if two clients are created within the same thread they are likely to have different names because clientName includes Random, but not guaranteed. - HDFS has an option to turn FileSystem caching off, which is another way to avoid multi-threaded clientName collisions. - Multiple threads using the same cached FileSystem has an issue even now without retries. Suppose one creates a file and then spawns several threads that start creating blocks simultaneously on the same file. The same problem, right? I am in favour of making clientName unique, can file a jira if you want me to. I also agree we should improve consistency of create. We have a few jiras for that, like HDFS-4437, HDFS-4874. > allow retry for a given call ID in a small window of time... This still may > not be the correct solution. I am also hesitant about this approach. My main concern that you need to make this cache persistent, which in turn will lead to the requirement of idempotent cache-persisting-operations. But I would rather further discuss it in HDFS-4872. Create (append) is much simpler because NN already keeps the state of the file in the form of its lease and INodeUnderConstruction. addBlock() was made idempotent so the create can be, which I think my patch achieves. For audit log. Do you mean the one at the end of {{startFileInt()}}? {code} logAuditEvent(true, "create", src, null, stat); {code} Is there a requirement for audit log to print the create file only once? > Idempotent create and append operations. > ---------------------------------------- > > Key: HDFS-4849 > URL: https://issues.apache.org/jira/browse/HDFS-4849 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 2.0.4-alpha > Reporter: Konstantin Shvachko > Assignee: Konstantin Shvachko > Priority: Blocker > Attachments: idempotentCreate.patch, idempotentCreate.patch > > > create, append and delete operations can be made idempotent. This will reduce > chances for a job or other app failures when NN fails over. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira