[ 
https://issues.apache.org/jira/browse/HDFS-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13686225#comment-13686225
 ] 

Konstantin Shvachko commented on HDFS-4849:
-------------------------------------------

Suresh, I was referring to my different reply. There is hyper-link pointing 
directly there.
I'll repeat facts below:
- DFSClient.clientName is defined through configuration via taskId field. 
-- This is the only way to guarantee uniqueness of the clientName now.
-- if two clients are created within the same thread they are likely to have 
different names because clientName includes Random, but not guaranteed.
- HDFS has an option to turn FileSystem caching off, which is another way to 
avoid multi-threaded clientName collisions.
- Multiple threads using the same cached FileSystem has an issue even now 
without retries.
Suppose one creates a file and then spawns several threads that start creating 
blocks simultaneously on the same file. The same problem, right?

I am in favour of making clientName unique, can file a jira if you want me to.
I also agree we should improve consistency of create. We have a few jiras for 
that, like HDFS-4437, HDFS-4874.

>  allow retry for a given call ID in a small window of time... This still may 
> not be the correct solution.

I am also hesitant about this approach. My main concern that you need to make 
this cache persistent, which in turn will lead to the requirement of idempotent 
cache-persisting-operations. But I would rather further discuss it in HDFS-4872.

Create (append) is much simpler because NN already keeps the state of the file 
in the form of its lease and INodeUnderConstruction. addBlock() was made 
idempotent so the create can be, which I think my patch achieves.

For audit log. Do you mean the one at the end of {{startFileInt()}}?
{code}
    logAuditEvent(true, "create", src, null, stat);
{code}
Is there a requirement for audit log to print the create file only once?
                
> Idempotent create and append operations.
> ----------------------------------------
>
>                 Key: HDFS-4849
>                 URL: https://issues.apache.org/jira/browse/HDFS-4849
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.0.4-alpha
>            Reporter: Konstantin Shvachko
>            Assignee: Konstantin Shvachko
>            Priority: Blocker
>         Attachments: idempotentCreate.patch, idempotentCreate.patch
>
>
> create, append and delete operations can be made idempotent. This will reduce 
> chances for a job or other app failures when NN fails over.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to