[ 
https://issues.apache.org/jira/browse/HDFS-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401135#comment-13401135
 ] 

Kihwal Lee commented on HDFS-3541:
----------------------------------

The patch looks okay but I was wondering whether the test can be improved. 

The test in the current patch does not directly recreate the original race 
condition. Probably an artificial deadlock can be created by creating a thread 
which does sleep and then kills the writer inside a 
{{synchronized(datanode.data)}} block. While it's sleeping, another thread 
could try closing the {{DFSOutputStream}}. This should fail when the writer 
(i.e. the {{DataXceiver}} thread) is killed and streams get closed.  After this 
we could verify the block is not finalized. Then we know the 
{{PacketResponder}} thread didn't finalize the block. 

Does it make sense?

                
> Deadlock between recovery, xceiver and packet responder
> -------------------------------------------------------
>
>                 Key: HDFS-3541
>                 URL: https://issues.apache.org/jira/browse/HDFS-3541
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.23.3, 2.0.1-alpha
>            Reporter: suja s
>            Assignee: Vinay
>         Attachments: DN_dump.rar, HDFS-3541.patch
>
>
> Block Recovery initiated while write in progress at Datanode side. Found a 
> lock between recovery, xceiver and packet responder.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to