[ 
https://issues.apache.org/jira/browse/HDFS-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao updated HDFS-3948:
----------------------------

    Attachment: HDFS-3948-regenerate-exception.patch

Also got this exception in HDFS-3616 and HDFS-4067. After checking the code, I 
guess this exception may be caused because of this process:

1. A FSDataOutputStream instance (out4) is created through 
WebHdfsFileSystem#create, in order to create and write a new file.

2. The request is redirected to a DN, where DFSClient#create is called to 
create the file in NN through RPC.

3. At this time, the test has called MiniDfsCluster#shutdownNameNode, and in 
NameNode#stop(), the FSNamesystem has been shutdown (where the FSEditLog will 
be close) but the RPCServer has not been closed yet.

4. The RPC request from DN is sent to NN and FSEditLog#logEdit is called for 
the creation. But at this time the FSEditLog has already been closed and 
FSEditLog#editLogStream has been set to null.

Therefore, if the assertion is enabled, a "bad state: CLOSED" will be returned 
to client finally (the case in HDFS-3948); if the assertion is not enabled, 
because FSEditLog#editLogStream has been set to null, a NPE will be returned as 
reported in HDFS-3822.

The attached patch can regenerate the exception.
                
> TestWebHDFS#testNamenodeRestart is racy 
> ----------------------------------------
>
>                 Key: HDFS-3948
>                 URL: https://issues.apache.org/jira/browse/HDFS-3948
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.0.0-alpha
>            Reporter: Eli Collins
>         Attachments: HDFS-3948-regenerate-exception.patch
>
>
> After fixing HDFS-3936 I noticed that TestWebHDFS#testNamenodeRestart fails 
> when looping it, on my system it takes about 40 runs. WebHdfsFileSystem#close 
> is racing with restart and resulting in an add block after the edit log is 
> closed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to