[jira] [Commented] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

Hadoop QA (JIRA) Thu, 10 May 2012 19:04:19 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13272985#comment-13272985
 ]


Hadoop QA commented on HDFS-3031:
---------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12526461/hdfs-3031.txt
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 4 new or modified test 
files.

    +1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

    +1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

    -1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

                  
org.apache.hadoop.hdfs.server.namenode.ha.TestFailureToReadEdits

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2419//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2419//console

This message is automatically generated.
                
> HA: Error (failed to close file) when uploading large file + kill active NN + 
> manual failover
> ---------------------------------------------------------------------------------------------
>
>                 Key: HDFS-3031
>                 URL: https://issues.apache.org/jira/browse/HDFS-3031
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: ha
>    Affects Versions: 0.24.0
>            Reporter: Stephen Chu
>            Assignee: Todd Lipcon
>         Attachments: hdfs-3031.txt, hdfs-3031.txt, hdfs-3031.txt, 
> styx01_killNNfailover, styx01_uploadLargeFile
>
>
> I executed section 3.4 of Todd's HA test plan. 
> https://issues.apache.org/jira/browse/HDFS-1623
> 1. A large file upload is started.
> 2. While the file is being uploaded, the administrator kills the first NN and 
> performs a failover.
> 3. After the file finishes being uploaded, it is verified for correct length 
> and contents.
> For the test, I have a vm_template styx01:/home/schu/centos64-2-5.5.qcow2. 
> styx01 hosted the active NN and styx02 hosted the standby NN.
> In the log files I attached, you can see that on styx01 I began file upload.
> hadoop fs -put centos64-2.5.5.qcow2
> After waiting several seconds, I kill -9'd the active NN on styx01 and 
> manually failed over to the NN on styx02. I ran into exception below. (rest 
> of the stacktrace in the attached file styx01_uploadLargeFile)
> 12/02/29 14:12:52 WARN retry.RetryInvocationHandler: A failover has occurred 
> since the start of this method invocation attempt.
> put: Failed on local exception: java.io.EOFException; Host Details : local 
> host is: "styx01.sf.cloudera.com/172.29.5.192"; destination host is: 
> ""styx01.sf.cloudera.com"\
> :12020;
> 12/02/29 14:12:52 ERROR hdfs.DFSClient: Failed to close file 
> /user/schu/centos64-2-5.5.qcow2._COPYING_
> java.io.IOException: Failed on local exception: java.io.EOFException; Host 
> Details : local host is: "styx01.sf.cloudera.com/172.29.5.192"; destination 
> host is: ""styx01.\
> sf.cloudera.com":12020;
>         at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1145)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:188)
>         at $Proxy9.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:302)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
>         at $Proxy10.addBlock(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1097)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:973)
>         at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:455)
> Caused by: java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:375)
>         at 
> org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:830)
>         at org.apache.hadoop.ipc.Client$Connection.run(Client.java:762)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3031) HA: Error (failed to close file) when uploading large file + kill active NN + manual failover

Reply via email to