[ 
https://issues.apache.org/jira/browse/HDFS-4723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HDFS-4723:
---------------------------------

    Attachment: 4723-branch-2.patch
    
> Occasional failure in TestDFSClientRetries#testGetFileChecksum because the 
> number of available xcievers is set too low
> ----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-4723
>                 URL: https://issues.apache.org/jira/browse/HDFS-4723
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: test
>    Affects Versions: 3.0.0, 2.0.4-alpha
>            Reporter: Andrew Purtell
>         Attachments: 4723-branch-2.patch
>
>
> Occasional failure in TestDFSClientRetries#testGetFileChecksum because the 
> number of available xcievers is set too low. 
> {noformat}
> 2013-04-21 18:48:28,273 WARN  datanode.DataNode 
> (DataXceiverServer.java:run(161)) - 127.0.0.1:37608:DataXceiverServer: 
> java.io.IOException: Xceiver count 3 exceeds the limit of concurrent 
> xcievers: 2
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:143)
>       at java.lang.Thread.run(Thread.java:662)
> 2013-04-21 18:48:28,274 INFO  datanode.DataNode 
> (DataXceiver.java:writeBlock(453)) - Datanode 2 got response for connect ack  
> from downstream datanode with firstbadlink as 127.0.0.1:37608
> 2013-04-21 18:48:28,276 INFO  datanode.DataNode 
> (DataXceiver.java:writeBlock(491)) - Datanode 2 forwarding connect ack to 
> upstream firstbadlink is 127.0.0.1:37608
> 2013-04-21 18:48:28,276 ERROR datanode.DataNode 
> (DataXceiver.java:writeBlock(477)) - 
> DataNode{data=FSDataset{dirpath='[/home/ec2-user/jenkins/workspace/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data3/current,
>  
> /home/ec2-user/jenkins/workspace/hadoop/hadoop-hdfs-project/hadoop-hdfs/target/test/data/dfs/data/data4/current]'},
>  localName='127.0.0.1:33298', 
> storageID='DS-1506063529-10.174.86.97-33298-1366570107286', 
> xmitsInProgress=0}:Exception transfering block 
> BP-2121022065-10.174.86.97-1366570107029:blk_6876843860808656778_1071 to 
> mirror 127.0.0.1:37608: java.io.EOFException: Premature EOF: no length prefix 
> available
> 2013-04-21 18:48:28,276 INFO  hdfs.DFSClient 
> (DFSOutputStream.java:createBlockOutputStream(1105)) - Exception in 
> createBlockOutputStream
> java.io.IOException: Bad connect ack with firstBadLink as 127.0.0.1:37608
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1096)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1019)
>       at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:464)
> 2013-04-21 18:48:28,276 INFO  datanode.DataNode 
> (DataXceiver.java:writeBlock(537)) - opWriteBlock 
> BP-2121022065-10.174.86.97-1366570107029:blk_6876843860808656778_1071 
> received exception java.io.EOFException: Premature EOF: no length prefix 
> available
> 2013-04-21 18:48:28,277 INFO  datanode.DataNode 
> (BlockReceiver.java:receiveBlock(674)) - Exception for 
> BP-2121022065-10.174.86.97-1366570107029:blk_6876843860808656778_1071
> java.io.IOException: Premature EOF from inputStream
>       at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:644)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:506)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:219)
>       at java.lang.Thread.run(Thread.java:662)
> 2013-04-21 18:48:28,277 INFO  hdfs.DFSClient 
> (DFSOutputStream.java:nextBlockOutputStream(1022)) - Abandoning 
> BP-2121022065-10.174.86.97-1366570107029:blk_6876843860808656778_1071
> 2013-04-21 18:48:28,277 ERROR datanode.DataNode (DataXceiver.java:run(223)) - 
> 127.0.0.1:33298:DataXceiver error processing WRITE_BLOCK operation  src: 
> /127.0.0.1:55182 dest: /127.0.0.1:33298
> java.io.EOFException: Premature EOF: no length prefix available
>       at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1340)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:448)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:219)
>       at java.lang.Thread.run(Thread.java:662)
> 2013-04-21 18:48:28,277 INFO  datanode.DataNode (BlockReceiver.java:run(950)) 
> - PacketResponder: 
> BP-2121022065-10.174.86.97-1366570107029:blk_6876843860808656778_1071, 
> type=HAS_DOWNSTREAM_IN_PIPELINE
> java.io.EOFException: Premature EOF: no length prefix available
>       at 
> org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:1340)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:116)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:894)
>       at java.lang.Thread.run(Thread.java:662)
> 2013-04-21 18:48:28,278 INFO  datanode.DataNode (BlockReceiver.java:run(962)) 
> - PacketResponder: 
> BP-2121022065-10.174.86.97-1366570107029:blk_6876843860808656778_1071, 
> type=HAS_DOWNSTREAM_IN_PIPELINE: Thread is interrupted.
> 2013-04-21 18:48:28,278 INFO  datanode.DataNode 
> (BlockReceiver.java:run(1043)) - PacketResponder: 
> BP-2121022065-10.174.86.97-1366570107029:blk_6876843860808656778_1071, 
> type=HAS_DOWNSTREAM_IN_PIPELINE terminating
> 2013-04-21 18:48:28,278 INFO  datanode.DataNode 
> (DataXceiver.java:writeBlock(537)) - opWriteBlock 
> BP-2121022065-10.174.86.97-1366570107029:blk_6876843860808656778_1071 
> received exception java.io.IOException: Premature EOF from inputStream
> 2013-04-21 18:48:28,278 ERROR datanode.DataNode (DataXceiver.java:run(223)) - 
> 127.0.0.1:58102:DataXceiver error processing WRITE_BLOCK operation  src: 
> /127.0.0.1:47124 dest: /127.0.0.1:58102
> java.io.IOException: Premature EOF from inputStream
>       at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:194)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:414)
>       at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:644)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:506)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:98)
>       at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:65)
>       at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:219)
>       at java.lang.Thread.run(Thread.java:662)
> 2013-04-21 18:48:28,279 INFO  hdfs.DFSClient 
> (DFSOutputStream.java:nextBlockOutputStream(1025)) - Excluding datanode 
> 127.0.0.1:37608
> {noformat}
> As a consequence of this failure one datanode has been excluded and from this 
> point there are insufficient datanodes to place replicas:
> {noformat}
> 2013-04-21 18:48:54,288 WARN  blockmanagement.BlockPlacementPolicy 
> (BlockPlacementPolicyDefault.java:chooseTarget(232)) - Not able to place 
> enough replicas, still in need of 1 to reach 3
> ...
> {noformat}
> and the test eventually times out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to