[jira] [Commented] (HDFS-15837) Incorrect bytes causing block corruption after update pipeline and recovery failure

Kihwal Lee (Jira) Wed, 17 Feb 2021 11:43:09 -0800


    [ 
https://issues.apache.org/jira/browse/HDFS-15837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17286111#comment-17286111
 ]


Kihwal Lee commented on HDFS-15837:
-----------------------------------

The successful pipeline update with 3072 bytes was at 03:42:40.  The 
replication of a 3544 byte block and corruption detection was 4 minutes later. 
The client must have wrote more after the recovery and closed the file. 
Replication happens only after closing the file.  It looks like the file was 
closed at around 03:46:40.

I don't think the size difference observed on datanodes are anomaly. Clients 
usually write more data after a pipeline recovery. Is the file still existing? 
What's the size you see when you do "hadoop fs -ls /pat_to_the_file"?  Is the 
block appearing as missing on the UI? 

If there was a corruption issue, I don't think it was because of the size.  The 
fact that the replication happened means the NN was agreeing with the size 
3544. If the datanode had unexpected size of the replica, the transfer wouldn't 
have happened.  The corruption report came from the destination node for the 
replication, 172.21.234.181.  The replication source does not perform any 
checks and there is no feedback mechanism for reporting error back to the 
source node. So even if a corruption was detected during the replication, only 
the target (172.21.234.181) would notice it.  Check the log of 172.21.234.181 
at around 03:46:43. It must have detected a corruption and logged it.

The replica on the source (172.21.226.26) must be the only copy and if it is 
corrupt, it will show up as missing. If the file still exists, you can get to 
the node and run "hdfs debug verifyMeta" command against the block file and the 
meta file to manually check the integrity.

If it is really corrupt, it could be due to 172.21.226.26's local issue.  
Extreme load can cause write failures and pipeline updates, but we haven't seen 
such condition causing data corruption.

> Incorrect bytes causing block corruption after update pipeline and recovery 
> failure
> -----------------------------------------------------------------------------------
>
>                 Key: HDFS-15837
>                 URL: https://issues.apache.org/jira/browse/HDFS-15837
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: datanode, hdfs
>    Affects Versions: 2.8.5
>            Reporter: Udit Mehrotra
>            Priority: Major
>
> We are seeing cases on HDFS blocks being marked as *bad* after the initial 
> block receive fails during *update pipeline* followed by *HDFS* *recovery* 
> for the block failing as well. Here is the life cycle of a block 
> *{{blk_1342440165_272630578}}* that was ultimately marked as corrupt:
> 1. The block creation starts at name node as part of *update pipeline* 
> process:
> {noformat}
> 2021-01-25 03:41:17,335 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem (IPC Server handler 61 on 
> 8020): updatePipeline(blk_1342440165_272500939 => blk_1342440165_272630578) 
> success{noformat}
> 2. The block receiver on the data node fails with a socket timeout exception, 
> and so do the retries:
> {noformat}
> 2021-01-25 03:42:22,525 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578, 
> type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[172.21.246.239:50010]): 
> PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578, 
> type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[172.21.246.239:50010]
> java.net.SocketTimeoutException: 65000 millis timeout while waiting for 
> channel to be ready for read. ch : java.nio.channels.SocketChannel[connected 
> local=/172.21.226.26:56294 remote=/172.21.246.239:50010]
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
>     at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
>     at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
>     at 
> org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
>     at java.io.FilterInputStream.read(FilterInputStream.java:83)
>     at java.io.FilterInputStream.read(FilterInputStream.java:83)
>     at 
> org.apache.hadoop.hdfs.protocolPB.PBHelperClient.vintPrefixed(PBHelperClient.java:400)
>     at 
> org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:213)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1305)
>     at java.lang.Thread.run(Thread.java:748)
> 2021-01-25 03:42:22,526 WARN org.apache.hadoop.hdfs.server.datanode.DataNode 
> (PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578, 
> type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[172.21.246.239:50010]): 
> IOException in BlockReceiver.run(): 
> java.io.IOException: Connection reset by peer
>     at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>     at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>     at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>     at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:468)
>     at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>     at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
>     at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
>     at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>     at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>     at java.io.DataOutputStream.flush(DataOutputStream.java:123)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstreamUnprotected(BlockReceiver.java:1552)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstream(BlockReceiver.java:1489)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1402)
>     at java.lang.Thread.run(Thread.java:748)
> 2021-01-25 03:42:22,526 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578, 
> type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[172.21.246.239:50010]): 
> PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578, 
> type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[172.21.246.239:50010]
> java.io.IOException: Connection reset by peer
>     at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>     at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>     at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>     at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>     at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:468)
>     at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
>     at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>     at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
>     at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
>     at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>     at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140)
>     at java.io.DataOutputStream.flush(DataOutputStream.java:123)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstreamUnprotected(BlockReceiver.java:1552)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.sendAckUpstream(BlockReceiver.java:1489)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1402)
>     at java.lang.Thread.run(Thread.java:748){noformat}
> 3. The data node terminates the receiving of block and initiates recovery:
> {noformat}
> 2021-01-25 03:42:22,526 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578, 
> type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[172.21.246.239:50010]): 
> PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578, 
> type=HAS_DOWNSTREAM_IN_PIPELINE, downstreams=1:[172.21.246.239:50010] 
> terminating
> 2021-01-25 03:42:22,529 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client DFSClient_NONMAPREDUCE_390070548_42 at 
> /172.21.240.31:60430 [Receiving block 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578]): 
> Receiving BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578 
> src: /172.21.240.31:60430 dest: /172.21.226.26:50010
> 2021-01-25 03:42:22,529 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> (DataXceiver for client DFSClient_NONMAPREDUCE_390070548_42 at 
> /172.21.240.31:60430 [Receiving block 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578]): Recover 
> RBW replica 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578{noformat}
> 4. The recovery on the datanode for *{{blk_1342440165_272630578}}* fails with:
> {noformat}
> 2021-01-25 03:42:34,223 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client DFSClient_NONMAPREDUCE_390070548_42 at 
> /172.21.240.31:45918 [Receiving block 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272500939]): 
> Exception for 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578
> java.nio.channels.ClosedByInterruptException
>     at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>     at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
>     at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.adjustCrcChannelPosition(FsDatasetImpl.java:1716)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.adjustCrcFilePosition(BlockReceiver.java:1052)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:733)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:923)
>     at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:854)
>     at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:166)
>     at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:103)
>     at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:288)
>     at java.lang.Thread.run(Thread.java:748)
> 2021-01-25 03:42:34,223 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client DFSClient_NONMAPREDUCE_390070548_42 at 
> /172.21.240.31:45918 [Receiving block 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272500939]): 
> opWriteBlock 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578 received 
> exception java.nio.channels.ClosedByInterruptException
> 2021-01-25 03:42:34,223 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client DFSClient_NONMAPREDUCE_390070548_42 at 
> /172.21.240.31:45918 [Receiving block 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272500939]): 
> ip-172-21-226-26.ec2.internal:50010:DataXceiver error processing WRITE_BLOCK 
> operation  src: /172.21.240.31:45918 dst: /172.21.226.26:50010
> java.nio.channels.ClosedByInterruptException
>     at 
> java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:202)
>     at sun.nio.ch.FileChannelImpl.position(FileChannelImpl.java:269)
>     at 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.adjustCrcChannelPosition(FsDatasetImpl.java:1716)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.adjustCrcFilePosition(BlockReceiver.java:1052)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:733)
>     at 
> org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:923)
>     at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:854)
>     at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:166)
>     at 
> org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:103)
>     at 
> org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:288)
>     at java.lang.Thread.run(Thread.java:748){noformat}
>  
> 5. This is followed by the resetting of bytes on the data node to {{3072}} 
> the correct length, which is an expected behavior based on 
> https://issues.apache.org/jira/browse/hdfs-11472:
> {noformat}
> 2021-01-25 03:42:40,007 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> (DataXceiver for client DFSClient_NONMAPREDUCE_390070548_42 at 
> /172.21.240.31:60430 [Receiving block 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578]): At 
> ip-172-21-226-26.ec2.internal:50010, Recovering ReplicaBeingWritten, 
> blk_1342440165_272630578, RBW
>   getNumBytes()     = 3072
>   getBytesOnDisk()  = 2678
>   getVisibleLength()= 3072
>   getVolume()       = /mnt/hdfs/current
>   getBlockFile()    = 
> /mnt/hdfs/current/BP-908477295-172.21.224.178-1606768078949/current/rbw/blk_1342440165
>   bytesAcked=3072
>   bytesOnDisk=2678
> 2021-01-25 03:42:40,011 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (DataXceiver for client 
> DFSClient_attempt_16089610225137_1287341_r_000139_0_392781198_30 at 
> /172.21.244.117:34354 [Receiving block 
> BP-908477295-172.21.224.178-1606768078949:blk_1342762434_272648220]): 
> Receiving BP-908477295-172.21.224.178-1606768078949:blk_1342762434_272648220 
> src: /172.21.244.117:34354 dest: /172.21.226.26:50010
> 2021-01-25 03:42:40,013 INFO 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> (DataXceiver for client DFSClient_NONMAPREDUCE_390070548_42 at 
> /172.21.240.31:60430 [Receiving block 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272630578]): 
> Resetting bytesOnDisk to match blockDataLength (=3072) for replica 
> ReplicaBeingWritten, blk_1342440165_272630578, RBW
>   getNumBytes()     = 3072
>   getBytesOnDisk()  = 2678
>   getVisibleLength()= 3072
>   getVolume()       = /mnt/hdfs/current
>   getBlockFile()    = 
> /mnt/hdfs/current/BP-908477295-172.21.224.178-1606768078949/current/rbw/blk_1342440165
>   bytesAcked=3072
>   bytesOnDisk=2678{noformat}
> 6. Since recovery failed, the Name Node restarts the {{updatePipeline}} for 
> the block by increasing the generation:
> {noformat}
> 2021-01-25 03:42:40,034 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem (IPC Server handler 15 on 
> 8020): updatePipeline(blk_1342440165_272630578, newGS=272645069, 
> newLength=3072, newNodes=[172.21.226.26:50010], 
> client=DFSClient_NONMAPREDUCE_390070548_42)
> 2021-01-25 03:42:40,034 INFO 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem (IPC Server handler 15 on 
> 8020): updatePipeline(blk_1342440165_272630578 => blk_1342440165_272645069) 
> success{noformat}
> Notice how the correct *{{newLength}}* in sent in the above log from Name 
> Node i.e. *{{3072}}*.
> 7. The data node then receives the new generation block of this block 
> correctly:
> {noformat}
> 2021-01-25 03:46:40,066 INFO 
> org.apache.hadoop.hdfs.server.datanode.DataNode.clienttrace (PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272645069, 
> type=LAST_IN_PIPELINE): src: /172.21.240.31:60430, dest: 
> /172.21.226.26:50010, bytes: 3554, op: HDFS_WRITE, cliID: 
> DFSClient_NONMAPREDUCE_390070548_42, offset: 0, srvID: 
> 1a2ff556-73f5-4e10-b018-a0c2105b242e, blockid: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272645069, duration: 
> 240052941459
> 2021-01-25 03:46:40,066 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272645069, 
> type=LAST_IN_PIPELINE): PacketResponder: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272645069, 
> type=LAST_IN_PIPELINE terminating{noformat}
> But here notice how bytes: *3554* which is different from what was sent from 
> the Name Node i.e. *3072*.
> 8. The data node then does the successful downstream replication into another 
> node for this block:
> {noformat}
> 2021-01-25 03:46:43,093 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (BP-908477295-172.21.224.178-1606768078949 heartbeating to 
> /172.21.238.225:8020): DatanodeRegistration(172.21.226.26:50010, 
> datanodeUuid=1a2ff556-73f5-4e10-b018-a0c2105b242e, infoPort=50075, 
> infoSecurePort=0, ipcPort=50020, 
> storageInfo=lv=-57;cid=CID-7608840c-3634-4377-9ba0-1c710e6d08d5;nsid=2011779031;c=1606768078949)
>  Starting thread to transfer 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272645069 to 
> 172.21.234.181:50010 
> 2021-01-25 03:46:43,094 INFO org.apache.hadoop.hdfs.server.datanode.DataNode 
> (org.apache.hadoop.hdfs.server.datanode.DataNode$DataTransfer@5cb29d58): 
> DataTransfer, at ip-172-21-226-26.ec2.internal:50010: Transmitted 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272645069 
> (numBytes=3554) to /172.21.234.181:5001{noformat}
> 9. However, immediately after successful transmitting and replication the 
> namenode declares the block as corrupt:
> {noformat}
> 2021-01-25 03:46:43,106 INFO org.apache.hadoop.hdfs.StateChange (IPC Server 
> handler 46 on 8020): *DIR* reportBadBlocks for block: 
> BP-908477295-172.21.224.178-1606768078949:blk_1342440165_272645069 on 
> datanode: 172.21.226.26:50010{noformat}
>  
> Any suggestions on what could cause this difference in the bytes communicated 
> by the name node (3072) vs whats written on the the data node (3554) ?
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-15837) Incorrect bytes causing block corruption after update pipeline and recovery failure

Reply via email to