[ https://issues.apache.org/jira/browse/HDFS-17176?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ke Han updated HDFS-17176: -------------------------- Description: When upgrading from 3.2.4 to 3.3.6, I met the following error when shutting down the old version cluster. {code:java} 2023-09-01 21:40:50,765 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002 java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:212) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:211) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:528) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:971) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:908) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:292) at java.lang.Thread.run(Thread.java:750) 2023-09-01 21:40:50,769 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002, type=LAST_IN_PIPELINE: Thread is interrupted. 2023-09-01 21:40:50,769 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002, type=LAST_IN_PIPELINE terminating 2023-09-01 21:40:50,770 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002 received exception java.io.IOException: Premature EOF from inputStream 2023-09-01 21:40:50,783 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: f3e74e86e41c:9866:DataXceiver error processing WRITE_BLOCK operation src: /192.168.127.2:58644 dst: /192.168.127.4:9866 java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:212) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:211) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:528) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:971) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:908) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:292) at java.lang.Thread.run(Thread.java:750) {code} This error could be deterministically reproduced by executing the following command sequence 1) Start up 4 node 3.2.4 HDFS cluster: 1 NN, 1 SNN, 2 DN. 2) Execute the following commands {code:java} dfs -mkdir /lSFVKIFi dfs -put -f /tmp/hdfs/xbutTMQg/GbkVxPvqoc /lSFVKIFi/ dfs -mkdir /lSFVKIFi/PWXVE ec -enablePolicy -policy XOR-2-1-1024k dfsadmin -refreshNodes dfs -mv /lSFVKIFi/GbkVxPvqoc /lSFVKIFi/PWXVE dfsadmin -clrQuota /lSFVKIFi/ dfs -expunge -immediate ec -setPolicy -path /lSFVKIFi/ -policy XOR-2-1-1024k dfs -put -f -d /tmp/hdfs/LNSEzfJm/z /lSFVKIFi/PWXVE {code} 3) Shutdown the cluster 4) Start up the new version cluster. (Not necessary). The error will show for one of the datanode. I have attached my configuration file, folder and also the error logs. was: When upgrading from 3.2.4 to 3.3.6, I met the following error when shutting down the old version cluster. {code:java} 2023-09-01 21:40:50,765 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Exception for BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002 java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:212) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:211) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:528) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:971) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:908) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:292) at java.lang.Thread.run(Thread.java:750) 2023-09-01 21:40:50,769 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002, type=LAST_IN_PIPELINE: Thread is interrupted. 2023-09-01 21:40:50,769 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: PacketResponder: BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002, type=LAST_IN_PIPELINE terminating 2023-09-01 21:40:50,770 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: opWriteBlock BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002 received exception java.io.IOException: Premature EOF from inputStream 2023-09-01 21:40:50,783 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: f3e74e86e41c:9866:DataXceiver error processing WRITE_BLOCK operation src: /192.168.127.2:58644 dst: /192.168.127.4:9866 java.io.IOException: Premature EOF from inputStream at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:212) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:211) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:528) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:971) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:908) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:292) at java.lang.Thread.run(Thread.java:750) {code} This error could be deterministically reproduced by executing the following command sequence # Start up 4 node 3.2.4 HDFS cluster: 1 NN, 1 SNN, 2 DN. # Execute the following commands {code:java} dfs -mkdir /lSFVKIFi dfs -put -f /tmp/hdfs/xbutTMQg/GbkVxPvqoc /lSFVKIFi/ dfs -mkdir /lSFVKIFi/PWXVE ec -enablePolicy -policy XOR-2-1-1024k dfsadmin -refreshNodes dfs -mv /lSFVKIFi/GbkVxPvqoc /lSFVKIFi/PWXVE dfsadmin -clrQuota /lSFVKIFi/ dfs -expunge -immediate ec -setPolicy -path /lSFVKIFi/ -policy XOR-2-1-1024k dfs -put -f -d /tmp/hdfs/LNSEzfJm/z /lSFVKIFi/PWXVE {code} # Shutdown the cluster # Start up the new version cluster. (Not necessary). The error will show for one of the datanode. > IOException: Premature EOF from inputStream when shutting down 3.2.4 cluster > ---------------------------------------------------------------------------- > > Key: HDFS-17176 > URL: https://issues.apache.org/jira/browse/HDFS-17176 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Ke Han > Priority: Major > Attachments: hdfs-site.xml, hdfs.tar.gz, persistent.tar-2.gz > > > When upgrading from 3.2.4 to 3.3.6, I met the following error when shutting > down the old version cluster. > {code:java} > 2023-09-01 21:40:50,765 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > Exception for > BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002 > java.io.IOException: Premature EOF from inputStream > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:212) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:211) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:528) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:971) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:908) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:292) > at java.lang.Thread.run(Thread.java:750) > 2023-09-01 21:40:50,769 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > PacketResponder: > BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002, > type=LAST_IN_PIPELINE: Thread is interrupted. > 2023-09-01 21:40:50,769 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > PacketResponder: > BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002, > type=LAST_IN_PIPELINE terminating > 2023-09-01 21:40:50,770 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: > opWriteBlock > BP-1860374779-192.168.127.2-1693604406482:blk_-9223372036854775792_1002 > received exception java.io.IOException: Premature EOF from inputStream > 2023-09-01 21:40:50,783 ERROR > org.apache.hadoop.hdfs.server.datanode.DataNode: > f3e74e86e41c:9866:DataXceiver error processing WRITE_BLOCK operation src: > /192.168.127.2:58644 dst: /192.168.127.4:9866 > java.io.IOException: Premature EOF from inputStream > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:212) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:211) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:528) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:971) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:908) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:173) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:107) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:292) > at java.lang.Thread.run(Thread.java:750) > {code} > This error could be deterministically reproduced by executing the following > command sequence > 1) Start up 4 node 3.2.4 HDFS cluster: 1 NN, 1 SNN, 2 DN. > 2) Execute the following commands > {code:java} > dfs -mkdir /lSFVKIFi > dfs -put -f /tmp/hdfs/xbutTMQg/GbkVxPvqoc /lSFVKIFi/ > dfs -mkdir /lSFVKIFi/PWXVE > ec -enablePolicy -policy XOR-2-1-1024k > dfsadmin -refreshNodes > dfs -mv /lSFVKIFi/GbkVxPvqoc /lSFVKIFi/PWXVE > dfsadmin -clrQuota /lSFVKIFi/ > dfs -expunge -immediate > ec -setPolicy -path /lSFVKIFi/ -policy XOR-2-1-1024k > dfs -put -f -d /tmp/hdfs/LNSEzfJm/z /lSFVKIFi/PWXVE {code} > 3) Shutdown the cluster > 4) Start up the new version cluster. (Not necessary). > The error will show for one of the datanode. I have attached my configuration > file, folder and also the error logs. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org