[ https://issues.apache.org/jira/browse/HDFS-9874?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183159#comment-15183159 ]
Daryn Sharp commented on HDFS-9874: ----------------------------------- The synchronization on {{FSDatasetImpl#stopAllDataxceiverThreads}} is a bit concerning. Stopping xceiver threads uses a default timeout of 1min. That's a long time for the DN to block if threads don't exit immediately. The iteration of replicas might not be safe. The correct locking model isn't immediately clear but {{ReplicaMap#replicas}} has the comment which other code doesn't appear to follow: {noformat} /** * Get a collection of the replicas for given block pool * This method is <b>not synchronized</b>. It needs to be synchronized * externally using the mutex, both for getting the replicas * values from the map and iterating over it. Mutex can be accessed using * {@link #getMutext()} method. {noformat} Might need to consider forcibly decrementing the ref and interrupting with no timeout. For the test, I'd assert the volume actually has a non-zero ref count before trying to interrupt. Instead of triggering an async check and sleeping, which inevitable creates flaky race conditions, the disk check should be invoked non-async. Should verify that the client stream fails after the volume is failed. > Long living DataXceiver threads cause volume shutdown to block. > --------------------------------------------------------------- > > Key: HDFS-9874 > URL: https://issues.apache.org/jira/browse/HDFS-9874 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.7.0 > Reporter: Rushabh S Shah > Assignee: Rushabh S Shah > Priority: Critical > Attachments: HDFS-9874-trunk.patch > > > One of the failed volume shutdown took 3 days to complete. > Below are the relevant datanode logs while shutting down a volume (due to > disk failure) > {noformat} > 2016-02-21 10:12:55,333 [Thread-49277] WARN impl.FsDatasetImpl: Removing > failed volume volumeA/current: > org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not > writable: volumeA/current/BP-1788428031-nnIp-1351700107344/current/finalized > at > org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:194) > at > org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:108) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.checkDirs(BlockPoolSlice.java:308) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.checkDirs(FsVolumeImpl.java:786) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.checkDirs(FsVolumeList.java:242) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.checkDataDir(FsDatasetImpl.java:2011) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:3145) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.access$800(DataNode.java:243) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$7.run(DataNode.java:3178) > at java.lang.Thread.run(Thread.java:745) > 2016-02-21 10:12:55,334 [Thread-49277] INFO datanode.BlockScanner: Removing > scanner for volume volumeA (StorageID DS-cd2ea223-bab3-4361-a567-5f3f27a5dd23) > 2016-02-21 10:12:55,334 [VolumeScannerThread(volumeA)] INFO > datanode.VolumeScanner: VolumeScanner(volumeA, > DS-cd2ea223-bab3-4361-a567-5f3f27a5dd23) exiting. > 2016-02-21 10:12:55,335 [VolumeScannerThread(volumeA)] WARN > datanode.VolumeScanner: VolumeScanner(volumeA, > DS-cd2ea223-bab3-4361-a567-5f3f27a5dd23): error saving > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl$BlockIteratorImpl@4169ad8b. > java.io.FileNotFoundException: > volumeA/current/BP-1788428031-nnIp-1351700107344/scanner.cursor.tmp > (Read-only file system) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.<init>(FileOutputStream.java:213) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl$BlockIteratorImpl.save(FsVolumeImpl.java:669) > at > org.apache.hadoop.hdfs.server.datanode.VolumeScanner.saveBlockIterator(VolumeScanner.java:314) > at > org.apache.hadoop.hdfs.server.datanode.VolumeScanner.run(VolumeScanner.java:633) > 2016-02-24 16:05:53,285 [Thread-49277] WARN impl.FsDatasetImpl: Failed to > delete old dfsUsed file in > volumeA/current/BP-1788428031-nnIp-1351700107344/current > 2016-02-24 16:05:53,286 [Thread-49277] WARN impl.FsDatasetImpl: Failed to > write dfsUsed to > volumeA/current/BP-1788428031-nnIp-1351700107344/current/dfsUsed > java.io.FileNotFoundException: > volumeA/current/BP-1788428031-nnIp-1351700107344/current/dfsUsed (Read-only > file system) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.<init>(FileOutputStream.java:213) > at java.io.FileOutputStream.<init>(FileOutputStream.java:162) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.saveDfsUsed(BlockPoolSlice.java:247) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.shutdown(BlockPoolSlice.java:698) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.shutdown(FsVolumeImpl.java:815) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.removeVolume(FsVolumeList.java:328) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.checkDirs(FsVolumeList.java:250) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.checkDataDir(FsDatasetImpl.java:2011) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:3145) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.access$800(DataNode.java:243) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$7.run(DataNode.java:3178) > at java.lang.Thread.run(Thread.java:745) > 2016-02-24 16:05:53,286 [Thread-49277] INFO impl.FsDatasetImpl: Removed > volume: volumeA/current > 2016-02-24 16:05:53,286 [Thread-49277] WARN impl.FsDatasetImpl: Completed > checkDirs. Found 1 failure volumes. > 2016-02-24 16:05:53,286 [Thread-49277] INFO datanode.DataNode: Deactivating > volumes (clear failure=false): volumeA > 2016-02-24 16:05:53,286 [Thread-49277] INFO impl.FsDatasetImpl: Removing > volumeA from FsDataset. > 2016-02-24 16:05:53,342 [Thread-49277] INFO common.Storage: Removing block > level storage: volumeA/current/BP-1788428031-nnIp-1351700107344 > 2016-02-24 16:05:53,345 [Thread-49277] WARN datanode.DataNode: > DataNode.handleDiskError: Keep Running: true > {noformat} > The datanode waits for the reference count to become zero before shutting > down the volume. > {code:title=FsVolumeImpl.java|borderStyle=solid} > while (this.reference.getReferenceCount() > 0) { > if (FsDatasetImpl.LOG.isDebugEnabled()) { > FsDatasetImpl.LOG.debug(String.format( > "The reference count for %s is %d, wait to be 0.", > this, reference.getReferenceCount())); > } > try { > Thread.sleep(SLEEP_MILLIS); > } catch (InterruptedException e) { > throw new IOException(e); > } > } > {code} > Just before datanode logged the following line, > {noformat} > 2016-02-24 16:05:53,285 [Thread-49277] WARN impl.FsDatasetImpl: Failed to > delete old dfsUsed file in > volumeA/current/BP-1788428031-nnIp-1351700107344/current > {noformat} > I saw the following stack trace > {noformat} > 2016-02-24 16:05:53,017 [DataXceiver for client > DFSClient_NONMAPREDUCE_1651663681_1 at /upStreamDNIp:55710 [Receiving block > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832821906]] INFO > datanode.DataNode: Exception for > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832909736 > java.io.IOException: Premature EOF from inputStream > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253) > at java.lang.Thread.run(Thread.java:745) > 2016-02-24 16:05:53,018 [PacketResponder: > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832909736, > type=LAST_IN_PIPELINE, downstreams=0:[]] INFO datanode.DataNode: > PacketResponder: > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832909736, > type=LAST_IN_PIPELINE, downstreams=0:[]: Thread is interrupted. > 2016-02-24 16:05:53,018 [PacketResponder: > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832909736, > type=LAST_IN_PIPELINE, downstreams=0:[]] INFO datanode.DataNode: > PacketResponder: > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832909736, > type=LAST_IN_PIPELINE, downstreams=0:[] terminating > 2016-02-24 16:05:53,018 [DataXceiver for client > DFSClient_NONMAPREDUCE_1651663681_1 at /upStreamDNIp:55710 [Receiving block > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832821906]] INFO > datanode.DataNode: opWriteBlock > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832909736 received > exception java.io.IOException: Prematur > e EOF from inputStream > 2016-02-24 16:05:53,018 [DataXceiver for client > DFSClient_NONMAPREDUCE_1651663681_1 at /upStreamDNIp:55710 [Receiving block > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832821906]] ERROR > datanode.DataNode: thisDNName:1004:DataXceiver error processing WRITE_BLOCK > operation src: /upStreamDNIp:55710 dst: /thisDNIp > :1004 > java.io.IOException: Premature EOF from inputStream > at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134) > at > org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501) > at > org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:895) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:801) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) > at > org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) > at > org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253) > at java.lang.Thread.run(Thread.java:745) > {noformat} > On tracking the block blk_7059462432_1109832821906, I see that the block was > created on 2016-02-17 15:06:28,256 > {noformat} > 2016-02-17 15:06:28,928 [DataXceiver for client > DFSClient_NONMAPREDUCE_1651663681_1 at /upStreamDNIp:55590 [Receiving block > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832821906]] INFO > datanode.DataNode: Receiving > BP-1788428031-nnIp-1351700107344:blk_7059462432_1109832821906 src: > /upStreamDNIp:55590 dest: /thisDNIp:1004 > {noformat} > The job which created this file was running for more than 7 days and the > client eventually failed to renew the delegation token so the lease manager > failed to renew lease for this file. > Once commitBlockSynchronization kicked in, it tried to recover the block and > eventually the DataXceiver thread died decrementing the reference count. -- This message was sent by Atlassian JIRA (v6.3.4#6332)