[jira] [Commented] (HDFS-7492) If multiple threads call FsVolumeList#checkDirs at the same time, we should only do checkDirs once and give the results to all waiting threads
[ https://issues.apache.org/jira/browse/HDFS-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14803419#comment-14803419 ] Colin Patrick McCabe commented on HDFS-7492: [~eclark], also check HDFS-8845 for another improvement in this area. > If multiple threads call FsVolumeList#checkDirs at the same time, we should > only do checkDirs once and give the results to all waiting threads > -- > > Key: HDFS-7492 > URL: https://issues.apache.org/jira/browse/HDFS-7492 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Reporter: Colin Patrick McCabe >Assignee: Elliott Clark >Priority: Minor > > checkDirs is called when we encounter certain I/O errors. It's rare to get > just a single I/O error... normally you start getting many errors when a disk > is going bad. For this reason, we shouldn't start a new checkDirs scan for > each error. Instead, if multiple threads call FsVolumeList#checkDirs at > around the same time, we should only do checkDirs once and give the results > to all the waiting threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7492) If multiple threads call FsVolumeList#checkDirs at the same time, we should only do checkDirs once and give the results to all waiting threads
[ https://issues.apache.org/jira/browse/HDFS-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14746491#comment-14746491 ] Elliott Clark commented on HDFS-7492: - I'm going to grab this one. We're seeing this in production. There's an un-related issue with one datanode locking up (still heart beating to NN but not able to make progress on anything that hits disks). So all datanodes talking to the bad node throw a bunch of IOExceptions. This causes a significant portion of the cluster to checkDiskError while the network issue is going on. FsDatasetImpl.checkDirs holds a lock so all new xceivers are blocked by the checkDiskError. This causes more time outs and basically serializes all reading and writing to blocks until everything on the cluster settles down. {code} "DataXceiver for client unix:/mnt/d2/hdfs-socket/dn.50010 [Passing file descriptors for block BP-1735829752-10.210.49.21-1437433901380:blk_1121816087_48310306]" #85474 daemon prio=5 os_prio=0 tid=0x7f10910b2800 nid=0x5d44f waiting for monitor entry [0x7f1072c06000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockFileNoExistsCheck(FsDatasetImpl.java:606) - waiting to lock <0x0007015a3fe8> (a org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.getBlockInputStream(FsDatasetImpl.java:618) at org.apache.hadoop.hdfs.server.datanode.DataNode.requestShortCircuitFdsForRead(DataNode.java:1524) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.requestShortCircuitFds(DataXceiver.java:287) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opRequestShortCircuitFds(Receiver.java:185) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:89) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) at java.lang.Thread.run(Thread.java:745) "DataXceiver for client DFSClient_NONMAPREDUCE_-1067692187_1 at /10.210.65.21:33560 [Receiving block BP-1735829752-10.210.49.21-1437433901380:blk_1121839247_48333595]" #85463 daemon prio=5 os_prio=0 tid=0x7f108933d800 nid=0x5d28e waiting for monitor entry [0x7f1072904000] java.lang.Thread.State: BLOCKED (on object monitor) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.getNextVolume(FsVolumeList.java:63) - waiting to lock <0x0007015a4030> (a org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:1084) - locked <0x0007015a3fe8> (a org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.createRbw(FsDatasetImpl.java:114) at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.(BlockReceiver.java:183) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:615) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137) at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74) at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:235) at java.lang.Thread.run(Thread.java:745) "Thread-13149" #13302 daemon prio=5 os_prio=0 tid=0x7f10884a9000 nid=0xe9e7 runnable [0x7f1076e6] java.lang.Thread.State: RUNNABLE at java.io.UnixFileSystem.createDirectory(Native Method) at java.io.File.mkdir(File.java:1316) at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsCheck(DiskChecker.java:67) at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:104) at org.apache.hadoop.util.DiskChecker.checkDirs(DiskChecker.java:88) at org.apache.hadoop.util.DiskChecker.checkDirs(DiskChecker.java:91) at org.apache.hadoop.util.DiskChecker.checkDirs(DiskChecker.java:91) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.checkDirs(BlockPoolSlice.java:300) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.checkDirs(FsVolumeImpl.java:307) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList.checkDirs(FsVolumeList.java:183) - locked <0x0007015a4030> (a org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList) at org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl.checkDataDir(FsDatasetImpl.java:1743) at org.apache.hadoop.hdfs.server.datanode.DataNode.checkDiskError(DataNode.java:3002) at org.apache.hadoop.hdfs.server.datanode.DataNode.access$800(DataNode.java:240) at org.apach
[jira] [Commented] (HDFS-7492) If multiple threads call FsVolumeList#checkDirs at the same time, we should only do checkDirs once and give the results to all waiting threads
[ https://issues.apache.org/jira/browse/HDFS-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14238565#comment-14238565 ] Colin Patrick McCabe commented on HDFS-7492: I think a Guava cache could be useful to this. Perhaps we could set the refresh time really low (or even to 0). The goal is to avoid having 1000 "do checkDirs" requests pile up when a hard disk generates 1000 I/O errors. > If multiple threads call FsVolumeList#checkDirs at the same time, we should > only do checkDirs once and give the results to all waiting threads > -- > > Key: HDFS-7492 > URL: https://issues.apache.org/jira/browse/HDFS-7492 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: Colin Patrick McCabe > > checkDirs is called when we encounter certain I/O errors. It's rare to get > just a single I/O error... normally you start getting many errors when a disk > is going bad. For this reason, we shouldn't start a new checkDirs scan for > each error. Instead, if multiple threads call FsVolumeList#checkDirs at > around the same time, we should only do checkDirs once and give the results > to all the waiting threads. -- This message was sent by Atlassian JIRA (v6.3.4#6332)