[ https://issues.apache.org/jira/browse/HDFS-14333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Stephen O'Donnell updated HDFS-14333: ------------------------------------- Attachment: HDFS-14333.005.patch > Datanode fails to start if any disk has errors during Namenode registration > --------------------------------------------------------------------------- > > Key: HDFS-14333 > URL: https://issues.apache.org/jira/browse/HDFS-14333 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 3.3.0 > Reporter: Stephen O'Donnell > Assignee: Stephen O'Donnell > Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-16119.poc.patch, HDFS-14333.001.patch, > HDFS-14333.002.patch, HDFS-14333.003.patch, HDFS-14333.004.patch, > HDFS-14333.005.patch > > > This is closely related to HDFS-9908, where it was reported that a datanode > would fail to start if an IO error occurred on a single disk when running du > during Datanode registration. That Jira was closed due to HADOOP-12973 which > refactored how du is called and prevents any exception being thrown. However > this problem can still occur if the volume has errors (eg permission or > filesystem corruption) when the disk is scanned to load all the replicas. The > method chain is: > DataNode.initBlockPool -> FSDataSetImpl.addBlockPool -> > FSVolumeList.getAllVolumesMap -> Throws exception which goes unhandled. > The DN logs will contain a stack trace for the problem volume, so the > workaround is to remove the volume from the DN config and the DN will start, > but the logs are a little confusing, so its always not obvious what the issue > is. > These are the cut down logs from an occurrence of this issue. > {code} > 2019-03-01 08:58:24,830 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Scanning > block pool BP-240961797-x.x.x.x-1392827522027 on volume > /data/18/dfs/dn/current... > ... > 2019-03-01 08:58:27,029 WARN org.apache.hadoop.fs.CachingGetSpaceUsed: Could > not get disk usage information > ExitCodeException exitCode=1: du: cannot read directory > `/data/18/dfs/dn/current/BP-240961797-x.x.x.x-1392827522027/current/finalized/subdir149/subdir215': > Permission denied > du: cannot read directory > `/data/18/dfs/dn/current/BP-240961797-x.x.x.x-1392827522027/current/finalized/subdir149/subdir213': > Permission denied > du: cannot read directory > `/data/18/dfs/dn/current/BP-240961797-x.x.x.x-1392827522027/current/finalized/subdir97/subdir25': > Permission denied > at org.apache.hadoop.util.Shell.runCommand(Shell.java:601) > at org.apache.hadoop.util.Shell.run(Shell.java:504) > at org.apache.hadoop.fs.DU$DUShell.startRefresh(DU.java:61) > at org.apache.hadoop.fs.DU.refresh(DU.java:53) > at > org.apache.hadoop.fs.CachingGetSpaceUsed.init(CachingGetSpaceUsed.java:84) > at > org.apache.hadoop.fs.GetSpaceUsed$Builder.build(GetSpaceUsed.java:166) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.<init>(BlockPoolSlice.java:145) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.addBlockPool(FsVolumeImpl.java:881) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$2.run(FsVolumeList.java:412) > ... > 2019-03-01 08:58:27,043 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time > taken to scan block pool BP-240961797-x.x.x.x-1392827522027 on > /data/18/dfs/dn/current: 2202ms > {code} > So we can see a du error occurred, was logged but not re-thrown (due to > HADOOP-12973) and the blockpool scan completed. However then in the 'add > replicas to map' logic, we got another exception stemming from the same > problem: > {code} > 2019-03-01 08:58:27,564 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Adding > replicas to map for block pool BP-240961797-x.x.x.x-1392827522027 on volume > /data/18/dfs/dn/current... > ... > 2019-03-01 08:58:31,155 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Caught > exception while adding replicas from /data/18/dfs/dn/current. Will throw > later. > java.io.IOException: Invalid directory or I/O error occurred for dir: > /data/18/dfs/dn/current/BP-240961797-x.x.x.x-1392827522027/current/finalized/subdir149/subdir215 > at org.apache.hadoop.fs.FileUtil.listFiles(FileUtil.java:1167) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:445) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.addToReplicasMap(BlockPoolSlice.java:448) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.BlockPoolSlice.getVolumeMap(BlockPoolSlice.java:342) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeImpl.getVolumeMap(FsVolumeImpl.java:861) > at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsVolumeList$1.run(FsVolumeList.java:191) > < The message 2019-03-01 08:59:00,989 INFO > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Time to > add replicas to map for block pool BP-240961797-x.x.x.x-1392827522027 on > volume xxx did not appear for this volume as it failed > > {code} > The exception is re-thrown, so the DN fails registration and then retries. > Then it finds all volumes already locked and exits with a 'all volumes > failed' error. > I believe we should handle the failing volume like a runtime volume failure > and only abort the DN if too many volumes have failed. > I will post a patch for this. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org