[ https://issues.apache.org/jira/browse/HDFS-8299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523920#comment-14523920 ]
Tsz Wo Nicholas Sze commented on HDFS-8299: ------------------------------------------- Hi Hari, HDFS currently does not support read-only file systems. I agree it is good support it. > HDFS reporting missing blocks when they are actually present due to read-only > filesystem > ---------------------------------------------------------------------------------------- > > Key: HDFS-8299 > URL: https://issues.apache.org/jira/browse/HDFS-8299 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Affects Versions: 2.6.0 > Environment: HDP 2.2 > Reporter: Hari Sekhon > Priority: Critical > Attachments: datanode.log > > > Fsck shows missing blocks when the blocks can be found on a datanode's > filesystem and the datanode has been restarted to try to get it to recognize > that the blocks are indeed present and hence report them to the NameNode in a > block report. > Fsck output showing an example "missing" block: > {code}/apps/hive/warehouse/<custom_scrubbed>.db/someTable/000000_0: CORRUPT > blockpool BP-120244285-<ip>-1417023863606 block blk_1075202330 > MISSING 1 blocks of total size 3260848 B > 0. BP-120244285-<ip>-1417023863606:blk_1075202330_1484191 len=3260848 > MISSING!{code} > The block is definitely present on more than one datanode however, here is > the output from one of them that I restarted to try to get it to report the > block to the NameNode: > {code}# ll > /archive1/dn/current/BP-120244285-<ip>-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330* > -rw-r--r-- 1 hdfs 499 3260848 Apr 27 15:02 > /archive1/dn/current/BP-120244285-<ip>-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330 > -rw-r--r-- 1 hdfs 499 25483 Apr 27 15:02 > /archive1/dn/current/BP-120244285-<ip>-1417023863606/current/finalized/subdir22/subdir73/blk_1075202330_1484191.meta{code} > It's worth noting that this is on HDFS tiered storage on an archive tier > going to a networked block device that may have become temporarily > unavailable but is available now. See also feature request HDFS-8297 for > online rescan to not have to go around restarting datanodes. > It turns out in the datanode log (that I am attaching) this is because the > datanode fails to get a write lock on the filesystem. I think it would be > better to be able to read-only those blocks however, since this way causes > client visible data unavailability when the data could in fact be read. > {code}2015-04-30 14:11:08,235 WARN datanode.DataNode > (DataNode.java:checkStorageLocations(2284)) - Invalid dfs.datanode.data.dir > /archive1/dn : > org.apache.hadoop.util.DiskChecker$DiskErrorException: Directory is not > writable: /archive1/dn > at > org.apache.hadoop.util.DiskChecker.checkAccessByFileMethods(DiskChecker.java:193) > at > org.apache.hadoop.util.DiskChecker.checkDirAccess(DiskChecker.java:174) > at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:157) > at > org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2239) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2281) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2263) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2155) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2202) > at > org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2378) > at > org.apache.hadoop.hdfs.server.datanode.SecureDataNodeStarter.start(SecureDataNodeStarter.java:78) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.commons.daemon.support.DaemonLoader.start(DaemonLoader.java:243) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)