[ https://issues.apache.org/jira/browse/HDFS-11701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jitendra Nath Pandey reassigned HDFS-11701: ------------------------------------------- Assignee: Lokesh Jain > NPE from Unresolved Host causes permanent DFSInputStream failures > ----------------------------------------------------------------- > > Key: HDFS-11701 > URL: https://issues.apache.org/jira/browse/HDFS-11701 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client > Affects Versions: 2.6.0 > Environment: AWS Centos linux running HBase CDH 5.9.0 and HDFS CDH > 5.9.0 > Reporter: James Moore > Assignee: Lokesh Jain > > We recently encountered the following NPE due to the DFSInputStream storing > old cached block locations from hosts which could no longer resolve. > {quote} > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hdfs.DFSClient.isLocalAddress(DFSClient.java:1122) > at > org.apache.hadoop.hdfs.shortcircuit.DomainSocketFactory.getPathInfo(DomainSocketFactory.java:148) > at > org.apache.hadoop.hdfs.BlockReaderFactory.getBlockReaderLocal(BlockReaderFactory.java:474) > at > org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:354) > at > org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:662) > at > org.apache.hadoop.hdfs.DFSInputStream.seekToNewSource(DFSInputStream.java:1613) > at > org.apache.hadoop.fs.FSDataInputStream.seekToNewSource(FSDataInputStream.java:127) > ~HBase related stack frames trimmed~ > {quote} > After investigating, the DFSInputStream appears to have been open for upwards > of 3-4 weeks and had cached block locations from decommissioned nodes that no > longer resolve in DNS and had been shutdown and removed from the cluster 2 > weeks prior. If the DFSInputStream had refreshed its block locations from > the name node, it would have received alternative block locations which would > not contain the decommissioned data nodes. As the above NPE leaves the > non-resolving data node in the list of block locations the DFSInputStream > never refreshes the block locations and all attempts to open a BlockReader > for the given blocks will fail. > In our case, we resolved the NPE by closing and re-opening every > DFSInputStream in the cluster to force a purge of the block locations cache. > Ideally, the DFSInputStream would re-fetch all block locations for a host > which can't be resolved in DNS or at least the blocks requested. -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org