Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The "TroubleShooting" page has been changed by SteveLoughran: http://wiki.apache.org/hadoop/TroubleShooting?action=diff&rev1=13&rev2=14 Comment: x-ref DataNode text to the page, mention resolv.conf. There are a number of possible of causes for this. * The NameNode may be overloaded. Check the logs for messages that say "discarding calls..." - * There may not be enough (any) DataNodes for the data to be written. Again, check the logs. + * There may not be enough (any) DataNode nodes running for the data to be written. Again, check the logs. - * The DataNodes on which the blocks were stored might be down. + * Every DataNode on which the blocks were stored might be down (or not connected to the NameNode; it is impossible to distinguish the two). === Error message: Could not obtain block === @@ -62, +62 @@ java.io.IOException: No live nodes contain current block }}} - There are no live DataNodes containing a copy of the block of the file you are looking for. Bring up any nodes that are down, or skip that block. + There are no live DataNode nodes containing a copy of the block of the file you are looking for. Bring up any nodes that are down, or skip that block. == Reduce hangs == This can be a DNS issue. Two problems which have been encountered in practice are: * Machines with multiple NICs. In this case, set {{{ dfs.datanode.dns.interface }}} (in {{{ hdfs-site.xml }}}) and {{{ mapred.datanode.dns.interface }}} (in {{{ mapred-site.xml }}}) to the name of the network interface used by Hadoop (something like {{{ eth0 }}} under Linux), - * Badly formatted or incorrect hosts files ({{{ /etc/hosts }}} under Linux) can wreak havoc. Any DNS problem will hobble Hadoop, so ensure that names can be resolved correctly. + * Badly formatted or incorrect hosts and DNS files ({{{ /etc/hosts }}} and {{{{ /etc/resolv.conf }}}under Linux) can wreak havoc. Any DNS problem will hobble Hadoop, so ensure that names can be resolved correctly. == Error message saying a file "Could only be replicated to 0 nodes instead of 1" ==