Kihwal Lee created HDFS-5850:
--------------------------------
Summary: DNS Issues during TrashEmptier initialization can
silently leave it non-functional
Key: HDFS-5850
URL: https://issues.apache.org/jira/browse/HDFS-5850
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Kihwal Lee
Priority: Critical
[~knoguchi] once noticed that the trash directories of a restarted cluster are
not cleaned up. It turned out that it was caused by a transient DNS problem
during initialization.
TrashEmptier thread in namenode is actually a FileSystem client running in a
loop, which makes RPC calls to itself in order to list, rename and delete
trash files. In a secure setup, the client needs to create the right service
principal name for the namenode for making a RPC connection. If there is a DNS
issue at that moment, the SPN ends up with the IP address, not the fqdn.
Since KDC does not recognize this SPN, TrashEmptier does not work from that
point on. I verified that the SPN with the IP address was what the TrashEmptier
thread asked KDC for a service ticket for.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)