[ https://issues.apache.org/jira/browse/HADOOP-11802?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eric Payne reassigned HADOOP-11802: ----------------------------------- Assignee: Eric Payne > DomainSocketWatcher#watcherThread encounters IllegalStateException in finally > block when calling sendCallback > ------------------------------------------------------------------------------------------------------------- > > Key: HADOOP-11802 > URL: https://issues.apache.org/jira/browse/HADOOP-11802 > Project: Hadoop Common > Issue Type: Bug > Affects Versions: 2.7.0 > Reporter: Eric Payne > Assignee: Eric Payne > > In the main finally block of the {{DomainSocketWatcher#watcherThread}}, the > call to {{sendCallback}} can encountering an {{IllegalStateException}}, and > leave some cleanup tasks undone. > {code} > } finally { > lock.lock(); > try { > kick(); // allow the handler for notificationSockets[0] to read a > byte > for (Entry entry : entries.values()) { > // We do not remove from entries as we iterate, because that can > // cause a ConcurrentModificationException. > sendCallback("close", entries, fdSet, entry.getDomainSocket().fd); > } > entries.clear(); > fdSet.close(); > } finally { > lock.unlock(); > } > } > {code} > The exception causes {{watcherThread}} to skip the calls to > {{entries.clear()}} and {{fdSet.close()}}. > {code} > 2015-04-02 11:48:09,941 [DataXceiver for client > unix:/home/gs/var/run/hdfs/dn_socket [Waiting for operation #1]] INFO > DataNode.clienttrace: cliID: DFSClient_NONMAPREDUCE_-807148576_1, src: > 127.0.0.1, dest: 127.0.0.1, op: REQUEST_SHORT_CIRCUIT_SHM, shmId: n/a, srvID: > e6b6cdd7-1bf8-415f-a412-32d8493554df, success: false > 2015-04-02 11:48:09,941 [Thread-14] ERROR unix.DomainSocketWatcher: > Thread[Thread-14,5,main] terminating on unexpected exception > java.lang.IllegalStateException: failed to remove > b845649551b6b1eab5c17f630e42489d > at > com.google.common.base.Preconditions.checkState(Preconditions.java:145) > at > org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry.removeShm(ShortCircuitRegistry.java:119) > at > org.apache.hadoop.hdfs.server.datanode.ShortCircuitRegistry$RegisteredShm.handle(ShortCircuitRegistry.java:102) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.sendCallback(DomainSocketWatcher.java:402) > at > org.apache.hadoop.net.unix.DomainSocketWatcher.access$1100(DomainSocketWatcher.java:52) > at > org.apache.hadoop.net.unix.DomainSocketWatcher$2.run(DomainSocketWatcher.java:522) > at java.lang.Thread.run(Thread.java:722) > {code} > Please note that this is not a duplicate of HADOOP-11333, HADOOP-11604, or > HADOOP-10404. The cluster installation is running code with all of these > fixes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)