[ 
https://issues.apache.org/jira/browse/HADOOP-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12483788
 ] 

Konstantin Shvachko commented on HADOOP-1153:
---------------------------------------------

DataXceiveServer should either declare
        boolean shouldListen = true;
as volatile or use DataNode.shouldRun instead

==========
DataNode.register() should loop on
      while( shouldRun ) {
instead of
      while( true ) {

==========
The DataNode thread itself is interrupted in shutdownAll(), but we never call 
it.
Who is interrupting the main data-node thread?

==========
Even if it is interrupted the RPC will ignore this inrrupt
RPC.waitForProxy()
    while (true) {
      try {
.................
      } catch (InterruptedException ie) {
        // IGNORE
      }
    }
May be this is one of the main problems with all our Mini clusters?

==========
DataNode.runAndWait() calls join() and catches InterruptedException
      try {
        t.join();
      } catch (InterruptedException e) {
        if (Thread.currentThread().isInterrupted()) {
          // did someone knock?
          return;
        }
      }
Here is what documentation on join says:
void java.lang.Thread.join()

Waits for this thread to die.

Throws: InterruptedException if another thread has interrupted the current 
thread.
The interrupted status of the current thread is cleared when this exception is 
thrown.

Does it make any sense to check isInterrupted()?

==========
The NameNode should be also checked that it
- closes all files
- closes all soccets
- correctly handles InterruptedException



> DataNode and FSNamesystem don't shutdown cleanly
> ------------------------------------------------
>
>                 Key: HADOOP-1153
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1153
>             Project: Hadoop
>          Issue Type: Bug
>          Components: dfs
>    Affects Versions: 0.12.1
>            Reporter: Nigel Daley
>             Fix For: 0.13.0
>
>         Attachments: 1153.patch
>
>
> The DataNode and FSNamesystem don't interrup their threads when shutting 
> down.  This causes threads to stay around which is a problem if tests are 
> starting and stopping these servers many times in the same process.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to