[jira] Commented: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

sam rash (JIRA) Mon, 21 Jun 2010 17:58:24 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-1218?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12880989#action_12880989
 ]


sam rash commented on HDFS-1218:
--------------------------------

I realize in the hadoop code we already swallow InterruptedException 
frequently, but I think you can change the trend here:

{code}
        // wait for all acks to be received back from datanodes
        synchronized (ackQueue) {
          if (!closed && ackQueue.size() != 0) {
            try {
              ackQueue.wait();
            } catch (InterruptedException e) {
              Thread.currentThread.interrupt();  //add this 
            }
            continue;
          }
        }
{code}

otherwise, it's very easy to have a thread that I own and manage that has a 
DFSOutputStream in it that swallows an interrupt.  when i check 
Thread.currentThread.isInterrupted() to see if one of my other threads has 
interrupted me, i will not see it

(the crux here is that swallowing interrupts in threads that hadoop controls 
are less harmful--this is directly in client code when you call sync()/close())


> 20 append: Blocks recovered on startup should be treated with lower priority 
> during block synchronization
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1218
>                 URL: https://issues.apache.org/jira/browse/HDFS-1218
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: data-node
>    Affects Versions: 0.20-append
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.20-append
>
>         Attachments: hdfs-1281.txt
>
>
> When a datanode experiences power loss, it can come back up with truncated 
> replicas (due to local FS journal replay). Those replicas should not be 
> allowed to truncate the block during block synchronization if there are other 
> replicas from DNs that have _not_ restarted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-1218) 20 append: Blocks recovered on startup should be treated with lower priority during block synchronization

Reply via email to