[ 
https://issues.apache.org/jira/browse/HDFS-1955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Foley updated HDFS-1955:
-----------------------------

    Attachment: hdfs-1955_2.patch

bq. Should this call getNumEditStreams() [instead of "(editStreams == null)"], 
or maybe even better yet isOpen() which returns false if there are no edit 
streams?

No, the point of this insertion is solely to prevent NPE in the rare case where 
(as the comment notes) an error occurs on one or more sd's before editStreams 
has even been initialized.  The check for null is efficient and sufficient.

bq. Would it make sense to move the call to reportErrorOnDirectories inside the 
if? Other callers of the method tend to not unconditionally call the method.

Agreed.  New patch contains this change.

bq. is there a race condition... Can isAlive return false because the thread 
already terminated before waitForThreads is invoked? I ask because won't the 
thread be left in limbo? In which case, should the while be a do-while?

We talked, and noted that Java thread join is not the same as pthread join.  
There's no race, nor other issue, because both .isAlive() and .join() can be 
called on an already-terminated thread without any exception being thrown.  The 
only purpose of the loop is to deal with the possibility that an interruption 
may be received while this method is blocked on the join() call.  It doesn't 
matter whether the termination condition is checked at the beginning or the end 
of the loop.  So the existing code is acceptable.

> HDFS-1826 made FSImage.doUpgrade() too fault-tolerant
> -----------------------------------------------------
>
>                 Key: HDFS-1955
>                 URL: https://issues.apache.org/jira/browse/HDFS-1955
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.22.0, 0.23.0
>            Reporter: Matt Foley
>            Assignee: Matt Foley
>         Attachments: hdfs-1955_1.patch, hdfs-1955_1.patch, hdfs-1955_2.patch
>
>
> Prior to HDFS-1826, doUpgrade() would fail if any of the storage directories 
> failed to successfully write the new fsimage or edits files.
> Now it appears to "succeed" even if some or all of the individual directories 
> fail.
> There is some discussion about whether doUpgrade() should have some fault 
> tolerance, but for now make it fail on any single storage directory failure, 
> as before.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to