[ 
https://issues.apache.org/jira/browse/HDFS-5138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883394#comment-13883394
 ] 

Aaron T. Myers commented on HDFS-5138:
--------------------------------------

bq. Aaron T. Myers, we talked about this last on Friday Jan 16th over the phone 
right. I did tell you that JournalNode potentially losing editlogs.

There must have been some misunderstanding because I'm pretty sure I told you 
that I didn't think that was possible. :) Anyway, see below...

bq. Is that correct? Did you check it? Java File#renameTo() is platform 
dependent. The following code always renames the directories (on my MAC):

I did, at least on Linux. In the code example you have above, try putting a 
child file or directory under the directory f2 and see if it still works. The 
concern is about losing edit logs by overwriting a renamed directory with some 
contents, so by definition there will be some files in the directory being 
renamed to.

bq. Related question. Lets say even if the rename fails, how does user recover 
from that condition? I brought up several scenarios related to that in 
preupgrade, upgrade, and finalize. How do we handle finalize being done 
successfully done on one namenode and not the other?

Finalize is actually rather easy, since it's idempotent. The preupgrade and 
upgrade failure scenarios should both be handled either manually or by the 
storage recovery process, which currently should happen on JN restart, but I 
agree could be improved. Let's continue discussion of this over on HDFS-5840.

> Support HDFS upgrade in HA
> --------------------------
>
>                 Key: HDFS-5138
>                 URL: https://issues.apache.org/jira/browse/HDFS-5138
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 2.1.1-beta
>            Reporter: Kihwal Lee
>            Assignee: Aaron T. Myers
>            Priority: Blocker
>             Fix For: 3.0.0
>
>         Attachments: HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
> HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
> HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, HDFS-5138.patch, 
> hdfs-5138-branch-2.txt
>
>
> With HA enabled, NN wo't start with "-upgrade". Since there has been a layout 
> version change between 2.0.x and 2.1.x, starting NN in upgrade mode was 
> necessary when deploying 2.1.x to an existing 2.0.x cluster. But the only way 
> to get around this was to disable HA and upgrade. 
> The NN and the cluster cannot be flipped back to HA until the upgrade is 
> finalized. If HA is disabled only on NN for layout upgrade and HA is turned 
> back on without involving DNs, things will work, but finaliizeUpgrade won't 
> work (the NN is in HA and it cannot be in upgrade mode) and DN's upgrade 
> snapshots won't get removed.
> We will need a different ways of doing layout upgrade and upgrade snapshot.  
> I am marking this as a 2.1.1-beta blocker based on feedback from others.  If 
> there is a reasonable workaround that does not increase maintenance window 
> greatly, we can lower its priority from blocker to critical.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to