[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-01-27 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13883273#comment-13883273
 ] 

Aaron T. Myers commented on HDFS-5840:
--

>From Suresh:

I am adding information about the design, the way I understand it. Let me know 
if I got it wrong.
*Upgrade preparation:*
# New bits are installed on the cluster nodes.
# The cluster is brought down.

*Upgrade:* For HA setup, choose one of the namenodes to initiate upgrade on and 
start it with -upgrade flag.
# NN performs preupgrade for all non shared storage directories by moving 
current to previous.tmp and creating new current.
#* Failure here is fine. NN start up fails. Next attempt at upgrade the storage 
directories are recovered.
# NN performs preupgrade of shared edits (NFS/JournalNodes) over RPC. 
JournalNodes current moved to previous.tmp and new current is created.
#* If one of the JN preupgrade fails and upgrade is reattempted, editlog 
directory could be lost on the JN. Restarting the JN does not fix the issue.
# NN performs upgrade of non shared edits by writing new CTIME to current and 
moving previous.tmp to previous.
#* If one of the JN preupgrade fails and upgrade is reattempted, editlog 
directory could be lost on the JN. Restarting the JN does not fix the issue.
# NN performs upgrade of shared edits (NFS/JournalNodes) over RPC. JournalNodes 
current has new CTIM and previous.tmp is moved to previous.
# We need to document that all the JournalNodes must be up. If a JN is 
irrecoverably lost, configuration must be changed to exclude the JN.

*Rollback:* NN is started with rollback flag
# For all the non shared directories, the NN checks for canRollBack, 
essentially ensures that previous directory with the right layout version 
exists.
# For all the shared directories, the NN checks for canRollBack, essentially 
ensures that previous directory with the right layout version exists.
# NN performs rollback for shared directories (moving previous to current)
#* If rollback of one of the JN fails, then directories are in inconsistent 
state. I think any attempt at retrying rollback will fail and will require 
manually moving files around. I do not think restarting JN fixes this.
# We need to document that all the JournalNodes must be up. If a JN is 
irrecoverably lost, configuration must be changed to exclude the JN.

*Finalize:* DFSAdmin command is run to finalize the upgrade.
# Active NN performs finalizing of editlog. If JN's fail to finalize, active NN 
fails to finalize. However it is possible that standby finalizes, leaving the 
cluster in an inconsistent state.
# We need to document that all the JournalNodes must be up. If a JN is 
irrecoverably lost, configuration must be changed to exclude the JN.

Comments on the code in the patch (this is almost complete):
Comments:
# Minor nit: there are some white space changes
# assertAllResultsEqual - for loop can just start with i = 1? Also if the 
collection objects is of size zero or one, the method can return early. Is 
there a need to do object.toArray() for these early checks? With that, perhaps 
the findbugs exclude may not be necessary.
# Unit test can be added for methods isAtLeastOneActive, 
getRpcAddressesForNameserviceId and getProxiesForAllNameNodesInNameservice (I 
am okay if this is done in a separate jira)
# Finalizing upgrade is quite tricky. Consider the following scenarios:
#* One NN is active and the other is standby - works fine
#* One NN is active and the other is down or all NNs - finalize command throws 
exception and the user will not know if it has succeeded or failed and what to 
do next
#* No active NN - throws an exception cannot finalize with no active
#* BlockPoolSliceStorage.java change seems unnecessary
# Why is {{throw new AssertionError("Unreachable code.");}} in 
QuorumJournalManager.java methods?
# FSImage#doRollBack() - when canRollBack is false after checking if non-share 
directories can rollback, an exception must be immediately thrown, instead of 
checking shared editlog. Also printing Log.info when storages can be rolled 
back will help in debugging.
# FSEditlog#canRollBackSharedLog should accept StorageInfo instead of Storage
# QuorumJournalManager#canRollBack and getJournalCTime can throw AssertionError 
(from DFSUtil.assertAllResultsEqual()). Is that the right exception to expose 
or IOException?
# Namenode startup throws AssertionError with -rollback option. I think we 
should throw IOException, which is how all the other failures are indicated.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.

[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-02-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13900258#comment-13900258
 ] 

Hadoop QA commented on HDFS-5840:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12628708/HDFS-5840.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6140//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6140//console

This message is automatically generated.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-02-20 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907687#comment-13907687
 ] 

Suresh Srinivas commented on HDFS-5840:
---

[~atm], sorry for the late reply. I had lost track of this.

{quote}
As for handling the partial upgrade failure as you've described, I'd like to 
add one more RPC call to the JournalManager to initiate analysis/recovery of 
the storage dirs upon first contact, and then refactor the contents of 
FSImage#recoverStorageDirs into NNUpgradeUtil just like was done with the other 
upgrade-related procedures. If this sounds OK to you, I'll go ahead and add 
that stuff and appropriate tests.
{quote}
Why not always recover in preupgrade/upgrade step, instead of adding another 
RPC?

With rolling upgrade getting ready, some of the functionality added in that may 
be useful. For partial failures related to JournalNodes, the choice made in 
that feature to make the operation to rollback JournalNode idempotent. It looks 
like lot of rolling upgrade related code can be leveraged here, since upgrade 
is a special case of rolling upgrade. Should we explore that?

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-02-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13911031#comment-13911031
 ] 

Suresh Srinivas commented on HDFS-5840:
---

[~atm], realized my previous comment is incorrect. We cannot use the rolling 
upgrade functionality. We should perhaps fix the issues that this jira was 
targeting.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-02-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13911059#comment-13911059
 ] 

Aaron T. Myers commented on HDFS-5840:
--

bq. Why not always recover in preupgrade step, instead of adding another RPC?

That seems fine to me, and would certainly make this patch more 
straightforward. I'll work on that.

bq. Aaron T. Myers, realized my previous comment is incorrect. We cannot use 
the rolling upgrade functionality. We should perhaps fix the issues that this 
jira was targeting.

I agree with this as well.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-05 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921307#comment-13921307
 ] 

Suresh Srinivas commented on HDFS-5840:
---

[~atm], this needs to be fixed in 2.4. Are you going to be working on this?

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-05 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13921486#comment-13921486
 ] 

Aaron T. Myers commented on HDFS-5840:
--

Hey Suresh, I haven't had a chance to get to this since I posted that initial 
patch. I can hopefully get something out late this week or early next, but if 
someone else beats me to it I won't object, and will happily review something.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-12 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13932001#comment-13932001
 ] 

Suresh Srinivas commented on HDFS-5840:
---

[~atm], most of us are swamped with wrapping up rolling upgrades and testing 
it. Can you please look into this?

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-14 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13935327#comment-13935327
 ] 

Aaron T. Myers commented on HDFS-5840:
--

Sorry, got swamped this week. Will try to get to it early next.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-22 Thread Arun C Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944061#comment-13944061
 ] 

Arun C Murthy commented on HDFS-5840:
-

[~sureshms] / [~atm] - Should this be a blocker for 2.4? Thanks.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-23 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13944588#comment-13944588
 ] 

Suresh Srinivas commented on HDFS-5840:
---

[~atm], any updates on this. [~jingzhao], found some issues and posted comments 
on HDFS-5138 as well.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-24 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945683#comment-13945683
 ] 

Suresh Srinivas commented on HDFS-5840:
---

[~jingzhao], thanks for doing the patch.

Should JN throw an exception, if pre-upgrade or upgrade is retried, that 
upgrade is already in progress, restart the journal node before attempting 
namenode upgrade again?

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.001.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-24 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945728#comment-13945728
 ] 

Jing Zhao commented on HDFS-5840:
-

Thanks for the comments, Suresh! Will update the patch to address the comments.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 3.0.0
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>Priority: Blocker
> Fix For: 3.0.0
>
> Attachments: HDFS-5840.001.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13945899#comment-13945899
 ] 

Hadoop QA commented on HDFS-5840:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636424/HDFS-5840.001.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestSafeMode
  org.apache.hadoop.hdfs.qjournal.TestNNWithQJM

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6479//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6479//console

This message is automatically generated.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Jing Zhao
>Priority: Blocker
> Attachments: HDFS-5840.001.patch, HDFS-5840.002.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-24 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946080#comment-13946080
 ] 

Hadoop QA commented on HDFS-5840:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12636462/HDFS-5840.002.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/6481//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6481//console

This message is automatically generated.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Jing Zhao
>Priority: Blocker
> Attachments: HDFS-5840.001.patch, HDFS-5840.002.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-24 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946099#comment-13946099
 ] 

Aaron T. Myers commented on HDFS-5840:
--

Thanks a lot for taking this across the finish line, Jing. The updated patch 
you posted looks pretty good to me.

Suresh, does it look OK to you as well? If so, +1 from my perspective.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Jing Zhao
>Priority: Blocker
> Attachments: HDFS-5840.001.patch, HDFS-5840.002.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-24 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946237#comment-13946237
 ] 

Jing Zhao commented on HDFS-5840:
-

Thanks for the review, Suresh and Aaron! +1 for the 003 patch. I will commit it 
soon.

> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Jing Zhao
>Priority: Blocker
> Attachments: HDFS-5840.001.patch, HDFS-5840.002.patch, 
> HDFS-5840.003.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946277#comment-13946277
 ] 

Hudson commented on HDFS-5840:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #5397 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/5397/])
HDFS-5840. Follow-up to HDFS-5138 to improve error handling during partial 
upgrade failures. Contributed by Aaron T. Myers, Suresh Srinivas, and Jing 
Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1581260)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JNStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNUpgradeUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHAStateTransitions.java


> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Jing Zhao
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: HDFS-5840.001.patch, HDFS-5840.002.patch, 
> HDFS-5840.003.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946402#comment-13946402
 ] 

Hudson commented on HDFS-5840:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #520 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/520/])
HDFS-5840. Follow-up to HDFS-5138 to improve error handling during partial 
upgrade failures. Contributed by Aaron T. Myers, Suresh Srinivas, and Jing 
Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1581260)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JNStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNUpgradeUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHAStateTransitions.java


> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Jing Zhao
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: HDFS-5840.001.patch, HDFS-5840.002.patch, 
> HDFS-5840.003.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946562#comment-13946562
 ] 

Hudson commented on HDFS-5840:
--

FAILURE: Integrated in Hadoop-Hdfs-trunk #1712 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1712/])
HDFS-5840. Follow-up to HDFS-5138 to improve error handling during partial 
upgrade failures. Contributed by Aaron T. Myers, Suresh Srinivas, and Jing 
Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1581260)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JNStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNUpgradeUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHAStateTransitions.java


> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Jing Zhao
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: HDFS-5840.001.patch, HDFS-5840.002.patch, 
> HDFS-5840.003.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (HDFS-5840) Follow-up to HDFS-5138 to improve error handling during partial upgrade failures

2014-03-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5840?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13946534#comment-13946534
 ] 

Hudson commented on HDFS-5840:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1737 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1737/])
HDFS-5840. Follow-up to HDFS-5138 to improve error handling during partial 
upgrade failures. Contributed by Aaron T. Myers, Suresh Srinivas, and Jing 
Zhao. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1581260)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/client/QuorumJournalManager.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JNStorage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/Journal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/qjournal/server/JournalNode.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLog.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImage.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NNUpgradeUtil.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/site/apt/HDFSHighAvailabilityWithQJM.apt.vm
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/qjournal/server/TestJournal.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestHAStateTransitions.java


> Follow-up to HDFS-5138 to improve error handling during partial upgrade 
> failures
> 
>
> Key: HDFS-5840
> URL: https://issues.apache.org/jira/browse/HDFS-5840
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: ha, journal-node, namenode
>Affects Versions: 3.0.0, 2.4.0
>Reporter: Aaron T. Myers
>Assignee: Jing Zhao
>Priority: Blocker
> Fix For: 2.4.0
>
> Attachments: HDFS-5840.001.patch, HDFS-5840.002.patch, 
> HDFS-5840.003.patch, HDFS-5840.patch
>
>
> Suresh posted some good comment in HDFS-5138 after that patch had already 
> been committed to trunk. This JIRA is to address those. See the first comment 
> of this JIRA for the full content of the review.



--
This message was sent by Atlassian JIRA
(v6.2#6252)