[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-12-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13845293#comment-13845293
 ] 

Hudson commented on HDFS-5504:
--

FAILURE: Integrated in Hadoop-Yarn-trunk #418 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/418/])
Move 
HDFS-5257,HDFS-5427,HDFS-5443,HDFS-5476,HDFS-5425,HDFS-5474,HDFS-5504,HDFS-5428 
into branch-2.3 section. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550011)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 2.2.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 2.3.0

 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-12-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13845389#comment-13845389
 ] 

Hudson commented on HDFS-5504:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1609 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1609/])
Move 
HDFS-5257,HDFS-5427,HDFS-5443,HDFS-5476,HDFS-5425,HDFS-5474,HDFS-5504,HDFS-5428 
into branch-2.3 section. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550011)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 2.2.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 2.3.0

 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-12-11 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13845444#comment-13845444
 ] 

Hudson commented on HDFS-5504:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1635 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1635/])
Move 
HDFS-5257,HDFS-5427,HDFS-5443,HDFS-5476,HDFS-5425,HDFS-5474,HDFS-5504,HDFS-5428 
into branch-2.3 section. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550011)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 2.2.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 2.3.0

 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-12-10 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13844901#comment-13844901
 ] 

Hudson commented on HDFS-5504:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4859 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4859/])
Move 
HDFS-5257,HDFS-5427,HDFS-5443,HDFS-5476,HDFS-5425,HDFS-5474,HDFS-5504,HDFS-5428 
into branch-2.3 section. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1550011)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt


 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
Priority: Blocker
 Fix For: 2.4.0

 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822319#comment-13822319
 ] 

Hudson commented on HDFS-5504:
--

SUCCESS: Integrated in Hadoop-Yarn-trunk #391 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/391/])
HDFS-5504. In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode 
threshold, leads to NN safemode. Contributed by Vinay. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1541773)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java


 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 2.3.0

 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822410#comment-13822410
 ] 

Hudson commented on HDFS-5504:
--

FAILURE: Integrated in Hadoop-Mapreduce-trunk #1608 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1608/])
HDFS-5504. In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode 
threshold, leads to NN safemode. Contributed by Vinay. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1541773)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java


 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 2.3.0

 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822435#comment-13822435
 ] 

Hudson commented on HDFS-5504:
--

SUCCESS: Integrated in Hadoop-Hdfs-trunk #1582 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1582/])
HDFS-5504. In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode 
threshold, leads to NN safemode. Contributed by Vinay. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1541773)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java


 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 2.3.0

 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-13 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13821354#comment-13821354
 ] 

Hadoop QA commented on HDFS-5504:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12613578/HDFS-5504.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5421//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5421//console

This message is automatically generated.

 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-13 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822045#comment-13822045
 ] 

Jing Zhao commented on HDFS-5504:
-

+1.  I will commit the patch shortly.

 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-13 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822058#comment-13822058
 ] 

Vinay commented on HDFS-5504:
-

Thanks Jing for the review and commit

 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 2.3.0

 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-13 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13822089#comment-13822089
 ] 

Hudson commented on HDFS-5504:
--

SUCCESS: Integrated in Hadoop-trunk-Commit #4733 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/4733/])
HDFS-5504. In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode 
threshold, leads to NN safemode. Contributed by Vinay. (jing9: 
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1541773)
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSEditLogLoader.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/snapshot/TestSnapshotDeletion.java


 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Fix For: 2.3.0

 Attachments: HDFS-5504.patch, HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-12 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820426#comment-13820426
 ] 

Hadoop QA commented on HDFS-5504:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12613367/HDFS-5504.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/5399//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5399//console

This message is automatically generated.

 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-12 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820638#comment-13820638
 ] 

Jing Zhao commented on HDFS-5504:
-

The patch looks good to me. One minor is that removePathAndBlocks already holds 
the FSNS write lock, and with the patch we will acquire the FSNS write lock 
again inside removePathAndBlocks when calling removeBlocks. Can we avoid the 
double locking here and still reuse the code? Maybe we can define new methods 
just to reuse the following code:
{code}
for (int i = 0; i  BLOCK_DELETION_INCREMENT  iter.hasNext(); i++) {
  Block b = iter.next();
  if (trackBlockCounts) {
BlockInfo bi = getStoredBlock(b);
if (bi.isComplete()) {
  numRemovedComplete++;
  if (bi.numNodes() = blockManager.minReplication) {
numRemovedSafe++;
  }
}
  }
  blockManager.removeBlock(b);
}
{code}

 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (HDFS-5504) In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, leads to NN safemode.

2013-11-12 Thread Vinay (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13820767#comment-13820767
 ] 

Vinay commented on HDFS-5504:
-

Hi Jing,
Thanks for reviewing the patch.
I thought about keeping the old code in {{removePathAndBlocks()}}. When we see 
the code, locking will happen only during loading edits. This can be only in 
startup or while tailing edits in SNN. So locking and unlocking again for every 
1000 blocks may not be problem in my opinion.

If still require updation, I will upload the patch addressing this.

 In HA mode, OP_DELETE_SNAPSHOT is not decrementing the safemode threshold, 
 leads to NN safemode.
 

 Key: HDFS-5504
 URL: https://issues.apache.org/jira/browse/HDFS-5504
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: snapshots
Affects Versions: 3.0.0, 2.2.0
Reporter: Vinay
Assignee: Vinay
 Attachments: HDFS-5504.patch


 1. HA installation, standby NN is down.
 2. delete snapshot is called and it has deleted the blocks from blocksmap and 
 all datanodes. log sync also happened.
 3. before next log roll NN crashed
 4. When the namenode restartes then it will fsimage and finalized edits from 
 shared storage and set the safemode threshold. which includes blocks from 
 deleted snapshot also. (because this edits is not yet read as namenode is 
 restarted before the last edits segment is not finalized)
 5. When it becomes active, it finalizes the edits and read the delete 
 snapshot edits_op. but at this time, it was not reducing the safemode count. 
 and it will continuing in safemode.
 6. On next restart, as the edits is already finalized, on startup only it 
 will read and set the safemode threshold correctly.
 But one more restart will bring NN out of safemode.



--
This message was sent by Atlassian JIRA
(v6.1#6144)