[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-09-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13464677#comment-13464677
 ] 

Hudson commented on HDFS-3860:
--

Integrated in Hadoop-Hdfs-0.23-Build #387 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/387/])
svn merge -c 1378228 FIXES: HDFS-3860. HeartbeatManager#Monitor may wrongly 
hold the writelock of namesystem. Contributed by Jing Zhao. (Revision 1390632)

 Result = UNSTABLE
bobby : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1390632
Files : 
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java


> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 0.23.4, 2.0.2-alpha
>
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13444078#comment-13444078
 ] 

Hudson commented on HDFS-3860:
--

Integrated in Hadoop-Mapreduce-trunk #1180 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1180/])
HDFS-3860. HeartbeatManager#Monitor may wrongly hold the writelock of 
namesystem. Contributed by Jing Zhao. (Revision 1378228)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378228
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java


> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-29 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13444024#comment-13444024
 ] 

Hudson commented on HDFS-3860:
--

Integrated in Hadoop-Hdfs-trunk #1149 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1149/])
HDFS-3860. HeartbeatManager#Monitor may wrongly hold the writelock of 
namesystem. Contributed by Jing Zhao. (Revision 1378228)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378228
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java


> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443367#comment-13443367
 ] 

Hudson commented on HDFS-3860:
--

Integrated in Hadoop-Hdfs-trunk-Commit #2715 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2715/])
HDFS-3860. HeartbeatManager#Monitor may wrongly hold the writelock of 
namesystem. Contributed by Jing Zhao. (Revision 1378228)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378228
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java


> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443353#comment-13443353
 ] 

Hudson commented on HDFS-3860:
--

Integrated in Hadoop-Common-trunk-Commit #2651 (See 
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2651/])
HDFS-3860. HeartbeatManager#Monitor may wrongly hold the writelock of 
namesystem. Contributed by Jing Zhao. (Revision 1378228)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378228
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java


> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443338#comment-13443338
 ] 

Hudson commented on HDFS-3860:
--

Integrated in Hadoop-Mapreduce-trunk-Commit #2680 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2680/])
HDFS-3860. HeartbeatManager#Monitor may wrongly hold the writelock of 
namesystem. Contributed by Jing Zhao. (Revision 1378228)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1378228
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/HeartbeatManager.java


> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-28 Thread Jing Zhao (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443292#comment-13443292
 ] 

Jing Zhao commented on HDFS-3860:
-

I just checked all the invocation of namesystem#writelock / writeunlock, and 
did not find similar problems. I will check other similar code too.

> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-28 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443289#comment-13443289
 ] 

Suresh Srinivas commented on HDFS-3860:
---

Thanks Aaron for committing the patch.

bq. BTW could you please also ensure that this pattern of code is not repeated 
in any other places.
Going back to my previous comment, Jing, if possible can you also see if there 
other such issues.

> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 2.2.0-alpha
>
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-28 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443271#comment-13443271
 ] 

Aaron T. Myers commented on HDFS-3860:
--

Oof, good catch, Jing. Fortunately this case seems like it would be pretty 
tough to hit, since if the NN is in SM then HeartbeatManager#heartbeatCheck 
will return early, so to hit this the NN would have to enter SM in a very short 
window of time. Still certainly worth fixing, though.

The patch looks good to me. The findbugs warning is unrelated and 
TestHftpDelegationToken is known to currently be failing.

+1, I'll commit this momentarily.

> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-28 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443105#comment-13443105
 ] 

Hadoop QA commented on HDFS-3860:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12542695/HDFS-3860.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 eclipse:eclipse.  The patch built with eclipse:eclipse.

-1 findbugs.  The patch appears to introduce 1 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestHftpDelegationToken

+1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3106//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3106//artifact/trunk/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3106//console

This message is automatically generated.

> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-28 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443056#comment-13443056
 ] 

Suresh Srinivas commented on HDFS-3860:
---

Jing, nice find. Submitting the patch.

> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3860) HeartbeatManager#Monitor may wrongly hold the writelock of namesystem

2012-08-28 Thread Suresh Srinivas (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13443058#comment-13443058
 ] 

Suresh Srinivas commented on HDFS-3860:
---

BTW could you please also ensure that this pattern of code is not repeated in 
any other places.

> HeartbeatManager#Monitor may wrongly hold the writelock of namesystem
> -
>
> Key: HDFS-3860
> URL: https://issues.apache.org/jira/browse/HDFS-3860
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Attachments: HDFS-3860.patch, HDFS-heartbeat-testcase.patch
>
>
> In HeartbeatManager#heartbeatCheck, if some dead datanode is found, the 
> monitor thread will acquire the write lock of namesystem, and recheck the 
> safemode. If it is in safemode, the monitor thread will return from the 
> heartbeatCheck function without release the write lock. This may cause the 
> monitor thread wrongly holding the write lock forever.
> The attached test case tries to simulate this bad scenario.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira