[jira] [Updated] (HDFS-9865) TestBlockReplacement fails intermittently in trunk

2016-03-07 Thread Masatake Iwasaki (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Masatake Iwasaki updated HDFS-9865:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
   Fix Version/s: 2.7.3
  2.8.0
Target Version/s: 2.7.3
  Status: Resolved  (was: Patch Available)

+1. Committed to branch-2.7 and above. Thanks, [~linyiqun].

> TestBlockReplacement fails intermittently in trunk
> --
>
> Key: HDFS-9865
> URL: https://issues.apache.org/jira/browse/HDFS-9865
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Fix For: 2.8.0, 2.7.3
>
> Attachments: HDFS-9865.001.patch, HDFS-9865.002.patch
>
>
> I found the testcase {{TestBlockReplacement}} will be failed sometimes in 
> testing. And I looked the unit log, always I will found these infos:
> {code}
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement
> testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement)
>   Time elapsed: 8.764 sec  <<< FAILURE!
> java.lang.AssertionError: The block should be only on 1 datanode  
> expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436)
> {code}
> Finally I found the reason is that not deleting block completely in 
> testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. 
> And the time to wait FsDatasetAsyncDsikService to delete the block is not a 
> accurate value. 
> {code}
> LOG.info("replaceBlock:  " + replaceBlock(block,
>   (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc,
>   (DatanodeInfo)destDnDesc));
> // Waiting for the FsDatasetAsyncDsikService to delete the block
> Thread.sleep(3000);
> {code}
> When I adjust this time to 1 seconds, it will be always failed. Also the 3 
> seconds in test is not a accurate value too. We should adjust these code's 
> logic to a better way such as waiting for the block to be replicated in 
> testDecommision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9865) TestBlockReplacement fails intermittently in trunk

2016-03-07 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9865:

Attachment: HDFS-9865.002.patch

> TestBlockReplacement fails intermittently in trunk
> --
>
> Key: HDFS-9865
> URL: https://issues.apache.org/jira/browse/HDFS-9865
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9865.001.patch, HDFS-9865.002.patch
>
>
> I found the testcase {{TestBlockReplacement}} will be failed sometimes in 
> testing. And I looked the unit log, always I will found these infos:
> {code}
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement
> testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement)
>   Time elapsed: 8.764 sec  <<< FAILURE!
> java.lang.AssertionError: The block should be only on 1 datanode  
> expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436)
> {code}
> Finally I found the reason is that not deleting block completely in 
> testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. 
> And the time to wait FsDatasetAsyncDsikService to delete the block is not a 
> accurate value. 
> {code}
> LOG.info("replaceBlock:  " + replaceBlock(block,
>   (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc,
>   (DatanodeInfo)destDnDesc));
> // Waiting for the FsDatasetAsyncDsikService to delete the block
> Thread.sleep(3000);
> {code}
> When I adjust this time to 1 seconds, it will be always failed. Also the 3 
> seconds in test is not a accurate value too. We should adjust these code's 
> logic to a better way such as waiting for the block to be replicated in 
> testDecommision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9865) TestBlockReplacement fails intermittently in trunk

2016-02-26 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9865:

Attachment: HDFS-9865.001.patch

> TestBlockReplacement fails intermittently in trunk
> --
>
> Key: HDFS-9865
> URL: https://issues.apache.org/jira/browse/HDFS-9865
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
> Attachments: HDFS-9865.001.patch
>
>
> I found the testcase {{TestBlockReplacement}} will be failed sometimes in 
> testing. And I looked the unit log, always I will found these infos:
> {code}
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement
> testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement)
>   Time elapsed: 8.764 sec  <<< FAILURE!
> java.lang.AssertionError: The block should be only on 1 datanode  
> expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436)
> {code}
> Finally I found the reason is that not deleting block completely in 
> testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. 
> And the time to wait FsDatasetAsyncDsikService to delete the block is not a 
> accurate value. 
> {code}
> LOG.info("replaceBlock:  " + replaceBlock(block,
>   (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc,
>   (DatanodeInfo)destDnDesc));
> // Waiting for the FsDatasetAsyncDsikService to delete the block
> Thread.sleep(3000);
> {code}
> When I adjust this time to 1 seconds, it will be always failed. Also the 3 
> seconds in test is not a accurate value too. We should adjust these code's 
> logic to a better way such as waiting for the block to be replicated in 
> testDecommision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9865) TestBlockReplacement fails intermittently in trunk

2016-02-26 Thread Lin Yiqun (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lin Yiqun updated HDFS-9865:

Status: Patch Available  (was: Open)

Attach a initial patch, I adjust the waiting code as a dynamic way. Kindly 
review, thanks.

> TestBlockReplacement fails intermittently in trunk
> --
>
> Key: HDFS-9865
> URL: https://issues.apache.org/jira/browse/HDFS-9865
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.1
>Reporter: Lin Yiqun
>Assignee: Lin Yiqun
>
> I found the testcase {{TestBlockReplacement}} will be failed sometimes in 
> testing. And I looked the unit log, always I will found these infos:
> {code}
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement
> testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement)
>   Time elapsed: 8.764 sec  <<< FAILURE!
> java.lang.AssertionError: The block should be only on 1 datanode  
> expected:<1> but was:<2>
>   at org.junit.Assert.fail(Assert.java:88)
>   at org.junit.Assert.failNotEquals(Assert.java:743)
>   at org.junit.Assert.assertEquals(Assert.java:118)
>   at org.junit.Assert.assertEquals(Assert.java:555)
>   at 
> org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436)
> {code}
> Finally I found the reason is that not deleting block completely in 
> testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. 
> And the time to wait FsDatasetAsyncDsikService to delete the block is not a 
> accurate value. 
> {code}
> LOG.info("replaceBlock:  " + replaceBlock(block,
>   (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc,
>   (DatanodeInfo)destDnDesc));
> // Waiting for the FsDatasetAsyncDsikService to delete the block
> Thread.sleep(3000);
> {code}
> When I adjust this time to 1 seconds, it will be always failed. Also the 3 
> seconds in test is not a accurate value too. We should adjust these code's 
> logic to a better way such as waiting for the block to be replicated in 
> testDecommision.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)