[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Masatake Iwasaki updated HDFS-10426: Resolution: Fixed Fix Version/s: 3.0.0-alpha2 2.8.0 Status: Resolved (was: Patch Available) Committed. Thanks, [~linyiqun]. > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Fix For: 2.8.0, 3.0.0-alpha2 > > Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, > HDFS-10426.003.patch, HDFS-10426.004.patch, HDFS-10426.005.patch, > HDFS-10426.006.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Attachment: HDFS-10426.006.patch > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, > HDFS-10426.003.patch, HDFS-10426.004.patch, HDFS-10426.005.patch, > HDFS-10426.006.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Attachment: HDFS-10426.005.patch Replied the same patch to kick jenkins. > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, > HDFS-10426.003.patch, HDFS-10426.004.patch, HDFS-10426.005.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Attachment: (was: HDFS-10426.005.patch) > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, > HDFS-10426.003.patch, HDFS-10426.004.patch, HDFS-10426.005.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Attachment: HDFS-10426.005.patch Thanks [~iwasakims] for the comments. The comments looks good to me. I think we can use the line {{Mockito.doReturn(1L).when(mockIb).getInvalidationDelay()}} for here to delay the block's deletion as that the test {{testPendingDeleteUnknownBlocks}} has already did. Attach a new patch to address the comments. > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, > HDFS-10426.003.patch, HDFS-10426.004.patch, HDFS-10426.005.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Attachment: HDFS-10426.004.patch I make a improvement based on my last patch. I construct a new method for setting the existed variable {{blocksInvalidateWorkPct}}, so we can control the blocks deletion in BlockManager. And we can reuse that in the BlockManager test in the future. Hi, [~iwasakims], can do a quick look for this? Thanks. > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, > HDFS-10426.003.patch, HDFS-10426.004.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Attachment: HDFS-10426.003.patch Thanks [~iwasakims] for comments. {quote} In order to fix #2, we should wait for replication count of the block reaches to 2 before delete. {quote} I think the method {{DataNodeTestUtils.setHeartbeatsDisabledForTests}} will make sense for this. We can disable herartbeats for test here. {quote} ReplicationMonitor kicks in between delete and checking pending deletion blocks count. {quote} I'd like to add a flag for test in {{BlockManager#ReplicationMonitor}} and this will make few change. But I haven't found the way that we just don't add any code to control this. I am glad if you have some good comments for me. The other intermittent failures seem that the pengdingDeleteBlocks was not deleted completely when we did the check. We can add some retry chances for that. Attach a new patch. > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch, > HDFS-10426.003.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Attachment: HDFS-10426.002.patch Thanks [~iwasakims] for the comments. And I have tested the way as you comment. But I found it still not affect the {{cluster.getNamesystem().getPendingDeletionBlocks()}}, and the invalidateBlocks will still be removed. The {{DataNodeTestUtils#setHeartbeatsDisabledForTests}} is controlling the blocks scheduled to the datanodes, so it will make sense to {{dfs.getPendingDeletionBlocksCount()}}. So I think a better way is to dynamicly adjusting the {{pendingPeriodInMs}} time since the NameNode startup. Before checking, we can set a MAX_VALUE and not let the invalidateBlocks to be processed. After that we can reset to the -1. Finally, I will post a patch for this and using this way. > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10426.001.patch, HDFS-10426.002.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Status: Patch Available (was: Open) Attach a simple patch for this. > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10426.001.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Attachment: HDFS-10426.001.patch > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > Attachments: HDFS-10426.001.patch > > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); > Assert.assertEquals(REPLICATION, cluster.getNamesystem() > .getPendingDeletionBlocks()); > Assert.assertEquals(REPLICATION, > dfs.getPendingDeletionBlocksCount()); > {code} > And I look into the related configurations. I found the property > {{dfs.namenode.replication.interval}} was just set as 1 second in this test. > And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} > and the delete operation was slowly, it will cause this case. We can see the > stack info before, the failed test costs 7.7s more than 5+1 second. > One way can improve this. > * Increase the time of {{dfs.namenode.replication.interval}} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Updated] (HDFS-10426) TestPendingInvalidateBlock failed in trunk
[ https://issues.apache.org/jira/browse/HDFS-10426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yiqun Lin updated HDFS-10426: - Description: The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: {code} org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) Time elapsed: 7.703 sec <<< FAILURE! java.lang.AssertionError: expected:<2> but was:<1> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) {code} It looks that the {{invalidateBlock}} has been removed before we do the check {code} // restart NN cluster.restartNameNode(true); dfs.delete(foo, true); Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); Assert.assertEquals(REPLICATION, cluster.getNamesystem() .getPendingDeletionBlocks()); Assert.assertEquals(REPLICATION, dfs.getPendingDeletionBlocksCount()); {code} And I look into the related configurations. I found the property {{dfs.namenode.replication.interval}} was just set as 1 second in this test. And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} and the delete operation was slowly, it will cause this case. We can see the stack info before, the failed test costs 7.7s more than 5+1 second. One way can improve this. * Increase the time of {{dfs.namenode.replication.interval}} was: The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: {code} org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) Time elapsed: 7.703 sec <<< FAILURE! java.lang.AssertionError: expected:<2> but was:<1> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.junit.Assert.assertEquals(Assert.java:542) at org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) {code} It looks that the {{invalidateBlock}} has been removed before we do the check {code} // restart NN cluster.restartNameNode(true); dfs.delete(foo, true); Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); Assert.assertEquals(REPLICATION, cluster.getNamesystem() .getPendingDeletionBlocks()); Assert.assertEquals(REPLICATION, dfs.getPendingDeletionBlocksCount()); {code} And I look into the related configurations. I found the property {{dfs.namenode.replication.interval}} was just set as 1 second in this test. And after the delay time of {{dfs.namenode.startup.delay.block.deletion.sec}} and the delete operation was slowly, it will cause this case. We can see the stack info before, the failed test costs 7.7s more than 5+1 second. Two methods will improve this. * Increase the time of {{dfs.namenode.startup.delay.block.deletion.sec}} * Increase the time of {{dfs.namenode.replication.interval}} > TestPendingInvalidateBlock failed in trunk > -- > > Key: HDFS-10426 > URL: https://issues.apache.org/jira/browse/HDFS-10426 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Yiqun Lin >Assignee: Yiqun Lin > > The test {{TestPendingInvalidateBlock}} failed sometimes. The stack info: > {code} > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock > testPendingDeletion(org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock) > Time elapsed: 7.703 sec <<< FAILURE! > java.lang.AssertionError: expected:<2> but was:<1> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.blockmanagement.TestPendingInvalidateBlock.testPendingDeletion(TestPendingInvalidateBlock.java:92) > {code} > It looks that the {{invalidateBlock}} has been removed before we do the check > {code} > // restart NN > cluster.restartNameNode(true); > dfs.delete(foo, true); > Assert.assertEquals(0, cluster.getNamesystem().getBlocksTotal()); >