[jira] [Updated] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen
[ https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Junping Du updated HDFS-9313: - Target Version/s: 2.6.4 > Possible NullPointerException in BlockManager if no excess replica can be > chosen > > > Key: HDFS-9313 > URL: https://issues.apache.org/jira/browse/HDFS-9313 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 2.8.0, 2.7.3, 2.6.4 > > Attachments: HDFS-9313-2.patch, HDFS-9313.branch26.patch, > HDFS-9313.branch27.patch, HDFS-9313.patch > > > HDFS-8647 makes it easier to reason about various block placement scenarios. > Here is one possible case where BlockManager won't be able to find the excess > replica to delete: when storage policy changes around the same time balancer > moves the block. When this happens, it will cause NullPointerException. > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978) > {noformat} > Note that it isn't found in any production clusters. Instead, it is found > from new unit tests. In addition, the issue has been there before HDFS-8647. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen
[ https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhe Zhang updated HDFS-9313: Fix Version/s: 2.6.4 2.7.3 > Possible NullPointerException in BlockManager if no excess replica can be > chosen > > > Key: HDFS-9313 > URL: https://issues.apache.org/jira/browse/HDFS-9313 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 2.8.0, 2.7.3, 2.6.4 > > Attachments: HDFS-9313-2.patch, HDFS-9313.branch26.patch, > HDFS-9313.branch27.patch, HDFS-9313.patch > > > HDFS-8647 makes it easier to reason about various block placement scenarios. > Here is one possible case where BlockManager won't be able to find the excess > replica to delete: when storage policy changes around the same time balancer > moves the block. When this happens, it will cause NullPointerException. > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978) > {noformat} > Note that it isn't found in any production clusters. Instead, it is found > from new unit tests. In addition, the issue has been there before HDFS-8647. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen
[ https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Chen updated HDFS-9313: Attachment: HDFS-9313.branch27.patch HDFS-9313.branch26.patch > Possible NullPointerException in BlockManager if no excess replica can be > chosen > > > Key: HDFS-9313 > URL: https://issues.apache.org/jira/browse/HDFS-9313 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 2.8.0 > > Attachments: HDFS-9313-2.patch, HDFS-9313.branch26.patch, > HDFS-9313.branch27.patch, HDFS-9313.patch > > > HDFS-8647 makes it easier to reason about various block placement scenarios. > Here is one possible case where BlockManager won't be able to find the excess > replica to delete: when storage policy changes around the same time balancer > moves the block. When this happens, it will cause NullPointerException. > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978) > {noformat} > Note that it isn't found in any production clusters. Instead, it is found > from new unit tests. In addition, the issue has been there before HDFS-8647. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen
[ https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9313: -- Resolution: Fixed Hadoop Flags: Reviewed Fix Version/s: 2.8.0 Status: Resolved (was: Patch Available) Committed to trunk and branch-2. Thanks [~zhz], [~liuml07], [~walter.k.su] and [~brahmareddy]. > Possible NullPointerException in BlockManager if no excess replica can be > chosen > > > Key: HDFS-9313 > URL: https://issues.apache.org/jira/browse/HDFS-9313 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Fix For: 2.8.0 > > Attachments: HDFS-9313-2.patch, HDFS-9313.patch > > > HDFS-8647 makes it easier to reason about various block placement scenarios. > Here is one possible case where BlockManager won't be able to find the excess > replica to delete: when storage policy changes around the same time balancer > moves the block. When this happens, it will cause NullPointerException. > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978) > {noformat} > Note that it isn't found in any production clusters. Instead, it is found > from new unit tests. In addition, the issue has been there before HDFS-8647. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen
[ https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9313: -- Attachment: HDFS-9313-2.patch New patch to fix the issue Mingliang pointed out. Thanks Mingliang, Water, Brahma. > Possible NullPointerException in BlockManager if no excess replica can be > chosen > > > Key: HDFS-9313 > URL: https://issues.apache.org/jira/browse/HDFS-9313 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9313-2.patch, HDFS-9313.patch > > > HDFS-8647 makes it easier to reason about various block placement scenarios. > Here is one possible case where BlockManager won't be able to find the excess > replica to delete: when storage policy changes around the same time balancer > moves the block. When this happens, it will cause NullPointerException. > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978) > {noformat} > Note that it isn't found in any production clusters. Instead, it is found > from new unit tests. In addition, the issue has been there before HDFS-8647. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen
[ https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9313: -- Assignee: Ming Ma Status: Patch Available (was: Open) > Possible NullPointerException in BlockManager if no excess replica can be > chosen > > > Key: HDFS-9313 > URL: https://issues.apache.org/jira/browse/HDFS-9313 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma >Assignee: Ming Ma > Attachments: HDFS-9313.patch > > > HDFS-8647 makes it easier to reason about various block placement scenarios. > Here is one possible case where BlockManager won't be able to find the excess > replica to delete: when storage policy changes around the same time balancer > moves the block. When this happens, it will cause NullPointerException. > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978) > {noformat} > Note that it isn't found in any production clusters. Instead, it is found > from new unit tests. In addition, the issue has been there before HDFS-8647. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9313) Possible NullPointerException in BlockManager if no excess replica can be chosen
[ https://issues.apache.org/jira/browse/HDFS-9313?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ming Ma updated HDFS-9313: -- Attachment: HDFS-9313.patch Here is the patch that illustrates the scenario. It is better to guard against this. In addition, for this specific test scenario, {{BlockPlacementPolicyDefault}} should have been able to delete excessSSD. We can fix it separately. > Possible NullPointerException in BlockManager if no excess replica can be > chosen > > > Key: HDFS-9313 > URL: https://issues.apache.org/jira/browse/HDFS-9313 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ming Ma > Attachments: HDFS-9313.patch > > > HDFS-8647 makes it easier to reason about various block placement scenarios. > Here is one possible case where BlockManager won't be able to find the excess > replica to delete: when storage policy changes around the same time balancer > moves the block. When this happens, it will cause NullPointerException. > {noformat} > java.lang.NullPointerException > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy.adjustSetsWithChosenReplica(BlockPlacementPolicy.java:156) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicyDefault.chooseReplicasToDelete(BlockPlacementPolicyDefault.java:978) > {noformat} > Note that it isn't found in any production clusters. Instead, it is found > from new unit tests. In addition, the issue has been there before HDFS-8647. -- This message was sent by Atlassian JIRA (v6.3.4#6332)