[ 
https://issues.apache.org/jira/browse/HDFS-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17375421#comment-17375421
 ] 

Daniel Ma edited comment on HDFS-15796 at 7/6/21, 9:57 AM:
-----------------------------------------------------------

[~sodonnell]

Thanks for reviewing, Actually you missed the for loop here:
{code:java}
//代码占位符
synchronized (pendingReconstruction) {
  List<DatanodeStorageInfo> targets = pendingReconstruction
      .getTargets(rw.getBlock());
  if (targets != null) {
    for (DatanodeStorageInfo dn : targets) {
      if (!excludedNodes.contains(dn.getDatanodeDescriptor())) {
        excludedNodes.add(dn.getDatanodeDescriptor());
      }
    }
  }
}
{code}
The problem happens when the code above try to travel the DataNodes stored in 
pendingReconstruction object, while the DataNode list is also been modifing 
elsewhere.

In other words, if you modify a List(delete or add an element) and visit it in 
the same time, ConcurrentModificationException will be casted.


was (Author: daniel ma):
[~sodonnell]

Thanks for reviewing, Actually you missed the for loop here:
{code:java}
//代码占位符
synchronized (pendingReconstruction) {
  List<DatanodeStorageInfo> targets = pendingReconstruction
      .getTargets(rw.getBlock());
  if (targets != null) {
    for (DatanodeStorageInfo dn : targets) {
      if (!excludedNodes.contains(dn.getDatanodeDescriptor())) {
        excludedNodes.add(dn.getDatanodeDescriptor());
      }
    }
  }
}
{code}
The problem happens when the code above try to travel the DataNodes stored in 
pendingReconstruction object, while the DataNode list is also be modified 
elsewhere.

In other words, if you modify a List(delete or add an element) and visit it in 
the same time, ConcurrentModificationException will be casted.

> ConcurrentModificationException error happens on NameNode occasionally
> ----------------------------------------------------------------------
>
>                 Key: HDFS-15796
>                 URL: https://issues.apache.org/jira/browse/HDFS-15796
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 3.1.1
>            Reporter: Daniel Ma
>            Priority: Critical
>         Attachments: 0001-HDFS-15796.patch
>
>
> ConcurrentModificationException error happens on NameNode occasionally.
>  
> {code:java}
> 2021-01-23 20:21:18,107 | ERROR | RedundancyMonitor | RedundancyMonitor 
> thread received Runtime exception.  | BlockManager.java:4746
> java.util.ConcurrentModificationException
>       at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>       at java.util.ArrayList$Itr.next(ArrayList.java:859)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1907)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1859)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4862)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4729)
>       at java.lang.Thread.run(Thread.java:748)
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to