[ 
https://issues.apache.org/jira/browse/HDFS-15796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17376419#comment-17376419
 ] 

Stephen O'Donnell commented on HDFS-15796:
------------------------------------------

Looking at the changes in this area, I think the problem was caused by 
HDFS-15159. However I don't think the solution in the patch is the best 
approach to fix this. It means that if anyone else tries to use getTargets(...) 
again in the future, they will need to know to synchronise around the results, 
and this could result in another bug like this.

A better approach, may be to return a new ArrayList from getTargets, eg:

{code}
  List<DatanodeStorageInfo> getTargets(BlockInfo block) {
    synchronized (pendingReconstructions) {
      PendingBlockInfo found = pendingReconstructions.get(block);
      if (found != null) {
        return new ArrayList<>(found.targets);  // changed line here
      }
    }
    return null;
  }
{code}

That way, it doesn't matter if something else changes the list before it is 
used.

> ConcurrentModificationException error happens on NameNode occasionally
> ----------------------------------------------------------------------
>
>                 Key: HDFS-15796
>                 URL: https://issues.apache.org/jira/browse/HDFS-15796
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs
>    Affects Versions: 3.1.1
>            Reporter: Daniel Ma
>            Priority: Critical
>         Attachments: 0001-HDFS-15796.patch
>
>
> ConcurrentModificationException error happens on NameNode occasionally.
>  
> {code:java}
> 2021-01-23 20:21:18,107 | ERROR | RedundancyMonitor | RedundancyMonitor 
> thread received Runtime exception.  | BlockManager.java:4746
> java.util.ConcurrentModificationException
>       at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909)
>       at java.util.ArrayList$Itr.next(ArrayList.java:859)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeReconstructionWorkForBlocks(BlockManager.java:1907)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeBlockReconstructionWork(BlockManager.java:1859)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.computeDatanodeWork(BlockManager.java:4862)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager$RedundancyMonitor.run(BlockManager.java:4729)
>       at java.lang.Thread.run(Thread.java:748)
> {code}
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to