[ https://issues.apache.org/jira/browse/HDFS-15963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364000#comment-17364000 ]
Kihwal Lee commented on HDFS-15963: ----------------------------------- We hit this in 2.10 recently. I've cherry-picked it to branch-2.10 with minor conflicts. All new test cases pass. > Unreleased volume references cause an infinite loop > --------------------------------------------------- > > Key: HDFS-15963 > URL: https://issues.apache.org/jira/browse/HDFS-15963 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode > Reporter: Shuyan Zhang > Assignee: Shuyan Zhang > Priority: Major > Labels: pull-request-available > Fix For: 3.3.1, 3.4.0 > > Attachments: HDFS-15963.001.patch, HDFS-15963.002.patch, > HDFS-15963.003.patch > > Time Spent: 4.5h > Remaining Estimate: 0h > > When BlockSender throws an exception because the meta-data cannot be found, > the volume reference obtained by the thread is not released, which causes the > thread trying to remove the volume to wait and fall into an infinite loop. > {code:java} > boolean checkVolumesRemoved() { > Iterator<FsVolumeImpl> it = volumesBeingRemoved.iterator(); > while (it.hasNext()) { > FsVolumeImpl volume = it.next(); > if (!volume.checkClosed()) { > return false; > } > it.remove(); > } > return true; > } > boolean checkClosed() { > // always be true. > if (this.reference.getReferenceCount() > 0) { > FsDatasetImpl.LOG.debug("The reference count for {} is {}, wait to be 0.", > this, reference.getReferenceCount()); > return false; > } > return true; > } > {code} > At the same time, because the thread has been holding checkDirsLock when > removing the volume, other threads trying to acquire the same lock will be > permanently blocked. > Similar problems also occur in RamDiskAsyncLazyPersistService and > FsDatasetAsyncDiskService. > This patch releases the three previously unreleased volume references. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org