[ https://issues.apache.org/jira/browse/HDFS-14126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wei-Chiu Chuang resolved HDFS-14126. ------------------------------------ Resolution: Duplicate I believe HDFS-14476 solves the same issue and so I'll mark this as a duplicate. > DataNode DirectoryScanner holding global lock for too long > ---------------------------------------------------------- > > Key: HDFS-14126 > URL: https://issues.apache.org/jira/browse/HDFS-14126 > Project: Hadoop HDFS > Issue Type: Bug > Affects Versions: 3.0.0 > Reporter: Wei-Chiu Chuang > Priority: Major > > I've got a Hadoop 3 based cluster set up, and this DN has just 434 thousand > blocks. > And yet, DirectoryScanner holds the fsdataset lock for 2.7 seconds: > {quote} > 2018-12-03 21:33:09,130 INFO > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool > BP-4588049-10.17.XXX-XX-XXXX281857726 Total blocks: 434401, missing metadata > fi > les:0, missing block files:0, missing blocks in memory:0, mismatched blocks:0 > 2018-12-03 21:33:09,131 WARN > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Lock > held time above threshold: lock identifier: org.apache.hadoop.hdfs.serve > r.datanode.fsdataset.impl.FsDatasetImpl lockHeldTimeMs=2710 ms. Suppressed 0 > lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133) > org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) > org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:473) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:373) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:318) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > {quote} > Log messages like this repeats every several hours (6, to be exact). I am not > sure if this is a performance regression, or just the fact that the lock > information is printed in Hadoop 3. [~vagarychen] or [~templedf] do you know? > There's no log in DN to indicate any sort of JVM GC going on. Plus, the DN's > heap size is set to several GB. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org