[ https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406064#comment-17406064 ]
shaik Mahaboob commented on HDFS-15406: --------------------------------------- Hello, Currently my local data node which installed through docker, is generating below WARN while spark job writing file. After below WARN message in my hadoop logs, the spark job is gettking killed. any pointers to resolve this would be highly appricated. ==== namenode | 21/08/27 20:02:48 INFO namenode.FSEditLog: Number of transactions: 74 Total time for transactions(ms): 37 Number of transactions batched in Syncs: 41 Number of syncs: 30 SyncTimes(ms): 686 datanode | 21/08/27 20:09:14 WARN impl.FsDatasetImpl: Lock held time above threshold: lock identifier: org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl lockHeldTimeMs=374 ms. Suppressed 0 lock warnings. The stack trace is: java.lang.Thread.getStackTrace(Thread.java:1556) datanode | org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) datanode | org.apache.hadoop.hdfs.InstrumentedLock.logWarning(InstrumentedLock.java:145) datanode | org.apache.hadoop.hdfs.InstrumentedLock.check(InstrumentedLock.java:181) datanode | org.apache.hadoop.hdfs.InstrumentedLock.unlock(InstrumentedLock.java:135) datanode | org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) datanode | org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) datanode | org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:261) datanode | org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:580) datanode | org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:145) datanode | org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:100) datanode | org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:288) datanode | java.lang.Thread.run(Thread.java:745) =========== Regards Mahaboob.. > Improve the speed of Datanode Block Scan > ---------------------------------------- > > Key: HDFS-15406 > URL: https://issues.apache.org/jira/browse/HDFS-15406 > Project: Hadoop HDFS > Issue Type: Improvement > Reporter: Hemanth Boyina > Assignee: Hemanth Boyina > Priority: Major > Fix For: 3.2.2, 3.3.1, 3.4.0 > > Attachments: HDFS-15406.001.patch, HDFS-15406.002.patch > > > In our customer cluster we have approx 10M blocks in one datanode > the Datanode to scans all the blocks , it has taken nearly 5mins > {code:java} > 2020-06-10 12:17:06,869 | INFO | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: > 11149530, missing metadata files:472, missing block files:472, missing blocks > in memory:0, mismatched blocks:0 | DirectoryScanner.java:473 > 2020-06-10 12:17:06,869 | WARN | > java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty > queue] | Lock held time above threshold: lock identifier: > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl > lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: > java.lang.Thread.getStackTrace(Thread.java:1559) > org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032) > org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148) > org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186) > org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133) > org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84) > org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375) > org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320) > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > java.lang.Thread.run(Thread.java:748) > | InstrumentedLock.java:143 {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org