[ 
https://issues.apache.org/jira/browse/HDFS-15406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17406064#comment-17406064
 ] 

shaik Mahaboob commented on HDFS-15406:
---------------------------------------

Hello,

Currently my local data node which installed through docker, is generating
below WARN while spark job writing file.

After below WARN message in my hadoop logs, the spark job is
gettking killed. any pointers to resolve this would be highly appricated.

====

namenode        | 21/08/27 20:02:48 INFO namenode.FSEditLog: Number of
transactions: 74 Total time for transactions(ms): 37 Number of transactions
batched in Syncs: 41 Number of syncs: 30 SyncTimes(ms): 686

datanode        | 21/08/27 20:09:14 WARN impl.FsDatasetImpl: Lock held time
above threshold: lock identifier:
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl
lockHeldTimeMs=374 ms. Suppressed 0 lock warnings. The stack trace is:
java.lang.Thread.getStackTrace(Thread.java:1556)

datanode        |
org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)

datanode        |
org.apache.hadoop.hdfs.InstrumentedLock.logWarning(InstrumentedLock.java:145)

datanode        |
org.apache.hadoop.hdfs.InstrumentedLock.check(InstrumentedLock.java:181)

datanode        |
org.apache.hadoop.hdfs.InstrumentedLock.unlock(InstrumentedLock.java:135)

datanode        |
org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)

datanode        |
org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)

datanode        |
org.apache.hadoop.hdfs.server.datanode.BlockSender.<init>(BlockSender.java:261)

datanode        |
org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock(DataXceiver.java:580)

datanode        |
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock(Receiver.java:145)

datanode        |
org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:100)

datanode        |
org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:288)

datanode        | java.lang.Thread.run(Thread.java:745)

===========



Regards

Mahaboob..


> Improve the speed of Datanode Block Scan
> ----------------------------------------
>
>                 Key: HDFS-15406
>                 URL: https://issues.apache.org/jira/browse/HDFS-15406
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Hemanth Boyina
>            Assignee: Hemanth Boyina
>            Priority: Major
>             Fix For: 3.2.2, 3.3.1, 3.4.0
>
>         Attachments: HDFS-15406.001.patch, HDFS-15406.002.patch
>
>
> In our customer cluster we have approx 10M blocks in one datanode 
> the Datanode to scans all the blocks , it has taken nearly 5mins
> {code:java}
> 2020-06-10 12:17:06,869 | INFO  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | BlockPool BP-1104115233-**.**.**.**-1571300215588 Total blocks: 
> 11149530, missing metadata files:472, missing block files:472, missing blocks 
> in memory:0, mismatched blocks:0 | DirectoryScanner.java:473
> 2020-06-10 12:17:06,869 | WARN  | 
> java.util.concurrent.ThreadPoolExecutor$Worker@3b4bea70[State = -1, empty 
> queue] | Lock held time above threshold: lock identifier: 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl 
> lockHeldTimeMs=329854 ms. Suppressed 0 lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:475)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
>  | InstrumentedLock.java:143 {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to