[
https://issues.apache.org/jira/browse/HADOOP-4780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12653780#action_12653780
]
he yongqiang commented on HADOOP-4780:
--------------------------------------
it seems that FileUtil.getDU(new File(baseDir.toString())) is quite
time-consuming, maybe we could just remove the function call of
FileUtil.getDU() in DistributedCache.getLocalCache. I don't think it matters if
this statement is removed.
> Task Tracker burns a lot of cpu in calling getLocalCache
> ---------------------------------------------------------
>
> Key: HADOOP-4780
> URL: https://issues.apache.org/jira/browse/HADOOP-4780
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.18.2
> Reporter: Runping Qi
>
> I noticed that many times, a task tracker max up to 6 cpus.
> During that time, iostat shows majority of that was system cpu.
> That situation can last for quite long.
> During that time, I saw a number of threads were in the following state:
> java.lang.Thread.State: RUNNABLE
> at java.io.UnixFileSystem.getBooleanAttributes0(Native Method)
> at
> java.io.UnixFileSystem.getBooleanAttributes(UnixFileSystem.java:228)
> at java.io.File.exists(File.java:733)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:399)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at org.apache.hadoop.fs.FileUtil.getDU(FileUtil.java:407)
> at
> org.apache.hadoop.filecache.DistributedCache.getLocalCache(DistributedCache.java:176)
> at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:140)
> I suspect that getLocalCache is too expensive.
> And calling it for every task initialization seems too much waste.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.