[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873109#action_12873109
 ] 

Scott Chen commented on MAPREDUCE-1823:
---------------------------------------

Here's the corresponding jstack:
{code}
        at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215)
        at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65)
        at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69)
        - locked <0x00002aaab7e19810> (a sun.nio.ch.Util$1)
        - locked <0x00002aaab7e197f8> (a java.util.Collections$UnmodifiableSet)
        - locked <0x00002aaab7e19468> (a sun.nio.ch.EPollSelectorImpl)
        at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80)
        at 
org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:332)
        at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
        at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
        at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
        at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
        - locked <0x00002aaae427a320> (a java.io.BufferedInputStream)
        at java.io.DataInputStream.readShort(DataInputStream.java:295)
        at 
org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1436)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1698)
        - locked <0x00002aaae4264f38> (a 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
        at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1815)
        - locked <0x00002aaae4264f38> (a 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream)
        at java.io.DataInputStream.read(DataInputStream.java:83)
        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134)
        at org.apache.hadoop.util.LineReader.readLine(LineReader.java:187)
        at 
org.apache.hadoop.fs.HarFileSystem.fileStatusInIndex(HarFileSystem.java:441)
        at 
org.apache.hadoop.fs.HarFileSystem.getFileStatus(HarFileSystem.java:616)
        at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:541)
        at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:561)
        at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:639)
        at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655)
        at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655)
        at org.apache.hadoop.raid.RaidNode.selectFiles(RaidNode.java:594)
        at org.apache.hadoop.raid.RaidNode.access$300(RaidNode.java:63)
        at 
org.apache.hadoop.raid.RaidNode$TriggerMonitor.doProcess(RaidNode.java:374)
        at org.apache.hadoop.raid.RaidNode$TriggerMonitor.run(RaidNode.java:313)
        at java.lang.Thread.run(Thread.java:619)
{code}

> Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode
> ---------------------------------------------------------------------
>
>                 Key: MAPREDUCE-1823
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1823
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>    Affects Versions: 0.22.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>             Fix For: 0.22.0
>
>
> RaidNode makes lots of calls of HarFileSystem.getFileStatus. This method 
> fetches information from DataNode so it is slow. It becomes the bottleneck of 
> the RaidNode. It will be nice if we can make this more efficient.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to