[ https://issues.apache.org/jira/browse/MAPREDUCE-1823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12873109#action_12873109 ]
Scott Chen commented on MAPREDUCE-1823: --------------------------------------- Here's the corresponding jstack: {code} at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method) at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:215) at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:65) at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:69) - locked <0x00002aaab7e19810> (a sun.nio.ch.Util$1) - locked <0x00002aaab7e197f8> (a java.util.Collections$UnmodifiableSet) - locked <0x00002aaab7e19468> (a sun.nio.ch.EPollSelectorImpl) at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:80) at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:332) at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked <0x00002aaae427a320> (a java.io.BufferedInputStream) at java.io.DataInputStream.readShort(DataInputStream.java:295) at org.apache.hadoop.hdfs.DFSClient$BlockReader.newBlockReader(DFSClient.java:1436) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:1698) - locked <0x00002aaae4264f38> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:1815) - locked <0x00002aaae4264f38> (a org.apache.hadoop.hdfs.DFSClient$DFSInputStream) at java.io.DataInputStream.read(DataInputStream.java:83) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:134) at org.apache.hadoop.util.LineReader.readLine(LineReader.java:187) at org.apache.hadoop.fs.HarFileSystem.fileStatusInIndex(HarFileSystem.java:441) at org.apache.hadoop.fs.HarFileSystem.getFileStatus(HarFileSystem.java:616) at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:541) at org.apache.hadoop.raid.RaidNode.getParityFile(RaidNode.java:561) at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:639) at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655) at org.apache.hadoop.raid.RaidNode.recurse(RaidNode.java:655) at org.apache.hadoop.raid.RaidNode.selectFiles(RaidNode.java:594) at org.apache.hadoop.raid.RaidNode.access$300(RaidNode.java:63) at org.apache.hadoop.raid.RaidNode$TriggerMonitor.doProcess(RaidNode.java:374) at org.apache.hadoop.raid.RaidNode$TriggerMonitor.run(RaidNode.java:313) at java.lang.Thread.run(Thread.java:619) {code} > Reduce the number of calls of HarFileSystem.getFileStatus in RaidNode > --------------------------------------------------------------------- > > Key: MAPREDUCE-1823 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1823 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Affects Versions: 0.22.0 > Reporter: Scott Chen > Assignee: Scott Chen > Fix For: 0.22.0 > > > RaidNode makes lots of calls of HarFileSystem.getFileStatus. This method > fetches information from DataNode so it is slow. It becomes the bottleneck of > the RaidNode. It will be nice if we can make this more efficient. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.