Dear Hadoop Users and Developers, I was wondering if there's a plan to add "file info cache" in DFSClient?
It could eliminate network travelling cost for contacting Namenode and I think it would greatly improve the DFSClient's performance. The code I was looking at was this ----------------------- DFSClient.java /** * Grab the open-file info from namenode */ synchronized void openInfo() throws IOException { /* Maybe, we could add a file info cache here! */ LocatedBlocks newInfo = callGetBlockLocations(src, 0, prefetchSize); if (newInfo == null) { throw new IOException("Cannot open filename " + src); } if (locatedBlocks != null) { Iterator<LocatedBlock> oldIter = locatedBlocks.getLocatedBlocks().iterator(); Iterator<LocatedBlock> newIter = newInfo.getLocatedBlocks().iterator(); while (oldIter.hasNext() && newIter.hasNext()) { if (! oldIter.next().getBlock().equals(newIter.next().getBlock())) { throw new IOException("Blocklist for " + src + " has changed!"); } } } this.locatedBlocks = newInfo; this.currentNode = null; } ----------------------- Does anybody have an opinion on this matter? Thank you in advance, Taeho