Improve interface to FileSystem.getFileCacheHints
-------------------------------------------------
Key: HADOOP-1296
URL: https://issues.apache.org/jira/browse/HADOOP-1296
Project: Hadoop
Issue Type: Improvement
Components: fs
Reporter: Owen O'Malley
Assigned To: dhruba borthakur
The FileSystem interface provides a very limited interface for finding the
location of the data. The current method looks like:
String[][] getFileCacheHints(Path file, long start, long len) throws IOException
which returns a list of "block info" where the block info consists of a list
host names. Because the hints don't include the information about where the
block boundaries are, map/reduce is required to call the name node for each
split. I'd propose that we fix the naming a bit and make it:
public class BlockInfo extends Writable {
public long getStart();
public String[] getHosts();
}
BlockInfo[] getFileHints(Path file, long start, long len) throws IOException;
So that map/reduce can query about the entire file and get the locations in a
single call.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.