[ http://issues.apache.org/jira/browse/HADOOP-451?page=comments#action_12427953 ] Doug Cutting commented on HADOOP-451: -------------------------------------
> You will probably need to add a "setSplitClass" and "getSplitClass" to > JobConf The InputFormat is a Split factory, no? Who else would need to create Splits? Splits are passed in RPCs, and the RPC mechanism supports polymorphism. Splits are not written to files, so I see no reason to declare a fixed class per job. Am I missing something? > Add a Split interface > --------------------- > > Key: HADOOP-451 > URL: http://issues.apache.org/jira/browse/HADOOP-451 > Project: Hadoop > Issue Type: Improvement > Components: mapred > Reporter: Doug Cutting > Fix For: 0.6.0 > > > The InputFormat interface has a method: > FileSplit[] getSplits(); > This should change to: > Split[] getSplits(); > The Split interface would look like: > public interface Split extends Writable { > /** Returns a list of hosts that contain this split. > This is only used to optimize task placement, so this may be empty. */ > String[] getLocations(FileSystem fs); > /** The relative, estimated cost of operating on this. Typically the size > of the data in the split. > Used to prioritize tasks in a job (high-cost tasks are run first). */ > long getCost(); > } -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira