[jira] Commented: (MAPREDUCE-326) The lowest level map-reduce APIs should be byte oriented

Doug Cutting (JIRA) Thu, 04 Feb 2010 10:22:52 -0800

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12829679#action_12829679
 ]


Doug Cutting commented on MAPREDUCE-326:
----------------------------------------

On MAPREDUCE-1126 I proposed the following as a pseudo low-level API:

{code}
final class Split {
  String[] locations;
  ByteBuffer data;
}
interface MapEmitter { // implemented by kernel
  emit(ByteBuffer datum, int partition);
}
interface CombineEmitter { // implemented by kernel
  emit(ByteBuffer datum);
}
interface Job {  // implemented by user code
  Split[] getSplits();
  void map(ByteBuffer splitData, MapEmitter out);
  int compare(ByteBuffer x, ByteBuffer y);
  boolean hasCombiner();
  void combine(Iterator<Iterator<ByteBuffer>>, CombineEmitter)
  void reduce(Iterator<Iterator<ByteBuffer>>, String attemptName);
  void abandon(String[] attemptNames);
  void commit(String[] attemptNames);
}
{code}

This would need hooks for progress reporting, etc., so each Job method might 
also need a Context object where implementations could report progress and 
otherwise interact with the runtime.

The more fundamental problem I see is that this is still very Java-centric.  
We'd like to make it easy for non-Java applications to submit jobs.  So, rather 
than having each of these aspects of a Job be a Java method, they might instead 
each be a command line.  A Job would be a set of command lines and an archive 
file that's unpacked whereever these are run that contains the implementation 
of these commands.  This is like Pipes and Streaming, but, instead of building 
these atop a Java API, the Java API would be built atop this.

The map command line, for example, would produce, on standard output, a 
sequence of output data entries, possibly intermixed with progress reports.  It 
might generate records like:

{code}
byte[] datum;
int partition;
float progress;
{code}

We could use Avro as the format for this output.

The reduce command might consume, from standard input, a series of records of 
the form:

{code}
boolean isFirst;
byte[] datum;
{code}

Currently the output of Java-based map sub-processes and the input of reduce 
sub-processes do not need to cross a process boundary before they enter the 
shuffle, whereas, with this proposal, they would.  But, with the Linux splice 
system call, one can implement zero-copy data transfer between processes.  So, 
in theory, this architecture need not impact Java performance.

Also, not only would it better support non-Java clients, it would also better 
support non-Java servers: we might eventually re-write the TaskTracker and/or 
JobTracker in a different language.  So long as Java classes are fundamental to 
job submission that is not possible.


> The lowest level map-reduce APIs should be byte oriented
> --------------------------------------------------------
>
>                 Key: MAPREDUCE-326
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-326
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: eric baldeschwieler
>
> As discussed here:
> https://issues.apache.org/jira/browse/HADOOP-1986#action_12551237
> The templates, serializers and other complexities that allow map-reduce to 
> use arbitrary types complicate the design and lead to lots of object creates 
> and other overhead that a byte oriented design would not suffer.  I believe 
> the lowest level implementation of hadoop map-reduce should have byte string 
> oriented APIs (for keys and values).  This API would be more performant, 
> simpler and more easily cross language.
> The existing API could be maintained as a thin layer on top of the leaner API.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-326) The lowest level map-reduce APIs should be byte oriented

Reply via email to