[ http://issues.apache.org/jira/browse/HADOOP-645?page=comments#action_12445300 ] Yoram Arnon commented on HADOOP-645: ------------------------------------
it's streaming specific. That said, with each map task normally working on one entire dfs block but sometimes working on an entire file (like in the case of gzipped data), generating one file per map will result in fairly large files on output. If the imput was a bunch of small files to begin with, the output is no worse than the input. With iterative jobs in particular, where the job output is the input to the next job and is really temporary, it is very reasonable to skip shuffling the data and sorting it if possible. > Map-reduce task does not finish correctly when -reducer NONE is specified > ------------------------------------------------------------------------- > > Key: HADOOP-645 > URL: http://issues.apache.org/jira/browse/HADOOP-645 > Project: Hadoop > Issue Type: Bug > Components: contrib/streaming > Affects Versions: 0.7.2 > Reporter: dhruba borthakur > Assigned To: dhruba borthakur > > Map-reduce task does not finish correctly when -reducer NONE is specified, > The NONE option means that the reducer should not be generating any output. > Using this option causes an exception in the task tracker: > java.lang.IllegalArgumentException: URI is not hierarchical > TaskRunner: at java.io.File.<init>(File.java:335) > TaskRunner: at > org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:583) > TaskRunner: at > org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:96) > TaskRunner: at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:49) > TaskRunner: at org.apache.hadoop.mapred.MapTask.run(MapTask.java:213) > TaskRunner: at > org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1240) > TaskRunner: sideEffectURI_ file:output length 11 -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira