You need to add a call to MultipleOutputs.close() in your reducer's cleanup:
public void cleanup(Context) throws IOException {
mos.close();
...
}
On Fri, May 6, 2011 at 1:55 PM, Geoffry Roberts
wrote:
> All,
>
> I am attempting to take a large file and split it up into a series of
> smal
All,
I am attempting to take a large file and split it up into a series of
smaller files. I want the smaller files to be named based on values taken
from the large file. I am using
org.apache.hadoop.mapreduce.lib.output.MultipleOutputs to do this.
The job runs without error and produces a set o
Yes. They can be used inside a mapper also.
See org.apache.hadoop.mapred.lib.TestMultipleOutputs.java or
org.apache.hadoop.mapreduce.lib.output.TestMRMultiplteOutputs for some sample
code.
Thanks
Amareshwari
On 6/9/10 5:57 AM, "Torsten Curdt" wrote:
Can the MultipleOutputs also be used insid
Can the MultipleOutputs also be used inside a mapper?
So basically I pipe data into different reducers from the mapper.
Of course I could do two separate jobs but that would very inefficient
as I would have to go/read through all the data twice.
cheers
--
Torsten
On Tue, Jun 8, 2010 at 06:22, A
MultipleOutputs is ported to use new api through
http://issues.apache.org/jira/browse/MAPREDUCE-370
See the discussions on jira and javadoc/testcase as an example on how to use it.
Thanks
Amareshwari
On 6/7/10 8:08 PM, "Torsten Curdt" wrote:
I need to emit to different output files from a redu
I need to emit to different output files from a reducer.
The old API had MultipleSequenceFileOutputFormat.
Am I missing something or is this gone in the new API?
Are there any problems porting this over?
Or does it just needs to be done?
cheers
--
Torsten