Hi all,
I am having issues using SequenceFileInputFormat to retrieve whole records
I have 1 job that is used to write to a SequenceFile
SequenceFileOutputFormat.setOutputPath(job, new Path(out/data));
SequenceFileOutputFormat.setOutputCompressionType(job,
SequenceFile.CompressionType.NONE);
I
Tim,
Do you also set your I/O formats explicitly to SequenceFileInputFormat
and SequenceFileOutputFormat? Via job.setInputFormat/setOutputFormat I
mean.
Hadoop should not be splitting records across maps/mappers. There are
specific test cases that ensure this does not happen, so it would seem
Harsh, that was exactly the issue!
Thanks very much for your help
Tim
On 19 August 2011 15:15, Harsh J ha...@cloudera.com wrote:
Tim,
Do you also set your I/O formats explicitly to SequenceFileInputFormat
and SequenceFileOutputFormat? Via job.setInputFormat/setOutputFormat I
mean.
Hadoop