SequenceFileInputFormat doesn't return whole records

2011-08-19 Thread Tim Fletcher
Hi all, I am having issues using SequenceFileInputFormat to retrieve whole records I have 1 job that is used to write to a SequenceFile SequenceFileOutputFormat.setOutputPath(job, new Path(out/data)); SequenceFileOutputFormat.setOutputCompressionType(job, SequenceFile.CompressionType.NONE); I

Re: SequenceFileInputFormat doesn't return whole records

2011-08-19 Thread Harsh J
Tim, Do you also set your I/O formats explicitly to SequenceFileInputFormat and SequenceFileOutputFormat? Via job.setInputFormat/setOutputFormat I mean. Hadoop should not be splitting records across maps/mappers. There are specific test cases that ensure this does not happen, so it would seem

Re: SequenceFileInputFormat doesn't return whole records

2011-08-19 Thread Tim Fletcher
Harsh, that was exactly the issue! Thanks very much for your help Tim On 19 August 2011 15:15, Harsh J ha...@cloudera.com wrote: Tim, Do you also set your I/O formats explicitly to SequenceFileInputFormat and SequenceFileOutputFormat? Via job.setInputFormat/setOutputFormat I mean. Hadoop