How to split a sequence file

2012-09-11 Thread Jason Yang
Hi, I have a sequence file written by SequenceFileOutputFormat with key/value type of Text, BytesWritable, like below: Text BytesWritable - id_A_01 7F2B3C687F2B3C687F2B3C68 id_A_02

Re: How to split a sequence file

2012-09-11 Thread Robert Dyer
If the file is pre-sorted, why not just make multiple sequence files - 1 for each split? Then you don't have to compute InputSplits because the physical files are already split. On Tue, Sep 11, 2012 at 11:00 PM, Harsh J ha...@cloudera.com wrote: Hey Jason, Is the file pre-sorted? You could