Hello Ranjith, On Thu, Apr 7, 2011 at 10:26 AM, ranjith k <[email protected]> wrote: > Hello. > > I need to create a custom input split. I need to split my input in to 50 > line for one input split. How can i do it.
Maybe you are looking for the NLineInputFormat? It creates input splits for every defined N lines. > And also there is an another problem for me. I have a file. But it is not in > the form of text. It contain structure. I need to give one structure in to > my map function as value. And the number of the record is my key. How can i > achieve this. please help me. You will need to implement a custom RecordReader for this; basically you'll have to read your file and structure it to your specs using low level byte reads off a DFS input stream for your file. Computing the number of records in the same go may not be possible if the file/split is too large to be held in the memory, but you may create a SequenceFile out of this, which has the records count as the key to a chunk of records as value. -- Harsh J
