Hello Ranjith,

On Thu, Apr 7, 2011 at 10:26 AM, ranjith k <[email protected]> wrote:
> Hello.
>
> I need to create a custom input split. I need to split my input in to 50
> line for one input split. How can i do it.

Maybe you are looking for the NLineInputFormat? It creates input
splits for every defined N lines.

> And also there is an another problem for me. I have a file. But it is not in
> the form of text. It contain structure. I need to give one structure in to
> my map function as value. And the number of the record is my key. How can i
> achieve this. please help me.

You will need to implement a custom RecordReader for this; basically
you'll have to read your file and structure it to your specs using low
level byte reads off a DFS input stream for your file. Computing the
number of records in the same go may not be possible if the file/split
is too large to be held in the memory, but you may create a
SequenceFile out of this, which has the records count as the key to a
chunk of records as value.

-- 
Harsh J

Reply via email to