Re: Not allow file split

Rahul Sood Wed, 07 May 2008 07:17:24 -0700

You can implement a custom input format and a record reader. Assuming
your record data type is class RecType, the input format should subclass
FileInputFormat< LongWritable, RecType > and the record reader should
implement RecordReader < LongWritable, RecType >


In this case the key could be the offset into the file, although it is
not very useful since you treat the entire file as one record. 

The isSplitable() method in the input format should return false.
The RecordReader.next( LongWritable pos, RecType val ) method should
read the entire file and set val to the file contents. This will ensure
that the entire file goes to one map task as a single record.

-Rahul Sood
[EMAIL PROTECTED]

> Hi at all, I'm a newbie and I have the following problem.
> 
> I need to implement an InputFormat such that the isSplitable always
> returns false ah shown in http://wiki.apache.org/hadoop/FAQ (question
> no 10).
> And here there is the problem.
> 
> I have also to implement the RecordReader interface for returning the
> whole content of the input file but I don't know how. I have found
> only examples that uses the LineRecordReader
> 
> Someone can help me?
> 
> Thanks
>

Re: Not allow file split

Reply via email to