It also happens that an implementation of the class to read the whole file
is given in Tom's Hadoop book on page 192-194.  It returns the file as
BytesWritable.

On Sat, Jan 23, 2010 at 4:30 PM, Alex Kozlov <ale...@cloudera.com> wrote:

> By the design, the TextInputFormat will split the file into lines and pass
> each one as a record.
>
> If you override isSplittable(), it will still return a bunch of records.
>  Each file will be a split.
>
> If you want to get the context of a single file, the best way is to put the
> files into a SequenceFile, one per key, which can be the file name, and read
> the file as bytes.
>
> Alternatively, you can pass a file where each line is a file name to a
> mapper and open the file explicitly within the mapper.
>
> On Sat, Jan 23, 2010 at 8:48 AM, prashant ullegaddi <
> prashullega...@gmail.com> wrote:
>
>> Why don't you extend FileInputFormat, and implement
>> isSplittable<
>> http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapred/FileInputFormat.html#isSplitable%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path%29
>> >,
>> so that it returns false.
>>
>>
>> On Sat, Jan 23, 2010 at 10:05 PM, stolikp <stol...@o2.pl> wrote:
>>
>> >
>> > I've got some text files in my input directory and I want to pass each
>> > single
>> > text file (whole file not just a line) to a map (one file per one map).
>> How
>> > can I do this ? TextInputFormat splits text into lines and I do not want
>> > this to happen.
>> > I tried:
>> >
>> >
>> http://hadoop.apache.org/common/docs/r0.20./streaming.html#How+do+I+process+files%2C+one+per+map%3F
>> > but it doesn't work for me, compiler doesn't know what
>> > NonSplitableTextInputFormat.class is.
>> > I'm using hadoop 0.20.1
>> > --
>> > View this message in context:
>> >
>> http://old.nabble.com/Passing-whole-text-file-to-a-single-map-tp27287649p27287649.html
>> > Sent from the Hadoop core-user mailing list archive at Nabble.com.
>> >
>> >
>>
>>
>> --
>> Thanks,
>> Prashant Ullegaddi,
>> Search and Information Extraction Lab,
>> IIIT-Hyderabad, INDIA.
>>
>
>

Reply via email to