It also happens that an implementation of the class to read the whole file is given in Tom's Hadoop book on page 192-194. It returns the file as BytesWritable.
On Sat, Jan 23, 2010 at 4:30 PM, Alex Kozlov <ale...@cloudera.com> wrote: > By the design, the TextInputFormat will split the file into lines and pass > each one as a record. > > If you override isSplittable(), it will still return a bunch of records. > Each file will be a split. > > If you want to get the context of a single file, the best way is to put the > files into a SequenceFile, one per key, which can be the file name, and read > the file as bytes. > > Alternatively, you can pass a file where each line is a file name to a > mapper and open the file explicitly within the mapper. > > On Sat, Jan 23, 2010 at 8:48 AM, prashant ullegaddi < > prashullega...@gmail.com> wrote: > >> Why don't you extend FileInputFormat, and implement >> isSplittable< >> http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/mapred/FileInputFormat.html#isSplitable%28org.apache.hadoop.fs.FileSystem,%20org.apache.hadoop.fs.Path%29 >> >, >> so that it returns false. >> >> >> On Sat, Jan 23, 2010 at 10:05 PM, stolikp <stol...@o2.pl> wrote: >> >> > >> > I've got some text files in my input directory and I want to pass each >> > single >> > text file (whole file not just a line) to a map (one file per one map). >> How >> > can I do this ? TextInputFormat splits text into lines and I do not want >> > this to happen. >> > I tried: >> > >> > >> http://hadoop.apache.org/common/docs/r0.20./streaming.html#How+do+I+process+files%2C+one+per+map%3F >> > but it doesn't work for me, compiler doesn't know what >> > NonSplitableTextInputFormat.class is. >> > I'm using hadoop 0.20.1 >> > -- >> > View this message in context: >> > >> http://old.nabble.com/Passing-whole-text-file-to-a-single-map-tp27287649p27287649.html >> > Sent from the Hadoop core-user mailing list archive at Nabble.com. >> > >> > >> >> >> -- >> Thanks, >> Prashant Ullegaddi, >> Search and Information Extraction Lab, >> IIIT-Hyderabad, INDIA. >> > >