You use TextInputFormat, you'll get the following key<LongWritable>, value<Text> pairs in your mapper:
file_position, your_input Example: 0, "0\t[356:0.3481597,359:0.3481597,358:0.3481597,361:0.3481597,360:0.3481597]" 100, "8\t[356:0.34786037,359:0.34786037,358:0.34786037,361:0.34786037,360:0.34786 037]" 200, "25\t[284:0.34821576,286:0.34821576,287:0.34821576,288:0.34821576,289:0.3482 1576]" Then just parse it out in your mapper. -----Original Message----- From: Pat Ferrel [mailto:pat.fer...@gmail.com] Sent: Wednesday, December 12, 2012 7:50 AM To: user@hadoop.apache.org Subject: Hadoop 101 Stupid question for the day. I have a file created by a mahout job of the form: 0 [356:0.3481597,359:0.3481597,358:0.3481597,361:0.3481597,360:0.3481597] 8 [356:0.34786037,359:0.34786037,358:0.34786037,361:0.34786037,360:0.34786037] 25 [284:0.34821576,286:0.34821576,287:0.34821576,288:0.34821576,289:0.34821576] 28 [452:0.34802154,454:0.34802154,453:0.34802154,456:0.34802154,455:0.34802154] . If this were a SequenceFile I could read it and be merrily on my way but it's a text file. The classes written are key, value pairs <LongWritable, VectorWritable> but the file is tab delimited text. I was hoping to do something like: SequenceFile.Reader reader = new SequenceFile.Reader(fs, inputFile, conf); Writable userId = new LongWritable(); VectorWritable recommendations = new VectorWritable(); while (reader.next(userId, recommendations)) { //do something with each pair } But alas Google fails me. How do you read in key, values pairs from text files outside of a map or reduce?