Hi Tariq I assume the mapper being used is IdentityMapper instead of XPTMapper class. Can you share your main class?
If you are using TextInputFormat an reading from a file in hdfs, it should have LongWritable Keys as input and your code has IntWritable as the input key type. Have a check on that as well. Regards Bejoy KS Sent from handheld, please excuse typos. -----Original Message----- From: Mohammad Tariq <donta...@gmail.com> Date: Thu, 2 Aug 2012 15:48:42 To: <mapreduce-user@hadoop.apache.org> Reply-To: mapreduce-user@hadoop.apache.org Subject: Re: Reading fields from a Text line Thanks for the response Harsh n Sri. Actually, I was trying to prepare a template for my application using which I was trying to read one line at a time, extract the first field from it and emit that extracted value from the mapper. I have these few lines of code for that : public static class XPTMapper extends Mapper<IntWritable, Text, LongWritable, Text>{ public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException{ Text word = new Text(); String line = value.toString(); if (!line.startsWith("TT")){ context.setStatus("INVALID LINE..SKIPPING........"); }else{ String stdid = line.substring(0, 7); word.set(stdid); context.write(key, word); } } But the output file contains all the rows of the input file including the lines which I was expecting to get skipped. Also, I was expecting only the fields I am emitting but the file contains entire lines. Could you guys please point out the the mistake I might have made. (Pardon my ignorance, as I am not very good at MapReduce).Many thanks. Regards, Mohammad Tariq On Thu, Aug 2, 2012 at 10:58 AM, Sriram Ramachandrasekaran <sri.ram...@gmail.com> wrote: > Wouldn't it be better if you could skip those unwanted lines > upfront(preprocess) and have a file which is ready to be processed by the MR > system? In any case, more details are needed. > > > On Thu, Aug 2, 2012 at 8:23 AM, Harsh J <ha...@cloudera.com> wrote: >> >> Mohammad, >> >> > But it seems I am not doing things in correct way. Need some guidance. >> >> What do you mean by the above? What is your written code exactly >> expected to do and what is it not doing? Perhaps since you ask for a >> code question here, can you share it with us (pastebin or gists, >> etc.)? >> >> For skipping 8 lines, if you are using splits, you need to detect >> within the mapper or your record reader if the map task filesplit has >> an offset of 0 and skip 8 line reads if so (Cause its the first split >> of some file). >> >> On Thu, Aug 2, 2012 at 1:54 AM, Mohammad Tariq <donta...@gmail.com> wrote: >> > Hello list, >> > >> > I have a flat file in which data is stored as lines of 107 >> > bytes each. I need to skip the first 8 lines(as they don't contain any >> > valuable info). Thereafter, I have to read each line and extract the >> > information from them, but not the line as a whole. Each line is >> > composed of several fields without any delimiter between them. For >> > example, the first field is of 8 bytes, second of 2 bytes and so on. I >> > was trying to reach each line as a Text value, convert it into string >> > and using String.subring() method to extract the value of each field. >> > But it seems I am not doing things in correct way. Need some >> > guidance. Many thanks. >> > >> > Regards, >> > Mohammad Tariq >> >> >> >> -- >> Harsh J > > > > > -- > It's just about how deep your longing is! >