Seems like StreamInputFormat not yet ported to new API.That's why you are not able to set as InputFormatClass. You can file a jira for this issue.
On Tue, Jun 19, 2012 at 4:49 PM, Mohammad Tariq <donta...@gmail.com> wrote: > My driver function looks like this - > > public static void main(String[] args) throws IOException, > InterruptedException, ClassNotFoundException { > // TODO Auto-generated method stub > > Configuration conf = new Configuration(); > Job job = new Job(); > conf.set("stream.recordreader.class", > "org.apache.hadoop.streaming.StreamXmlRecordReader"); > conf.set("stream.recordreader.begin", "<info>"); > conf.set("stream.recordreader.end", "</info>"); > job.setInputFormatClass(StreamInputFormat.class); > job.setOutputKeyClass(Text.class); > job.setOutputValueClass(IntWritable.class); > FileInputFormat.addInputPath(job, new > Path("/mapin/demo.xml")); > FileOutputFormat.setOutputPath(job, new > Path("/mapout/demo")); > job.waitForCompletion(true); > } > > Could you please out my mistake?? > > Regards, > Mohammad Tariq > > > On Tue, Jun 19, 2012 at 4:35 PM, Mohammad Tariq <donta...@gmail.com> > wrote: > > Hello Madhu, > > > > Thanks for the response. Actually I was trying to use the > > new API (Job). Have you tried that. I was not able to set the > > InputFormat using the Job API. > > > > Regards, > > Mohammad Tariq > > > > > > On Tue, Jun 19, 2012 at 4:28 PM, madhu phatak <phatak....@gmail.com> > wrote: > >> Hi, > >> Set the following properties in driver class > >> > >> jobConf.set("stream.recordreader.class", > >> "org.apache.hadoop.streaming.StreamXmlRecordReader"); > >> jobConf.set("stream.recordreader.begin", > >> "start-tag"); > >> jobConf.set("stream.recordreader.end", > >> "end-tag"); > >> jobConf.setInputFormat(StreamInputFormat,class); > >> > >> In Mapper, xml record will come as key of type Text,so your mapper will > >> look like > >> > >> public class MyMapper<K,V> implements Mapper<Text,Text,K,V> > >> > >> > >> On Tue, Jun 19, 2012 at 2:49 AM, Mohammad Tariq <donta...@gmail.com> > wrote: > >>> > >>> Hello list, > >>> > >>> Could anyone, who has written MapReduce jobs to process xml > >>> documents stored in there cluster using "StreamXmlRecordReader" share > >>> his/her experience??...or if you can provide me some pointers > >>> addressing that..Many thanks. > >>> > >>> Regards, > >>> Mohammad Tariq > >> > >> > >> > >> > >> -- > >> https://github.com/zinnia-phatak-dev/Nectar > >> > -- https://github.com/zinnia-phatak-dev/Nectar