Yeah I solved the problem. I had to move the configuration options from the
main function of my code to the workflow.xml file. The reason for confusion
was the "Quick Start" Documentation. I strongly feel the documentation
needs to be clear about this.

On Wed, May 23, 2012 at 10:30 PM, Alejandro Abdelnur <[email protected]>wrote:

> Hi Anshul,
>
> It looks like you are setting the key/value classes for the
> input/output/intermediate-output and Hadoop is trying to use the default
> ones.
>
> thx
>
> On Tue, May 22, 2012 at 11:21 PM, Anshul Singhle
> <[email protected]>wrote:
>
> > Hi all,
> > I tried running the wordcount example on oozie and i'm getting the
> > following exception on my hadoop log:
> > ERROR org.apache.hadoop.security.UserGroupInformation:
> > PriviledgedActionException as:cloudera (auth:SIMPLE)
> > cause:java.io.IOException: Type mismatch in key from map: expected
> > org.apache.hadoop.io.LongWritable, recieved org.apache.hadoop.io.Text
> > 2012-05-22 18:58:27,832 WARN org.apache.hadoop.mapred.Child: Error
> running
> > child
> > java.io.IOException: Type mismatch in key from map: expected
> > org.apache.hadoop.io.LongWritable, recieved org.apache.hadoop.io.Text
> >        at
> >
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:871)
> >        at
> >
> >
> org.apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:499)
> >        at WordCount$Map.map(WordCount.java:22)
> >        at WordCount$Map.map(WordCount.java:12)
> >        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> >        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:391)
> >        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:325)
> >        at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
> >        at java.security.AccessController.doPrivileged(Native Method)
> >        at javax.security.auth.Subject.doAs(Subject.java:396)
> >        at
> >
> >
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
> >        at org.apache.hadoop.mapred.Child.main(Child.java:264)
> >
> > WordCount.java is copied verbatim from the apache mapreduce tutorial. I
> > removed the combiner that solved the problem for some people over at
> > stackoverflow, but i'm still getting the error. I ran the same jar in
> > hadoop and got the correct output with no error. Posting WordCount.java
> for
> > completeness.
> >
> >  import java.io.IOException;
> >        import java.util.*;
> >
> >        import org.apache.hadoop.fs.Path;
> >        import org.apache.hadoop.conf.*;
> >        import org.apache.hadoop.io.*;
> >        import org.apache.hadoop.mapred.*;
> >        import org.apache.hadoop.util.*;
> >
> >        public class WordCount {
> >
> >           public static class Map extends MapReduceBase implements
> > Mapper<LongWritable, Text, Text, IntWritable> {
> >             private final static IntWritable one = new IntWritable(1);
> >             private Text word = new Text();
> >
> >             public void map(LongWritable key, Text value,
> > OutputCollector<Text, IntWritable> output, Reporter reporter) throws
> > IOException {
> >               String line = value.toString();
> >               System.out.println("here"+"\t"+ line + "\t" + key);
> >               StringTokenizer tokenizer = new StringTokenizer(line);
> >               while (tokenizer.hasMoreTokens()) {
> >                 word.set(tokenizer.nextToken());
> >                 output.collect(word, one);
> >               }
> >             }
> >           }
> >
> >           public static class Reduce extends MapReduceBase implements
> > Reducer<Text, IntWritable, Text, IntWritable> {
> >  public void reduce(Text key, Iterator<IntWritable> values,
> > OutputCollector<Text, IntWritable> output, Reporter reporter) throws
> > IOException {
> >               int sum = 0;
> >               while (values.hasNext()) {
> >                 sum += values.next().get();
> >               }
> >               output.collect(key, new IntWritable(sum));
> >             }
> >           }
> >
> >           public static void main(String[] args) throws Exception {
> >             JobConf conf = new JobConf(WordCount.class);
> >             //conf.setJobName("wordcount");
> >             //conf.setJar("wordcount.jar");
> >             conf.setOutputKeyClass(Text.class);
> >             conf.setOutputValueClass(IntWritable.class);
> >
> >             conf.setMapperClass(Map.class);
> >             conf.setReducerClass(Reduce.class);
> >
> >             conf.setInputFormat(TextInputFormat.class);
> >             conf.setOutputFormat(TextOutputFormat.class);
> >
> >             //FileInputFormat.setInputPaths(conf, new Path(args[0]));
> >             //FileOutputFormat.setOutputPath(conf, new Path(args[1]));
> >
> >             JobClient.runJob(conf);
> >           }
> >        }
> >
> >  Note that if I change mapper to map and reducer to reduce , I don't get
> > the error but I get wrong output. I checked the input given to the mapper
> > and that is apparently empty. And here is my workflow.xml.  :
> >
> > <workflow-app name='wordcount-wf' xmlns="uri:oozie:workflow:0.1">
> >  <start to='wordcount'/>
> >    <action name='wordcount'>
> >        <map-reduce>
> >            <job-tracker>${jobTracker}</job-tracker>
> >            <name-node>${nameNode}</name-node>
> >            <prepare>
> >            </prepare>
> >            <configuration>
> >
> >                <property>
> >                    <name>mapred.job.queue.name</name>
> >                    <value>${queueName}</value>
> >                </property>
> >                <property>
> >                    <name>mapred.mapper.class</name>
> >                    <value>WordCount$Map</value>
> >                </property>
> >                <property>
> >                    <name>mapred.reducer.class</name>
> >                    <value>WordCount$Reduce</value>
> >                </property>
> >                <property>
> >                    <name>mapred.input.dir</name>
> >                    <value>${inputDir}</value>
> >                </property>
> >                <property>
> >                    <name>mapred.output.dir</name>
> >                    <value>${outputDir}</value>
> >                </property>
> >            </configuration>
> >        </map-reduce>
> >        <ok to='end'/>
> >        <error to='end'/>
> >    </action>
> >    <kill name='kill'>
> >       <message>${wf:errorCode("wordcount")}</message>
> >    </kill>
> >    <end name='end'/>
> > </workflow-app>
> >
> > I'm running oozie and hadoop on a single VM taken from cloudera.
> > oozie version:
> >  Oozie client build version: 2.3.2-cdh3u4
> > hadoop version:
> > Subversion file:///data/1/tmp/topdir/BUILD/hadoop-0.20.2-cdh3u4 -r
> > 214dd731e3bdb687cb55988d3f47dd9e248c5690
> > Compiled by root on Mon May  7 14:03:02 PDT 2012
> > From source with checksum a60c9795e41a3248b212344fb131c12c
> >
>
>
>
> --
> Alejandro
>

Reply via email to