[jira] [Commented] (MAPREDUCE-4507) IdentityMapper is being triggered when the type of the Input Key at class level and method level has a conflict

Bejoy KS (JIRA) Thu, 02 Aug 2012 15:53:05 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427695#comment-13427695
 ]


Bejoy KS commented on MAPREDUCE-4507:
-------------------------------------

This piece of code will trigger IdentityReducer. No compile time errors thrown 
even though the Input Key Type is not matching at Class and Method levels 

Main Class
{code}
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;


public class WcNewMain extends Configured implements Tool
{
      public int run(String[] args) throws Exception
      {
            //getting configuration object and setting job name
            Configuration conf = getConf();
                Job job = new Job(conf, "Word Count ");
              
                //setting the class names
                job.setJarByClass(WcNewMain.class);
                job.setMapperClass(WcMapperNew.class);
                //job.setReducerClass(WordCountReducer.class);
                job.setNumReduceTasks(0);
        
                //setting the output data type classes
                job.setOutputKeyClass(Text.class);
                job.setOutputValueClass(IntWritable.class);
        
              
                
                FileInputFormat.addInputPath(job, new 
Path("hdfs://localhost:9000/userdata/bejoy/samples/wc/input"));
                    FileOutputFormat.setOutputPath(job, new 
Path("hdfs://localhost:9000/userdata/bejoy/samples/wc/output"));
        
                return job.waitForCompletion(true) ? 0 : 1;
    }

    public static void main(String[] args) throws Exception {
        int res = ToolRunner.run(new Configuration(), new WcNewMain(), args);
        System.exit(res);
    }
}


{code}

Mapper Class
{code}
import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

public class WcMapperNew extends Mapper<IntWritable, Text, Text, IntWritable>
{
            //hadoop supported data types
      private final static IntWritable one = new IntWritable(1);
      private Text word = new Text();
       
           public void map(LongWritable key, Text value, Context context) 
throws IOException, InterruptedException
           {
             //taking one line at a time and tokenizing the same
               String line = value.toString();
               StringTokenizer tokenizer = new StringTokenizer(line);
           
             //iterating through all the words available in that line and 
forming the key value pair
               while (tokenizer.hasMoreTokens())
               {
                  word.set(tokenizer.nextToken());
                  //sending to output collector which inturn passes the same to 
reducer
                  context.write(word, one);
               }
           }
           
 }
{code}
                
> IdentityMapper is being triggered when the type of the Input Key at class 
> level and method level has a conflict
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4507
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4507
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: mrv1
>    Affects Versions: 1.0.3
>         Environment: linux ubuntu
>            Reporter: Bejoy KS
>
> If we use the default InputFormat (TextInputFormat) but specify the Key type 
> in mapper as IntWritable instead of Long Writable. The framework is supposed 
> throw a class cast exception.Such an exception is thrown only if the key 
> types at class level and method level are the same (IntWritable). But if we 
> provide the Input key type as IntWritable on the class level but LongWritable 
> on the method level (map method), instead of throwing a compile time error, 
> the code compliles fine . In addition to it on execution the framework 
> triggers Identity Mapper instead of the custom mapper provided with the 
> configuration. In this case the 'mapreduce.map.class' in job.xml shows mapper 
> as Custom Mapper itself , it should show IdentityMapper in cases where 
> IdentityMapper is triggered to avoid confusion and easy debugging.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-4507) IdentityMapper is being triggered when the type of the Input Key at class level and method level has a conflict

Reply via email to