Below r my simple mapper, partitioner classes and the input file and the output 
displayed on Console at the end of the message:

My question is about the keys it prints in the console window highlighted in 
bold in the console output which looks like this:

Here is the first few lines of the output in console:

...

13/03/27 02:20:57 INFO mapred.MapTask: data buffer = 79691776/99614720
13/03/27 02:20:57 INFO mapred.MapTask: record buffer = 262144/327680
key = 0 value = 10    10
token[0] = 10 token[1] = 10
Printing Result in Partitioner = 0
IntPair in Mapper = 10-10
key = 6 value = 20    200
token[0] = 20 token[1] = 200
Printing Result in Partitioner = 0
IntPair in Mapper = 20-200 

Q1: I am confused how/where it is calculating/getting these values Key=0 & 
Key=6 and so on?

Q2: After output of the first 2 lines it prints the output from the partitioner 
class:
       Printing Result in Partitioner = 0
Is this because its happening parallel y the mapper & the partitioner?

Will really appreciate if someone can take a quick look and pour some light in 
understanding it.

**** Mapper Class *** 


public class SecondarySortMapper extends  Mapper<LongWritable, Text, IntPair, 
IntWritable> {
    
    private String [] tokens = null;
    private IntWritable ONE = new IntWritable(1);


    @Override
    public void map(LongWritable key, Text value,
            Context context)
            throws IOException , InterruptedException{
        
        System.out.println("key = " + key.toString() + " value = " + 
value.toString());
        
        if(value!=null){
            tokens = value.toString().split("\\s+") ;
            System.out.println("token[0] = " + tokens[0] + " token[1] = " + 
tokens[1] );
            ONE.set(Integer.parseInt(tokens[1]));
            IntPair ip = new IntPair(Integer.parseInt(tokens[0]), 
Integer.parseInt(tokens[1]));
            context.write(ip, ONE);
            System.out.println("IntPair in Mapper = " + ip.toString());         
           
        }
    }

**** Partitioner class *** 

public class SecondarySortPartitioner extends Partitioner<IntPair, IntWritable> 
{

 
    @Override
    public int getPartition(IntPair key, IntWritable value, int 
numOfPartitions) {
        // TODO Auto-generated method stub
        
        int result = (key.getFirst().hashCode())%numOfPartitions;
        System.out.println("Printing Result in Partitioner = " + result);
        return result;
    }
    
}


*** input file ***

10    10
20    200
30    2500
40    400
50    500
60    1
10    10
30    2500
50    500
10    100
20    2000
30    25000
40    4000
50    5000
60    10
10    100
30    25000
50    5000



********** Here is the output in the console ****
...

13/03/27 02:20:57 INFO mapred.MapTask: data buffer = 79691776/99614720
13/03/27 02:20:57 INFO mapred.MapTask: record buffer = 262144/327680
key = 0 value = 10    10
token[0] = 10 token[1] = 10
Printing Result in Partitioner = 0
IntPair in Mapper = 10-10
key = 6 value = 20    200
token[0] = 20 token[1] = 200
Printing Result in Partitioner = 0
IntPair in Mapper = 20-200
key = 13 value = 30    2500
token[0] = 30 token[1] = 2500
Printing Result in Partitioner = 0
IntPair in Mapper = 30-2500
key = 21 value = 40    400
token[0] = 40 token[1] = 400
Printing Result in Partitioner = 0
IntPair in Mapper = 40-400
key = 28 value = 50    500
token[0] = 50 token[1] = 500
Printing Result in Partitioner = 0
IntPair in Mapper = 50-500
key = 35 value = 60    1
token[0] = 60 token[1] = 1
Printing Result in Partitioner = 0
IntPair in Mapper = 60-1
key = 40 value = 10    10
token[0] = 10 token[1] = 10
Printing Result in Partitioner = 0
IntPair in Mapper = 10-10
key = 46 value = 30    2500
token[0] = 30 token[1] = 2500
Printing Result in Partitioner = 0
IntPair in Mapper = 30-2500
key = 54 value = 50    500
token[0] = 50 token[1] = 500
Printing Result in Partitioner = 0
IntPair in Mapper = 50-500
key = 61 value = 10    100
token[0] = 10 token[1] = 100
Printing Result in Partitioner = 0
IntPair in Mapper = 10-100
key = 68 value = 20    2000
token[0] = 20 token[1] = 2000
Printing Result in Partitioner = 0
IntPair in Mapper = 20-2000
key = 76 value = 30    25000
token[0] = 30 token[1] = 25000
Printing Result in Partitioner = 0
IntPair in Mapper = 30-25000
key = 85 value = 40    4000
token[0] = 40 token[1] = 4000
Printing Result in Partitioner = 0
IntPair in Mapper = 40-4000
key = 93 value = 50    5000
token[0] = 50 token[1] = 5000
Printing Result in Partitioner = 0
IntPair in Mapper = 50-5000
key = 101 value = 60    10
token[0] = 60 token[1] = 10
Printing Result in Partitioner = 0
IntPair in Mapper = 60-10
key = 107 value = 10    100
token[0] = 10 token[1] = 100
Printing Result in Partitioner = 0
IntPair in Mapper = 10-100
key = 114 value = 30    25000
token[0] = 30 token[1] = 25000
Printing Result in Partitioner = 0
IntPair in Mapper = 30-25000
key = 123 value = 50    5000
token[0] = 50 token[1] = 5000
Printing Result in Partitioner = 0
IntPair in Mapper = 50-5000



Thanks
Sai

Reply via email to