Thanks Mahesh for your help.

Wondering if u can provide some insight with the below compare method using 
byte[] in the SecondarySort example:

public static class Comparator extends WritableComparator {
        public Comparator() {
            super(URICountKey.class);
        }

        public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int 
l2) {
            return compareBytes(b1, s1, l1, b2, s2, l2);
        }
    }


My question is in the below compare method that i have given we are comparing 
word1/word2
which makes sense but what about this byte[] comparison, is it right in 
assuming  it converts each objects word1/word2/word3 to byte[] and compares 
them.
If so is it for performance reason it is done.
Could you please verify.
Thanks
Sai


________________________________
 From: Mahesh Balija <balijamahesh....@gmail.com>
To: user@hadoop.apache.org; Sai Sai <saigr...@yahoo.in> 
Sent: Saturday, 23 February 2013 5:23 AM
Subject: Re: WordPairCount Mapreduce question.
 

Please check the in-line answers...


On Sat, Feb 23, 2013 at 6:22 PM, Sai Sai <saigr...@yahoo.in> wrote:


>
>Hello
>
>
>I have a question about how Mapreduce sorting works internally with multiple 
>columns.
>
>
>Below r my classes using 2 columns in an input file given below.
>
>
>
>1st question: About the method hashCode, we r adding a "31 + ", i am wondering 
>why is this required. what does 31 refer to.
>
This is how usually hashcode is calculated for any String instance 
(s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]) where n stands for length of the 
String. Since in your case you only have 2 chars then it will be a * 31^0 + b * 
31^1.
 


>
>2nd question: what if my input file has 3 columns instead of 2 how would you 
>write a compare method and was wondering if anyone can map this to a real 
>world scenario it will be really helpful.
>
you will extend the same approach for the third column,
 public int compareTo(WordPairCountKey o) {
        int diff = word1.compareTo(o.word1);
        if (diff == 0) {
            diff = word2.compareTo(o.word2);
            if(diff==0){
                 diff = word3.compareTo(o.word3);
            }
        }
        return diff;
    }
    

>
>
>
>    @Override
>    public int compareTo(WordPairCountKey o) {
>        int diff = word1.compareTo(o.word1);
>        if (diff == 0) {
>            diff = word2.compareTo(o.word2);
>        }
>        return diff;
>    }
>    
>    @Override
>    public int hashCode() {
>        return word1.hashCode() + 31 * word2.hashCode();
>    }
>
>
>******************************
>
>Here is my input file wordpair.txt
>
>******************************
>
>a    b
>a    c
>a    b
>a    d
>b    d
>e    f
>b    d
>e    f
>b    d
>
>**********************************
>
>
>Here is my WordPairObject:
>
>*********************************
>
>public class WordPairCountKey implements WritableComparable<WordPairCountKey> {
>
>    private String word1;
>    private String word2;
>
>    @Override
>    public int compareTo(WordPairCountKey o) {
>        int diff = word1.compareTo(o.word1);
>        if (diff == 0) {
>            diff = word2.compareTo(o.word2);
>        }
>        return diff;
>    }
>    
>    @Override
>    public int hashCode() {
>        return word1.hashCode() + 31 * word2.hashCode();
>    }
>
>    
>    public String getWord1() {
>        return word1;
>    }
>
>    public void setWord1(String word1) {
>        this.word1 = word1;
>    }
>
>    public String getWord2() {
>        return word2;
>    }
>
>    public void setWord2(String word2) {
>        this.word2 = word2;
>    }
>
>    @Override
>    public void readFields(DataInput in) throws IOException {
>        word1 = in.readUTF();
>        word2 = in.readUTF();
>    }
>
>    @Override
>    public void
 write(DataOutput out) throws IOException {
>        out.writeUTF(word1);
>        out.writeUTF(word2);
>    }
>
>    
>    @Override
>    public String toString() {
>        return "[word1=" + word1 + ", word2=" + word2 + "]";
>    }
>
>}
>
>******************************
>
>Any help will be really appreciated.
>Thanks
>Sai
>

Reply via email to