Re: Help with Map Reduce

2009-04-19 Thread Brian Bockelman
Hey Reza, From reading your code, you are calling this for the key sex: output.collect(The total population is: , (actual population)) and, for every other key: output.collect(The total population is: , 0) You probably only want to call the output collector in the first case, not every

Re: Help with Map Reduce

2009-04-19 Thread Reza
Brian: Thanks for your response. I have 8 total keys and values. The code I show below is part of the whole thing, just to illustrate my problem. Im trying to call each key with an if statement (as in the piece I showed before). Each of my 8 keys have their respective if statement. However, it

Re: Help with Map Reduce

2009-04-19 Thread Brian Bockelman
Hm, I don't know how equals() is implemented for Text, but I'd try: key.toString().equals(sex) Brian On Apr 19, 2009, at 11:29 AM, Reza wrote: Brian: Thanks for your response. I have 8 total keys and values. The code I show below is part of the whole thing, just to illustrate my

Re: Help with Map Reduce

2009-04-19 Thread Tushar Jain
Could you try to things, just to further test your theory that the equals is whats failing: 1. Log the value of the key (so the result of key.toString()), just to see what it is. 2. Do as below: Test sexKey = new Text(sex); if(key.equals(sexKey)) { . . } These 2 should atleast help confirm or

Are SequenceFiles split? If so, how?

2009-04-19 Thread Barnet Wagman
Suppose a SequenceFile (containing keys and values that are BytesWritable) is used as input. Will it be divided into InputSplits? If so, what's the criteria use for splitting? I'm interested in this because I need to control the number of map tasks used, which (if I understand it correctly),

Re: Help with Map Reduce

2009-04-19 Thread Reza
Tushar: I also tried that to make sure I wasnt going crazy, but it still did not go through the if statement. I do not know why it is bypassing it. This is very weird to me. Unless I have missed something, the logic should be working. Reza Could you try to things, just to further test your

Performance question

2009-04-19 Thread Mark Kerzner
Hi, I ran a Hadoop MapReduce task in the local mode, reading and writing from HDFS, and it took 2.5 minutes. Essentially the same operations on the local file system without MapReduce took 1/2 minute. Is this to be expected? It seemed that the system lost most of the time in the MapReduce

which is better Text or Custom Class

2009-04-19 Thread chintan bhatt
Hi all, I want to ask you about the performance difference between using the Text class and using a custom Class which implements Writable interface. Lets say in InvertedIndex problem when I emit token and a list of document Ids which contains it , using Text we usually Concat the list of