Re: Help on streaming jobs

2010-08-27 Thread Xin Feng
in both the mapper and the reducer, i used : while(fget(stdin,"%s\t%s",var)){ } to exhaust the stdin, is this the reason that only 1 reducer was initiated? By the way, there were always a couple of mappers initiated. Xin On Sat, Aug 28, 2010 at 12:52 AM, Xin Feng wrote: > Did you mean that i

Re: Help on streaming jobs

2010-08-27 Thread Xin Feng
Did you mean that i should include: -D mapreduce.job.reduces=2 ? since two tags exist. Xin On Sat, Aug 28, 2010 at 12:25 AM, Ken Goodhope wrote: > Your number of reducers is not set by your number of keys. If you had > an input with a million unique keys, would you expect a million > reduce

Re: Help on streaming jobs

2010-08-27 Thread Ken Goodhope
Your number of reducers is not set by your number of keys. If you had an input with a million unique keys, would you expect a million reducers each processing one record? The number is set in the conf. It's the partiotiners job to divide the work among those reducers and in this case since you did

Help on streaming jobs

2010-08-27 Thread Xin Feng
Hi, First post I wrote my own mapper and reducer in c++. I tried submitting the streaming jobs using the following command: path/to/hadoop jar path/to/streaming.jar -input path/to/input -output path/to/ouput -mapper my_own_mapper -reducer my_own_reducer The result shows that only 1 reduce

Surge 2010 Early Registration ends Tuesday!

2010-08-27 Thread Jason Dixon
Early Bird Registration for Surge Scalability Conference 2010 ends next Tuesday, August 31. We have a killer lineup of speakers and architects from across the Internet. Listen to experts talk about the newest methods and technologies for scaling your Web presence. http://omniti.com/surge/2010/re