Re: Problem to deploy hadoop in ec2

2010-09-13 Thread Xin Feng
Hi Tao, Then this will deal more with the network specifications. Since your nodes are not within the same local network, the conf/slaves file and /etc/hosts file should be tailored to reflect the actual IP of your nodes. Besides, the inter-sub-network communication should be properly configured (

Re: Problem to deploy hadoop in ec2

2010-09-13 Thread Xin Feng
Hi Tao, I think the first step is to make sure your hadoop is actually running and with adequate HDFS space. You can try checking the number of active nodes of the cluster and the size of the HDFS space in the UI or via the command line. Xin On Mon, Sep 13, 2010 at 11:36 AM, Tao You wrote: >

Re: Help on streaming jobs

2010-08-27 Thread Xin Feng
in both the mapper and the reducer, i used : while(fget(stdin,"%s\t%s",var)){ } to exhaust the stdin, is this the reason that only 1 reducer was initiated? By the way, there were always a couple of mappers initiated. Xin On Sat, Aug 28, 2010 at 12:52 AM, Xin Feng wrote: > Did

Re: Help on streaming jobs

2010-08-27 Thread Xin Feng
illion > reducers each processing one record?  The number is set in the conf. > It's the partiotiners job to divide the work among those reducers and > in this case since you didn't override the default of one, all work > went to the same reducer. > > On Friday, Augus

Help on streaming jobs

2010-08-27 Thread Xin Feng
Hi, First post I wrote my own mapper and reducer in c++. I tried submitting the streaming jobs using the following command: path/to/hadoop jar path/to/streaming.jar -input path/to/input -output path/to/ouput -mapper my_own_mapper -reducer my_own_reducer The result shows that only 1 reduce