Re: configuring different number of slaves for MR jobs
Thanks Suhas. I will try using HOD. The use case for me is some research experiments with different set of slaves for each job run. On Tue, Sep 27, 2011 at 1:03 PM, Vitthal "Suhas" Gogate < gog...@hortonworks.com> wrote: > Slaves file is used only by control scripts like {start/stop}-dfs.sh, > {start/stop}-mapred.sh to start the data nodes and task trackers on > specified set of slave machines.. they can not be used effectively to > change > the size of the cluster for each M/R job (unless you want to restart the > task trackers with different number of slaves before every M/R job :) > > You can use Hadoop Job Tracker Schedulers (Capacity/Fair-share) to allocate > and share the cluster capacity effectively. Also there is a option of > using > HOD (Hadoop on demand) for dynamically allocating the cluster of required > number of nodes.. typically used by QA/RE folks for testing purposes.. > Again in production resizing the HDFS cluster is not easy as nodes hold the > data. > > --Suhas > > On Tue, Sep 27, 2011 at 8:50 AM, bikash sharma >wrote: > > > Hi -- Can we specify a different set of slaves for each mapreduce job > run. > > I tried using the --config option and specify different set of slaves in > > slaves config file. However, it does not use the selective slaves set but > > the one initially configured. > > > > Any help? > > > > Thanks, > > Biksah > > >
Re: configuring different number of slaves for MR jobs
Slaves file is used only by control scripts like {start/stop}-dfs.sh, {start/stop}-mapred.sh to start the data nodes and task trackers on specified set of slave machines.. they can not be used effectively to change the size of the cluster for each M/R job (unless you want to restart the task trackers with different number of slaves before every M/R job :) You can use Hadoop Job Tracker Schedulers (Capacity/Fair-share) to allocate and share the cluster capacity effectively. Also there is a option of using HOD (Hadoop on demand) for dynamically allocating the cluster of required number of nodes.. typically used by QA/RE folks for testing purposes.. Again in production resizing the HDFS cluster is not easy as nodes hold the data. --Suhas On Tue, Sep 27, 2011 at 8:50 AM, bikash sharma wrote: > Hi -- Can we specify a different set of slaves for each mapreduce job run. > I tried using the --config option and specify different set of slaves in > slaves config file. However, it does not use the selective slaves set but > the one initially configured. > > Any help? > > Thanks, > Biksah >
configuring different number of slaves for MR jobs
Hi -- Can we specify a different set of slaves for each mapreduce job run. I tried using the --config option and specify different set of slaves in slaves config file. However, it does not use the selective slaves set but the one initially configured. Any help? Thanks, Biksah