Re: configuring different number of slaves for MR jobs

2011-09-27 Thread bikash sharma
Thanks Suhas. I will try using HOD. The use case for me is some research
experiments with different set of slaves for each job run.

On Tue, Sep 27, 2011 at 1:03 PM, Vitthal "Suhas" Gogate <
gog...@hortonworks.com> wrote:

> Slaves file is used only by control scripts like {start/stop}-dfs.sh,
> {start/stop}-mapred.sh to start the data nodes and task trackers on
> specified set of slave machines.. they can not be used effectively to
> change
> the size of the cluster for each M/R job  (unless you want to restart the
> task trackers with different number of slaves before every M/R job :)
>
> You can use Hadoop Job Tracker Schedulers (Capacity/Fair-share) to allocate
> and share the cluster capacity effectively.  Also there is a option of
> using
> HOD (Hadoop on demand) for dynamically allocating the cluster of required
> number of nodes.. typically used by QA/RE folks for testing purposes..
> Again in production resizing the HDFS cluster is not easy as nodes hold the
> data.
>
> --Suhas
>
> On Tue, Sep 27, 2011 at 8:50 AM, bikash sharma  >wrote:
>
> > Hi -- Can we specify a different set of slaves for each mapreduce job
> run.
> > I tried using the --config option and specify different set of slaves in
> > slaves config file. However, it does not use the selective slaves set but
> > the one initially configured.
> >
> > Any help?
> >
> > Thanks,
> > Biksah
> >
>


Re: configuring different number of slaves for MR jobs

2011-09-27 Thread Vitthal "Suhas" Gogate
Slaves file is used only by control scripts like {start/stop}-dfs.sh,
{start/stop}-mapred.sh to start the data nodes and task trackers on
specified set of slave machines.. they can not be used effectively to change
the size of the cluster for each M/R job  (unless you want to restart the
task trackers with different number of slaves before every M/R job :)

You can use Hadoop Job Tracker Schedulers (Capacity/Fair-share) to allocate
and share the cluster capacity effectively.  Also there is a option of using
HOD (Hadoop on demand) for dynamically allocating the cluster of required
number of nodes.. typically used by QA/RE folks for testing purposes..
Again in production resizing the HDFS cluster is not easy as nodes hold the
data.

--Suhas

On Tue, Sep 27, 2011 at 8:50 AM, bikash sharma wrote:

> Hi -- Can we specify a different set of slaves for each mapreduce job run.
> I tried using the --config option and specify different set of slaves in
> slaves config file. However, it does not use the selective slaves set but
> the one initially configured.
>
> Any help?
>
> Thanks,
> Biksah
>


configuring different number of slaves for MR jobs

2011-09-27 Thread bikash sharma
Hi -- Can we specify a different set of slaves for each mapreduce job run.
I tried using the --config option and specify different set of slaves in
slaves config file. However, it does not use the selective slaves set but
the one initially configured.

Any help?

Thanks,
Biksah