Re: Is there a way to limit # of hadoop tasks per user at runtime?

2013-05-23 Thread Amal G Jose
You can use capacity scheduler also. In that you can create some queues,
each of specific capacity. Then you can submit jobs to that specific queue
at runtime or you can configure it as direct submission.


On Wed, May 22, 2013 at 3:27 AM, Sandy Ryza sandy.r...@cloudera.com wrote:

 Hi Mehmet,

 Are you using MR1 or MR2?

 The fair scheduler, present in both versions, but configured slightly
 differently, allows you to limit the number of map and reduce tasks in a
 queue.  The configuration can be updated at runtime by modifying the
 scheduler's allocations file.  It also has a feature that automatically
 maps jobs to queues based on the user submitted them.

 Here are links to documentation in MR1 and MR2:
 http://hadoop.apache.org/docs/stable/fair_scheduler.html

 http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html

 -Sandy



 On Tue, May 21, 2013 at 2:43 PM, Mehmet Belgin 
 mehmet.bel...@oit.gatech.edu wrote:

 Hi Everyone,

 I was wondering if there is a way for limiting the number of tasks
 (map+reduce) *per user* at runtime? Using an environment variable perhaps?
 I am asking this from a resource provisioning perspective. I am trying to
 come up with a N-token licensing system for multiple users to use our
 limited hadoop resources simultaneously. That is, when user A checks out 6
 tokens,  he/she can only run 6 hadoop tasks.

 If there is no such thing in hadoop, has anyone tried to integrate hadoop
 with torque/moab (or any other RM or scheduler)? Any advice in that
 direction will be appreciated :)

 Thanks in advance,
 -Mehmet










Re: Is there a way to limit # of hadoop tasks per user at runtime?

2013-05-23 Thread Harsh J
The only pain point I'd find with CS in a multi-user environment is its
limitation of using queue configs. Its non-trivial to configure a queue per
user as CS doesn't provide any user level settings (it wasn't designed for
that initially), while in FS you get user level limiting settings for
free, while also being able to specify pools (for users, or generally for a
property, such as queues).


On Thu, May 23, 2013 at 10:55 PM, Amal G Jose amalg...@gmail.com wrote:

 You can use capacity scheduler also. In that you can create some queues,
 each of specific capacity. Then you can submit jobs to that specific queue
 at runtime or you can configure it as direct submission.


 On Wed, May 22, 2013 at 3:27 AM, Sandy Ryza sandy.r...@cloudera.comwrote:

 Hi Mehmet,

 Are you using MR1 or MR2?

 The fair scheduler, present in both versions, but configured slightly
 differently, allows you to limit the number of map and reduce tasks in a
 queue.  The configuration can be updated at runtime by modifying the
 scheduler's allocations file.  It also has a feature that automatically
 maps jobs to queues based on the user submitted them.

 Here are links to documentation in MR1 and MR2:
 http://hadoop.apache.org/docs/stable/fair_scheduler.html

 http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html

 -Sandy



 On Tue, May 21, 2013 at 2:43 PM, Mehmet Belgin 
 mehmet.bel...@oit.gatech.edu wrote:

 Hi Everyone,

 I was wondering if there is a way for limiting the number of tasks
 (map+reduce) *per user* at runtime? Using an environment variable perhaps?
 I am asking this from a resource provisioning perspective. I am trying to
 come up with a N-token licensing system for multiple users to use our
 limited hadoop resources simultaneously. That is, when user A checks out 6
 tokens,  he/she can only run 6 hadoop tasks.

 If there is no such thing in hadoop, has anyone tried to integrate
 hadoop with torque/moab (or any other RM or scheduler)? Any advice in that
 direction will be appreciated :)

 Thanks in advance,
 -Mehmet











-- 
Harsh J


Re: Is there a way to limit # of hadoop tasks per user at runtime?

2013-05-21 Thread Sandy Ryza
Hi Mehmet,

Are you using MR1 or MR2?

The fair scheduler, present in both versions, but configured slightly
differently, allows you to limit the number of map and reduce tasks in a
queue.  The configuration can be updated at runtime by modifying the
scheduler's allocations file.  It also has a feature that automatically
maps jobs to queues based on the user submitted them.

Here are links to documentation in MR1 and MR2:
http://hadoop.apache.org/docs/stable/fair_scheduler.html
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html

-Sandy



On Tue, May 21, 2013 at 2:43 PM, Mehmet Belgin mehmet.bel...@oit.gatech.edu
 wrote:

 Hi Everyone,

 I was wondering if there is a way for limiting the number of tasks
 (map+reduce) *per user* at runtime? Using an environment variable perhaps?
 I am asking this from a resource provisioning perspective. I am trying to
 come up with a N-token licensing system for multiple users to use our
 limited hadoop resources simultaneously. That is, when user A checks out 6
 tokens,  he/she can only run 6 hadoop tasks.

 If there is no such thing in hadoop, has anyone tried to integrate hadoop
 with torque/moab (or any other RM or scheduler)? Any advice in that
 direction will be appreciated :)

 Thanks in advance,
 -Mehmet