Re: Understanding fair schedulers

2012-01-25 Thread Srinivas Surasani
Praveenesh,

You can try specifying "mapred.fairscheduler.pool" to your pool name while
running the job. By default, mapred.faircheduler.poolnameproperty set to
user.name ( each job run by user is allocated to his named pool ) and you
can also change this property to group.name.

Srinivas --

Also, you can set

On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar wrote:

> Understanding Fair Schedulers better.
>
> Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> correct me.
>
> Suppose I have 2 pools in my fair-scheduler.xml
>
> 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max Reduce :
> 50
> 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
> 80
>
> I have 5 users, who will be using these pools. How will I allocate specific
> pools to specific users ?
>
> Suppose I want user1,user2 to use "Hadoop-users" pool and user3,user4,user5
> to use "Admin users"
>
> In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> they have mentioned allocations something like this.
>
> 
> 
>  
>5
>5
>25
>25
>300
>  
>  
>6
>  
>  3
>  600
> 
>
> I tried creating more pools, its happening, but how to allocate users to
> use specific pools ?
>
> Thanks,
> Praveenesh
>


Re: Understanding fair schedulers

2012-01-25 Thread Srinivas Surasani
Praveenesh,

You can try specifying "mapred.fairscheduler.pool" to your pool name while
running the job. By default, mapred.faircheduler.poolnameproperty set to
user.name ( each job run by user is allocated to his named pool ) and you
can also change this property to group.name.

Srinivas --

Also, you can set

On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar wrote:

> Understanding Fair Schedulers better.
>
> Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> correct me.
>
> Suppose I have 2 pools in my fair-scheduler.xml
>
> 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max Reduce :
> 50
> 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
> 80
>
> I have 5 users, who will be using these pools. How will I allocate specific
> pools to specific users ?
>
> Suppose I want user1,user2 to use "Hadoop-users" pool and user3,user4,user5
> to use "Admin users"
>
> In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> they have mentioned allocations something like this.
>
> 
> 
>  
>5
>5
>25
>25
>300
>  
>  
>6
>  
>  3
>  600
> 
>
> I tried creating more pools, its happening, but how to allocate users to
> use specific pools ?
>
> Thanks,
> Praveenesh
>


Re: Understanding fair schedulers

2012-01-25 Thread praveenesh kumar
I am running pig jobs, how can I specify on which pool, it should run ?
Also do you mean, the pool allocation is done job wise, not user wise ?


On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani  wrote:

> Praveenesh,
>
> You can try specifying "mapred.fairscheduler.pool" to your pool name while
> running the job. By default, mapred.faircheduler.poolnameproperty set to
> user.name ( each job run by user is allocated to his named pool ) and you
> can also change this property to group.name.
>
> Srinivas --
>
> Also, you can set
>
> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar  >wrote:
>
> > Understanding Fair Schedulers better.
> >
> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> > correct me.
> >
> > Suppose I have 2 pools in my fair-scheduler.xml
> >
> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> Reduce :
> > 50
> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
> > 80
> >
> > I have 5 users, who will be using these pools. How will I allocate
> specific
> > pools to specific users ?
> >
> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> user3,user4,user5
> > to use "Admin users"
> >
> > In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> > they have mentioned allocations something like this.
> >
> > 
> > 
> >  
> >5
> >5
> >25
> >25
> >300
> >  
> >  
> >6
> >  
> >  3
> >  600
> > 
> >
> > I tried creating more pools, its happening, but how to allocate users to
> > use specific pools ?
> >
> > Thanks,
> > Praveenesh
> >
>


Re: Understanding fair schedulers

2012-01-25 Thread Harsh J
Set the property in Pig with the 'set' command or other ways:
http://pig.apache.org/docs/r0.9.1/cmds.html#set or
http://pig.apache.org/docs/r0.9.1/start.html#properties

As Srinivas covered earlier, pool allocation can be done per-user if
you set the scheduler poolnameproperty to "user.name". Per group if
you set the property to "group.name".

Then you can provide per-poolname config overrides via the "pool"
element config described in
http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29

On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar  wrote:
> I am running pig jobs, how can I specify on which pool, it should run ?
> Also do you mean, the pool allocation is done job wise, not user wise ?
>
>
> On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani  wrote:
>
>> Praveenesh,
>>
>> You can try specifying "mapred.fairscheduler.pool" to your pool name while
>> running the job. By default, mapred.faircheduler.poolnameproperty set to
>> user.name ( each job run by user is allocated to his named pool ) and you
>> can also change this property to group.name.
>>
>> Srinivas --
>>
>> Also, you can set
>>
>> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar > >wrote:
>>
>> > Understanding Fair Schedulers better.
>> >
>> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>> > correct me.
>> >
>> > Suppose I have 2 pools in my fair-scheduler.xml
>> >
>> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> Reduce :
>> > 50
>> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max Reduce :
>> > 80
>> >
>> > I have 5 users, who will be using these pools. How will I allocate
>> specific
>> > pools to specific users ?
>> >
>> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> user3,user4,user5
>> > to use "Admin users"
>> >
>> > In http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> > they have mentioned allocations something like this.
>> >
>> > 
>> > 
>> >  
>> >    5
>> >    5
>> >    25
>> >    25
>> >    300
>> >  
>> >  
>> >    6
>> >  
>> >  3
>> >  600
>> > 
>> >
>> > I tried creating more pools, its happening, but how to allocate users to
>> > use specific pools ?
>> >
>> > Thanks,
>> > Praveenesh
>> >
>>



-- 
Harsh J
Customer Ops. Engineer, Cloudera


Re: Understanding fair schedulers

2012-01-25 Thread praveenesh kumar
I am looking for the solution where we can do it permanently without
specify these things inside jobs.
I want to keep these things hidden from the end-user.
End-user would just write pig scripts and all the jobs submitted by the
particular user will get submit to their respective pools automatically.

What I am doing write now is something like this

 
  
10
10
192
96
300
  
  
6
  
  3
  600

  
10
10
192
96
300
  
  
   6
  
  3
  600



By doing this, I am able to see different pools per user, without
mentioning anything inside the jobs.
Automatically jobs are going to the respective pools.

But what I wanted to know , is this the right method to do ?

Thanks,
Praveenesh


On Wed, Jan 25, 2012 at 7:36 PM, Harsh J  wrote:

> Set the property in Pig with the 'set' command or other ways:
> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> http://pig.apache.org/docs/r0.9.1/start.html#properties
>
> As Srinivas covered earlier, pool allocation can be done per-user if
> you set the scheduler poolnameproperty to "user.name". Per group if
> you set the property to "group.name".
>
> Then you can provide per-poolname config overrides via the "pool"
> element config described in
>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>
> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar 
> wrote:
> > I am running pig jobs, how can I specify on which pool, it should run ?
> > Also do you mean, the pool allocation is done job wise, not user wise ?
> >
> >
> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani 
> wrote:
> >
> >> Praveenesh,
> >>
> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
> while
> >> running the job. By default, mapred.faircheduler.poolnameproperty set to
> >> user.name ( each job run by user is allocated to his named pool ) and
> you
> >> can also change this property to group.name.
> >>
> >> Srinivas --
> >>
> >> Also, you can set
> >>
> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar  >> >wrote:
> >>
> >> > Understanding Fair Schedulers better.
> >> >
> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
> >> > correct me.
> >> >
> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >> >
> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> >> Reduce :
> >> > 50
> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
> Reduce :
> >> > 80
> >> >
> >> > I have 5 users, who will be using these pools. How will I allocate
> >> specific
> >> > pools to specific users ?
> >> >
> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >> user3,user4,user5
> >> > to use "Admin users"
> >> >
> >> > In
> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >> > they have mentioned allocations something like this.
> >> >
> >> > 
> >> > 
> >> >  
> >> >5
> >> >5
> >> >25
> >> >25
> >> >300
> >> >  
> >> >  
> >> >6
> >> >  
> >> >  3
> >> >  600
> >> > 
> >> >
> >> > I tried creating more pools, its happening, but how to allocate users
> to
> >> > use specific pools ?
> >> >
> >> > Thanks,
> >> > Praveenesh
> >> >
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>


Re: Understanding fair schedulers

2012-01-25 Thread praveenesh kumar
Also, with the above mentioned method, my problem is I am having one
pool/user (thats obviously not a good way of configuring schedulers)
How can I allocate multiple users to one pool in the xml properties, so
that I don't have to care giving any options inside my codes.

Thanks,
Praveenesh

On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar wrote:

> I am looking for the solution where we can do it permanently without
> specify these things inside jobs.
> I want to keep these things hidden from the end-user.
> End-user would just write pig scripts and all the jobs submitted by the
> particular user will get submit to their respective pools automatically.
>
> What I am doing write now is something like this
>
>  
>   
> 10
> 10
> 192
> 96
> 300
>   
>   
>
> 6
>   
>   3
>   600
>
>   
> 10
> 10
> 192
> 96
> 300
>   
>   
>
>6
>   
>   3
>   600
>
> 
>
> By doing this, I am able to see different pools per user, without
> mentioning anything inside the jobs.
> Automatically jobs are going to the respective pools.
>
> But what I wanted to know , is this the right method to do ?
>
> Thanks,
> Praveenesh
>
>
>
> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J  wrote:
>
>> Set the property in Pig with the 'set' command or other ways:
>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>>
>> As Srinivas covered earlier, pool allocation can be done per-user if
>> you set the scheduler poolnameproperty to "user.name". Per group if
>> you set the property to "group.name".
>>
>> Then you can provide per-poolname config overrides via the "pool"
>> element config described in
>>
>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>>
>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar 
>> wrote:
>> > I am running pig jobs, how can I specify on which pool, it should run ?
>> > Also do you mean, the pool allocation is done job wise, not user wise ?
>> >
>> >
>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani 
>> wrote:
>> >
>> >> Praveenesh,
>> >>
>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>> while
>> >> running the job. By default, mapred.faircheduler.poolnameproperty set
>> to
>> >> user.name ( each job run by user is allocated to his named pool ) and
>> you
>> >> can also change this property to group.name.
>> >>
>> >> Srinivas --
>> >>
>> >> Also, you can set
>> >>
>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>> praveen...@gmail.com
>> >> >wrote:
>> >>
>> >> > Understanding Fair Schedulers better.
>> >> >
>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>> >> > correct me.
>> >> >
>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>> >> >
>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> >> Reduce :
>> >> > 50
>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>> Reduce :
>> >> > 80
>> >> >
>> >> > I have 5 users, who will be using these pools. How will I allocate
>> >> specific
>> >> > pools to specific users ?
>> >> >
>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> >> user3,user4,user5
>> >> > to use "Admin users"
>> >> >
>> >> > In
>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> >> > they have mentioned allocations something like this.
>> >> >
>> >> > 
>> >> > 
>> >> >  
>> >> >5
>> >> >5
>> >> >25
>> >> >25
>> >> >300
>> >> >  
>> >> >  
>> >> >6
>> >> >  
>> >> >  3
>> >> >  600
>> >> > 
>> >> >
>> >> > I tried creating more pools, its happening, but how to allocate
>> users to
>> >> > use specific pools ?
>> >> >
>> >> > Thanks,
>> >> > Praveenesh
>> >> >
>> >>
>>
>>
>>
>> --
>> Harsh J
>> Customer Ops. Engineer, Cloudera
>>
>
>


Re: Understanding fair schedulers

2012-01-25 Thread Harsh J
A solution would be to place your users into groups, and use
group.name identifier to be the  poolnameproperty. Would this work for
you instead?

On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar  wrote:
> Also, with the above mentioned method, my problem is I am having one
> pool/user (thats obviously not a good way of configuring schedulers)
> How can I allocate multiple users to one pool in the xml properties, so
> that I don't have to care giving any options inside my codes.
>
> Thanks,
> Praveenesh
>
> On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar wrote:
>
>> I am looking for the solution where we can do it permanently without
>> specify these things inside jobs.
>> I want to keep these things hidden from the end-user.
>> End-user would just write pig scripts and all the jobs submitted by the
>> particular user will get submit to their respective pools automatically.
>>
>> What I am doing write now is something like this
>>
>>  
>>   
>>     10
>>     10
>>     192
>>     96
>>     300
>>   
>>   
>>
>>     6
>>   
>>   3
>>   600
>>
>>   
>>     10
>>     10
>>     192
>>     96
>>     300
>>   
>>   
>>
>>    6
>>   
>>   3
>>   600
>>
>> 
>>
>> By doing this, I am able to see different pools per user, without
>> mentioning anything inside the jobs.
>> Automatically jobs are going to the respective pools.
>>
>> But what I wanted to know , is this the right method to do ?
>>
>> Thanks,
>> Praveenesh
>>
>>
>>
>> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J  wrote:
>>
>>> Set the property in Pig with the 'set' command or other ways:
>>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>>>
>>> As Srinivas covered earlier, pool allocation can be done per-user if
>>> you set the scheduler poolnameproperty to "user.name". Per group if
>>> you set the property to "group.name".
>>>
>>> Then you can provide per-poolname config overrides via the "pool"
>>> element config described in
>>>
>>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>>>
>>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar 
>>> wrote:
>>> > I am running pig jobs, how can I specify on which pool, it should run ?
>>> > Also do you mean, the pool allocation is done job wise, not user wise ?
>>> >
>>> >
>>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani 
>>> wrote:
>>> >
>>> >> Praveenesh,
>>> >>
>>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>>> while
>>> >> running the job. By default, mapred.faircheduler.poolnameproperty set
>>> to
>>> >> user.name ( each job run by user is allocated to his named pool ) and
>>> you
>>> >> can also change this property to group.name.
>>> >>
>>> >> Srinivas --
>>> >>
>>> >> Also, you can set
>>> >>
>>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>>> praveen...@gmail.com
>>> >> >wrote:
>>> >>
>>> >> > Understanding Fair Schedulers better.
>>> >> >
>>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes. Please
>>> >> > correct me.
>>> >> >
>>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>>> >> >
>>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>>> >> Reduce :
>>> >> > 50
>>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>>> Reduce :
>>> >> > 80
>>> >> >
>>> >> > I have 5 users, who will be using these pools. How will I allocate
>>> >> specific
>>> >> > pools to specific users ?
>>> >> >
>>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>>> >> user3,user4,user5
>>> >> > to use "Admin users"
>>> >> >
>>> >> > In
>>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>>> >> > they have mentioned allocations something like this.
>>> >> >
>>> >> > 
>>> >> > 
>>> >> >  
>>> >> >    5
>>> >> >    5
>>> >> >    25
>>> >> >    25
>>> >> >    300
>>> >> >  
>>> >> >  
>>> >> >    6
>>> >> >  
>>> >> >  3
>>> >> >  600
>>> >> > 
>>> >> >
>>> >> > I tried creating more pools, its happening, but how to allocate
>>> users to
>>> >> > use specific pools ?
>>> >> >
>>> >> > Thanks,
>>> >> > Praveenesh
>>> >> >
>>> >>
>>>
>>>
>>>
>>> --
>>> Harsh J
>>> Customer Ops. Engineer, Cloudera
>>>
>>
>>



-- 
Harsh J
Customer Ops. Engineer, Cloudera


Re: Understanding fair schedulers

2012-01-25 Thread praveenesh kumar
Then in that case, will I be using group name tag in allocations file, like
this inside each pool ?

< group name="ABC">
6
  

Thanks,
Praveenesh

On Wed, Jan 25, 2012 at 8:08 PM, Harsh J  wrote:

> A solution would be to place your users into groups, and use
> group.name identifier to be the  poolnameproperty. Would this work for
> you instead?
>
> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar 
> wrote:
> > Also, with the above mentioned method, my problem is I am having one
> > pool/user (thats obviously not a good way of configuring schedulers)
> > How can I allocate multiple users to one pool in the xml properties, so
> > that I don't have to care giving any options inside my codes.
> >
> > Thanks,
> > Praveenesh
> >
> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar  >wrote:
> >
> >> I am looking for the solution where we can do it permanently without
> >> specify these things inside jobs.
> >> I want to keep these things hidden from the end-user.
> >> End-user would just write pig scripts and all the jobs submitted by the
> >> particular user will get submit to their respective pools automatically.
> >>
> >> What I am doing write now is something like this
> >>
> >>  
> >>   
> >> 10
> >> 10
> >> 192
> >> 96
> >> 300
> >>   
> >>   
> >>
> >> 6
> >>   
> >>   3
> >>   600
> >>
> >>   
> >> 10
> >> 10
> >> 192
> >> 96
> >> 300
> >>   
> >>   
> >>
> >>6
> >>   
> >>   3
> >>   600
> >>
> >> 
> >>
> >> By doing this, I am able to see different pools per user, without
> >> mentioning anything inside the jobs.
> >> Automatically jobs are going to the respective pools.
> >>
> >> But what I wanted to know , is this the right method to do ?
> >>
> >> Thanks,
> >> Praveenesh
> >>
> >>
> >>
> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J  wrote:
> >>
> >>> Set the property in Pig with the 'set' command or other ways:
> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
> >>>
> >>> As Srinivas covered earlier, pool allocation can be done per-user if
> >>> you set the scheduler poolnameproperty to "user.name". Per group if
> >>> you set the property to "group.name".
> >>>
> >>> Then you can provide per-poolname config overrides via the "pool"
> >>> element config described in
> >>>
> >>>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
> >>>
> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
> praveen...@gmail.com>
> >>> wrote:
> >>> > I am running pig jobs, how can I specify on which pool, it should
> run ?
> >>> > Also do you mean, the pool allocation is done job wise, not user
> wise ?
> >>> >
> >>> >
> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani  >
> >>> wrote:
> >>> >
> >>> >> Praveenesh,
> >>> >>
> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
> >>> while
> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
> set
> >>> to
> >>> >> user.name ( each job run by user is allocated to his named pool )
> and
> >>> you
> >>> >> can also change this property to group.name.
> >>> >>
> >>> >> Srinivas --
> >>> >>
> >>> >> Also, you can set
> >>> >>
> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
> >>> praveen...@gmail.com
> >>> >> >wrote:
> >>> >>
> >>> >> > Understanding Fair Schedulers better.
> >>> >> >
> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
> Please
> >>> >> > correct me.
> >>> >> >
> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >>> >> >
> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
> >>> >> Reduce :
> >>> >> > 50
> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
> >>> Reduce :
> >>> >> > 80
> >>> >> >
> >>> >> > I have 5 users, who will be using these pools. How will I allocate
> >>> >> specific
> >>> >> > pools to specific users ?
> >>> >> >
> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >>> >> user3,user4,user5
> >>> >> > to use "Admin users"
> >>> >> >
> >>> >> > In
> >>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
> >>> >> > they have mentioned allocations something like this.
> >>> >> >
> >>> >> > 
> >>> >> > 
> >>> >> >  
> >>> >> >5
> >>> >> >5
> >>> >> >25
> >>> >> >25
> >>> >> >300
> >>> >> >  
> >>> >> >  
> >>> >> >6
> >>> >> >  
> >>> >> >  3
> >>> >> >  600
> >>> >> > 
> >>> >> >
> >>> >> > I tried creating more pools, its happening, but how to allocate
> >>> users to
> >>> >> > use specific pools ?
> >>> >> >
> >>> >> > Thanks,
> >>> >> > Praveenesh
> >>> >> >
> >>> >>
> >>>
> >>>
> >>>
> >>> --
> >>> Harsh J
> >>> Customer Ops. Engineer, Cloudera
> >>>
> >>
> >>
>
>
>
> --
> Harsh J
> Customer Ops. Engineer, Cloudera
>


Re: Understanding fair schedulers

2012-01-25 Thread Harsh J
Not exactly. See, the poolnameproperty being group.name will map the
group name as a pool name. So you need to only use 
for configuring a group "ABC". Does that make sense?

On Wed, Jan 25, 2012 at 8:49 PM, praveenesh kumar  wrote:
> Then in that case, will I be using group name tag in allocations file, like
> this inside each pool ?
>
> < group name="ABC">
>    6
>  
>
> Thanks,
> Praveenesh
>
> On Wed, Jan 25, 2012 at 8:08 PM, Harsh J  wrote:
>
>> A solution would be to place your users into groups, and use
>> group.name identifier to be the  poolnameproperty. Would this work for
>> you instead?
>>
>> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar 
>> wrote:
>> > Also, with the above mentioned method, my problem is I am having one
>> > pool/user (thats obviously not a good way of configuring schedulers)
>> > How can I allocate multiple users to one pool in the xml properties, so
>> > that I don't have to care giving any options inside my codes.
>> >
>> > Thanks,
>> > Praveenesh
>> >
>> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar > >wrote:
>> >
>> >> I am looking for the solution where we can do it permanently without
>> >> specify these things inside jobs.
>> >> I want to keep these things hidden from the end-user.
>> >> End-user would just write pig scripts and all the jobs submitted by the
>> >> particular user will get submit to their respective pools automatically.
>> >>
>> >> What I am doing write now is something like this
>> >>
>> >>  
>> >>   
>> >>     10
>> >>     10
>> >>     192
>> >>     96
>> >>     300
>> >>   
>> >>   
>> >>
>> >>     6
>> >>   
>> >>   3
>> >>   600
>> >>
>> >>   
>> >>     10
>> >>     10
>> >>     192
>> >>     96
>> >>     300
>> >>   
>> >>   
>> >>
>> >>    6
>> >>   
>> >>   3
>> >>   600
>> >>
>> >> 
>> >>
>> >> By doing this, I am able to see different pools per user, without
>> >> mentioning anything inside the jobs.
>> >> Automatically jobs are going to the respective pools.
>> >>
>> >> But what I wanted to know , is this the right method to do ?
>> >>
>> >> Thanks,
>> >> Praveenesh
>> >>
>> >>
>> >>
>> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J  wrote:
>> >>
>> >>> Set the property in Pig with the 'set' command or other ways:
>> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
>> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
>> >>>
>> >>> As Srinivas covered earlier, pool allocation can be done per-user if
>> >>> you set the scheduler poolnameproperty to "user.name". Per group if
>> >>> you set the property to "group.name".
>> >>>
>> >>> Then you can provide per-poolname config overrides via the "pool"
>> >>> element config described in
>> >>>
>> >>>
>> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
>> >>>
>> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
>> praveen...@gmail.com>
>> >>> wrote:
>> >>> > I am running pig jobs, how can I specify on which pool, it should
>> run ?
>> >>> > Also do you mean, the pool allocation is done job wise, not user
>> wise ?
>> >>> >
>> >>> >
>> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani > >
>> >>> wrote:
>> >>> >
>> >>> >> Praveenesh,
>> >>> >>
>> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool name
>> >>> while
>> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
>> set
>> >>> to
>> >>> >> user.name ( each job run by user is allocated to his named pool )
>> and
>> >>> you
>> >>> >> can also change this property to group.name.
>> >>> >>
>> >>> >> Srinivas --
>> >>> >>
>> >>> >> Also, you can set
>> >>> >>
>> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
>> >>> praveen...@gmail.com
>> >>> >> >wrote:
>> >>> >>
>> >>> >> > Understanding Fair Schedulers better.
>> >>> >> >
>> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
>> Please
>> >>> >> > correct me.
>> >>> >> >
>> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
>> >>> >> >
>> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10, Max
>> >>> >> Reduce :
>> >>> >> > 50
>> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20, Max
>> >>> Reduce :
>> >>> >> > 80
>> >>> >> >
>> >>> >> > I have 5 users, who will be using these pools. How will I allocate
>> >>> >> specific
>> >>> >> > pools to specific users ?
>> >>> >> >
>> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
>> >>> >> user3,user4,user5
>> >>> >> > to use "Admin users"
>> >>> >> >
>> >>> >> > In
>> >>> http://hadoop.apache.org/common/docs/r0.20.205.0/fair_scheduler.html
>> >>> >> > they have mentioned allocations something like this.
>> >>> >> >
>> >>> >> > 
>> >>> >> > 
>> >>> >> >  
>> >>> >> >    5
>> >>> >> >    5
>> >>> >> >    25
>> >>> >> >    25
>> >>> >> >    300
>> >>> >> >  
>> >>> >> >  
>> >>> >> >    6
>> >>> >> >  
>> >>> >> >  3
>> >>> >> >  600
>> >>> >> > 
>> >>> >> >
>> >>> >> > I tried creating more pools, its happening, but how to allocate
>> >

Re: Understanding fair schedulers

2012-01-25 Thread praveenesh kumar
okie got it.. same pool name.. as group name...

On Wed, Jan 25, 2012 at 8:51 PM, Harsh J  wrote:

> Not exactly. See, the poolnameproperty being group.name will map the
> group name as a pool name. So you need to only use 
> for configuring a group "ABC". Does that make sense?
>
> On Wed, Jan 25, 2012 at 8:49 PM, praveenesh kumar 
> wrote:
> > Then in that case, will I be using group name tag in allocations file,
> like
> > this inside each pool ?
> >
> > < group name="ABC">
> >6
> >  
> >
> > Thanks,
> > Praveenesh
> >
> > On Wed, Jan 25, 2012 at 8:08 PM, Harsh J  wrote:
> >
> >> A solution would be to place your users into groups, and use
> >> group.name identifier to be the  poolnameproperty. Would this work for
> >> you instead?
> >>
> >> On Wed, Jan 25, 2012 at 8:00 PM, praveenesh kumar  >
> >> wrote:
> >> > Also, with the above mentioned method, my problem is I am having one
> >> > pool/user (thats obviously not a good way of configuring schedulers)
> >> > How can I allocate multiple users to one pool in the xml properties,
> so
> >> > that I don't have to care giving any options inside my codes.
> >> >
> >> > Thanks,
> >> > Praveenesh
> >> >
> >> > On Wed, Jan 25, 2012 at 7:55 PM, praveenesh kumar <
> praveen...@gmail.com
> >> >wrote:
> >> >
> >> >> I am looking for the solution where we can do it permanently without
> >> >> specify these things inside jobs.
> >> >> I want to keep these things hidden from the end-user.
> >> >> End-user would just write pig scripts and all the jobs submitted by
> the
> >> >> particular user will get submit to their respective pools
> automatically.
> >> >>
> >> >> What I am doing write now is something like this
> >> >>
> >> >>  
> >> >>   
> >> >> 10
> >> >> 10
> >> >> 192
> >> >> 96
> >> >> 300
> >> >>   
> >> >>   
> >> >>
> >> >> 6
> >> >>   
> >> >>   3
> >> >>   600
> >> >>
> >> >>   
> >> >> 10
> >> >> 10
> >> >> 192
> >> >> 96
> >> >> 300
> >> >>   
> >> >>   
> >> >>
> >> >>6
> >> >>   
> >> >>   3
> >> >>   600
> >> >>
> >> >> 
> >> >>
> >> >> By doing this, I am able to see different pools per user, without
> >> >> mentioning anything inside the jobs.
> >> >> Automatically jobs are going to the respective pools.
> >> >>
> >> >> But what I wanted to know , is this the right method to do ?
> >> >>
> >> >> Thanks,
> >> >> Praveenesh
> >> >>
> >> >>
> >> >>
> >> >> On Wed, Jan 25, 2012 at 7:36 PM, Harsh J  wrote:
> >> >>
> >> >>> Set the property in Pig with the 'set' command or other ways:
> >> >>> http://pig.apache.org/docs/r0.9.1/cmds.html#set or
> >> >>> http://pig.apache.org/docs/r0.9.1/start.html#properties
> >> >>>
> >> >>> As Srinivas covered earlier, pool allocation can be done per-user if
> >> >>> you set the scheduler poolnameproperty to "user.name". Per group if
> >> >>> you set the property to "group.name".
> >> >>>
> >> >>> Then you can provide per-poolname config overrides via the "pool"
> >> >>> element config described in
> >> >>>
> >> >>>
> >>
> http://hadoop.apache.org/common/docs/current/fair_scheduler.html#Allocation+File+%28fair-scheduler.xml%29
> >> >>>
> >> >>> On Wed, Jan 25, 2012 at 7:01 PM, praveenesh kumar <
> >> praveen...@gmail.com>
> >> >>> wrote:
> >> >>> > I am running pig jobs, how can I specify on which pool, it should
> >> run ?
> >> >>> > Also do you mean, the pool allocation is done job wise, not user
> >> wise ?
> >> >>> >
> >> >>> >
> >> >>> > On Wed, Jan 25, 2012 at 6:14 PM, Srinivas Surasani <
> vas...@gmail.com
> >> >
> >> >>> wrote:
> >> >>> >
> >> >>> >> Praveenesh,
> >> >>> >>
> >> >>> >> You can try specifying "mapred.fairscheduler.pool" to your pool
> name
> >> >>> while
> >> >>> >> running the job. By default, mapred.faircheduler.poolnameproperty
> >> set
> >> >>> to
> >> >>> >> user.name ( each job run by user is allocated to his named pool
> )
> >> and
> >> >>> you
> >> >>> >> can also change this property to group.name.
> >> >>> >>
> >> >>> >> Srinivas --
> >> >>> >>
> >> >>> >> Also, you can set
> >> >>> >>
> >> >>> >> On Wed, Jan 25, 2012 at 6:24 AM, praveenesh kumar <
> >> >>> praveen...@gmail.com
> >> >>> >> >wrote:
> >> >>> >>
> >> >>> >> > Understanding Fair Schedulers better.
> >> >>> >> >
> >> >>> >> > Can we create mulitple pools in Fair Schedulers. I guess Yes.
> >> Please
> >> >>> >> > correct me.
> >> >>> >> >
> >> >>> >> > Suppose I have 2 pools in my fair-scheduler.xml
> >> >>> >> >
> >> >>> >> > 1. Hadoop-users : Min map : 10, Max map : 50, Min Reduce : 10,
> Max
> >> >>> >> Reduce :
> >> >>> >> > 50
> >> >>> >> > 2. Admin-users: Min map : 20, Max map : 80, Min Reduce : 20,
> Max
> >> >>> Reduce :
> >> >>> >> > 80
> >> >>> >> >
> >> >>> >> > I have 5 users, who will be using these pools. How will I
> allocate
> >> >>> >> specific
> >> >>> >> > pools to specific users ?
> >> >>> >> >
> >> >>> >> > Suppose I want user1,user2 to use "Hadoop-users" pool and
> >> >>> >> user3,user4,user5
> >> >>> >> > to use "Admin users"
> >> >>> >> >
> >> >>> >>