Configuring capacity scheduler?

2015-09-17 Thread Chetna C
Hi All,
   Is there any way to configure capacity scheduler capacities in terms
of resources, i.e x mb, y cores, z disks? I could see, this feature is
available for fair scheduler.
   If anyone has configured scheduler like this, please guide.

Thanks,
Chetna Chaudhari


Re: Configuring capacity scheduler?

2015-09-17 Thread YIMEN GAEL
Hello Chetna,

Please Kindly read the official documentation

http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html

Regards
Le jeu. 17 sept. 2015 à 11:54, Chetna C  a écrit :

> Hi All,
>Is there any way to configure capacity scheduler capacities in
> terms of resources, i.e x mb, y cores, z disks? I could see, this feature
> is available for fair scheduler.
>If anyone has configured scheduler like this, please guide.
>
> Thanks,
> Chetna Chaudhari
>


Fwd: Concurrency control

2015-09-17 Thread Laxman Ch
Hi,

In YARN, do we have any way to control the amount of resources (vcores,
memory) used by an application SIMULTANEOUSLY.

- In my cluster, noticed some large and long running mr-app occupied all
the slots of the queue and blocking other apps to get started.
- I'm using Capacity schedulers (using hierarchical queues and preemption
disabled)
- Using Hadoop version 2.6.0
- Did some googling around this and gone through configuration docs but I'm
not able to find anything that matches my requirement.

If needed, I can provide more details on the usecase and problem.

-- 
Thanks,
Laxman


Re: Configuring capacity scheduler?

2015-09-17 Thread Chetna C
Hi Yimen,
  I read the documentation, but I couldn't find a way to configure
queue's in terms of resources. Documentation stated only way to configure
in terms of %. I would like some thing like
queue1 -> resource allocation -> 81920 mb, 20 vcores
instead of
queue1 -> resource allocation -> 25%.



Thanks,
Chetna Chaudhari

On 17 September 2015 at 16:00, YIMEN GAEL  wrote:

> Hello Chetna,
>
> Please Kindly read the official documentation
>
> http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>
> Regards
> Le jeu. 17 sept. 2015 à 11:54, Chetna C  a écrit :
>
>> Hi All,
>>Is there any way to configure capacity scheduler capacities in
>> terms of resources, i.e x mb, y cores, z disks? I could see, this feature
>> is available for fair scheduler.
>>If anyone has configured scheduler like this, please guide.
>>
>> Thanks,
>> Chetna Chaudhari
>>
>


Re: Concurrency control

2015-09-17 Thread Naganarasimha Garla
Hi Laxman,
Yes if cgroups are enabled and
"yarn.scheduler.capacity.resource-calculator" configured
to DominantResourceCalculator then cpu and memory can be controlled.
Please Kindly  furhter refer to the official documentation
http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html

But may be if say more about problem then we can suggest ideal
configuration, seems like capacity configuration and splitting of the queue
is not rightly done or you might refer to Fair Scheduler if you want more
fairness for container allocation for different apps.

On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch  wrote:

> Hi,
>
> In YARN, do we have any way to control the amount of resources (vcores,
> memory) used by an application SIMULTANEOUSLY.
>
> - In my cluster, noticed some large and long running mr-app occupied all
> the slots of the queue and blocking other apps to get started.
> - I'm using Capacity schedulers (using hierarchical queues and preemption
> disabled)
> - Using Hadoop version 2.6.0
> - Did some googling around this and gone through configuration docs but
> I'm not able to find anything that matches my requirement.
>
> If needed, I can provide more details on the usecase and problem.
>
> --
> Thanks,
> Laxman
>


Re: Concurrency control

2015-09-17 Thread Laxman Ch
Yes. I'm already using cgroups. Cgroups helps in controlling the resources
at container level. But my requirement is more about controlling the
concurrent resource usage of an application at whole cluster level.

And yes, we do configure queues properly. But, that won't help.

For example, I have an application with a requirement of 1000 vcores. But,
I wanted to control this application not to go beyond 100 vcores at any
point of time in the cluster/queue. This makes that application to run
longer even when my cluster is free but I will be able meet the guaranteed
SLAs of other applications.

Hope this helps to understand my question.

And thanks Narasimha for quick response.

On 17 September 2015 at 16:17, Naganarasimha Garla <
naganarasimha...@gmail.com> wrote:

> Hi Laxman,
> Yes if cgroups are enabled and "
> yarn.scheduler.capacity.resource-calculator" configured to
> DominantResourceCalculator then cpu and memory can be controlled.
> Please Kindly  furhter refer to the official documentation
> http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>
> But may be if say more about problem then we can suggest ideal
> configuration, seems like capacity configuration and splitting of the queue
> is not rightly done or you might refer to Fair Scheduler if you want more
> fairness for container allocation for different apps.
>
> On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch  wrote:
>
>> Hi,
>>
>> In YARN, do we have any way to control the amount of resources (vcores,
>> memory) used by an application SIMULTANEOUSLY.
>>
>> - In my cluster, noticed some large and long running mr-app occupied all
>> the slots of the queue and blocking other apps to get started.
>> - I'm using Capacity schedulers (using hierarchical queues and preemption
>> disabled)
>> - Using Hadoop version 2.6.0
>> - Did some googling around this and gone through configuration docs but
>> I'm not able to find anything that matches my requirement.
>>
>> If needed, I can provide more details on the usecase and problem.
>>
>> --
>> Thanks,
>> Laxman
>>
>
>


-- 
Thanks,
Laxman


Re: Configuring capacity scheduler?

2015-09-17 Thread Laxman Ch
Hi Chetna,

All capacity scheduler queue configurations are in terms of percentage only
and not absolute (as you asked). This is done to auto-scale the queues when
new nodes are added to the cluster. Capacity scheduler enforces the
following

- Sumtotal of all allocations at any given level of queue tree must be 100.
- Applications can be schedulerd only on leaf queues.


On 17 September 2015 at 16:14, Chetna C  wrote:

> Hi Yimen,
>   I read the documentation, but I couldn't find a way to configure
> queue's in terms of resources. Documentation stated only way to configure
> in terms of %. I would like some thing like
> queue1 -> resource allocation -> 81920 mb, 20 vcores
> instead of
> queue1 -> resource allocation -> 25%.
>
>
>
> Thanks,
> Chetna Chaudhari
>
> On 17 September 2015 at 16:00, YIMEN GAEL  wrote:
>
>> Hello Chetna,
>>
>> Please Kindly read the official documentation
>>
>> http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>
>> Regards
>> Le jeu. 17 sept. 2015 à 11:54, Chetna C  a écrit :
>>
>>> Hi All,
>>>Is there any way to configure capacity scheduler capacities in
>>> terms of resources, i.e x mb, y cores, z disks? I could see, this feature
>>> is available for fair scheduler.
>>>If anyone has configured scheduler like this, please guide.
>>>
>>> Thanks,
>>> Chetna Chaudhari
>>>
>>
>


-- 
Thanks,
Laxman


Re: Concurrency control

2015-09-17 Thread Laxman Ch
No Naga. That wont help.

I am running two applications (app1 - 100 vcores, app2 - 100 vcores) with
same user which runs in same queue (capacity=100vcores). In this scenario,
if app1 triggers first occupies all the slots and runs longs then app2 will
starve longer.

Let me reiterate my problem statement. I wanted "to control the amount of
resources (vcores, memory) used by an application SIMULTANEOUSLY"

On 17 September 2015 at 22:28, Naganarasimha Garla <
naganarasimha...@gmail.com> wrote:

> Hi Laxman,
> For the example you have stated may be we can do the following things :
> 1. Create/modify the queue with capacity and max cap set such that its
> equivalent to 100 vcores. So as there is no elasticity, given application
> will not be using the resources beyond the capacity configured
> 2. yarn.scheduler.capacity..minimum-user-limit-percent   so
> that each active user would be assured with the minimum guaranteed
> resources . By default value is 100 implies no user limits are imposed.
>
> Additionally we can think of*
> "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage" 
> *which
> will enforce strict cpu usage for a given container if required.
>
> + Naga
>
> On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch  wrote:
>
>> Yes. I'm already using cgroups. Cgroups helps in controlling the
>> resources at container level. But my requirement is more about controlling
>> the concurrent resource usage of an application at whole cluster level.
>>
>> And yes, we do configure queues properly. But, that won't help.
>>
>> For example, I have an application with a requirement of 1000 vcores.
>> But, I wanted to control this application not to go beyond 100 vcores at
>> any point of time in the cluster/queue. This makes that application to run
>> longer even when my cluster is free but I will be able meet the guaranteed
>> SLAs of other applications.
>>
>> Hope this helps to understand my question.
>>
>> And thanks Narasimha for quick response.
>>
>> On 17 September 2015 at 16:17, Naganarasimha Garla <
>> naganarasimha...@gmail.com> wrote:
>>
>>> Hi Laxman,
>>> Yes if cgroups are enabled and "
>>> yarn.scheduler.capacity.resource-calculator" configured to
>>> DominantResourceCalculator then cpu and memory can be controlled.
>>> Please Kindly  furhter refer to the official documentation
>>> http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>
>>> But may be if say more about problem then we can suggest ideal
>>> configuration, seems like capacity configuration and splitting of the queue
>>> is not rightly done or you might refer to Fair Scheduler if you want more
>>> fairness for container allocation for different apps.
>>>
>>> On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch  wrote:
>>>
 Hi,

 In YARN, do we have any way to control the amount of resources (vcores,
 memory) used by an application SIMULTANEOUSLY.

 - In my cluster, noticed some large and long running mr-app occupied
 all the slots of the queue and blocking other apps to get started.
 - I'm using Capacity schedulers (using hierarchical queues and
 preemption disabled)
 - Using Hadoop version 2.6.0
 - Did some googling around this and gone through configuration docs but
 I'm not able to find anything that matches my requirement.

 If needed, I can provide more details on the usecase and problem.

 --
 Thanks,
 Laxman

>>>
>>>
>>
>>
>> --
>> Thanks,
>> Laxman
>>
>
>


-- 
Thanks,
Laxman


Re: Concurrency control

2015-09-17 Thread Naganarasimha Garla
Hi Laxman,
For the example you have stated may be we can do the following things :
1. Create/modify the queue with capacity and max cap set such that its
equivalent to 100 vcores. So as there is no elasticity, given application
will not be using the resources beyond the capacity configured
2. yarn.scheduler.capacity..minimum-user-limit-percent   so
that each active user would be assured with the minimum guaranteed
resources . By default value is 100 implies no user limits are imposed.

Additionally we can think of*
"yarn.nodemanager.linux-container-executor.cgroups.strict-resource-usage"
*which
will enforce strict cpu usage for a given container if required.

+ Naga

On Thu, Sep 17, 2015 at 4:42 PM, Laxman Ch  wrote:

> Yes. I'm already using cgroups. Cgroups helps in controlling the resources
> at container level. But my requirement is more about controlling the
> concurrent resource usage of an application at whole cluster level.
>
> And yes, we do configure queues properly. But, that won't help.
>
> For example, I have an application with a requirement of 1000 vcores. But,
> I wanted to control this application not to go beyond 100 vcores at any
> point of time in the cluster/queue. This makes that application to run
> longer even when my cluster is free but I will be able meet the guaranteed
> SLAs of other applications.
>
> Hope this helps to understand my question.
>
> And thanks Narasimha for quick response.
>
> On 17 September 2015 at 16:17, Naganarasimha Garla <
> naganarasimha...@gmail.com> wrote:
>
>> Hi Laxman,
>> Yes if cgroups are enabled and "
>> yarn.scheduler.capacity.resource-calculator" configured to
>> DominantResourceCalculator then cpu and memory can be controlled.
>> Please Kindly  furhter refer to the official documentation
>> http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>
>> But may be if say more about problem then we can suggest ideal
>> configuration, seems like capacity configuration and splitting of the queue
>> is not rightly done or you might refer to Fair Scheduler if you want more
>> fairness for container allocation for different apps.
>>
>> On Thu, Sep 17, 2015 at 4:10 PM, Laxman Ch  wrote:
>>
>>> Hi,
>>>
>>> In YARN, do we have any way to control the amount of resources (vcores,
>>> memory) used by an application SIMULTANEOUSLY.
>>>
>>> - In my cluster, noticed some large and long running mr-app occupied all
>>> the slots of the queue and blocking other apps to get started.
>>> - I'm using Capacity schedulers (using hierarchical queues and
>>> preemption disabled)
>>> - Using Hadoop version 2.6.0
>>> - Did some googling around this and gone through configuration docs but
>>> I'm not able to find anything that matches my requirement.
>>>
>>> If needed, I can provide more details on the usecase and problem.
>>>
>>> --
>>> Thanks,
>>> Laxman
>>>
>>
>>
>
>
> --
> Thanks,
> Laxman
>


unsubscribe

2015-09-17 Thread Jean-Eric CAZAMEA



unsubscribe

2015-09-17 Thread Rob Podolski



hadoop cluster with mixed servers(different memory, speed, etc)

2015-09-17 Thread Demai Ni
hi, folks,

I am wondering how hadoop cluster handle commodity hardware with different
speed, capacity .

This situation is happening and probably become very common soon. That a
cluster starts with 100 machines, and in a couple years, add another 100
machines. With Moore's law as an indicator, the new vs. old machines are at
least one generation apart. The situation get even more complex if the
'new' 100 join the cluster gradually. How hadoop handles this situation and
avoid the weakest link problem?

thanks

Demai


Re: Configuring capacity scheduler?

2015-09-17 Thread Chetna C
Hi Laxman,
 Thanks for the reply. It means in my case where I want to configure
queues in terms of absolute resources, I should go with fair scheduler only.

Thanks,
Chetna Chaudhari

On 17 September 2015 at 17:50, Laxman Ch  wrote:

> Hi Chetna,
>
> All capacity scheduler queue configurations are in terms of percentage
> only and not absolute (as you asked). This is done to auto-scale the queues
> when new nodes are added to the cluster. Capacity scheduler enforces the
> following
>
> - Sumtotal of all allocations at any given level of queue tree must be 100.
> - Applications can be schedulerd only on leaf queues.
>
>
> On 17 September 2015 at 16:14, Chetna C  wrote:
>
>> Hi Yimen,
>>   I read the documentation, but I couldn't find a way to configure
>> queue's in terms of resources. Documentation stated only way to configure
>> in terms of %. I would like some thing like
>> queue1 -> resource allocation -> 81920 mb, 20 vcores
>> instead of
>> queue1 -> resource allocation -> 25%.
>>
>>
>>
>> Thanks,
>> Chetna Chaudhari
>>
>> On 17 September 2015 at 16:00, YIMEN GAEL  wrote:
>>
>>> Hello Chetna,
>>>
>>> Please Kindly read the official documentation
>>>
>>> http://hadoop.apache.org/docs/r1.2.1/capacity_scheduler.html
>>>
>>> Regards
>>> Le jeu. 17 sept. 2015 à 11:54, Chetna C  a écrit :
>>>
 Hi All,
Is there any way to configure capacity scheduler capacities in
 terms of resources, i.e x mb, y cores, z disks? I could see, this feature
 is available for fair scheduler.
If anyone has configured scheduler like this, please guide.

 Thanks,
 Chetna Chaudhari

>>>
>>
>
>
> --
> Thanks,
> Laxman
>