I also had this issue, and it was resolved by changing settings in
yarn-site.xml and capacity-scheduler.xml. The amount of memory (and number of
virtual CPUs) allocated to your jobs is controlled by settings in
yarn-site.xml. And I suspect you’re seeing jobs going into ACCEPTED instead of
RUNNING due to the default value (0.1) of
yarn.scheduler.capacity.maximum-am-resource-percent in capacity-scheduler.xml
being too low. For example, here are the values I’m using in my staging
cluster (just two m3.medium EC2 instances), where I typically request 256MB per
container.
-----------------------------------------------------------------------------
yarn-site.xml
-----------------------------------------------------------------------------
<?xml version="1.0"?>
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>128</value>
<description>Minimum limit of memory to allocate to each container request
at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>512</value>
<description>Maximum limit of memory to allocate to each container request
at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
<description>The minimum allocation for every container request at the RM,
in terms of virtual CPU cores. Requests lower than this won't take effect, and
the specified value will get allocated the minimum.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>2</value>
<description>The maximum allocation for every container request at the RM,
in terms of virtual CPU cores. Requests higher than this won't take effect, and
will get capped to this value.</description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>3072</value>
<description>Physical memory, in MB, to be made available to running
containers</description>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>8</value>
<description>Number of CPU cores that can be allocated for
containers.</description>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.1</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>******.amazonaws.com</value>
</property>
<property>
<name>yarn.nodemanager.log.retain-seconds</name>
<value>86400</value>
</property>
</configuration>
-----------------------------------------------------------------------------
capacity-scheduler.xml
-----------------------------------------------------------------------------
<configuration>
<property>
<name>yarn.scheduler.capacity.maximum-applications</name>
<value>10000</value>
<description>
Maximum number of applications that can be pending and running.
</description>
</property>
<!-- Changed by MM from 0.1 (default) to 0.5, as our Samza jobs have typically
just one AppMaster and one job container. -->
<property>
<name>yarn.scheduler.capacity.maximum-am-resource-percent</name>
<value>0.5</value>
<description>
Maximum percent of resources in the cluster which can be used to run
application masters i.e. controls number of concurrent running
applications.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.resource-calculator</name>
<value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value>
<description>
The ResourceCalculator implementation to be used to compare
Resources in the scheduler.
The default i.e. DefaultResourceCalculator only uses Memory while
DominantResourceCalculator uses dominant-resource to compare
multi-dimensional resources such as Memory, CPU etc.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.root.queues</name>
<value>default</value>
<description>
The queues at the this level (root is the root queue).
</description>
</property>
<property>
<name>yarn.scheduler.capacity.root.default.capacity</name>
<value>100</value>
<description>Default queue target capacity.</description>
</property>
<property>
<name>yarn.scheduler.capacity.root.default.user-limit-factor</name>
<value>1</value>
<description>
Default queue user limit a percentage from 0.0 to 1.0.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.root.default.maximum-capacity</name>
<value>100</value>
<description>
The maximum capacity of the default queue.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.root.default.state</name>
<value>RUNNING</value>
<description>
The state of the default queue. State can be one of RUNNING or STOPPED.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.root.default.acl_submit_applications</name>
<value>*</value>
<description>
The ACL of who can submit jobs to the default queue.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.root.default.acl_administer_queue</name>
<value>*</value>
<description>
The ACL of who can administer jobs on the default queue.
</description>
</property>
<property>
<name>yarn.scheduler.capacity.node-locality-delay</name>
<value>40</value>
<description>
Number of missed scheduling opportunities after which the
CapacityScheduler
attempts to schedule rack-local containers.
Typically this should be set to number of nodes in the cluster, By
default is setting
approximately number of nodes in one rack which is 40.
</description>
</property>
</configuration>
On September 15, 2015 at 5:05:28 AM, Jordi Blasi Uribarri
([email protected]<mailto:[email protected]>) wrote:
I have tried changing all the jobs configuration to this:
yarn.container.memory.mb=128
yarn.am.container.memory.mb=128
and on the startup I can see:
2015-09-15 12:40:18 ClientHelper [INFO] set memory request to 128 for
application_1442313590092_0002
On the web interface of hadoop I see that every job is still getting 2 gb each.
In fact, only two of the jobs are in state running, while the rest are accepted.
Any ideas?
Thanks,
Jordi
-----Mensaje original-----
De: Yan Fang [mailto:[email protected]]
Enviado el: viernes, 11 de septiembre de 2015 20:56
Para: [email protected]
Asunto: Re: memory limits
Hi Jordi,
I believe you can change the memory by* yarn.container.memory.mb* , default is
1024. And *yarn.am.container.memory.mb* is for the AM memory.
See
http://samza.apache.org/learn/documentation/0.9/jobs/configuration-table.html
Thanks,
Fang, Yan
[email protected]
On Fri, Sep 11, 2015 at 4:21 AM, Jordi Blasi Uribarri <[email protected]>
wrote:
> Hi,
>
> I am trying to implement an environment that requires multiple
> combined samza jobs for different tasks. I see that there is a limit
> to the number of jobs that can be running at the same time as they block 1GB
> of ram each.
> I understand that this is a reasonable limit in a production
> environment (as long as we are speaking of Big Data, we need big
> amounts of resources ☺
> ) but my lab does not have so much ram. Is there a way to reduce this
> limit so I can test it properly? I am using Samza 0.9.
>
> Thanks in advance,
>
> Jordi
> ________________________________
> Jordi Blasi Uribarri
> Área I+D+i
>
> [email protected]
> Oficina Bilbao
>
> [http://www.nextel.es/wp-content/uploads/Firma_Nextel_2015.png]
>