container on YARN

Prateek Maheshwari Mon, 01 Apr 2019 17:20:43 -0700

Hi Malcolm,

I think this is because in YARN 2.6 the FifoScheduler only accounts for
memory for 'maximumAllocation':
https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218


This has been changed as early as 2.7.0:
https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218

So upgrading will likely fix this issue. For reference, at LinkedIn we are
running YARN 2.7.2 with the CapacityScheduler
<https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html>
and DominantResourceCalculator to account for vcore allocations in
scheduling.

- Prateek

On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <mmcfarl...@cavulus.com>
wrote:

> Hi Prateek,
>
> This still seems to be manifesting with the same problem. Since this seems
> to be something in the hadoop codebase, and I've emailed the hadoop-dev
> mailing list about it.
>
> Cheers,
> Malcolm
>
> On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <prateek...@gmail.com>
> wrote:
>
> > Hi Malcolm,
> >
> > Yes, the AM is just reporting what the RM specified as the maximum
> allowed
> > request size.
> >
> > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be less than
> > 'yarn.nodemanager.resource.cpu-vcores', since a container must fit on a
> > single NM. Maybe the RM detected this and decided to default to 1? Can
> you
> > try setting maximum-allocation-vcores lower?
> >
> > - Prateek
> >
> > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
> mmcfarl...@cavulus.com>
> > wrote:
> >
> > > One other detail: I'm running YARN on ECS in AWS. Has anybody seen
> > > issues with core allocation in this environment? I'm seeing this in
> > > the samza log:
> > >
> > > "Got AM register response. The YARN RM supports container requests
> > > with max-mem: 14336, max-cpu: 1"
> > >
> > > How does samza determine this? Looking at the Samza source on Github,
> > > it appears to be information that's passed back to the AM when it
> > > starts up.
> > >
> > > Cheers,
> > > Malcolm
> > >
> > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> > > <mmcfarl...@cavulus.com> wrote:
> > > >
> > > > Hi Prateek,
> > > >
> > > > Sorry, meant to include these versions with my email; I'm running
> > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers across 3
> > > > node managers, each with 16GB and 8 vcores. The other two containers
> > > > are requesting 1 vcore each; even with the AMs running, that should
> be
> > > > 4 for them in total, leaving plenty of processing power available.
> > > >
> > > > The error is in the application attempt diagnostics field: "The YARN
> > > > cluster is unable to run your job due to unsatisfiable resource
> > > > requirements. You asked for mem: 2048, and cpu: 2." I do not see this
> > > > error with the same memory request, but a cpu count request of 1.
> > > >
> > > > Here are the configuration options pertaining to resource allocation:
> > > >
> > > > <?xml version="1.0"?>
> > > > <configuration>
> > > >   <property>
> > > >     <name>yarn.resourcemanager.scheduler.class</name>
> > > >
> > >
> >
> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> > > >     <value>false</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> > > >     <value>2.1</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.nodemanager.resource.memory-mb</name>
> > > >     <value>14336</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> > > >     <value>256</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> > > >     <value>14336</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> > > >     <value>1</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> > > >     <value>16</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> > > >     <value>8</value>
> > > >   </property>
> > > >   <property>
> > > >     <name>yarn.resourcemanager.cluster-id</name>
> > > >     <value>processor-cluster</value>
> > > >   </property>
> > > > </configuration>
> > > >
> > > > Cheers,
> > > > Malcolm
> > > >
> > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> > prateek...@gmail.com>
> > > wrote:
> > > > >
> > > > > Hi Malcolm,
> > > > >
> > > > > Just setting that configuration should be sufficient. We haven't
> seen
> > > this
> > > > > issue before. What Samza/YARN versions are you using? Can you also
> > > include
> > > > > the logs from where you get the error and your yarn configuration?
> > > > >
> > > > > - Prateek
> > > > >
> > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> > > mmcfarl...@cavulus.com>
> > > > > wrote:
> > > > >
> > > > > > Hey Folks,
> > > > > >
> > > > > > I'm having some issues getting multiple cores for containers in
> > yarn.
> > > > > > I seem to have my YARN settings correct, and the RM interface
> says
> > > > > > that I have 24vcores available. However, when I set the
> > > > > > cluster-manager.container.cpu.cores Samza setting to anything
> other
> > > > > > than 1, I get a message about how the container is requesting
> more
> > > > > > resources than it can allocate. With 1 core, everything is fine.
> Is
> > > > > > there another Samza option I need to set?
> > > > > >
> > > > > > Cheers,
> > > > > > Malcolm
> > > > > >
> > > > > >
> > > > > > --
> > > > > > Malcolm McFarland
> > > > > > Cavulus
> > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Malcolm McFarland
> > > > Cavulus
> > > > 1-800-760-6915
> > > > mmcfarl...@cavulus.com
> > > >
> > > >
> > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > > > unauthorized or improper disclosure, copying, distribution, or use of
> > > > the contents of this message is prohibited. The information contained
> > > > in this message is intended only for the personal and confidential
> use
> > > > of the recipient(s) named above. If you have received this message in
> > > > error, please notify the sender immediately and delete the original
> > > > message.
> > >
> > >
> > >
> > > --
> > > Malcolm McFarland
> > > Cavulus
> > > 1-800-760-6915
> > > mmcfarl...@cavulus.com
> > >
> > >
> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > > unauthorized or improper disclosure, copying, distribution, or use of
> > > the contents of this message is prohibited. The information contained
> > > in this message is intended only for the personal and confidential use
> > > of the recipient(s) named above. If you have received this message in
> > > error, please notify the sender immediately and delete the original
> > > message.
> > >
> >
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarl...@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of the
> contents of this message is prohibited. The information contained in this
> message is intended only for the personal and confidential use of the
> recipient(s) named above. If you have received this message in error,
> please notify the sender immediately and delete the original message.
>

Re: Running w/ multiple CPUs/container on YARN

Reply via email to