container on YARN

Malcolm McFarland Tue, 02 Apr 2019 00:10:53 -0700

Hey Prateek,

The upgrade to Hadoop 2.7.6 went fine; everything seems to be working, and
access to S3 via an access key/secret pair is working as well. However, my
requested tasks are still only getting allocated 1 core, despite requesting
more than that. Once again, I have a 3-node cluster that should have 24
vcores available; on the yarn side, I have these options set:


nodemanager.resource.cpu-vcores=8
yarn.scheduler.minimum-allocation-vcore=1
yarn.scheduler.maximum-allocation-vcores=4
yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator

And on the Samza side, I'm setting:

cluster-manager.container.cpu.cores=2

However, YARN is still telling me that the running task has 1 vcore
assigned. Do you have any other suggestions for options to tweak?

Cheers,
Malcolm


On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <mmcfarl...@cavulus.com>
wrote:

> One more thing -- fwiw, I actually also came across the possibility that I
> would need to use the DominantResourceCalculator, but as you point out,
> this doesn't seem to be available in Hadoop 2.6.
>
>
> On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <mmcfarl...@cavulus.com>
> wrote:
>
>> That's quite helpful! I actually initially tried using a version of
>> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN
>> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
>> received lots of "No AWS Credentials
>> provided by DefaultAWSCredentialsProviderChain" messages. I found a
>> way around this by providing the credentials to the AM directly via
>> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
>> since this seemed very workaround-ish, I just assumed that I would
>> eventually hit other problems using a version of Hadoop not pinned in
>> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
>> give it a shot again.
>>
>> Have you done any AWS credential integration, and if so, did you need
>> to do anything special to get it to work?
>>
>> Cheers,
>> Malcolm
>>
>>
>>
>> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <prateek...@gmail.com>
>> wrote:
>> >
>> > Hi Malcolm,
>> >
>> > I think this is because in YARN 2.6 the FifoScheduler only accounts for
>> > memory for 'maximumAllocation':
>> >
>> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> >
>> > This has been changed as early as 2.7.0:
>> >
>> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> >
>> > So upgrading will likely fix this issue. For reference, at LinkedIn we
>> are
>> > running YARN 2.7.2 with the CapacityScheduler
>> > <
>> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
>> >
>> > and DominantResourceCalculator to account for vcore allocations in
>> > scheduling.
>> >
>> > - Prateek
>> >
>> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
>> mmcfarl...@cavulus.com>
>> > wrote:
>> >
>> > > Hi Prateek,
>> > >
>> > > This still seems to be manifesting with the same problem. Since this
>> seems
>> > > to be something in the hadoop codebase, and I've emailed the
>> hadoop-dev
>> > > mailing list about it.
>> > >
>> > > Cheers,
>> > > Malcolm
>> > >
>> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
>> prateek...@gmail.com>
>> > > wrote:
>> > >
>> > > > Hi Malcolm,
>> > > >
>> > > > Yes, the AM is just reporting what the RM specified as the maximum
>> > > allowed
>> > > > request size.
>> > > >
>> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be less
>> than
>> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container must fit
>> on a
>> > > > single NM. Maybe the RM detected this and decided to default to 1?
>> Can
>> > > you
>> > > > try setting maximum-allocation-vcores lower?
>> > > >
>> > > > - Prateek
>> > > >
>> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
>> > > mmcfarl...@cavulus.com>
>> > > > wrote:
>> > > >
>> > > > > One other detail: I'm running YARN on ECS in AWS. Has anybody seen
>> > > > > issues with core allocation in this environment? I'm seeing this
>> in
>> > > > > the samza log:
>> > > > >
>> > > > > "Got AM register response. The YARN RM supports container requests
>> > > > > with max-mem: 14336, max-cpu: 1"
>> > > > >
>> > > > > How does samza determine this? Looking at the Samza source on
>> Github,
>> > > > > it appears to be information that's passed back to the AM when it
>> > > > > starts up.
>> > > > >
>> > > > > Cheers,
>> > > > > Malcolm
>> > > > >
>> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
>> > > > > <mmcfarl...@cavulus.com> wrote:
>> > > > > >
>> > > > > > Hi Prateek,
>> > > > > >
>> > > > > > Sorry, meant to include these versions with my email; I'm
>> running
>> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers
>> across 3
>> > > > > > node managers, each with 16GB and 8 vcores. The other two
>> containers
>> > > > > > are requesting 1 vcore each; even with the AMs running, that
>> should
>> > > be
>> > > > > > 4 for them in total, leaving plenty of processing power
>> available.
>> > > > > >
>> > > > > > The error is in the application attempt diagnostics field: "The
>> YARN
>> > > > > > cluster is unable to run your job due to unsatisfiable resource
>> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not
>> see this
>> > > > > > error with the same memory request, but a cpu count request of
>> 1.
>> > > > > >
>> > > > > > Here are the configuration options pertaining to resource
>> allocation:
>> > > > > >
>> > > > > > <?xml version="1.0"?>
>> > > > > > <configuration>
>> > > > > >   <property>
>> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
>> > > > > >
>> > > > >
>> > > >
>> > >
>> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
>> > > > > >     <value>false</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
>> > > > > >     <value>2.1</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
>> > > > > >     <value>14336</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
>> > > > > >     <value>256</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
>> > > > > >     <value>14336</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
>> > > > > >     <value>1</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
>> > > > > >     <value>16</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
>> > > > > >     <value>8</value>
>> > > > > >   </property>
>> > > > > >   <property>
>> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
>> > > > > >     <value>processor-cluster</value>
>> > > > > >   </property>
>> > > > > > </configuration>
>> > > > > >
>> > > > > > Cheers,
>> > > > > > Malcolm
>> > > > > >
>> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
>> > > > prateek...@gmail.com>
>> > > > > wrote:
>> > > > > > >
>> > > > > > > Hi Malcolm,
>> > > > > > >
>> > > > > > > Just setting that configuration should be sufficient. We
>> haven't
>> > > seen
>> > > > > this
>> > > > > > > issue before. What Samza/YARN versions are you using? Can you
>> also
>> > > > > include
>> > > > > > > the logs from where you get the error and your yarn
>> configuration?
>> > > > > > >
>> > > > > > > - Prateek
>> > > > > > >
>> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
>> > > > > mmcfarl...@cavulus.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hey Folks,
>> > > > > > > >
>> > > > > > > > I'm having some issues getting multiple cores for
>> containers in
>> > > > yarn.
>> > > > > > > > I seem to have my YARN settings correct, and the RM
>> interface
>> > > says
>> > > > > > > > that I have 24vcores available. However, when I set the
>> > > > > > > > cluster-manager.container.cpu.cores Samza setting to
>> anything
>> > > other
>> > > > > > > > than 1, I get a message about how the container is
>> requesting
>> > > more
>> > > > > > > > resources than it can allocate. With 1 core, everything is
>> fine.
>> > > Is
>> > > > > > > > there another Samza option I need to set?
>> > > > > > > >
>> > > > > > > > Cheers,
>> > > > > > > > Malcolm
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Malcolm McFarland
>> > > > > > > > Cavulus
>> > > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Malcolm McFarland
>> > > > > > Cavulus
>> > > > > > 1-800-760-6915
>> > > > > > mmcfarl...@cavulus.com
>> > > > > >
>> > > > > >
>> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
>> Any
>> > > > > > unauthorized or improper disclosure, copying, distribution, or
>> use of
>> > > > > > the contents of this message is prohibited. The information
>> contained
>> > > > > > in this message is intended only for the personal and
>> confidential
>> > > use
>> > > > > > of the recipient(s) named above. If you have received this
>> message in
>> > > > > > error, please notify the sender immediately and delete the
>> original
>> > > > > > message.
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Malcolm McFarland
>> > > > > Cavulus
>> > > > > 1-800-760-6915
>> > > > > mmcfarl...@cavulus.com
>> > > > >
>> > > > >
>> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> > > > > unauthorized or improper disclosure, copying, distribution, or
>> use of
>> > > > > the contents of this message is prohibited. The information
>> contained
>> > > > > in this message is intended only for the personal and
>> confidential use
>> > > > > of the recipient(s) named above. If you have received this
>> message in
>> > > > > error, please notify the sender immediately and delete the
>> original
>> > > > > message.
>> > > > >
>> > > >
>> > >
>> > >
>> > > --
>> > > Malcolm McFarland
>> > > Cavulus
>> > > 1-800-760-6915
>> > > mmcfarl...@cavulus.com
>> > >
>> > >
>> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> > > unauthorized or improper disclosure, copying, distribution, or use of
>> the
>> > > contents of this message is prohibited. The information contained in
>> this
>> > > message is intended only for the personal and confidential use of the
>> > > recipient(s) named above. If you have received this message in error,
>> > > please notify the sender immediately and delete the original message.
>> > >
>>
>>
>>
>> --
>> Malcolm McFarland
>> Cavulus
>> 1-800-760-6915
>> mmcfarl...@cavulus.com
>>
>>
>> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> unauthorized or improper disclosure, copying, distribution, or use of
>> the contents of this message is prohibited. The information contained
>> in this message is intended only for the personal and confidential use
>> of the recipient(s) named above. If you have received this message in
>> error, please notify the sender immediately and delete the original
>> message.
>>
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> mmcfarl...@cavulus.com
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of the
> contents of this message is prohibited. The information contained in this
> message is intended only for the personal and confidential use of the
> recipient(s) named above. If you have received this message in error,
> please notify the sender immediately and delete the original message.
>


-- 
Malcolm McFarland
Cavulus
1-800-760-6915
mmcfarl...@cavulus.com


This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
unauthorized or improper disclosure, copying, distribution, or use of the
contents of this message is prohibited. The information contained in this
message is intended only for the personal and confidential use of the
recipient(s) named above. If you have received this message in error,
please notify the sender immediately and delete the original message.

Re: Running w/ multiple CPUs/container on YARN

Reply via email to