container on YARN

Prateek Maheshwari Tue, 02 Apr 2019 09:50:52 -0700

And just to double check, you also changed the
yarn.resourcemanager.scheduler.class to CapacityScheduler?


On Tue, Apr 2, 2019 at 9:49 AM Prateek Maheshwari <prateek...@gmail.com>
wrote:

> Is it still the same message from the AM? The one that says: "Got AM
> register response. The YARN RM supports container requests with max-mem:
> 14336, max-cpu: 1"
>
> On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland <mmcfarl...@cavulus.com>
> wrote:
>
>> Hey Prateek,
>>
>> The upgrade to Hadoop 2.7.6 went fine; everything seems to be working, and
>> access to S3 via an access key/secret pair is working as well. However, my
>> requested tasks are still only getting allocated 1 core, despite
>> requesting
>> more than that. Once again, I have a 3-node cluster that should have 24
>> vcores available; on the yarn side, I have these options set:
>>
>> nodemanager.resource.cpu-vcores=8
>> yarn.scheduler.minimum-allocation-vcore=1
>> yarn.scheduler.maximum-allocation-vcores=4
>>
>> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
>>
>> And on the Samza side, I'm setting:
>>
>> cluster-manager.container.cpu.cores=2
>>
>> However, YARN is still telling me that the running task has 1 vcore
>> assigned. Do you have any other suggestions for options to tweak?
>>
>> Cheers,
>> Malcolm
>>
>>
>> On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <mmcfarl...@cavulus.com>
>> wrote:
>>
>> > One more thing -- fwiw, I actually also came across the possibility
>> that I
>> > would need to use the DominantResourceCalculator, but as you point out,
>> > this doesn't seem to be available in Hadoop 2.6.
>> >
>> >
>> > On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <
>> mmcfarl...@cavulus.com>
>> > wrote:
>> >
>> >> That's quite helpful! I actually initially tried using a version of
>> >> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN
>> >> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
>> >> received lots of "No AWS Credentials
>> >> provided by DefaultAWSCredentialsProviderChain" messages. I found a
>> >> way around this by providing the credentials to the AM directly via
>> >> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
>> >> since this seemed very workaround-ish, I just assumed that I would
>> >> eventually hit other problems using a version of Hadoop not pinned in
>> >> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
>> >> give it a shot again.
>> >>
>> >> Have you done any AWS credential integration, and if so, did you need
>> >> to do anything special to get it to work?
>> >>
>> >> Cheers,
>> >> Malcolm
>> >>
>> >>
>> >>
>> >> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <
>> prateek...@gmail.com>
>> >> wrote:
>> >> >
>> >> > Hi Malcolm,
>> >> >
>> >> > I think this is because in YARN 2.6 the FifoScheduler only accounts
>> for
>> >> > memory for 'maximumAllocation':
>> >> >
>> >>
>> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> >> >
>> >> > This has been changed as early as 2.7.0:
>> >> >
>> >>
>> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
>> >> >
>> >> > So upgrading will likely fix this issue. For reference, at LinkedIn
>> we
>> >> are
>> >> > running YARN 2.7.2 with the CapacityScheduler
>> >> > <
>> >>
>> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
>> >> >
>> >> > and DominantResourceCalculator to account for vcore allocations in
>> >> > scheduling.
>> >> >
>> >> > - Prateek
>> >> >
>> >> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
>> >> mmcfarl...@cavulus.com>
>> >> > wrote:
>> >> >
>> >> > > Hi Prateek,
>> >> > >
>> >> > > This still seems to be manifesting with the same problem. Since
>> this
>> >> seems
>> >> > > to be something in the hadoop codebase, and I've emailed the
>> >> hadoop-dev
>> >> > > mailing list about it.
>> >> > >
>> >> > > Cheers,
>> >> > > Malcolm
>> >> > >
>> >> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
>> >> prateek...@gmail.com>
>> >> > > wrote:
>> >> > >
>> >> > > > Hi Malcolm,
>> >> > > >
>> >> > > > Yes, the AM is just reporting what the RM specified as the
>> maximum
>> >> > > allowed
>> >> > > > request size.
>> >> > > >
>> >> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be
>> less
>> >> than
>> >> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container must
>> fit
>> >> on a
>> >> > > > single NM. Maybe the RM detected this and decided to default to
>> 1?
>> >> Can
>> >> > > you
>> >> > > > try setting maximum-allocation-vcores lower?
>> >> > > >
>> >> > > > - Prateek
>> >> > > >
>> >> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
>> >> > > mmcfarl...@cavulus.com>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > One other detail: I'm running YARN on ECS in AWS. Has anybody
>> seen
>> >> > > > > issues with core allocation in this environment? I'm seeing
>> this
>> >> in
>> >> > > > > the samza log:
>> >> > > > >
>> >> > > > > "Got AM register response. The YARN RM supports container
>> requests
>> >> > > > > with max-mem: 14336, max-cpu: 1"
>> >> > > > >
>> >> > > > > How does samza determine this? Looking at the Samza source on
>> >> Github,
>> >> > > > > it appears to be information that's passed back to the AM when
>> it
>> >> > > > > starts up.
>> >> > > > >
>> >> > > > > Cheers,
>> >> > > > > Malcolm
>> >> > > > >
>> >> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
>> >> > > > > <mmcfarl...@cavulus.com> wrote:
>> >> > > > > >
>> >> > > > > > Hi Prateek,
>> >> > > > > >
>> >> > > > > > Sorry, meant to include these versions with my email; I'm
>> >> running
>> >> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers
>> >> across 3
>> >> > > > > > node managers, each with 16GB and 8 vcores. The other two
>> >> containers
>> >> > > > > > are requesting 1 vcore each; even with the AMs running, that
>> >> should
>> >> > > be
>> >> > > > > > 4 for them in total, leaving plenty of processing power
>> >> available.
>> >> > > > > >
>> >> > > > > > The error is in the application attempt diagnostics field:
>> "The
>> >> YARN
>> >> > > > > > cluster is unable to run your job due to unsatisfiable
>> resource
>> >> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not
>> >> see this
>> >> > > > > > error with the same memory request, but a cpu count request
>> of
>> >> 1.
>> >> > > > > >
>> >> > > > > > Here are the configuration options pertaining to resource
>> >> allocation:
>> >> > > > > >
>> >> > > > > > <?xml version="1.0"?>
>> >> > > > > > <configuration>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >>
>> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
>> >> > > > > >     <value>false</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
>> >> > > > > >     <value>2.1</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
>> >> > > > > >     <value>14336</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
>> >> > > > > >     <value>256</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
>> >> > > > > >     <value>14336</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
>> >> > > > > >     <value>1</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
>> >> > > > > >     <value>16</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
>> >> > > > > >     <value>8</value>
>> >> > > > > >   </property>
>> >> > > > > >   <property>
>> >> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
>> >> > > > > >     <value>processor-cluster</value>
>> >> > > > > >   </property>
>> >> > > > > > </configuration>
>> >> > > > > >
>> >> > > > > > Cheers,
>> >> > > > > > Malcolm
>> >> > > > > >
>> >> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
>> >> > > > prateek...@gmail.com>
>> >> > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > Hi Malcolm,
>> >> > > > > > >
>> >> > > > > > > Just setting that configuration should be sufficient. We
>> >> haven't
>> >> > > seen
>> >> > > > > this
>> >> > > > > > > issue before. What Samza/YARN versions are you using? Can
>> you
>> >> also
>> >> > > > > include
>> >> > > > > > > the logs from where you get the error and your yarn
>> >> configuration?
>> >> > > > > > >
>> >> > > > > > > - Prateek
>> >> > > > > > >
>> >> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
>> >> > > > > mmcfarl...@cavulus.com>
>> >> > > > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > > Hey Folks,
>> >> > > > > > > >
>> >> > > > > > > > I'm having some issues getting multiple cores for
>> >> containers in
>> >> > > > yarn.
>> >> > > > > > > > I seem to have my YARN settings correct, and the RM
>> >> interface
>> >> > > says
>> >> > > > > > > > that I have 24vcores available. However, when I set the
>> >> > > > > > > > cluster-manager.container.cpu.cores Samza setting to
>> >> anything
>> >> > > other
>> >> > > > > > > > than 1, I get a message about how the container is
>> >> requesting
>> >> > > more
>> >> > > > > > > > resources than it can allocate. With 1 core, everything
>> is
>> >> fine.
>> >> > > Is
>> >> > > > > > > > there another Samza option I need to set?
>> >> > > > > > > >
>> >> > > > > > > > Cheers,
>> >> > > > > > > > Malcolm
>> >> > > > > > > >
>> >> > > > > > > >
>> >> > > > > > > > --
>> >> > > > > > > > Malcolm McFarland
>> >> > > > > > > > Cavulus
>> >> > > > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > --
>> >> > > > > > Malcolm McFarland
>> >> > > > > > Cavulus
>> >> > > > > > 1-800-760-6915
>> >> > > > > > mmcfarl...@cavulus.com
>> >> > > > > >
>> >> > > > > >
>> >> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a
>> Cavulus.
>> >> Any
>> >> > > > > > unauthorized or improper disclosure, copying, distribution,
>> or
>> >> use of
>> >> > > > > > the contents of this message is prohibited. The information
>> >> contained
>> >> > > > > > in this message is intended only for the personal and
>> >> confidential
>> >> > > use
>> >> > > > > > of the recipient(s) named above. If you have received this
>> >> message in
>> >> > > > > > error, please notify the sender immediately and delete the
>> >> original
>> >> > > > > > message.
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > > --
>> >> > > > > Malcolm McFarland
>> >> > > > > Cavulus
>> >> > > > > 1-800-760-6915
>> >> > > > > mmcfarl...@cavulus.com
>> >> > > > >
>> >> > > > >
>> >> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
>> Any
>> >> > > > > unauthorized or improper disclosure, copying, distribution, or
>> >> use of
>> >> > > > > the contents of this message is prohibited. The information
>> >> contained
>> >> > > > > in this message is intended only for the personal and
>> >> confidential use
>> >> > > > > of the recipient(s) named above. If you have received this
>> >> message in
>> >> > > > > error, please notify the sender immediately and delete the
>> >> original
>> >> > > > > message.
>> >> > > > >
>> >> > > >
>> >> > >
>> >> > >
>> >> > > --
>> >> > > Malcolm McFarland
>> >> > > Cavulus
>> >> > > 1-800-760-6915
>> >> > > mmcfarl...@cavulus.com
>> >> > >
>> >> > >
>> >> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> >> > > unauthorized or improper disclosure, copying, distribution, or use
>> of
>> >> the
>> >> > > contents of this message is prohibited. The information contained
>> in
>> >> this
>> >> > > message is intended only for the personal and confidential use of
>> the
>> >> > > recipient(s) named above. If you have received this message in
>> error,
>> >> > > please notify the sender immediately and delete the original
>> message.
>> >> > >
>> >>
>> >>
>> >>
>> >> --
>> >> Malcolm McFarland
>> >> Cavulus
>> >> 1-800-760-6915
>> >> mmcfarl...@cavulus.com
>> >>
>> >>
>> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> >> unauthorized or improper disclosure, copying, distribution, or use of
>> >> the contents of this message is prohibited. The information contained
>> >> in this message is intended only for the personal and confidential use
>> >> of the recipient(s) named above. If you have received this message in
>> >> error, please notify the sender immediately and delete the original
>> >> message.
>> >>
>> >
>> >
>> > --
>> > Malcolm McFarland
>> > Cavulus
>> > 1-800-760-6915
>> > mmcfarl...@cavulus.com
>> >
>> >
>> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> > unauthorized or improper disclosure, copying, distribution, or use of
>> the
>> > contents of this message is prohibited. The information contained in
>> this
>> > message is intended only for the personal and confidential use of the
>> > recipient(s) named above. If you have received this message in error,
>> > please notify the sender immediately and delete the original message.
>> >
>>
>>
>> --
>> Malcolm McFarland
>> Cavulus
>> 1-800-760-6915
>> mmcfarl...@cavulus.com
>>
>>
>> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
>> unauthorized or improper disclosure, copying, distribution, or use of the
>> contents of this message is prohibited. The information contained in this
>> message is intended only for the personal and confidential use of the
>> recipient(s) named above. If you have received this message in error,
>> please notify the sender immediately and delete the original message.
>>
>

Re: Running w/ multiple CPUs/container on YARN

Reply via email to