container on YARN

Prateek Maheshwari Tue, 02 Apr 2019 09:49:36 -0700

Is it still the same message from the AM? The one that says: "Got AM
register response. The YARN RM supports container requests with max-mem:
14336, max-cpu: 1"


On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland <[email protected]>
wrote:

> Hey Prateek,
>
> The upgrade to Hadoop 2.7.6 went fine; everything seems to be working, and
> access to S3 via an access key/secret pair is working as well. However, my
> requested tasks are still only getting allocated 1 core, despite requesting
> more than that. Once again, I have a 3-node cluster that should have 24
> vcores available; on the yarn side, I have these options set:
>
> nodemanager.resource.cpu-vcores=8
> yarn.scheduler.minimum-allocation-vcore=1
> yarn.scheduler.maximum-allocation-vcores=4
>
> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator
>
> And on the Samza side, I'm setting:
>
> cluster-manager.container.cpu.cores=2
>
> However, YARN is still telling me that the running task has 1 vcore
> assigned. Do you have any other suggestions for options to tweak?
>
> Cheers,
> Malcolm
>
>
> On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <[email protected]>
> wrote:
>
> > One more thing -- fwiw, I actually also came across the possibility that
> I
> > would need to use the DominantResourceCalculator, but as you point out,
> > this doesn't seem to be available in Hadoop 2.6.
> >
> >
> > On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland <[email protected]
> >
> > wrote:
> >
> >> That's quite helpful! I actually initially tried using a version of
> >> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN
> >> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I
> >> received lots of "No AWS Credentials
> >> provided by DefaultAWSCredentialsProviderChain" messages. I found a
> >> way around this by providing the credentials to the AM directly via
> >> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but
> >> since this seemed very workaround-ish, I just assumed that I would
> >> eventually hit other problems using a version of Hadoop not pinned in
> >> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll
> >> give it a shot again.
> >>
> >> Have you done any AWS credential integration, and if so, did you need
> >> to do anything special to get it to work?
> >>
> >> Cheers,
> >> Malcolm
> >>
> >>
> >>
> >> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari <[email protected]
> >
> >> wrote:
> >> >
> >> > Hi Malcolm,
> >> >
> >> > I think this is because in YARN 2.6 the FifoScheduler only accounts
> for
> >> > memory for 'maximumAllocation':
> >> >
> >>
> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >> >
> >> > This has been changed as early as 2.7.0:
> >> >
> >>
> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218
> >> >
> >> > So upgrading will likely fix this issue. For reference, at LinkedIn we
> >> are
> >> > running YARN 2.7.2 with the CapacityScheduler
> >> > <
> >>
> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html
> >> >
> >> > and DominantResourceCalculator to account for vcore allocations in
> >> > scheduling.
> >> >
> >> > - Prateek
> >> >
> >> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland <
> >> [email protected]>
> >> > wrote:
> >> >
> >> > > Hi Prateek,
> >> > >
> >> > > This still seems to be manifesting with the same problem. Since this
> >> seems
> >> > > to be something in the hadoop codebase, and I've emailed the
> >> hadoop-dev
> >> > > mailing list about it.
> >> > >
> >> > > Cheers,
> >> > > Malcolm
> >> > >
> >> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari <
> >> [email protected]>
> >> > > wrote:
> >> > >
> >> > > > Hi Malcolm,
> >> > > >
> >> > > > Yes, the AM is just reporting what the RM specified as the maximum
> >> > > allowed
> >> > > > request size.
> >> > > >
> >> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be
> less
> >> than
> >> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container must fit
> >> on a
> >> > > > single NM. Maybe the RM detected this and decided to default to 1?
> >> Can
> >> > > you
> >> > > > try setting maximum-allocation-vcores lower?
> >> > > >
> >> > > > - Prateek
> >> > > >
> >> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland <
> >> > > [email protected]>
> >> > > > wrote:
> >> > > >
> >> > > > > One other detail: I'm running YARN on ECS in AWS. Has anybody
> seen
> >> > > > > issues with core allocation in this environment? I'm seeing this
> >> in
> >> > > > > the samza log:
> >> > > > >
> >> > > > > "Got AM register response. The YARN RM supports container
> requests
> >> > > > > with max-mem: 14336, max-cpu: 1"
> >> > > > >
> >> > > > > How does samza determine this? Looking at the Samza source on
> >> Github,
> >> > > > > it appears to be information that's passed back to the AM when
> it
> >> > > > > starts up.
> >> > > > >
> >> > > > > Cheers,
> >> > > > > Malcolm
> >> > > > >
> >> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland
> >> > > > > <[email protected]> wrote:
> >> > > > > >
> >> > > > > > Hi Prateek,
> >> > > > > >
> >> > > > > > Sorry, meant to include these versions with my email; I'm
> >> running
> >> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers
> >> across 3
> >> > > > > > node managers, each with 16GB and 8 vcores. The other two
> >> containers
> >> > > > > > are requesting 1 vcore each; even with the AMs running, that
> >> should
> >> > > be
> >> > > > > > 4 for them in total, leaving plenty of processing power
> >> available.
> >> > > > > >
> >> > > > > > The error is in the application attempt diagnostics field:
> "The
> >> YARN
> >> > > > > > cluster is unable to run your job due to unsatisfiable
> resource
> >> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not
> >> see this
> >> > > > > > error with the same memory request, but a cpu count request of
> >> 1.
> >> > > > > >
> >> > > > > > Here are the configuration options pertaining to resource
> >> allocation:
> >> > > > > >
> >> > > > > > <?xml version="1.0"?>
> >> > > > > > <configuration>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.resourcemanager.scheduler.class</name>
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >>
> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.nodemanager.vmem-check-enabled</name>
> >> > > > > >     <value>false</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.nodemanager.vmem-pmem-ratio</name>
> >> > > > > >     <value>2.1</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.nodemanager.resource.memory-mb</name>
> >> > > > > >     <value>14336</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.scheduler.minimum-allocation-mb</name>
> >> > > > > >     <value>256</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.scheduler.maximum-allocation-mb</name>
> >> > > > > >     <value>14336</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.scheduler.minimum-allocation-vcores</name>
> >> > > > > >     <value>1</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.scheduler.maximum-allocation-vcores</name>
> >> > > > > >     <value>16</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.nodemanager.resource.cpu-vcores</name>
> >> > > > > >     <value>8</value>
> >> > > > > >   </property>
> >> > > > > >   <property>
> >> > > > > >     <name>yarn.resourcemanager.cluster-id</name>
> >> > > > > >     <value>processor-cluster</value>
> >> > > > > >   </property>
> >> > > > > > </configuration>
> >> > > > > >
> >> > > > > > Cheers,
> >> > > > > > Malcolm
> >> > > > > >
> >> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari <
> >> > > > [email protected]>
> >> > > > > wrote:
> >> > > > > > >
> >> > > > > > > Hi Malcolm,
> >> > > > > > >
> >> > > > > > > Just setting that configuration should be sufficient. We
> >> haven't
> >> > > seen
> >> > > > > this
> >> > > > > > > issue before. What Samza/YARN versions are you using? Can
> you
> >> also
> >> > > > > include
> >> > > > > > > the logs from where you get the error and your yarn
> >> configuration?
> >> > > > > > >
> >> > > > > > > - Prateek
> >> > > > > > >
> >> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland <
> >> > > > > [email protected]>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hey Folks,
> >> > > > > > > >
> >> > > > > > > > I'm having some issues getting multiple cores for
> >> containers in
> >> > > > yarn.
> >> > > > > > > > I seem to have my YARN settings correct, and the RM
> >> interface
> >> > > says
> >> > > > > > > > that I have 24vcores available. However, when I set the
> >> > > > > > > > cluster-manager.container.cpu.cores Samza setting to
> >> anything
> >> > > other
> >> > > > > > > > than 1, I get a message about how the container is
> >> requesting
> >> > > more
> >> > > > > > > > resources than it can allocate. With 1 core, everything is
> >> fine.
> >> > > Is
> >> > > > > > > > there another Samza option I need to set?
> >> > > > > > > >
> >> > > > > > > > Cheers,
> >> > > > > > > > Malcolm
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > --
> >> > > > > > > > Malcolm McFarland
> >> > > > > > > > Cavulus
> >> > > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Malcolm McFarland
> >> > > > > > Cavulus
> >> > > > > > 1-800-760-6915
> >> > > > > > [email protected]
> >> > > > > >
> >> > > > > >
> >> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
> >> Any
> >> > > > > > unauthorized or improper disclosure, copying, distribution, or
> >> use of
> >> > > > > > the contents of this message is prohibited. The information
> >> contained
> >> > > > > > in this message is intended only for the personal and
> >> confidential
> >> > > use
> >> > > > > > of the recipient(s) named above. If you have received this
> >> message in
> >> > > > > > error, please notify the sender immediately and delete the
> >> original
> >> > > > > > message.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Malcolm McFarland
> >> > > > > Cavulus
> >> > > > > 1-800-760-6915
> >> > > > > [email protected]
> >> > > > >
> >> > > > >
> >> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus.
> Any
> >> > > > > unauthorized or improper disclosure, copying, distribution, or
> >> use of
> >> > > > > the contents of this message is prohibited. The information
> >> contained
> >> > > > > in this message is intended only for the personal and
> >> confidential use
> >> > > > > of the recipient(s) named above. If you have received this
> >> message in
> >> > > > > error, please notify the sender immediately and delete the
> >> original
> >> > > > > message.
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > > --
> >> > > Malcolm McFarland
> >> > > Cavulus
> >> > > 1-800-760-6915
> >> > > [email protected]
> >> > >
> >> > >
> >> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> > > unauthorized or improper disclosure, copying, distribution, or use
> of
> >> the
> >> > > contents of this message is prohibited. The information contained in
> >> this
> >> > > message is intended only for the personal and confidential use of
> the
> >> > > recipient(s) named above. If you have received this message in
> error,
> >> > > please notify the sender immediately and delete the original
> message.
> >> > >
> >>
> >>
> >>
> >> --
> >> Malcolm McFarland
> >> Cavulus
> >> 1-800-760-6915
> >> [email protected]
> >>
> >>
> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> >> unauthorized or improper disclosure, copying, distribution, or use of
> >> the contents of this message is prohibited. The information contained
> >> in this message is intended only for the personal and confidential use
> >> of the recipient(s) named above. If you have received this message in
> >> error, please notify the sender immediately and delete the original
> >> message.
> >>
> >
> >
> > --
> > Malcolm McFarland
> > Cavulus
> > 1-800-760-6915
> > [email protected]
> >
> >
> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> > unauthorized or improper disclosure, copying, distribution, or use of the
> > contents of this message is prohibited. The information contained in this
> > message is intended only for the personal and confidential use of the
> > recipient(s) named above. If you have received this message in error,
> > please notify the sender immediately and delete the original message.
> >
>
>
> --
> Malcolm McFarland
> Cavulus
> 1-800-760-6915
> [email protected]
>
>
> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any
> unauthorized or improper disclosure, copying, distribution, or use of the
> contents of this message is prohibited. The information contained in this
> message is intended only for the personal and confidential use of the
> recipient(s) named above. If you have received this message in error,
> please notify the sender immediately and delete the original message.
>

Re: Running w/ multiple CPUs/container on YARN

Reply via email to