And just to double check, you also changed the yarn.resourcemanager.scheduler.class to CapacityScheduler?
On Tue, Apr 2, 2019 at 9:49 AM Prateek Maheshwari <prateek...@gmail.com> wrote: > Is it still the same message from the AM? The one that says: "Got AM > register response. The YARN RM supports container requests with max-mem: > 14336, max-cpu: 1" > > On Tue, Apr 2, 2019 at 12:09 AM Malcolm McFarland <mmcfarl...@cavulus.com> > wrote: > >> Hey Prateek, >> >> The upgrade to Hadoop 2.7.6 went fine; everything seems to be working, and >> access to S3 via an access key/secret pair is working as well. However, my >> requested tasks are still only getting allocated 1 core, despite >> requesting >> more than that. Once again, I have a 3-node cluster that should have 24 >> vcores available; on the yarn side, I have these options set: >> >> nodemanager.resource.cpu-vcores=8 >> yarn.scheduler.minimum-allocation-vcore=1 >> yarn.scheduler.maximum-allocation-vcores=4 >> >> yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DominantResourceCalculator >> >> And on the Samza side, I'm setting: >> >> cluster-manager.container.cpu.cores=2 >> >> However, YARN is still telling me that the running task has 1 vcore >> assigned. Do you have any other suggestions for options to tweak? >> >> Cheers, >> Malcolm >> >> >> On Mon, Apr 1, 2019 at 5:28 PM Malcolm McFarland <mmcfarl...@cavulus.com> >> wrote: >> >> > One more thing -- fwiw, I actually also came across the possibility >> that I >> > would need to use the DominantResourceCalculator, but as you point out, >> > this doesn't seem to be available in Hadoop 2.6. >> > >> > >> > On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland < >> mmcfarl...@cavulus.com> >> > wrote: >> > >> >> That's quite helpful! I actually initially tried using a version of >> >> Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN >> >> (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I >> >> received lots of "No AWS Credentials >> >> provided by DefaultAWSCredentialsProviderChain" messages. I found a >> >> way around this by providing the credentials to the AM directly via >> >> yarn.am.opts=-Daws.accessKeyId=<key> -Daws.secretKey=<secret>, but >> >> since this seemed very workaround-ish, I just assumed that I would >> >> eventually hit other problems using a version of Hadoop not pinned in >> >> the Samza repo. If you're running 2.7.x at LinkedIn, however, I'll >> >> give it a shot again. >> >> >> >> Have you done any AWS credential integration, and if so, did you need >> >> to do anything special to get it to work? >> >> >> >> Cheers, >> >> Malcolm >> >> >> >> >> >> >> >> On Mon, Apr 1, 2019 at 5:20 PM Prateek Maheshwari < >> prateek...@gmail.com> >> >> wrote: >> >> > >> >> > Hi Malcolm, >> >> > >> >> > I think this is because in YARN 2.6 the FifoScheduler only accounts >> for >> >> > memory for 'maximumAllocation': >> >> > >> >> >> https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218 >> >> > >> >> > This has been changed as early as 2.7.0: >> >> > >> >> >> https://github.com/apache/hadoop/blob/branch-2.7.0/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/scheduler/fifo/FifoScheduler.java#L218 >> >> > >> >> > So upgrading will likely fix this issue. For reference, at LinkedIn >> we >> >> are >> >> > running YARN 2.7.2 with the CapacityScheduler >> >> > < >> >> >> https://hadoop.apache.org/docs/r2.7.2/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html >> >> > >> >> > and DominantResourceCalculator to account for vcore allocations in >> >> > scheduling. >> >> > >> >> > - Prateek >> >> > >> >> > On Mon, Apr 1, 2019 at 3:00 PM Malcolm McFarland < >> >> mmcfarl...@cavulus.com> >> >> > wrote: >> >> > >> >> > > Hi Prateek, >> >> > > >> >> > > This still seems to be manifesting with the same problem. Since >> this >> >> seems >> >> > > to be something in the hadoop codebase, and I've emailed the >> >> hadoop-dev >> >> > > mailing list about it. >> >> > > >> >> > > Cheers, >> >> > > Malcolm >> >> > > >> >> > > On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari < >> >> prateek...@gmail.com> >> >> > > wrote: >> >> > > >> >> > > > Hi Malcolm, >> >> > > > >> >> > > > Yes, the AM is just reporting what the RM specified as the >> maximum >> >> > > allowed >> >> > > > request size. >> >> > > > >> >> > > > I think 'yarn.scheduler.maximum-allocation-vcores' needs to be >> less >> >> than >> >> > > > 'yarn.nodemanager.resource.cpu-vcores', since a container must >> fit >> >> on a >> >> > > > single NM. Maybe the RM detected this and decided to default to >> 1? >> >> Can >> >> > > you >> >> > > > try setting maximum-allocation-vcores lower? >> >> > > > >> >> > > > - Prateek >> >> > > > >> >> > > > On Mon, Apr 1, 2019 at 11:59 AM Malcolm McFarland < >> >> > > mmcfarl...@cavulus.com> >> >> > > > wrote: >> >> > > > >> >> > > > > One other detail: I'm running YARN on ECS in AWS. Has anybody >> seen >> >> > > > > issues with core allocation in this environment? I'm seeing >> this >> >> in >> >> > > > > the samza log: >> >> > > > > >> >> > > > > "Got AM register response. The YARN RM supports container >> requests >> >> > > > > with max-mem: 14336, max-cpu: 1" >> >> > > > > >> >> > > > > How does samza determine this? Looking at the Samza source on >> >> Github, >> >> > > > > it appears to be information that's passed back to the AM when >> it >> >> > > > > starts up. >> >> > > > > >> >> > > > > Cheers, >> >> > > > > Malcolm >> >> > > > > >> >> > > > > On Mon, Apr 1, 2019 at 10:44 AM Malcolm McFarland >> >> > > > > <mmcfarl...@cavulus.com> wrote: >> >> > > > > > >> >> > > > > > Hi Prateek, >> >> > > > > > >> >> > > > > > Sorry, meant to include these versions with my email; I'm >> >> running >> >> > > > > > Samza 0.14 and Hadoop 2.6.1. I'm running three containers >> >> across 3 >> >> > > > > > node managers, each with 16GB and 8 vcores. The other two >> >> containers >> >> > > > > > are requesting 1 vcore each; even with the AMs running, that >> >> should >> >> > > be >> >> > > > > > 4 for them in total, leaving plenty of processing power >> >> available. >> >> > > > > > >> >> > > > > > The error is in the application attempt diagnostics field: >> "The >> >> YARN >> >> > > > > > cluster is unable to run your job due to unsatisfiable >> resource >> >> > > > > > requirements. You asked for mem: 2048, and cpu: 2." I do not >> >> see this >> >> > > > > > error with the same memory request, but a cpu count request >> of >> >> 1. >> >> > > > > > >> >> > > > > > Here are the configuration options pertaining to resource >> >> allocation: >> >> > > > > > >> >> > > > > > <?xml version="1.0"?> >> >> > > > > > <configuration> >> >> > > > > > <property> >> >> > > > > > <name>yarn.resourcemanager.scheduler.class</name> >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> >> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fifo.FifoScheduler</value> >> >> > > > > > </property> >> >> > > > > > <property> >> >> > > > > > <name>yarn.nodemanager.vmem-check-enabled</name> >> >> > > > > > <value>false</value> >> >> > > > > > </property> >> >> > > > > > <property> >> >> > > > > > <name>yarn.nodemanager.vmem-pmem-ratio</name> >> >> > > > > > <value>2.1</value> >> >> > > > > > </property> >> >> > > > > > <property> >> >> > > > > > <name>yarn.nodemanager.resource.memory-mb</name> >> >> > > > > > <value>14336</value> >> >> > > > > > </property> >> >> > > > > > <property> >> >> > > > > > <name>yarn.scheduler.minimum-allocation-mb</name> >> >> > > > > > <value>256</value> >> >> > > > > > </property> >> >> > > > > > <property> >> >> > > > > > <name>yarn.scheduler.maximum-allocation-mb</name> >> >> > > > > > <value>14336</value> >> >> > > > > > </property> >> >> > > > > > <property> >> >> > > > > > <name>yarn.scheduler.minimum-allocation-vcores</name> >> >> > > > > > <value>1</value> >> >> > > > > > </property> >> >> > > > > > <property> >> >> > > > > > <name>yarn.scheduler.maximum-allocation-vcores</name> >> >> > > > > > <value>16</value> >> >> > > > > > </property> >> >> > > > > > <property> >> >> > > > > > <name>yarn.nodemanager.resource.cpu-vcores</name> >> >> > > > > > <value>8</value> >> >> > > > > > </property> >> >> > > > > > <property> >> >> > > > > > <name>yarn.resourcemanager.cluster-id</name> >> >> > > > > > <value>processor-cluster</value> >> >> > > > > > </property> >> >> > > > > > </configuration> >> >> > > > > > >> >> > > > > > Cheers, >> >> > > > > > Malcolm >> >> > > > > > >> >> > > > > > On Mon, Apr 1, 2019 at 10:25 AM Prateek Maheshwari < >> >> > > > prateek...@gmail.com> >> >> > > > > wrote: >> >> > > > > > > >> >> > > > > > > Hi Malcolm, >> >> > > > > > > >> >> > > > > > > Just setting that configuration should be sufficient. We >> >> haven't >> >> > > seen >> >> > > > > this >> >> > > > > > > issue before. What Samza/YARN versions are you using? Can >> you >> >> also >> >> > > > > include >> >> > > > > > > the logs from where you get the error and your yarn >> >> configuration? >> >> > > > > > > >> >> > > > > > > - Prateek >> >> > > > > > > >> >> > > > > > > On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland < >> >> > > > > mmcfarl...@cavulus.com> >> >> > > > > > > wrote: >> >> > > > > > > >> >> > > > > > > > Hey Folks, >> >> > > > > > > > >> >> > > > > > > > I'm having some issues getting multiple cores for >> >> containers in >> >> > > > yarn. >> >> > > > > > > > I seem to have my YARN settings correct, and the RM >> >> interface >> >> > > says >> >> > > > > > > > that I have 24vcores available. However, when I set the >> >> > > > > > > > cluster-manager.container.cpu.cores Samza setting to >> >> anything >> >> > > other >> >> > > > > > > > than 1, I get a message about how the container is >> >> requesting >> >> > > more >> >> > > > > > > > resources than it can allocate. With 1 core, everything >> is >> >> fine. >> >> > > Is >> >> > > > > > > > there another Samza option I need to set? >> >> > > > > > > > >> >> > > > > > > > Cheers, >> >> > > > > > > > Malcolm >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > > -- >> >> > > > > > > > Malcolm McFarland >> >> > > > > > > > Cavulus >> >> > > > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > >> >> > > > > > -- >> >> > > > > > Malcolm McFarland >> >> > > > > > Cavulus >> >> > > > > > 1-800-760-6915 >> >> > > > > > mmcfarl...@cavulus.com >> >> > > > > > >> >> > > > > > >> >> > > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a >> Cavulus. >> >> Any >> >> > > > > > unauthorized or improper disclosure, copying, distribution, >> or >> >> use of >> >> > > > > > the contents of this message is prohibited. The information >> >> contained >> >> > > > > > in this message is intended only for the personal and >> >> confidential >> >> > > use >> >> > > > > > of the recipient(s) named above. If you have received this >> >> message in >> >> > > > > > error, please notify the sender immediately and delete the >> >> original >> >> > > > > > message. >> >> > > > > >> >> > > > > >> >> > > > > >> >> > > > > -- >> >> > > > > Malcolm McFarland >> >> > > > > Cavulus >> >> > > > > 1-800-760-6915 >> >> > > > > mmcfarl...@cavulus.com >> >> > > > > >> >> > > > > >> >> > > > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. >> Any >> >> > > > > unauthorized or improper disclosure, copying, distribution, or >> >> use of >> >> > > > > the contents of this message is prohibited. The information >> >> contained >> >> > > > > in this message is intended only for the personal and >> >> confidential use >> >> > > > > of the recipient(s) named above. If you have received this >> >> message in >> >> > > > > error, please notify the sender immediately and delete the >> >> original >> >> > > > > message. >> >> > > > > >> >> > > > >> >> > > >> >> > > >> >> > > -- >> >> > > Malcolm McFarland >> >> > > Cavulus >> >> > > 1-800-760-6915 >> >> > > mmcfarl...@cavulus.com >> >> > > >> >> > > >> >> > > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any >> >> > > unauthorized or improper disclosure, copying, distribution, or use >> of >> >> the >> >> > > contents of this message is prohibited. The information contained >> in >> >> this >> >> > > message is intended only for the personal and confidential use of >> the >> >> > > recipient(s) named above. If you have received this message in >> error, >> >> > > please notify the sender immediately and delete the original >> message. >> >> > > >> >> >> >> >> >> >> >> -- >> >> Malcolm McFarland >> >> Cavulus >> >> 1-800-760-6915 >> >> mmcfarl...@cavulus.com >> >> >> >> >> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any >> >> unauthorized or improper disclosure, copying, distribution, or use of >> >> the contents of this message is prohibited. The information contained >> >> in this message is intended only for the personal and confidential use >> >> of the recipient(s) named above. If you have received this message in >> >> error, please notify the sender immediately and delete the original >> >> message. >> >> >> > >> > >> > -- >> > Malcolm McFarland >> > Cavulus >> > 1-800-760-6915 >> > mmcfarl...@cavulus.com >> > >> > >> > This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any >> > unauthorized or improper disclosure, copying, distribution, or use of >> the >> > contents of this message is prohibited. The information contained in >> this >> > message is intended only for the personal and confidential use of the >> > recipient(s) named above. If you have received this message in error, >> > please notify the sender immediately and delete the original message. >> > >> >> >> -- >> Malcolm McFarland >> Cavulus >> 1-800-760-6915 >> mmcfarl...@cavulus.com >> >> >> This correspondence is from HealthPlanCRM, LLC, d/b/a Cavulus. Any >> unauthorized or improper disclosure, copying, distribution, or use of the >> contents of this message is prohibited. The information contained in this >> message is intended only for the personal and confidential use of the >> recipient(s) named above. If you have received this message in error, >> please notify the sender immediately and delete the original message. >> >