[GitHub] [samza] mynameborat commented on issue #981: Bugfix: Making KafkaSytemAdmin's metadataConsumer accesses thread-safe, enabling StreamRegexMonitors only when required

2019-04-01 Thread GitBox
mynameborat commented on issue #981: Bugfix: Making KafkaSytemAdmin's metadataConsumer accesses thread-safe, enabling StreamRegexMonitors only when required URL: https://github.com/apache/samza/pull/981#issuecomment-478844338 Can you create a JIRA and add explanation around the bug and why

[GitHub] [samza] rmatharu opened a new pull request #981: Bugfix: Making KafkaSytemAdmin's metadataConsumer accesses thread-safe, enabling StreamRegexMonitors only when required

2019-04-01 Thread GitBox
rmatharu opened a new pull request #981: Bugfix: Making KafkaSytemAdmin's metadataConsumer accesses thread-safe, enabling StreamRegexMonitors only when required URL: https://github.com/apache/samza/pull/981 This is an autom

[GitHub] [samza] mynameborat commented on issue #905: SAMZA-2055: Async high level api

2019-04-01 Thread GitBox
mynameborat commented on issue #905: SAMZA-2055: Async high level api URL: https://github.com/apache/samza/pull/905#issuecomment-478805961 * Address review comments and renamed StreamOperatorImpl to FlatmapOperatorImpl * Added sample AsyncApplicationExample to illustrate the async APIs

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland
One more thing -- fwiw, I actually also came across the possibility that I would need to use the DominantResourceCalculator, but as you point out, this doesn't seem to be available in Hadoop 2.6. On Mon, Apr 1, 2019 at 5:27 PM Malcolm McFarland wrote: > That's quite helpful! I actually initiall

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland
That's quite helpful! I actually initially tried using a version of Hadoop > 2.6.x; when I did, it seemed like the AWS credentials in YARN (fs.s3a.access.key, fs.s3a.secret.key) weren't being accessed, as I received lots of "No AWS Credentials provided by DefaultAWSCredentialsProviderChain" message

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Prateek Maheshwari
Hi Malcolm, I think this is because in YARN 2.6 the FifoScheduler only accounts for memory for 'maximumAllocation': https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/r

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland
Hi Prateek, This still seems to be manifesting with the same problem. Since this seems to be something in the hadoop codebase, and I've emailed the hadoop-dev mailing list about it. Cheers, Malcolm On Mon, Apr 1, 2019 at 1:51 PM Prateek Maheshwari wrote: > Hi Malcolm, > > Yes, the AM is just r

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Prateek Maheshwari
Hi Malcolm, Yes, the AM is just reporting what the RM specified as the maximum allowed request size. I think 'yarn.scheduler.maximum-allocation-vcores' needs to be less than 'yarn.nodemanager.resource.cpu-vcores', since a container must fit on a single NM. Maybe the RM detected this and decided t

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland
One other detail: I'm running YARN on ECS in AWS. Has anybody seen issues with core allocation in this environment? I'm seeing this in the samza log: "Got AM register response. The YARN RM supports container requests with max-mem: 14336, max-cpu: 1" How does samza determine this? Looking at the S

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland
Hi Prateek, Sorry, meant to include these versions with my email; I'm running Samza 0.14 and Hadoop 2.6.1. I'm running three containers across 3 node managers, each with 16GB and 8 vcores. The other two containers are requesting 1 vcore each; even with the AMs running, that should be 4 for them in

Re: Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Prateek Maheshwari
Hi Malcolm, Just setting that configuration should be sufficient. We haven't seen this issue before. What Samza/YARN versions are you using? Can you also include the logs from where you get the error and your yarn configuration? - Prateek On Mon, Apr 1, 2019 at 2:33 AM Malcolm McFarland wrote:

Running w/ multiple CPUs/container on YARN

2019-04-01 Thread Malcolm McFarland
Hey Folks, I'm having some issues getting multiple cores for containers in yarn. I seem to have my YARN settings correct, and the RM interface says that I have 24vcores available. However, when I set the cluster-manager.container.cpu.cores Samza setting to anything other than 1, I get a message ab