Eric, sorry for the slow reply.
I'm afraid I'm not 100% sure anymore (this was at a client and I don't have access to the system at the moment). I believe we submitted to foo_ultimo first and then to foo_daily but I've reached out to them to get feedback. Cheers, Lars On Fri, Oct 11, 2019 at 8:21 PM epa...@apache.org <epa...@apache.org> wrote: > > Submit a job to queue 1 which uses 100% of the cluster. > > Submit a job to queue 2 which doesn't get allocated because there are > not enough resources for the AM. > > Lars, > can you please correlate the 'queue 1' and 'queue 2' with the config that > you provided? That is, based on your configs what is the queue name that > was using 100% and which was being starved for resources? > > Thanks, > -Eric > > > > On Friday, October 11, 2019, 6:41:08 AM CDT, Lars Francke < > lars.fran...@gmail.com> wrote: > > > > > > Sunil, > > this is the full capacity-scheduler.xml file: > > > <configuration xmlns:xi="http://www.w3.org/2001/XInclude"> > > <property> > <name>yarn.scheduler.capacity.maximum-am-resource-percent</name> > <value>0.2</value> > </property> > > <property> > <name>yarn.scheduler.capacity.maximum-applications</name> > <value>10000</value> > </property> > > <property> > <name>yarn.scheduler.capacity.node-locality-delay</name> > <value>40</value> > </property> > > <property> > <name>yarn.scheduler.capacity.queue-mappings-override.enable</name> > <value>false</value> > </property> > > <property> > <name>yarn.scheduler.capacity.resource-calculator</name> > > > <value>org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.accessible-node-labels</name> > <value>*</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.acl_administer_queue</name> > <value>*</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.acl_submit_applications</name> > <value>*</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.capacity</name> > <value>100</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.default.acl_submit_applications</name> > <value>*</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.default.capacity</name> > <value>1</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.default.maximum-capacity</name> > <value>100</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.default.priority</name> > <value>0</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.default.state</name> > <value>RUNNING</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.default.user-limit-factor</name> > <value>50</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.priority</name> > <value>0</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.queues</name> > <value>default,foo</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.acl_administer_queue</name> > <value>*</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.acl_submit_applications</name> > <value>*</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.capacity</name> > <value>99</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.maximum-capacity</name> > <value>100</value> > </property> > > <property> > > <name>yarn.scheduler.capacity.root.foo.minimum-user-limit-percent</name> > <value>100</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.priority</name> > <value>0</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.queues</name> > <value>foo_daily,foo_ultimo</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.foo.foo_daily.acl_administer_queue</name> > <value>*</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.foo.foo_daily.acl_submit_applications</name> > <value>*</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.foo_daily.capacity</name> > <value>50</value> > </property> > > <property> > > <name>yarn.scheduler.capacity.root.foo.foo_daily.maximum-capacity</name> > <value>100</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.foo.foo_daily.minimum-user-limit-percent</name> > <value>100</value> > </property> > > <property> > > <name>yarn.scheduler.capacity.root.foo.foo_daily.ordering-policy</name> > <value>fifo</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.foo_daily.priority</name> > <value>0</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.foo_daily.state</name> > <value>RUNNING</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.foo.foo_daily.user-limit-factor</name> > <value>2</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.foo.foo_ultimo.acl_administer_queue</name> > <value>*</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.foo.foo_ultimo.acl_submit_applications</name> > <value>*</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.foo_ultimo.capacity</name> > <value>50</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.foo.foo_ultimo.maximum-capacity</name> > <value>100</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.foo.foo_ultimo.minimum-user-limit-percent</name> > <value>100</value> > </property> > > <property> > > <name>yarn.scheduler.capacity.root.foo.foo_ultimo.ordering-policy</name> > <value>fifo</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.foo_ultimo.priority</name> > <value>0</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.foo_ultimo.state</name> > <value>RUNNING</value> > </property> > > <property> > > > <name>yarn.scheduler.capacity.root.foo.foo_ultimo.user-limit-factor</name> > <value>2</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.state</name> > <value>RUNNING</value> > </property> > > <property> > <name>yarn.scheduler.capacity.root.foo.user-limit-factor</name> > <value>1</value> > </property> > > </configuration> > > On Wed, Oct 9, 2019 at 10:56 PM Lars Francke <lars.fran...@gmail.com> > wrote: > > Sunil, > > > > thank you for the answer. > > > > This is HDP 3.1 based on Hadoop 3.1.1. > > No preemption defaults were changed I believe. The only thing changed is > to enable preemption (monitor.enabled) in Ambari but I can get the full XML > for you if that's helpful. I'll get back to you on that. > > > > Cheers, > > Lars > > > > > > > > On Wed, Oct 9, 2019 at 7:57 PM Sunil Govindan <sun...@apache.org> wrote: > >> Hi, > >> > >> Expectation of the scenario explained is more or less correct. > >> Few informations are needed to get more clear picture. > >> 1. hadoop version > >> 2. capacity scheduler xml (to be precise, all preemption related > configs which are added) > >> > >> - Sunil > >> > >> On Wed, Oct 9, 2019 at 6:23 PM Lars Francke <lars.fran...@gmail.com> > wrote: > >>> Hi, > >>> > >>> I've got a question about behavior we're seeing. > >>> > >>> Two queues: Preemption enabled, CapacityScheduler (happy to provide > more config if needed), 50% of resources to each > >>> > >>> Submit a job to queue 1 which uses 100% of the cluster. > >>> Submit a job to queue 2 which doesn't get allocated because there are > not enough resources for the AM. > >>> > >>> I'd have expected this to trigger preemption but it doesn't happen. > >>> Is this expected? > >>> > >>> Second question: > >>> Once we get the second job running it only gets three containers but > requested six. Utilization at this point is still "unfair" i.e. queue 1 > ~75% resources, queue 2 ~25% resources. We saw two preemptions happening to > get to this 25% utilization but then no more. > >>> > >>> Any idea why this could be happening? > >>> > >>> Thank you! > >>> > >>> Lars > >>> > >> > > >