[Yahoo-eng-team] [Bug 1341420] Re: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit
This is a really old bug and I don't think it applies to tripleo anymore (if ever). Setting to invalid for tripleo. ** Changed in: tripleo Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1341420 Title: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit Status in OpenStack Compute (nova): Invalid Status in tripleo: Invalid Bug description: There is a race between the scheduler in select_destinations, which selects a set of hosts, and the nova compute manager, which claims resources on those hosts when building the instance. The race is particularly noticable with Ironic, where every request will consume a full host, but can turn up on libvirt etc too. Multiple schedulers will likely exacerbate this too unless they are in a version of python with randomised dictionary ordering, in which case they will make it better :). I've put https://review.openstack.org/106677 up to remove a comment which comes from before we introduced this race. One mitigating aspect to the race in the filter scheduler _schedule method attempts to randomly select hosts to avoid returning the same host in repeated requests, but the default minimum set it selects from is size 1 - so when heat requests a single instance, the same candidate is chosen every time. Setting that number higher can avoid all concurrent requests hitting the same host, but it will still be a race, and still likely to fail fairly hard at near-capacity situations (e.g. deploying all machines in a cluster with Ironic and Heat). Folk wanting to reproduce this: take a decent size cloud - e.g. 5 or 10 hypervisor hosts (KVM is fine). Deploy up to 1 VM left of capacity on each hypervisor. Then deploy a bunch of VMs one at a time but very close together - e.g. use the python API to get cached keystone credentials, and boot 5 in a loop. If using Ironic you will want https://review.openstack.org/106676 to let you see which host is being returned from the selection. Possible fixes: - have the scheduler be a bit smarter about returning hosts - e.g. track destination selection counts since the last refresh and weight hosts by that count as well - reinstate actioning claims into the scheduler, allowing the audit to correct any claimed-but-not-started resource counts asynchronously - special case the retry behaviour if there are lots of resources available elsewhere in the cluster. Stats wise, I just testing a 29 instance deployment with ironic and a heat stack, with 45 machines to deploy onto (so 45 hosts in the scheduler set) and 4 failed with this race - which means they recheduled and failed 3 times each - or 12 cases of scheduler racing *at minimum*. background chat 15:43 < lifeless> mikal: around? I need to sanity check something 15:44 < lifeless> ulp, nope, am sure of it. filing a bug. 15:45 < mikal> lifeless: ok 15:46 < lifeless> mikal: oh, you're here, I will run it past you :) 15:46 < lifeless> mikal: if you have ~5m 15:46 < mikal> Sure 15:46 < lifeless> so, symptoms 15:46 < lifeless> nova boot <...> --num-instances 45 -> works fairly reliably. Some minor timeout related things to fix but nothing dramatic. 15:47 < lifeless> heat create-stack <...> with a stack with 45 instances in it -> about 50% of instances fail to come up 15:47 < lifeless> this is with Ironic 15:47 < mikal> Sure 15:47 < lifeless> the failure on all the instances is the retry-three-times failure-of-death 15:47 < lifeless> what I believe is happening is this 15:48 < lifeless> the scheduler is allocating the same weighed list of hosts for requests that happen close enough together 15:49 < lifeless> and I believe its able to do that because the target hosts (from select_destinations) need to actually hit the compute node manager and have 15:49 < lifeless> with rt.instance_claim(context, instance, limits): 15:49 < lifeless> happen in _build_and_run_instance 15:49 < lifeless> before the resource usage is assigned 15:49 < mikal> Is heat making 45 separate requests to the nova API? 15:49 < lifeless> eys 15:49 < lifeless> yes 15:49 < lifeless> thats the key difference 15:50 < lifeless> same flavour, same image 15:50 < openstackgerrit> Sam Morrison proposed a change to openstack/nova: Remove cell api overrides for lock and unlock https://review.openstack.org/89487 15:50 < mikal> And you have enough quota for these instances, right? 15:50 < lifeless> yes 15:51 < mikal> I'd have to dig deeper to have an answer, but it sure does seem worth filing a bug for 15:51 < lifeless> my theory is that there is enough time between select_destinations in the conductor, and _build_and_run_instance in compute for another request to come in
[Yahoo-eng-team] [Bug 1341420] Re: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit
What you describe is fundamental to how nova works right now. We speculate in the scheduler, and if we race between two, we handle it with a reschedule. Nova specifically states that scheduling every last resource is out of scope. When trying to do that (which is often the use case for ironic) you're likely to hit this race as you run out of capacity: https://github.com/openstack/nova/blob/master/doc/source/project_scope.rst #iaas-not-batch-processing In the next few cycles we plan to move the claim process to the placement engine, which will eliminate most of these race-to-claim type issues, and in that situation things will be better for this kind of arrangement. Until that point, this is not a bug though, because it's specifically how nova is designed to work. ** Changed in: nova Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1341420 Title: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit Status in OpenStack Compute (nova): Invalid Status in tripleo: New Bug description: There is a race between the scheduler in select_destinations, which selects a set of hosts, and the nova compute manager, which claims resources on those hosts when building the instance. The race is particularly noticable with Ironic, where every request will consume a full host, but can turn up on libvirt etc too. Multiple schedulers will likely exacerbate this too unless they are in a version of python with randomised dictionary ordering, in which case they will make it better :). I've put https://review.openstack.org/106677 up to remove a comment which comes from before we introduced this race. One mitigating aspect to the race in the filter scheduler _schedule method attempts to randomly select hosts to avoid returning the same host in repeated requests, but the default minimum set it selects from is size 1 - so when heat requests a single instance, the same candidate is chosen every time. Setting that number higher can avoid all concurrent requests hitting the same host, but it will still be a race, and still likely to fail fairly hard at near-capacity situations (e.g. deploying all machines in a cluster with Ironic and Heat). Folk wanting to reproduce this: take a decent size cloud - e.g. 5 or 10 hypervisor hosts (KVM is fine). Deploy up to 1 VM left of capacity on each hypervisor. Then deploy a bunch of VMs one at a time but very close together - e.g. use the python API to get cached keystone credentials, and boot 5 in a loop. If using Ironic you will want https://review.openstack.org/106676 to let you see which host is being returned from the selection. Possible fixes: - have the scheduler be a bit smarter about returning hosts - e.g. track destination selection counts since the last refresh and weight hosts by that count as well - reinstate actioning claims into the scheduler, allowing the audit to correct any claimed-but-not-started resource counts asynchronously - special case the retry behaviour if there are lots of resources available elsewhere in the cluster. Stats wise, I just testing a 29 instance deployment with ironic and a heat stack, with 45 machines to deploy onto (so 45 hosts in the scheduler set) and 4 failed with this race - which means they recheduled and failed 3 times each - or 12 cases of scheduler racing *at minimum*. background chat 15:43 < lifeless> mikal: around? I need to sanity check something 15:44 < lifeless> ulp, nope, am sure of it. filing a bug. 15:45 < mikal> lifeless: ok 15:46 < lifeless> mikal: oh, you're here, I will run it past you :) 15:46 < lifeless> mikal: if you have ~5m 15:46 < mikal> Sure 15:46 < lifeless> so, symptoms 15:46 < lifeless> nova boot <...> --num-instances 45 -> works fairly reliably. Some minor timeout related things to fix but nothing dramatic. 15:47 < lifeless> heat create-stack <...> with a stack with 45 instances in it -> about 50% of instances fail to come up 15:47 < lifeless> this is with Ironic 15:47 < mikal> Sure 15:47 < lifeless> the failure on all the instances is the retry-three-times failure-of-death 15:47 < lifeless> what I believe is happening is this 15:48 < lifeless> the scheduler is allocating the same weighed list of hosts for requests that happen close enough together 15:49 < lifeless> and I believe its able to do that because the target hosts (from select_destinations) need to actually hit the compute node manager and have 15:49 < lifeless> with rt.instance_claim(context, instance, limits): 15:49 < lifeless> happen in _build_and_run_instance 15:49 < lifeless> before the resource usage is assigned 15:49 < mikal> Is heat making 45 separate requests to the nova API? 15:49 < lifeless> eys
[Yahoo-eng-team] [Bug 1341420] Re: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit
The issues was reproduced at ironic-multinode job http://logs.openstack.org/75/427675/11/check/gate-tempest-dsvm-ironic- ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial- nv/a66f08f/logs several instances were scheduled to the same ironic node. http://logs.openstack.org/75/427675/11/check/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial-nv/a66f08f/logs/screen-n-sch.txt.gz#_2017-02-03_18_46_35_112 2017-02-03 18:46:35.112 21443 DEBUG nova.scheduler.filter_scheduler [req-f8358730-fc5a-4f56-959c-af8742c8c3c2 tempest- ServersTestJSON-264305795 tempest-ServersTestJSON-264305795] Selected host: WeighedHost [host: (ubuntu-xenial-2-node-osic- cloud1-s3500-7106433, e85b2653-5563-4742-b23f-7ceb15b2032f) ram: 384MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0] _schedule /opt/stack/new/nova/nova/scheduler/filter_scheduler.py:127 http://logs.openstack.org/75/427675/11/check/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial-nv/a66f08f/logs/screen-n-sch.txt.gz#_2017-02-03_18_46_35_204 2017-02-03 18:46:35.204 21443 DEBUG nova.scheduler.filter_scheduler [req-3ff28238-f02b-4914-9ee1-dc07a15f6c9d tempest-ServerActionsTestJSON-1934209030 tempest-ServerActionsTestJSON-1934209030] Selected host: WeighedHost [host: (ubuntu-xenial-2-node-osic-cloud1-s3500-7106433, e85b2653-5563-4742-b23f-7ceb15b2032f) ram: 384MB disk: 10240MB io_ops: 0 instances: 0, weight: 2.0] _schedule /opt/stack/new/nova/nova/scheduler/filter_scheduler.py:127 2017-02-03 18:46:35.204 21443 DEBUG oslo_concurrency.lockutils [req-3ff28238-f02b-4914-9ee1-dc07a15f6c9d tempest-ServerActionsTestJSON-1934209030 tempest-ServerActionsTestJSON-1934209030] Lock "(u'ubuntu-xenial-2-node-osic-cloud1-s3500-7106433', u'e85b2653-5563-4742-b23f-7ceb15b2032f')" acquired by "nova.scheduler.host_manager._locked" :: waited 0.000s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270 ** Changed in: nova Status: Invalid => New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1341420 Title: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit Status in OpenStack Compute (nova): New Status in tripleo: New Bug description: There is a race between the scheduler in select_destinations, which selects a set of hosts, and the nova compute manager, which claims resources on those hosts when building the instance. The race is particularly noticable with Ironic, where every request will consume a full host, but can turn up on libvirt etc too. Multiple schedulers will likely exacerbate this too unless they are in a version of python with randomised dictionary ordering, in which case they will make it better :). I've put https://review.openstack.org/106677 up to remove a comment which comes from before we introduced this race. One mitigating aspect to the race in the filter scheduler _schedule method attempts to randomly select hosts to avoid returning the same host in repeated requests, but the default minimum set it selects from is size 1 - so when heat requests a single instance, the same candidate is chosen every time. Setting that number higher can avoid all concurrent requests hitting the same host, but it will still be a race, and still likely to fail fairly hard at near-capacity situations (e.g. deploying all machines in a cluster with Ironic and Heat). Folk wanting to reproduce this: take a decent size cloud - e.g. 5 or 10 hypervisor hosts (KVM is fine). Deploy up to 1 VM left of capacity on each hypervisor. Then deploy a bunch of VMs one at a time but very close together - e.g. use the python API to get cached keystone credentials, and boot 5 in a loop. If using Ironic you will want https://review.openstack.org/106676 to let you see which host is being returned from the selection. Possible fixes: - have the scheduler be a bit smarter about returning hosts - e.g. track destination selection counts since the last refresh and weight hosts by that count as well - reinstate actioning claims into the scheduler, allowing the audit to correct any claimed-but-not-started resource counts asynchronously - special case the retry behaviour if there are lots of resources available elsewhere in the cluster. Stats wise, I just testing a 29 instance deployment with ironic and a heat stack, with 45 machines to deploy onto (so 45 hosts in the scheduler set) and 4 failed with this race - which means they recheduled and failed 3 times each - or 12 cases of scheduler racing *at minimum*. background chat 15:43 < lifeless> mikal: around? I need to sanity check something 15:44 < lifeless> ulp, nope, am sure of it. filing a bug. 15:45 < mikal> lifeless: ok 15:46 < lifeless> mikal: oh, you're
[Yahoo-eng-team] [Bug 1341420] Re: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit
That bug report is still accurate, but given the fact that we're working towards having new resource providers for the scheduler and discussing actively on the implementation details, I'd by far prefer to close that bug and leave a spec better describing the problem. ** Changed in: nova Status: In Progress => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1341420 Title: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit Status in OpenStack Compute (nova): Invalid Status in tripleo: New Bug description: There is a race between the scheduler in select_destinations, which selects a set of hosts, and the nova compute manager, which claims resources on those hosts when building the instance. The race is particularly noticable with Ironic, where every request will consume a full host, but can turn up on libvirt etc too. Multiple schedulers will likely exacerbate this too unless they are in a version of python with randomised dictionary ordering, in which case they will make it better :). I've put https://review.openstack.org/106677 up to remove a comment which comes from before we introduced this race. One mitigating aspect to the race in the filter scheduler _schedule method attempts to randomly select hosts to avoid returning the same host in repeated requests, but the default minimum set it selects from is size 1 - so when heat requests a single instance, the same candidate is chosen every time. Setting that number higher can avoid all concurrent requests hitting the same host, but it will still be a race, and still likely to fail fairly hard at near-capacity situations (e.g. deploying all machines in a cluster with Ironic and Heat). Folk wanting to reproduce this: take a decent size cloud - e.g. 5 or 10 hypervisor hosts (KVM is fine). Deploy up to 1 VM left of capacity on each hypervisor. Then deploy a bunch of VMs one at a time but very close together - e.g. use the python API to get cached keystone credentials, and boot 5 in a loop. If using Ironic you will want https://review.openstack.org/106676 to let you see which host is being returned from the selection. Possible fixes: - have the scheduler be a bit smarter about returning hosts - e.g. track destination selection counts since the last refresh and weight hosts by that count as well - reinstate actioning claims into the scheduler, allowing the audit to correct any claimed-but-not-started resource counts asynchronously - special case the retry behaviour if there are lots of resources available elsewhere in the cluster. Stats wise, I just testing a 29 instance deployment with ironic and a heat stack, with 45 machines to deploy onto (so 45 hosts in the scheduler set) and 4 failed with this race - which means they recheduled and failed 3 times each - or 12 cases of scheduler racing *at minimum*. background chat 15:43 < lifeless> mikal: around? I need to sanity check something 15:44 < lifeless> ulp, nope, am sure of it. filing a bug. 15:45 < mikal> lifeless: ok 15:46 < lifeless> mikal: oh, you're here, I will run it past you :) 15:46 < lifeless> mikal: if you have ~5m 15:46 < mikal> Sure 15:46 < lifeless> so, symptoms 15:46 < lifeless> nova boot <...> --num-instances 45 -> works fairly reliably. Some minor timeout related things to fix but nothing dramatic. 15:47 < lifeless> heat create-stack <...> with a stack with 45 instances in it -> about 50% of instances fail to come up 15:47 < lifeless> this is with Ironic 15:47 < mikal> Sure 15:47 < lifeless> the failure on all the instances is the retry-three-times failure-of-death 15:47 < lifeless> what I believe is happening is this 15:48 < lifeless> the scheduler is allocating the same weighed list of hosts for requests that happen close enough together 15:49 < lifeless> and I believe its able to do that because the target hosts (from select_destinations) need to actually hit the compute node manager and have 15:49 < lifeless> with rt.instance_claim(context, instance, limits): 15:49 < lifeless> happen in _build_and_run_instance 15:49 < lifeless> before the resource usage is assigned 15:49 < mikal> Is heat making 45 separate requests to the nova API? 15:49 < lifeless> eys 15:49 < lifeless> yes 15:49 < lifeless> thats the key difference 15:50 < lifeless> same flavour, same image 15:50 < openstackgerrit> Sam Morrison proposed a change to openstack/nova: Remove cell api overrides for lock and unlock https://review.openstack.org/89487 15:50 < mikal> And you have enough quota for these instances, right? 15:50 < lifeless> yes 15:51 < mikal> I'd have to dig deeper to have an answer, but it sure does seem worth filing a bug for 15:51 < lifeless> my
[Yahoo-eng-team] [Bug 1341420] Re: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit
** Also affects: tripleo Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1341420 Title: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit Status in OpenStack Compute (nova): In Progress Status in tripleo: New Bug description: There is a race between the scheduler in select_destinations, which selects a set of hosts, and the nova compute manager, which claims resources on those hosts when building the instance. The race is particularly noticable with Ironic, where every request will consume a full host, but can turn up on libvirt etc too. Multiple schedulers will likely exacerbate this too unless they are in a version of python with randomised dictionary ordering, in which case they will make it better :). I've put https://review.openstack.org/106677 up to remove a comment which comes from before we introduced this race. One mitigating aspect to the race in the filter scheduler _schedule method attempts to randomly select hosts to avoid returning the same host in repeated requests, but the default minimum set it selects from is size 1 - so when heat requests a single instance, the same candidate is chosen every time. Setting that number higher can avoid all concurrent requests hitting the same host, but it will still be a race, and still likely to fail fairly hard at near-capacity situations (e.g. deploying all machines in a cluster with Ironic and Heat). Folk wanting to reproduce this: take a decent size cloud - e.g. 5 or 10 hypervisor hosts (KVM is fine). Deploy up to 1 VM left of capacity on each hypervisor. Then deploy a bunch of VMs one at a time but very close together - e.g. use the python API to get cached keystone credentials, and boot 5 in a loop. If using Ironic you will want https://review.openstack.org/106676 to let you see which host is being returned from the selection. Possible fixes: - have the scheduler be a bit smarter about returning hosts - e.g. track destination selection counts since the last refresh and weight hosts by that count as well - reinstate actioning claims into the scheduler, allowing the audit to correct any claimed-but-not-started resource counts asynchronously - special case the retry behaviour if there are lots of resources available elsewhere in the cluster. Stats wise, I just testing a 29 instance deployment with ironic and a heat stack, with 45 machines to deploy onto (so 45 hosts in the scheduler set) and 4 failed with this race - which means they recheduled and failed 3 times each - or 12 cases of scheduler racing *at minimum*. background chat 15:43 < lifeless> mikal: around? I need to sanity check something 15:44 < lifeless> ulp, nope, am sure of it. filing a bug. 15:45 < mikal> lifeless: ok 15:46 < lifeless> mikal: oh, you're here, I will run it past you :) 15:46 < lifeless> mikal: if you have ~5m 15:46 < mikal> Sure 15:46 < lifeless> so, symptoms 15:46 < lifeless> nova boot <...> --num-instances 45 -> works fairly reliably. Some minor timeout related things to fix but nothing dramatic. 15:47 < lifeless> heat create-stack <...> with a stack with 45 instances in it -> about 50% of instances fail to come up 15:47 < lifeless> this is with Ironic 15:47 < mikal> Sure 15:47 < lifeless> the failure on all the instances is the retry-three-times failure-of-death 15:47 < lifeless> what I believe is happening is this 15:48 < lifeless> the scheduler is allocating the same weighed list of hosts for requests that happen close enough together 15:49 < lifeless> and I believe its able to do that because the target hosts (from select_destinations) need to actually hit the compute node manager and have 15:49 < lifeless> with rt.instance_claim(context, instance, limits): 15:49 < lifeless> happen in _build_and_run_instance 15:49 < lifeless> before the resource usage is assigned 15:49 < mikal> Is heat making 45 separate requests to the nova API? 15:49 < lifeless> eys 15:49 < lifeless> yes 15:49 < lifeless> thats the key difference 15:50 < lifeless> same flavour, same image 15:50 < openstackgerrit> Sam Morrison proposed a change to openstack/nova: Remove cell api overrides for lock and unlock https://review.openstack.org/89487 15:50 < mikal> And you have enough quota for these instances, right? 15:50 < lifeless> yes 15:51 < mikal> I'd have to dig deeper to have an answer, but it sure does seem worth filing a bug for 15:51 < lifeless> my theory is that there is enough time between select_destinations in the conductor, and _build_and_run_instance in compute for another request to come in the front door and be scheduled to the same host 15:51 < mikal> That seems possible to me 15:52 <
[Yahoo-eng-team] [Bug 1341420] Re: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit
This is not an Ironic bug. It's just triggered by using Ironic. This is a a Nova scheduling bug (and a PITA to fix :( ** No longer affects: ironic ** Tags removed: ironic -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1341420 Title: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit Status in OpenStack Compute (Nova): Confirmed Bug description: There is a race between the scheduler in select_destinations, which selects a set of hosts, and the nova compute manager, which claims resources on those hosts when building the instance. The race is particularly noticable with Ironic, where every request will consume a full host, but can turn up on libvirt etc too. Multiple schedulers will likely exacerbate this too unless they are in a version of python with randomised dictionary ordering, in which case they will make it better :). I've put https://review.openstack.org/106677 up to remove a comment which comes from before we introduced this race. One mitigating aspect to the race in the filter scheduler _schedule method attempts to randomly select hosts to avoid returning the same host in repeated requests, but the default minimum set it selects from is size 1 - so when heat requests a single instance, the same candidate is chosen every time. Setting that number higher can avoid all concurrent requests hitting the same host, but it will still be a race, and still likely to fail fairly hard at near-capacity situations (e.g. deploying all machines in a cluster with Ironic and Heat). Folk wanting to reproduce this: take a decent size cloud - e.g. 5 or 10 hypervisor hosts (KVM is fine). Deploy up to 1 VM left of capacity on each hypervisor. Then deploy a bunch of VMs one at a time but very close together - e.g. use the python API to get cached keystone credentials, and boot 5 in a loop. If using Ironic you will want https://review.openstack.org/106676 to let you see which host is being returned from the selection. Possible fixes: - have the scheduler be a bit smarter about returning hosts - e.g. track destination selection counts since the last refresh and weight hosts by that count as well - reinstate actioning claims into the scheduler, allowing the audit to correct any claimed-but-not-started resource counts asynchronously - special case the retry behaviour if there are lots of resources available elsewhere in the cluster. Stats wise, I just testing a 29 instance deployment with ironic and a heat stack, with 45 machines to deploy onto (so 45 hosts in the scheduler set) and 4 failed with this race - which means they recheduled and failed 3 times each - or 12 cases of scheduler racing *at minimum*. background chat 15:43 lifeless mikal: around? I need to sanity check something 15:44 lifeless ulp, nope, am sure of it. filing a bug. 15:45 mikal lifeless: ok 15:46 lifeless mikal: oh, you're here, I will run it past you :) 15:46 lifeless mikal: if you have ~5m 15:46 mikal Sure 15:46 lifeless so, symptoms 15:46 lifeless nova boot ... --num-instances 45 - works fairly reliably. Some minor timeout related things to fix but nothing dramatic. 15:47 lifeless heat create-stack ... with a stack with 45 instances in it - about 50% of instances fail to come up 15:47 lifeless this is with Ironic 15:47 mikal Sure 15:47 lifeless the failure on all the instances is the retry-three-times failure-of-death 15:47 lifeless what I believe is happening is this 15:48 lifeless the scheduler is allocating the same weighed list of hosts for requests that happen close enough together 15:49 lifeless and I believe its able to do that because the target hosts (from select_destinations) need to actually hit the compute node manager and have 15:49 lifeless with rt.instance_claim(context, instance, limits): 15:49 lifeless happen in _build_and_run_instance 15:49 lifeless before the resource usage is assigned 15:49 mikal Is heat making 45 separate requests to the nova API? 15:49 lifeless eys 15:49 lifeless yes 15:49 lifeless thats the key difference 15:50 lifeless same flavour, same image 15:50 openstackgerrit Sam Morrison proposed a change to openstack/nova: Remove cell api overrides for lock and unlock https://review.openstack.org/89487 15:50 mikal And you have enough quota for these instances, right? 15:50 lifeless yes 15:51 mikal I'd have to dig deeper to have an answer, but it sure does seem worth filing a bug for 15:51 lifeless my theory is that there is enough time between select_destinations in the conductor, and _build_and_run_instance in compute for another request to come in the front door and be scheduled to the same host 15:51 mikal That seems possible to