[Yahoo-eng-team] [Bug 1838694] Re: glanceclient doesn't cleanup session it creates if one is not provided
https://review.opendev.org/#/c/674133/ ** Project changed: glance => python-glanceclient ** Changed in: python-glanceclient Assignee: (unassigned) => Alex Schultz (alex-schultz) ** Changed in: python-glanceclient Status: New => In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1838694 Title: glanceclient doesn't cleanup session it creates if one is not provided Status in Glance Client: In Progress Bug description: If a session object is not provided to the glance client, the HTTPClient defined in glanceclient.common.http will create a session object. This session object leaks open connections because it is not properly closed when the object is no longer needed. This leads to a ResourceWarning about an unclosed socket: sys:1: ResourceWarning: unclosed Example code: $ cat g.py #!/usr/bin/python3 -Wd import glanceclient.common.http as h client = h.get_http_client(endpoint='https://192.168.24.2:13292', token='', cacert='/etc/pki/ca-trust/source/anchors/cm-local-ca.pem', insecure=False) print(client.get('/v2/images')) Results in: $ ./g.py /usr/lib64/python3.6/importlib/_bootstrap_external.py:426: ImportWarning: Not importing directory /usr/lib/python3.6/site-packages/repoze: missing __init__ _warnings.warn(msg.format(portions[0]), ImportWarning) /usr/lib64/python3.6/importlib/_bootstrap_external.py:426: ImportWarning: Not importing directory /usr/lib/python3.6/site-packages/paste: missing __init__ _warnings.warn(msg.format(portions[0]), ImportWarning) /usr/lib/python3.6/site-packages/pytz/__init__.py:499: ResourceWarning: unclosed file <_io.TextIOWrapper name='/usr/share/zoneinfo/zone.tab' mode='r' encoding='UTF-8'> for l in open(os.path.join(_tzinfo_dir, 'zone.tab')) /usr/lib/python3.6/site-packages/eventlet/patcher.py:1: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp (, {'images': [{}], 'first': '/v2/images', 'schema': '/v2/schemas/images'}) sys:1: ResourceWarning: unclosed This can be mitigated by adding a __del__ function to glanceclient.common.http.HTTPClient that closes the session. To manage notifications about this bug go to: https://bugs.launchpad.net/python-glanceclient/+bug/1838694/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1838694] [NEW] glanceclient doesn't cleanup session it creates if one is not provided
Public bug reported: If a session object is not provided to the glance client, the HTTPClient defined in glanceclient.common.http will create a session object. This session object leaks open connections because it is not properly closed when the object is no longer needed. This leads to a ResourceWarning about an unclosed socket: sys:1: ResourceWarning: unclosed Example code: $ cat g.py #!/usr/bin/python3 -Wd import glanceclient.common.http as h client = h.get_http_client(endpoint='https://192.168.24.2:13292', token='', cacert='/etc/pki/ca-trust/source/anchors/cm-local-ca.pem', insecure=False) print(client.get('/v2/images')) Results in: $ ./g.py /usr/lib64/python3.6/importlib/_bootstrap_external.py:426: ImportWarning: Not importing directory /usr/lib/python3.6/site-packages/repoze: missing __init__ _warnings.warn(msg.format(portions[0]), ImportWarning) /usr/lib64/python3.6/importlib/_bootstrap_external.py:426: ImportWarning: Not importing directory /usr/lib/python3.6/site-packages/paste: missing __init__ _warnings.warn(msg.format(portions[0]), ImportWarning) /usr/lib/python3.6/site-packages/pytz/__init__.py:499: ResourceWarning: unclosed file <_io.TextIOWrapper name='/usr/share/zoneinfo/zone.tab' mode='r' encoding='UTF-8'> for l in open(os.path.join(_tzinfo_dir, 'zone.tab')) /usr/lib/python3.6/site-packages/eventlet/patcher.py:1: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp (, {'images': [{}], 'first': '/v2/images', 'schema': '/v2/schemas/images'}) sys:1: ResourceWarning: unclosed This can be mitigated by adding a __del__ function to glanceclient.common.http.HTTPClient that closes the session. ** Affects: glance Importance: Undecided Status: New -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to Glance. https://bugs.launchpad.net/bugs/1838694 Title: glanceclient doesn't cleanup session it creates if one is not provided Status in Glance: New Bug description: If a session object is not provided to the glance client, the HTTPClient defined in glanceclient.common.http will create a session object. This session object leaks open connections because it is not properly closed when the object is no longer needed. This leads to a ResourceWarning about an unclosed socket: sys:1: ResourceWarning: unclosed Example code: $ cat g.py #!/usr/bin/python3 -Wd import glanceclient.common.http as h client = h.get_http_client(endpoint='https://192.168.24.2:13292', token='', cacert='/etc/pki/ca-trust/source/anchors/cm-local-ca.pem', insecure=False) print(client.get('/v2/images')) Results in: $ ./g.py /usr/lib64/python3.6/importlib/_bootstrap_external.py:426: ImportWarning: Not importing directory /usr/lib/python3.6/site-packages/repoze: missing __init__ _warnings.warn(msg.format(portions[0]), ImportWarning) /usr/lib64/python3.6/importlib/_bootstrap_external.py:426: ImportWarning: Not importing directory /usr/lib/python3.6/site-packages/paste: missing __init__ _warnings.warn(msg.format(portions[0]), ImportWarning) /usr/lib/python3.6/site-packages/pytz/__init__.py:499: ResourceWarning: unclosed file <_io.TextIOWrapper name='/usr/share/zoneinfo/zone.tab' mode='r' encoding='UTF-8'> for l in open(os.path.join(_tzinfo_dir, 'zone.tab')) /usr/lib/python3.6/site-packages/eventlet/patcher.py:1: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses import imp (, {'images': [{}], 'first': '/v2/images', 'schema': '/v2/schemas/images'}) sys:1: ResourceWarning: unclosed This can be mitigated by adding a __del__ function to glanceclient.common.http.HTTPClient that closes the session. To manage notifications about this bug go to: https://bugs.launchpad.net/glance/+bug/1838694/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1808951] Re: python3 + Fedora + SSL + wsgi nova deployment, nova api returns RecursionError: maximum recursion depth exceeded while calling a Python object
Adding tripleo because this is affecting our fedora28 containers when deployed via an undercloud with ssl enabled. ** Also affects: tripleo Importance: Undecided Status: New ** Changed in: tripleo Status: New => Incomplete ** Changed in: tripleo Status: Incomplete => Triaged ** Changed in: tripleo Importance: Undecided => High ** Changed in: tripleo Milestone: None => stein-3 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1808951 Title: python3 + Fedora + SSL + wsgi nova deployment, nova api returns RecursionError: maximum recursion depth exceeded while calling a Python object Status in OpenStack Compute (nova): New Status in tripleo: Triaged Bug description: Description:- So while testing python3 with Fedora in [1], Found an issue while running nova-api behind wsgi. It fails with below Traceback:- 2018-12-18 07:41:55.364 26870 INFO nova.api.openstack.requestlog [req-e1af4808-ecd8-47c7-9568-a5dd9691c2c9 - - - - -] 127.0.0.1 "GET /v2.1/servers/detail?all_tenants=True=True" status: 500 len: 0 microversion: - time: 0.007297 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack [req-e1af4808-ecd8-47c7-9568-a5dd9691c2c9 - - - - -] Caught error: maximum recursion depth exceeded while calling a Python object: RecursionError: maximum recursion depth exceeded while calling a Python object 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack Traceback (most recent call last): 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/nova/api/openstack/__init__.py", line 94, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack return req.get_response(self.application) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1313, in send 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack application, catch_exc_info=False) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1277, in call_application 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack app_iter = application(self.environ, start_response) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 129, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack resp = self.call_func(req, *args, **kw) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 193, in call_func 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack return self.func(req, *args, **kwargs) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/nova/api/openstack/requestlog.py", line 92, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack self._log_req(req, res, start) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack self.force_reraise() 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack six.reraise(self.type_, self.value, self.tb) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack raise value 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/nova/api/openstack/requestlog.py", line 87, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack res = req.get_response(self.application) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1313, in send 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack application, catch_exc_info=False) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/request.py", line 1277, in call_application 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack app_iter = application(self.environ, start_response) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 143, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack return resp(environ, start_response) 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack File "/usr/lib/python3.6/site-packages/webob/dec.py", line 129, in __call__ 2018-12-18 07:41:55.364 26870 ERROR nova.api.openstack resp = self.call_func(req, *args, **kw)
[Yahoo-eng-team] [Bug 1752896] Re: novncproxy in Newton uses outdated novnc 0.5 which breaks Nova noVNC consoles
>From a tripleo standpoint, it's a packaging issue in RDO. Additionally newton is basically EOL so closing that out as won't fix. ** Changed in: tripleo Status: New => Won't Fix ** Changed in: tripleo Importance: Undecided => Medium ** Changed in: tripleo Milestone: None => rocky-1 -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1752896 Title: novncproxy in Newton uses outdated novnc 0.5 which breaks Nova noVNC consoles Status in OpenStack Compute (nova): In Progress Status in tripleo: Won't Fix Bug description: Delorean Newton (CentOS 7) ships with noVNC 0.5.2 in the Overcloud images. Even building an Overcloud image (DIB) produces an image with noVNC 0.5.2. The problem seems to be that CentOS 7 does not ship anything newer than 0.5.2. However, Red Hat Enterprise Linux 7 does indeed ship noVNC 0.6. In any case, Nova noVNC consoles in Newton don't work with noVNC 0.5.2. My workaround was to customize the Overcloud base image and replace the 0.5.2 RPM with a 0.6.2 RPM that I downloaded from some CentOS CI repository. Steps to reproduce == Follow instructions from https://docs.openstack.org/tripleo-docs/latest/install/installation/installing.html to install an OpenStack Undercloud using Newton, and either download the Overcloud base images from here https://images.rdoproject.org/newton/delorean/consistent/testing/ or build them yourself directly from the Undercloud. In any case, the Overcloud base image ships with noVNC 0.5.2-1 instead of 0.6.*. Expected result === A newer version of noVNC that does not break the Nova noVNC console. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1752896/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1341420] Re: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit
This is a really old bug and I don't think it applies to tripleo anymore (if ever). Setting to invalid for tripleo. ** Changed in: tripleo Status: New => Invalid -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1341420 Title: gap between scheduler selection and claim causes spurious failures when the instance is the last one to fit Status in OpenStack Compute (nova): Invalid Status in tripleo: Invalid Bug description: There is a race between the scheduler in select_destinations, which selects a set of hosts, and the nova compute manager, which claims resources on those hosts when building the instance. The race is particularly noticable with Ironic, where every request will consume a full host, but can turn up on libvirt etc too. Multiple schedulers will likely exacerbate this too unless they are in a version of python with randomised dictionary ordering, in which case they will make it better :). I've put https://review.openstack.org/106677 up to remove a comment which comes from before we introduced this race. One mitigating aspect to the race in the filter scheduler _schedule method attempts to randomly select hosts to avoid returning the same host in repeated requests, but the default minimum set it selects from is size 1 - so when heat requests a single instance, the same candidate is chosen every time. Setting that number higher can avoid all concurrent requests hitting the same host, but it will still be a race, and still likely to fail fairly hard at near-capacity situations (e.g. deploying all machines in a cluster with Ironic and Heat). Folk wanting to reproduce this: take a decent size cloud - e.g. 5 or 10 hypervisor hosts (KVM is fine). Deploy up to 1 VM left of capacity on each hypervisor. Then deploy a bunch of VMs one at a time but very close together - e.g. use the python API to get cached keystone credentials, and boot 5 in a loop. If using Ironic you will want https://review.openstack.org/106676 to let you see which host is being returned from the selection. Possible fixes: - have the scheduler be a bit smarter about returning hosts - e.g. track destination selection counts since the last refresh and weight hosts by that count as well - reinstate actioning claims into the scheduler, allowing the audit to correct any claimed-but-not-started resource counts asynchronously - special case the retry behaviour if there are lots of resources available elsewhere in the cluster. Stats wise, I just testing a 29 instance deployment with ironic and a heat stack, with 45 machines to deploy onto (so 45 hosts in the scheduler set) and 4 failed with this race - which means they recheduled and failed 3 times each - or 12 cases of scheduler racing *at minimum*. background chat 15:43 < lifeless> mikal: around? I need to sanity check something 15:44 < lifeless> ulp, nope, am sure of it. filing a bug. 15:45 < mikal> lifeless: ok 15:46 < lifeless> mikal: oh, you're here, I will run it past you :) 15:46 < lifeless> mikal: if you have ~5m 15:46 < mikal> Sure 15:46 < lifeless> so, symptoms 15:46 < lifeless> nova boot <...> --num-instances 45 -> works fairly reliably. Some minor timeout related things to fix but nothing dramatic. 15:47 < lifeless> heat create-stack <...> with a stack with 45 instances in it -> about 50% of instances fail to come up 15:47 < lifeless> this is with Ironic 15:47 < mikal> Sure 15:47 < lifeless> the failure on all the instances is the retry-three-times failure-of-death 15:47 < lifeless> what I believe is happening is this 15:48 < lifeless> the scheduler is allocating the same weighed list of hosts for requests that happen close enough together 15:49 < lifeless> and I believe its able to do that because the target hosts (from select_destinations) need to actually hit the compute node manager and have 15:49 < lifeless> with rt.instance_claim(context, instance, limits): 15:49 < lifeless> happen in _build_and_run_instance 15:49 < lifeless> before the resource usage is assigned 15:49 < mikal> Is heat making 45 separate requests to the nova API? 15:49 < lifeless> eys 15:49 < lifeless> yes 15:49 < lifeless> thats the key difference 15:50 < lifeless> same flavour, same image 15:50 < openstackgerrit> Sam Morrison proposed a change to openstack/nova: Remove cell api overrides for lock and unlock https://review.openstack.org/89487 15:50 < mikal> And you have enough quota for these instances, right? 15:50 < lifeless> yes 15:51 < mikal> I'd have to dig deeper to have an answer, but it sure does seem worth filing a bug for 15:51 < lifeless> my theory is that there is enough time between select_destinations in the conductor, and _build_and_run_instance in compute for another request to come in
[Yahoo-eng-team] [Bug 1649341] Re: Undercloud upgrade fails with "Cell mappings are not created, but required for Ocata"
** Changed in: tripleo Status: Fix Released => In Progress -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1649341 Title: Undercloud upgrade fails with "Cell mappings are not created, but required for Ocata" Status in OpenStack Compute (nova): Fix Released Status in puppet-nova: Fix Released Status in tripleo: In Progress Bug description: Trying to upgrade with recent trunk nova and puppet-nova gives this error: Notice: /Stage[main]/Nova::Db::Sync_api/Exec[nova-db-sync-api]/returns: error: Cell mappings are not created, but required for Ocata. Please run nova-manage db simple_cell_setup before continuing. Error: /usr/bin/nova-manage api_db sync returned 1 instead of one of [0] Error: /Stage[main]/Nova::Db::Sync_api/Exec[nova-db-sync-api]/returns: change from notrun to 0 failed: /usr/bin/nova-manage api_db sync returned 1 instead of one of [0] Debugging manually gives: $ sudo /usr/bin/nova-manage api_db sync error: Cell mappings are not created, but required for Ocata. Please run nova-manage db simple_cell_setup before continuing. but... $ sudo nova-manage db simple_cell_setup usage: nova-manage db [-h] {archive_deleted_rows,null_instance_uuid_scan,online_data_migrations,sync,version} ... nova-manage db: error: argument action: invalid choice: 'simple_cell_setup' (choose from 'archive_deleted_rows', 'null_instance_uuid_scan', 'online_data_migrations', 'sync', 'version') I tried adding openstack-nova* to the delorean-current whitelist, but with the latest nova packages there still appears to be this mismatch. [stack@instack /]$ rpm -qa | grep nova openstack-nova-conductor-15.0.0-0.20161212155146.909410c.el7.centos.noarch python-nova-15.0.0-0.20161212155146.909410c.el7.centos.noarch openstack-nova-scheduler-15.0.0-0.20161212155146.909410c.el7.centos.noarch puppet-nova-10.0.0-0.20161211003757.09b9f7b.el7.centos.noarch python2-novaclient-6.0.0-0.20161003181629.25117fa.el7.centos.noarch openstack-nova-api-15.0.0-0.20161212155146.909410c.el7.centos.noarch openstack-nova-cert-15.0.0-0.20161212155146.909410c.el7.centos.noarch openstack-nova-common-15.0.0-0.20161212155146.909410c.el7.centos.noarch openstack-nova-compute-15.0.0-0.20161212155146.909410c.el7.centos.noarch To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1649341/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp
[Yahoo-eng-team] [Bug 1624554] [NEW] database deadlock during tempest run
Public bug reported: During our Puppet Openstack CI testing with tempest, we received an error indicating a DB deadlock in the nova api. http://logs.openstack.org/78/370178/1/check/gate-puppet-openstack- integration-3-scenario001-tempest-ubuntu- xenial/8c9eb76/console.html#_2016-09-16_19_28_57_526498 Attached are the nova configurations and nova logs for this run. ** Affects: nova Importance: Undecided Status: New ** Attachment added: "nova-deadlock.tar" https://bugs.launchpad.net/bugs/1624554/+attachment/4742312/+files/nova-deadlock.tar -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1624554 Title: database deadlock during tempest run Status in OpenStack Compute (nova): New Bug description: During our Puppet Openstack CI testing with tempest, we received an error indicating a DB deadlock in the nova api. http://logs.openstack.org/78/370178/1/check/gate-puppet-openstack- integration-3-scenario001-tempest-ubuntu- xenial/8c9eb76/console.html#_2016-09-16_19_28_57_526498 Attached are the nova configurations and nova logs for this run. To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1624554/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp