Public bug reported: The cellsv1 job has been failing pretty constantly within the last week or two due to a libvirt connection reset:
http://logs.openstack.org/36/536936/1/check/legacy-tempest-dsvm- cells/a9ff792/logs/libvirt/libvirtd.txt.gz#_2018-01-28_01_25_23_762 2018-01-28 01:25:23.762+0000: 3896: error : virKeepAliveTimerInternal:143 : internal error: connection closed due to keepalive timeout http://logs.openstack.org/36/536936/1/check/legacy-tempest-dsvm- cells/a9ff792/logs/screen-n-cpu.txt.gz?level=TRACE#_2018-01-28_01_25_23_766 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager [req-392410f9-c834-4bdc-a439-ac20476fe212 - -] Error updating resources for node ubuntu-xenial-inap-mtl01-0002208439. 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager Traceback (most recent call last): 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/manager.py", line 6590, in update_available_resource_for_node 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager rt.update_available_resource(context, nodename) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 535, in update_available_resource 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager resources = self.driver.get_available_resource(nodename) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 5675, in get_available_resource 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager data["vcpus_used"] = self._get_vcpu_used() 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 5316, in _get_vcpu_used 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager for guest in self._host.list_guests(): 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/virt/libvirt/host.py", line 573, in list_guests 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager only_running=only_running, only_guests=only_guests)] 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/virt/libvirt/host.py", line 593, in list_instance_domains 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager alldoms = self.get_connection().listAllDomains(flags) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager result = proxy_call(self._autowrap, f, *args, **kwargs) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager rv = execute(f, *args, **kwargs) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager six.reraise(c, e, tb) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager rv = meth(*args, **kwargs) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 4953, in listAllDomains 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager raise libvirtError("virConnectListAllDomains() failed", conn=self) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager libvirtError: Cannot recv data: Connection reset by peer 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager It seems to be totally random. I'm not sure what is different about this job running on stable vs master, but it doesn't appear to be an issue on master: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22libvirtError%3A%20Cannot%20recv%20data%3A%20Connection%20reset%20by%20peer%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22%20AND%20build_name%3A%5C %22legacy-tempest-dsvm-cells%5C%22&from=7d ** Affects: nova Importance: Undecided Status: New ** Tags: cells libvirt testing -- You received this bug notification because you are a member of Yahoo! Engineering Team, which is subscribed to OpenStack Compute (nova). https://bugs.launchpad.net/bugs/1745838 Title: legacy-tempest-dsvm-cells constantly failing on stable pike and ocata due to libvirt connection reset Status in OpenStack Compute (nova): New Bug description: The cellsv1 job has been failing pretty constantly within the last week or two due to a libvirt connection reset: http://logs.openstack.org/36/536936/1/check/legacy-tempest-dsvm- cells/a9ff792/logs/libvirt/libvirtd.txt.gz#_2018-01-28_01_25_23_762 2018-01-28 01:25:23.762+0000: 3896: error : virKeepAliveTimerInternal:143 : internal error: connection closed due to keepalive timeout http://logs.openstack.org/36/536936/1/check/legacy-tempest-dsvm- cells/a9ff792/logs/screen-n-cpu.txt.gz?level=TRACE#_2018-01-28_01_25_23_766 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager [req-392410f9-c834-4bdc-a439-ac20476fe212 - -] Error updating resources for node ubuntu-xenial-inap-mtl01-0002208439. 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager Traceback (most recent call last): 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/manager.py", line 6590, in update_available_resource_for_node 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager rt.update_available_resource(context, nodename) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/compute/resource_tracker.py", line 535, in update_available_resource 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager resources = self.driver.get_available_resource(nodename) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 5675, in get_available_resource 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager data["vcpus_used"] = self._get_vcpu_used() 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/virt/libvirt/driver.py", line 5316, in _get_vcpu_used 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager for guest in self._host.list_guests(): 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/virt/libvirt/host.py", line 573, in list_guests 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager only_running=only_running, only_guests=only_guests)] 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/opt/stack/new/nova/nova/virt/libvirt/host.py", line 593, in list_instance_domains 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager alldoms = self.get_connection().listAllDomains(flags) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in doit 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager result = proxy_call(self._autowrap, f, *args, **kwargs) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in proxy_call 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager rv = execute(f, *args, **kwargs) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in execute 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager six.reraise(c, e, tb) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in tworker 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager rv = meth(*args, **kwargs) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 4953, in listAllDomains 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager raise libvirtError("virConnectListAllDomains() failed", conn=self) 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager libvirtError: Cannot recv data: Connection reset by peer 2018-01-28 01:25:23.766 16360 ERROR nova.compute.manager It seems to be totally random. I'm not sure what is different about this job running on stable vs master, but it doesn't appear to be an issue on master: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22libvirtError%3A%20Cannot%20recv%20data%3A%20Connection%20reset%20by%20peer%5C%22%20AND%20tags%3A%5C%22screen-n-cpu.txt%5C%22%20AND%20build_name%3A%5C %22legacy-tempest-dsvm-cells%5C%22&from=7d To manage notifications about this bug go to: https://bugs.launchpad.net/nova/+bug/1745838/+subscriptions -- Mailing list: https://launchpad.net/~yahoo-eng-team Post to : yahoo-eng-team@lists.launchpad.net Unsubscribe : https://launchpad.net/~yahoo-eng-team More help : https://help.launchpad.net/ListHelp