You can send SIGHUP to Neutron to apply configuration without restarting it. Enable debug in configuration and "kill -s SIGHUP <neutron>" will get you all debugging logs.
Tony ________________________________________ From: discuss <ovs-discuss-boun...@openvswitch.org> on behalf of Christian Stelter <refu...@last-refuge.net> Sent: January 3, 2022 08:54 AM To: ovs-discuss@openvswitch.org Subject: Re: [ovs-discuss] Timeout waiting for network-vif-plugged for instance with vm_state building and task_state spawning Hi! A small update after restarting neutron server (problem vanished since the restart last week as in your case) and reading the posts/bug reports you send. We observed an additional phenomenon when running into the problem which lets us wonder whether ovn is may be also part of the problem and which isn't mentioned in the posts/bugs. Neutron gets the port up/network-vif-plugged report from ovn twice and at least in one case with a delay of >5h. We will have a closer look when this problem reoccurs and verify whether the multiple up/plugged reports will be also reproduced. Kind regards, Christian Stelter On Wed, Dec 29, 2021 at 11:28 PM Tony Liu <tonyliu0...@hotmail.com<mailto:tonyliu0...@hotmail.com>> wrote: I've been hitting this issue from time to time. There are discussions and bugs for this. http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025941.html http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025892.html https://bugs.launchpad.net/heat/+bug/1694371 https://bugs.launchpad.net/nova/+bug/1944779 I don't think it's about OVN/OVS, because the port is up in OVN NB. And restarting Neutron will make it work again. I doubt Neutron either doesn't pick it up or doesn't send it out to Nova compute. When I hit this issue, I enable debugging for Neutron and restart it, but that makes problem gone. If I keep logging on debug level till next time, it will dump too much loggings. I would be good for Neutron to support enable debug logging on-the-fly. If it happens to you more frequently, you can try to enable debug and see if that captures anything helpful. Tony ________________________________________ From: discuss <ovs-discuss-boun...@openvswitch.org<mailto:ovs-discuss-boun...@openvswitch.org>> on behalf of Christian Stelter <refu...@last-refuge.net<mailto:refu...@last-refuge.net>> Sent: December 29, 2021 09:51 AM To: ovs-discuss@openvswitch.org<mailto:ovs-discuss@openvswitch.org> Subject: [ovs-discuss] Timeout waiting for network-vif-plugged for instance with vm_state building and task_state spawning Hi! We observe in our well aged OpenStack installation (kolla-ansible deployed victoria release) strange timeouts while creating new instances. It seems that the feedback from the ovn controller(?) doesn't reach the nova compute/libvirt process within 300s. This does not happen all the time, but 1/3 to 1/5 of the launches have problems. This is what we see in the logs: --- nova-compute --- 2021-12-29 07:27:06.972 7 INFO nova.compute.claims [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Claim successful on node compute07 2021-12-29 07:27:07.757 7 INFO nova.virt.libvirt.driver [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Creating image 2021-12-29 07:27:12.514 7 INFO nova.compute.manager [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] VM Started (Lifecycle Event) 2021-12-29 07:27:12.559 7 INFO nova.compute.manager [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] VM Paused (Lifecycle Event) 2021-12-29 07:27:12.649 7 INFO nova.compute.manager [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] During sync_power_state the instance has a pending task (spawning). Skip. 2021-12-29 07:32:12.518 7 WARNING nova.virt.libvirt.driver [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Timeout waiting for [('network-vif-plugged', 'c641bae9-edcc-4e64-93ec-2dcbf5010009')] for instance with vm_state building and task_state spawning: eventlet.timeout.Timeout: 300 seconds 2021-12-29 07:32:12.522 7 INFO nova.compute.manager [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] VM Resumed (Lifecycle Event) 2021-12-29 07:32:12.526 7 INFO nova.virt.libvirt.driver [-] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Instance spawned successfully. 2021-12-29 07:32:12.527 7 INFO nova.compute.manager [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Took 304.77 seconds to spawn the instance on the hypervisor. 2021-12-29 07:32:12.613 7 INFO nova.compute.manager [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] During sync_power_state the instance has a pending task (spawning). Skip. 2021-12-29 07:32:12.632 7 INFO nova.compute.manager [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance: aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Took 305.70 seconds to build instance. --- nova-compute --- --- neutron-server --- 2021-12-29 07:27:09.115 23 INFO neutron.db.ovn_revision_numbers_db [req-cb418ddf-a068-4511-9fed-f0914945c76d 0b16e2645828414d86f094c2a7a693eb 66fcb198228e4d9d97f09dc7ce1d5130 - default default] Successfully bumped revision number for resource c641bae9-edcc-4e64-93ec-2dcbf5010009 (type: ports) to 1 2021-12-29 07:27:10.054 23 INFO neutron.db.ovn_revision_numbers_db [req-8e3f0f54-8164-4d6d-acab-3be912b0afb6 217d8068ede5469ebfbebaf3e2242840 8775aa7d41654c84b8b17b3a570adf4d - default default] Successfully bumped revision number for resource c641bae9-edcc-4e64-93ec-2dcbf5010009 (type: ports) to 2 2021-12-29 07:27:10.393 23 INFO neutron.db.ovn_revision_numbers_db [req-8e3f0f54-8164-4d6d-acab-3be912b0afb6 217d8068ede5469ebfbebaf3e2242840 8775aa7d41654c84b8b17b3a570adf4d - default default] Successfully bumped revision number for resource c641bae9-edcc-4e64-93ec-2dcbf5010009 (type: ports) to 2 2021-12-29 07:27:10.415 23 INFO neutron.wsgi [req-8e3f0f54-8164-4d6d-acab-3be912b0afb6 217d8068ede5469ebfbebaf3e2242840 8775aa7d41654c84b8b17b3a570adf4d - default default] 172.16.4.75,172.16.4.74 "PUT /v2.0/ports/c641bae9-edcc-4e64-93ec-2dcbf5010009 HTTP/1.1" status: 200 len: 1258 time: 1.2485251 2021-12-29 07:27:10.556 23 INFO neutron.notifiers.nova [-] Nova event response: {'name': 'network-changed', 'server_uuid': 'aef0a1b3-99d6-43e2-89db-b7bd4e0516d9', 'tag': 'c641bae9-edcc-4e64-93ec-2dcbf5010009', 'status': 'completed', 'code': 200} --- neutron-server --- --- ovn-controller --- 2021-12-29T07:27:11.448Z|00363|binding|INFO|Claiming lport c641bae9-edcc-4e64-93ec-2dcbf5010009 for this chassis. 2021-12-29T07:27:11.448Z|00364|binding|INFO|c641bae9-edcc-4e64-93ec-2dcbf5010009: Claiming fa:16:3e:62:2d:8a 192.168.0.237 --- ovn-controller --- In one case we found an "OVN reports status up for port" with more than 5 hours delay after the initial creation request in the log. Does anyone have an idea what could be causing this? We are not sure whether ovn is the culprit or something in between. I didn't find any useful hints in the all knowing trash heap. But may be I just used the wrong key words. Versions used: ovn-controller 20.03.2 Open vSwitch Library 2.13.3 OpenFlow versions 0x4:0x4 OpenStack Nova 22.2.1 Thanks in advance! Kind regards, Christian Stelter _______________________________________________ discuss mailing list disc...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-discuss