Re: [ovs-discuss] Timeout waiting for network-vif-plugged for instance with vm_state building and task_state spawning

2022-01-03 Thread Tony Liu
You can send SIGHUP to Neutron to apply configuration without restarting it.
Enable debug in configuration and "kill -s SIGHUP " will get you
all debugging logs.

Tony

From: discuss  on behalf of Christian 
Stelter 
Sent: January 3, 2022 08:54 AM
To: ovs-discuss@openvswitch.org
Subject: Re: [ovs-discuss] Timeout waiting for network-vif-plugged for instance 
with vm_state building and task_state spawning

Hi!

A small update after restarting neutron server (problem vanished since the 
restart last week as in your case) and reading the posts/bug reports you send.

We observed an additional phenomenon when running into the problem which lets 
us wonder whether ovn is may be also part of the problem and which isn't 
mentioned in the posts/bugs.

Neutron gets the port up/network-vif-plugged report from ovn twice and at least 
in one case with a delay of >5h.

We will have a closer look when this problem reoccurs and verify whether the 
multiple up/plugged reports will be also reproduced.

Kind regards,

Christian Stelter

On Wed, Dec 29, 2021 at 11:28 PM Tony Liu 
mailto:tonyliu0...@hotmail.com>> wrote:
I've been hitting this issue from time to time. There are discussions and bugs 
for this.

http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025941.html
http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025892.html
https://bugs.launchpad.net/heat/+bug/1694371
https://bugs.launchpad.net/nova/+bug/1944779

I don't think it's about OVN/OVS, because the port is up in OVN NB.
And restarting Neutron will make it work again. I doubt Neutron either
doesn't pick it up or doesn't send it out to Nova compute.

When I hit this issue, I enable debugging for Neutron and restart it, but that 
makes
problem gone. If I keep logging on debug level till next time, it will dump too 
much
loggings. I would be good for Neutron to support enable debug logging 
on-the-fly.

If it happens to you more frequently, you can try to enable debug and see if 
that
captures anything helpful.


Tony


From: discuss 
mailto:ovs-discuss-boun...@openvswitch.org>>
 on behalf of Christian Stelter 
mailto:refu...@last-refuge.net>>
Sent: December 29, 2021 09:51 AM
To: ovs-discuss@openvswitch.org
Subject: [ovs-discuss] Timeout waiting for network-vif-plugged for instance 
with vm_state building and task_state spawning

Hi!

We observe in our well aged OpenStack installation (kolla-ansible deployed 
victoria release) strange timeouts while creating new instances. It seems that 
the feedback from the ovn controller(?) doesn't reach the nova compute/libvirt 
process within 300s.

This does not happen all the time, but 1/3 to 1/5 of the launches have problems.

This is what we see in the logs:

---  nova-compute ---
2021-12-29 07:27:06.972 7 INFO nova.compute.claims 
[req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb 
66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance: 
aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Claim successful on node compute07
2021-12-29 07:27:07.757 7 INFO nova.virt.libvirt.driver 
[req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb 
66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance: 
aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Creating image
2021-12-29 07:27:12.514 7 INFO nova.compute.manager 
[req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance: 
aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] VM Started (Lifecycle Event)
2021-12-29 07:27:12.559 7 INFO nova.compute.manager 
[req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance: 
aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] VM Paused (Lifecycle Event)
2021-12-29 07:27:12.649 7 INFO nova.compute.manager 
[req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance: 
aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] During sync_power_state the instance has 
a pending task (spawning). Skip.
2021-12-29 07:32:12.518 7 WARNING nova.virt.libvirt.driver 
[req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb 
66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance: 
aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Timeout waiting for 
[('network-vif-plugged', 'c641bae9-edcc-4e64-93ec-2dcbf5010009')] for instance 
with vm_state building and task_state spawning: eventlet.timeout.Timeout: 300 
seconds
2021-12-29 07:32:12.522 7 INFO nova.compute.manager 
[req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance: 
aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] VM Resumed (Lifecycle Event)
2021-12-29 07:32:12.526 7 INFO nova.virt.libvirt.driver [-] [instance: 
aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Instance spawned successfully.
2021-12-29 07:32:12.527 7 INFO nova.compute.manager 
[req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb 
66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance: 
aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Took 304.77 seconds to spawn the 

Re: [ovs-discuss] Timeout waiting for network-vif-plugged for instance with vm_state building and task_state spawning

2022-01-03 Thread Christian Stelter
Hi!

A small update after restarting neutron server (problem vanished since the
restart last week as in your case) and reading the posts/bug reports you
send.

We observed an additional phenomenon when running into the problem which
lets us wonder whether ovn is may be also part of the problem and which
isn't mentioned in the posts/bugs.

Neutron gets the port up/network-vif-plugged report from ovn twice and at
least in one case with a delay of >5h.

We will have a closer look when this problem reoccurs and verify whether
the multiple up/plugged reports will be also reproduced.

Kind regards,

Christian Stelter

On Wed, Dec 29, 2021 at 11:28 PM Tony Liu  wrote:

> I've been hitting this issue from time to time. There are discussions and
> bugs for this.
>
>
> http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025941.html
>
> http://lists.openstack.org/pipermail/openstack-discuss/2021-November/025892.html
> https://bugs.launchpad.net/heat/+bug/1694371
> https://bugs.launchpad.net/nova/+bug/1944779
>
> I don't think it's about OVN/OVS, because the port is up in OVN NB.
> And restarting Neutron will make it work again. I doubt Neutron either
> doesn't pick it up or doesn't send it out to Nova compute.
>
> When I hit this issue, I enable debugging for Neutron and restart it, but
> that makes
> problem gone. If I keep logging on debug level till next time, it will
> dump too much
> loggings. I would be good for Neutron to support enable debug logging
> on-the-fly.
>
> If it happens to you more frequently, you can try to enable debug and see
> if that
> captures anything helpful.
>
>
> Tony
>
> 
> From: discuss  on behalf of
> Christian Stelter 
> Sent: December 29, 2021 09:51 AM
> To: ovs-discuss@openvswitch.org
> Subject: [ovs-discuss] Timeout waiting for network-vif-plugged for
> instance with vm_state building and task_state spawning
>
> Hi!
>
> We observe in our well aged OpenStack installation (kolla-ansible deployed
> victoria release) strange timeouts while creating new instances. It seems
> that the feedback from the ovn controller(?) doesn't reach the nova
> compute/libvirt process within 300s.
>
> This does not happen all the time, but 1/3 to 1/5 of the launches have
> problems.
>
> This is what we see in the logs:
>
> ---  nova-compute ---
> 2021-12-29 07:27:06.972 7 INFO nova.compute.claims
> [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb
> 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Claim successful on node compute07
> 2021-12-29 07:27:07.757 7 INFO nova.virt.libvirt.driver
> [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb
> 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Creating image
> 2021-12-29 07:27:12.514 7 INFO nova.compute.manager
> [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] VM Started (Lifecycle Event)
> 2021-12-29 07:27:12.559 7 INFO nova.compute.manager
> [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] VM Paused (Lifecycle Event)
> 2021-12-29 07:27:12.649 7 INFO nova.compute.manager
> [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] During sync_power_state the instance
> has a pending task (spawning). Skip.
> 2021-12-29 07:32:12.518 7 WARNING nova.virt.libvirt.driver
> [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb
> 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Timeout waiting for
> [('network-vif-plugged', 'c641bae9-edcc-4e64-93ec-2dcbf5010009')] for
> instance with vm_state building and task_state spawning:
> eventlet.timeout.Timeout: 300 seconds
> 2021-12-29 07:32:12.522 7 INFO nova.compute.manager
> [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] VM Resumed (Lifecycle Event)
> 2021-12-29 07:32:12.526 7 INFO nova.virt.libvirt.driver [-] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Instance spawned successfully.
> 2021-12-29 07:32:12.527 7 INFO nova.compute.manager
> [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb
> 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] Took 304.77 seconds to spawn the
> instance on the hypervisor.
> 2021-12-29 07:32:12.613 7 INFO nova.compute.manager
> [req-01948206-8ff3-404c-b71f-86ed7d1791d4 - - - - -] [instance:
> aef0a1b3-99d6-43e2-89db-b7bd4e0516d9] During sync_power_state the instance
> has a pending task (spawning). Skip.
> 2021-12-29 07:32:12.632 7 INFO nova.compute.manager
> [req-57321447-272f-4820-813d-925a43d69462 0b16e2645828414d86f094c2a7a693eb
> 66fcb198228e4d9d97f09dc7ce1d5130 - default default] [instance:
>