** Description changed: - A recent update to oslo.messaging to resolve bug 1789177 causes - failures. + [Impact] + A recent update to oslo.messaging to resolve #1789177 causes failures. (Below comments copied form the original bug): After a partial upgrade (only one side, producers or consumers), there are a lot of MessageTimeout and DuplicateMessage errors in the logs. Downgrading back to 5.35.0-0ubuntu1~cloud0 fixed the problem. Right after restarted n-ovs-agent, I can see a lot of errors in rabbitmq log[1] which is the same as the error when rabbitmq failover issue ( the original issue of this LP ) Then after I upgraded oslo.messaging in neutron-api unit and restarted neutron-server, below errors are gone and I was able to create instance again. After upgrading oslo.messaging in n-ovs only, exchange they communicate didn't match. As changing exchanges they use depends on publisher-cosumer relation. So I think there are two ways. 1. revert this patch for Q ( original failover problem will be there ) 2. upgrade them with maintenance window Thanks a lot [1] ################################################################################ =ERROR REPORT==== 3-Feb-2021::03:25:26 === Channel error on connection <0.2379.1> (10.0.0.32:60430 -> 10.0.0.34:5672, vhost: 'openstack', user: 'neutron'), channel 1: {amqp_error,not_found, "no exchange 'reply_7da3cecc31b34bdeb96c866dc84e3044' in vhost 'openstack'", 'basic.publish'} 10.0.0.32 is neutron-api unit + + [Test Case] + This SRU needs the following scenarios tested: + + 1) partial upgrade of n-ovs at 5.35.0-0ubuntu3 [1] and n-api/n-gateway + at 5.35.0-0ubuntu1 - instance creation will be successful + + 2) partial upgrade of n-api/n-gateway at 5.35.0-0ubuntu3 [1] and n-ovs + at 5.35.0-0ubuntu1 - instance creation will be successful + + 3) partial upgrade of n-ovs at 5.35.0-0ubuntu2 [1] and n-api/n-gateway + at 5.35.0-0ubuntu3 - instance creation will fail (see regression + potential) + + 4) partial upgrade of n-api/n-gateway at 5.35.0-0ubuntu3 [1] and n-ovs + at 5.35.0-0ubuntu2 - instance creation will fail (see regression + potential) + + 5) test all neutron nodes at 5.35.0-0ubunt3 - instance creation will be + successful + + [1] and neutron* services restarted + + [Regression Potential] + There is regression potential for clouds that have already upgraded to 5.35.0-0ubuntu2. This needs to be tested but if a cloud has fully upgraded to 5.35.0-0ubuntu2, then the same disruption that this SRU is trying to solve may once again occur in a cloud with some services running 5.35.0-0ubuntu2 and some running 5.35.0-0ubuntu3. Once that cloud is entirely at 5.35.0-0ubuntu3, messages will no longer timeout.
-- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1914437 Title: [SRU] MessageTimeout and DuplicateMessage errors after udpate To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-archive/+bug/1914437/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs