The queue TTL happens on reply queues and fanout queues. I don’t think it should happen on fanout queues. They should auto delete. I can understand the reason for having them on reply queues though so maybe that would be a way to forward?
Or am I missing something and it is needed on fanout queues too? Cheers, Sam > On 25 Jul 2016, at 8:47 PM, Dmitry Mescheryakov <dmescherya...@mirantis.com> > wrote: > > Sam, > > For your case I would suggest to lower rabbit_transient_queues_ttl until you > are comfortable with volume of messages which comes during that time. Setting > the parameter to 1 will essentially replicate bahaviour of auto_delete > queues. But I would suggest not to set it that low, as otherwise your > OpenStack will suffer from the original bug. Probably a value like 20 seconds > should work in most cases. > > I think that there is a space for improvement here - we can delete reply and > fanout queues on graceful shutdown. But I am not sure if it will be easy to > implement, as it requires services (Nova, Neutron, etc.) to stop RPC server > on sigint and I don't know if they do it right now. > > I don't think we can make case with sigkill any better. Other than that, the > issue could be investigated on Neutron side, maybe number of messages could > be reduced there. > > Thanks, > > Dmitry > > 2016-07-25 9:27 GMT+03:00 Sam Morrison <sorri...@gmail.com > <mailto:sorri...@gmail.com>>: > We recently upgraded to Liberty and have come across some issues with queue > build ups. > > This is due to changes in rabbit to set queue expiries as opposed to queue > auto delete. > See https://bugs.launchpad.net/oslo.messaging/+bug/1515278 > <https://bugs.launchpad.net/oslo.messaging/+bug/1515278> for more information. > > The fix for this bug is in liberty and it does fix an issue however it causes > another one. > > Every time you restart something that has a fanout queue. Eg. > cinder-scheduler or the neutron agents you will have > a queue in rabbit that is still bound to the rabbitmq exchange (and so still > getting messages in) but no consumers. > > These messages in these queues are basically rubbish and don’t need to exist. > Rabbit will delete these queues after 10 mins (although the default in master > is now changed to 30 mins) > > During this time the queue will grow and grow with messages. This sets off > our nagios alerts and our ops guys have to deal with something that isn’t > really an issue. They basically delete the queue. > > A bad scenario is when you make a change to your cloud that means all your > 1000 neutron agents are restarted, this causes a couple of dead queues per > agent to hang around. (port updates and security group updates) We get around > 25 messages / second on these queues and so you can see after 10 minutes we > have a ton of messages in these queues. > > 1000 x 2 x 25 x 600 = 30,000,000 messages in 10 minutes to be precise. > > Has anyone else been suffering with this before a raise a bug? > > Cheers, > Sam > > > _______________________________________________ > OpenStack-operators mailing list > OpenStack-operators@lists.openstack.org > <mailto:OpenStack-operators@lists.openstack.org> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators > <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators> >
_______________________________________________ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators