I agree that greater clarity on expectations around reliability are needed.

The drivers all differ in this regard.

As it stands today, the impl_rabbit driver only retries an RPC request if an exception occurs while sending it. However messages are sent unconfirmed[1]. This means a message can be lost before it gets enqueued by the broker, without the sender of the message receiving any error or notification of that fact.

Even if the requests are durably stored and/or replicated in a clustered RabbitMQ configuration, the reply queues are currently always auto-deleted and are not durable regardless of configuration, so replies may be lost on broker failure even if requests are not.

So I believe that various failures may cause an RPC request to fail (i.e. to timeout). It seems this is not universally expected however, so I am not sure how many OpenStack services using oslo.messaging expect and handle such failures.

--Gordon

[1] The impl_qpid driver by contrast sends messages synchronously - i.e. blocking until confirmed, but on the receive side it does not use acknowledgements so again message loss is possible.

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to     : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Reply via email to