Hi Raphael, On Sun, Feb 27, 2011 at 11:18:35AM +0000, Raphael Cohn wrote: > OpenStack's QueueService seems very interesting. As we have an existing > message queue implementation, we'd be happy to help you guys out. We're > about making messaging cloud-scale, so that everyone benefits.
Thank you! We're certainly looking to include as many community members as we can to ensure this is a successful project. You expertise and participation would be very much appreciated! > However, it worries us that you're planning to implement a REST API for > messaging. Message queuing is fundamentally asynchronous; this is one of > the reasons StormMQ got started, as we found that approaches that use it > (eg SQS) suffer from some major weaknesses:- > - They're too slow; > - They can't handle sustained volumes > - Higher-level needs, eg fanout, selective pub-sub and transactions, are > an awkward, if not impossible, fit I certainly agree, HTTP is not an ideal protocol for high-performance messaging. Some features may be awkward in HTTP, but almost anything is possible. As you'll note on the queue service specification page, a pluggable protocol is one of the main requirements. The REST API is the first since this is the easiest protocol for most folks to understand and get involved with, it is by no means the primary or a first-class protocol. For example, I mention other binary protocols to look at implementing for higher performance once we get the REST API off the ground. HTTP though, if done correctly (pipelining, binary content-types, ...), can provide decent throughput that is sufficient for a wide range of applications. It will always be restricted by the plain-text request/header envelope, and this is where binary protocols will excel. Also, not all users and use cases of the queue service will need to prioritize on high throughput. The overhead of the HTTP protocol parsing may be insignificant for some, and instead the accessibility of the service via HTTP in their environment (web apps, browser, etc.) may be much more important than high throughput. Accessibility, especially now in a very RESTy web/cloud world, is very import. > There are a hoard of technical reasons why HTTP, superb as it is for > request-response architectures, makes a poor backbone for messaging (some > of the team behind StormMQ implemented one of the first banking-scale REST > architectures). > > For example, implementations that need to send or consume lots of data, > and are only interested in a subset whose filter criteria changes over > time. Syslogging, for example. Imagine a dynamic cloud, where servers come > and go - and centralised logging systems and alerts need no configuration, > because they use queuing. Under load (eg hack attempts on your server > firewalls generate 1000s of log messages) it mustn't fail, just go a bit > slower. StormMQ use AMQP internally for our own log management for that > reason. Understood, and much of this can be accomplished with horizontally scaling architectures. As I touched on before and mentioned on the wiki, HTTP is only one interface in. The internal communication protocol for scaling out zones and clusters will not be HTTP long term, and instead a much more efficient, async, and binary protocol. My current thought is to use Google protocol buffers or Avro for this, but this is up in the air (something we won't get to for at least a couple months). Since we're using Erlang, we may even use the native Erlang message passing if we're on a trusted network. > First up, AMQP isn't actually very complex at the level of an application > developer. Indeed, with a good library (like ours) it's trivially easy. Agreed, there are some great AMQP libraries out there that make it seamless, but there are also some that do not. This wasn't my concern with the complexity comment though. > The apparent complexity comes becomes of unfamiliarity, both with concepts > and with use; no different to HTTP when it first came in (and we saw a > plethora of weird ways of using it and misunderstood criteria for headers, > etc). AMQP's highly suited to high-latency, unreliable links. That's why > Smith Electric vehicles use it to connect all their delivery trucks using > dodgy 3G links - and still gather 10,000s of items of data a second. The > AMQP protocol, particularly 1.0, make it's extremely clear how and when to > recover from failure. Indeed, AMQP's approach is failure happens - so deal > with it. HTTP on the other hand, has no such level of transactionality. For the complexity concern, my main point is that in order to use a queue, you need a channel, exchange, queue, and a binding between an exchange/queue. This can be made fairly trivial by libraries you mentioned, but there are a lot of objects and relationships to keep in sync in a distributed system. The OpenStack queue service takes a fundamentally different approach and requires no queue setup before you can put a message into it. Queues (and accounts) are transient, when a message is inserted into a queue, or when a consumer is waiting on queue, it comes into existence. When the queue is empty, it disappears. This allows you to easily create temporary queues without worry of race conditions between producers and consumers. As for my comment on AMQP's suitability for highly-latent or unreliable links, it is primary directed towards the 7-way handshake for consumers, and 4-way handshake for producers (both on top of one RTT for the TCP handshake). Once these connections are established the protocol is very efficient, but this doesn't help with unreliable links or environments where persistent connections are hard or impossible to maintain. AMQP will certainly work in these environments, but it seems it is much more suited for reliable links where the handshake isn't required as often. With the proposed OpenStack queue service REST API, there will only be one RTT for both producers and consumers (on top of one RTT for TCP). A producer will make a PUT request with a 201 Created response. A consumer is a GET or POST with response body. All authentication, queue destination, and other metadata will be included in the request, rather than building up a stateful channel through the handshake. Cloud, and especially module, use cases bring much higher latency than is typically seen in clustered environments. Short-lived connections are always possible depending on the developer or environment (not just due to unreliable links, for example connection caching may be difficult or impossible). This is why an emphasis was put on stateless communication with minimal round trips. > Second up, more importantly, StormMQ do not provide a REST API as an > alternative to AMQP. It's to provide features that are nothing to do with > message queuing - dynamically slicing up your cloud, for instance, or > managing environments to allow exact reproduceability or checking in to > source control your config. We'd be interested in providing a REST API if > there's the demand. AMQP does support multi-tenancy - we do it. We plan to address these issues with this project, especially multi-tenancy and multi-zone interaction. We need this to not only handle the simple use cases, but to also run a public cloud service. > To assist, pragamatically, we'd like to donate as open source our upcoming > C and Java clients for AMQP 1.0, and help sponsor Python, Perl, PHP and > Ruby ones off the C code, so that there is as wide as possible opportunity > for people to use messaging. Thanks! Before being able to fully leverage these, we'll also need an AMQP binding, which to be honest I've given very little thought. Once we have a solid queue "kernel" this will be easier, but I'm certainly keeping AMQP semantics in mind. We are also using RabbitMQ for the Nova project using the carrot Python module. It might be interesting to see how your clients compare and if they may benefit that project. > I'd strongly encourage you to get involved in the AMQP working group so if > there's needs that are not met by AMQP, they can be addressed. The working > group is really keen to encourage an open, widely adopted standard for > AMQP; they'd like it to be the HTTP of messaging. Many of the features I > see proposed for OpenStack are features in AMQP - and AMQP has spent a lot > of time working out the kinks in edge cases and making sure they'd work > with the legacy - JMS, TIBCO and the like. I'll certainly consider it, but I'd first like to get a functional service up and running to see how these ideas (distributed hashing, stateless, transient queues) hold up and then we can see what features, if any, would make sense as an AMQP proposal. Thanks again for you input! I'm looking forward to further discussion and StormMQ's participation. -Eric _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp