On 07.05.14 17:10, Matthew Howard wrote:
On Wednesday, May 7, 2014 12:57:23 AM UTC-4, Martin Krasser wrote:


    Please not that the primary use case for persistent channels is to
    deal with slow and/or temporarily available
    consumers/destinations. It is not optimized for high throughput
    (yet). More detailed, a persistent channel usually has a very high
    write rate (with up to 100k msgs/sec, provided by a Processor it
    uses internally) but only a moderate message delivery rate to
    consumers. If you need a persistent queue with a high message
    throughput, consider using a 3rd party messaging product.


Thanks, good to know. Just out of curiosity, would you also have any performance concerns with using a Processor/View combination acting in place of a 3rd party MQ?

A few performance numbers first. A simple performance test on my laptop (2013 MBP with SSD) gives:

- 120k msgs/sec write throughput for a Processor with LevelDB journal (with fsync=true) - 70k msgs/sec write throughput for a Processor with Cassandra journal (single node Cassandra) - 20k msgs/sec throughput of a PersistentChannel with LevelDB journal (with fsync=true)

An optimized implementation of a PersistentChannel should actually give a throughput comparable to the write throughput of a Processor. There are several reasons for the decreased throughput in the current implementation, one is interleaving read and write ops on the same journal (actor). I didn't measure the read throughput of a View yet, but if it is deployed on another node than its corresponding processor (and assuming a sufficiently high message replication by the journal) I'd expect a significantly higher throughput of distributed processor/view combinations. But here is also enough room for optimization. Currently, views are pull-based. Later optimizations could additionally support push-based delivery of messages to views, making the write throughput of processors the only bottleneck.

I hope that gives an idea what is possible with the current implementation (using LevelDB or Cassandra journal, didn't measure with others) and what one can expect from future optimizations. At the moment, akka-persistence is optimized for write-throughput, as reads are only made during processor recovery (except for PersistentChannel). Furthermore, it is optimized for cases where the whole message history is kept, without frequent deletions, such as when a message is delivered (by a PersistentChannel) or at regular intervals (by an application).

We were initially debating the alternatives to durable mailboxes - the OP was considering implementing a mailbox backed by rabbitmq/kafka/mongo/etc... To me that sounded like a lot of work to get right, and the complexities probably overlap a good bit with akka persistence. I've since seen that akka persistence is the reason for the deprecation of durable mailboxes. So maybe my question should be phrased as follows... given a data pipeline built in Akka with a succession of actors consuming, enriching/processing, and emitting data:

a) If you need persistence at key stages, what factors would lend themselves to introducing a 3rd party product rather than rely on akka persistence? Maybe queue size, throughput, complex distributed pub/sub or routing requirements...

If you need durability only for reliable delivery of messages (i.e. if you can throw away these messages from storage once delivered and processed), use a 3rd party messaging product. The main use case for akka-persistence is making stateful actors durable (at very high transaction rates).


b) If a 3rd party product wasn't needed, what might be a recommended way to implement a durable work queue (with some sort of flow control)?

See above.

Hope that helps.

Cheers,
Martin

I sort of threw out that suggestion of a Processor simply receiving/persisting work events, with a View consuming the journal (with some controlled replay) and forwarding to workers. But I could see some potential issues there (latency of the view, managing flow control in the view, not sure it is a proper use of a processor - does nothing really other than receive events).


--
Martin Krasser

blog:    http://krasserm.blogspot.com
code:    http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

--
     Read the docs: http://akka.io/docs/
     Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
     Search the archives: https://groups.google.com/group/akka-user
--- You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to