Re: [akka-user] Re: Pulling Pattern vs Durable Mailboxes

Martin Krasser Wed, 07 May 2014 23:04:28 -0700


On 07.05.14 17:10, Matthew Howard wrote:

On Wednesday, May 7, 2014 12:57:23 AM UTC-4, Martin Krasser wrote:



    Please not that the primary use case for persistent channels is to
    deal with slow and/or temporarily available
    consumers/destinations. It is not optimized for high throughput
    (yet). More detailed, a persistent channel usually has a very high
    write rate (with up to 100k msgs/sec, provided by a Processor it
    uses internally) but only a moderate message delivery rate to
    consumers. If you need a persistent queue with a high message
    throughput, consider using a 3rd party messaging product.

Thanks, good to know. Just out of curiosity, would you also have anyperformance concerns with using a Processor/View combination acting inplace of a 3rd party MQ?

A few performance numbers first. A simple performance test on my laptop(2013 MBP with SSD) gives:

- 120k msgs/sec write throughput for a Processor with LevelDB journal(with fsync=true)- 70k msgs/sec write throughput for a Processor with Cassandra journal(single node Cassandra)- 20k msgs/sec throughput of a PersistentChannel with LevelDB journal(with fsync=true)

An optimized implementation of a PersistentChannel should actually givea throughput comparable to the write throughput of a Processor. Thereare several reasons for the decreased throughput in the currentimplementation, one is interleaving read and write ops on the samejournal (actor). I didn't measure the read throughput of a View yet, butif it is deployed on another node than its corresponding processor (andassuming a sufficiently high message replication by the journal) I'dexpect a significantly higher throughput of distributed processor/viewcombinations. But here is also enough room for optimization. Currently,views are pull-based. Later optimizations could additionally supportpush-based delivery of messages to views, making the write throughput ofprocessors the only bottleneck.

I hope that gives an idea what is possible with the currentimplementation (using LevelDB or Cassandra journal, didn't measure withothers) and what one can expect from future optimizations. At themoment, akka-persistence is optimized for write-throughput, as reads areonly made during processor recovery (except for PersistentChannel).Furthermore, it is optimized for cases where the whole message historyis kept, without frequent deletions, such as when a message is delivered(by a PersistentChannel) or at regular intervals (by an application).

We were initially debating the alternatives to durable mailboxes - theOP was considering implementing a mailbox backed byrabbitmq/kafka/mongo/etc... To me that sounded like a lot of work toget right, and the complexities probably overlap a good bit with akkapersistence. I've since seen that akka persistence is the reason forthe deprecation of durable mailboxes. So maybe my question should bephrased as follows... given a data pipeline built in Akka with asuccession of actors consuming, enriching/processing, and emitting data:
a) If you need persistence at key stages, what factors would lendthemselves to introducing a 3rd party product rather than rely on akkapersistence? Maybe queue size, throughput, complex distributed pub/subor routing requirements...

If you need durability only for reliable delivery of messages (i.e. ifyou can throw away these messages from storage once delivered andprocessed), use a 3rd party messaging product. The main use case forakka-persistence is making stateful actors durable (at very hightransaction rates).

b) If a 3rd party product wasn't needed, what might be a recommendedway to implement a durable work queue (with some sort of flow control)?


See above.

Hope that helps.

Cheers,
Martin

I sort of threw out that suggestion of a Processor simplyreceiving/persisting work events, with a View consuming the journal(with some controlled replay) and forwarding to workers. But I couldsee some potential issues there (latency of the view, managing flowcontrol in the view, not sure it is a proper use of a processor - doesnothing really other than receive events).


--
Martin Krasser

blog:    http://krasserm.blogspot.com
code:    http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

--

     Read the docs: http://akka.io/docs/
     Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
     Search the archives: https://groups.google.com/group/akka-user

---You received this message because you are subscribed to the Google Groups "Akka User List" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Re: Pulling Pattern vs Durable Mailboxes

Reply via email to