On 07.05.14 17:10, Matthew Howard wrote:
On Wednesday, May 7, 2014 12:57:23 AM UTC-4, Martin Krasser wrote:
Please not that the primary use case for persistent channels is to
deal with slow and/or temporarily available
consumers/destinations. It is not optimized for high throughput
(yet). More detailed, a persistent channel usually has a very high
write rate (with up to 100k msgs/sec, provided by a Processor it
uses internally) but only a moderate message delivery rate to
consumers. If you need a persistent queue with a high message
throughput, consider using a 3rd party messaging product.
Thanks, good to know. Just out of curiosity, would you also have any
performance concerns with using a Processor/View combination acting in
place of a 3rd party MQ?
A few performance numbers first. A simple performance test on my laptop
(2013 MBP with SSD) gives:
- 120k msgs/sec write throughput for a Processor with LevelDB journal
(with fsync=true)
- 70k msgs/sec write throughput for a Processor with Cassandra journal
(single node Cassandra)
- 20k msgs/sec throughput of a PersistentChannel with LevelDB journal
(with fsync=true)
An optimized implementation of a PersistentChannel should actually give
a throughput comparable to the write throughput of a Processor. There
are several reasons for the decreased throughput in the current
implementation, one is interleaving read and write ops on the same
journal (actor). I didn't measure the read throughput of a View yet, but
if it is deployed on another node than its corresponding processor (and
assuming a sufficiently high message replication by the journal) I'd
expect a significantly higher throughput of distributed processor/view
combinations. But here is also enough room for optimization. Currently,
views are pull-based. Later optimizations could additionally support
push-based delivery of messages to views, making the write throughput of
processors the only bottleneck.
I hope that gives an idea what is possible with the current
implementation (using LevelDB or Cassandra journal, didn't measure with
others) and what one can expect from future optimizations. At the
moment, akka-persistence is optimized for write-throughput, as reads are
only made during processor recovery (except for PersistentChannel).
Furthermore, it is optimized for cases where the whole message history
is kept, without frequent deletions, such as when a message is delivered
(by a PersistentChannel) or at regular intervals (by an application).
We were initially debating the alternatives to durable mailboxes - the
OP was considering implementing a mailbox backed by
rabbitmq/kafka/mongo/etc... To me that sounded like a lot of work to
get right, and the complexities probably overlap a good bit with akka
persistence. I've since seen that akka persistence is the reason for
the deprecation of durable mailboxes. So maybe my question should be
phrased as follows... given a data pipeline built in Akka with a
succession of actors consuming, enriching/processing, and emitting data:
a) If you need persistence at key stages, what factors would lend
themselves to introducing a 3rd party product rather than rely on akka
persistence? Maybe queue size, throughput, complex distributed pub/sub
or routing requirements...
If you need durability only for reliable delivery of messages (i.e. if
you can throw away these messages from storage once delivered and
processed), use a 3rd party messaging product. The main use case for
akka-persistence is making stateful actors durable (at very high
transaction rates).
b) If a 3rd party product wasn't needed, what might be a recommended
way to implement a durable work queue (with some sort of flow control)?
See above.
Hope that helps.
Cheers,
Martin
I sort of threw out that suggestion of a Processor simply
receiving/persisting work events, with a View consuming the journal
(with some controlled replay) and forwarding to workers. But I could
see some potential issues there (latency of the view, managing flow
control in the view, not sure it is a proper use of a processor - does
nothing really other than receive events).
--
Martin Krasser
blog: http://krasserm.blogspot.com
code: http://github.com/krasserm
twitter: http://twitter.com/mrt1nz
--
Read the docs: http://akka.io/docs/
Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.