Hi Michael,
akka-persistence relies on asyncReplayMessages to complete the returned
Future either with a result or a failure in any case (incl. a timeout).
In this case, a response to ReplayMessages will always be generated. The
communication between the journal actor and persistent actors
(processors) is always local, so we can (at the moment) assume no
request or response messages are lost.
The problem is more related to the cassandra journal implementation i.e.
there seem to be some failure scenarios where the Future returned by
asyncReplayMessages is not completed, which shouldn't be the case.
Please open a ticket in akka-persistence-cassandra with information how
to reproduce this failure scenario, if possible.
Nevertheless, having a timeout mechanism directly in akka-persistence
would be a reasonable addition. If I remember correctly, this was
already discussed somewhere on akka-user or the issue tracker.
Thanks,
Martin
On 04.09.14 03:02, Michael Diamant wrote:
My team uses akka-persistence 2.3.3 and akka-persistence-cassandra
0.3.1. Recently, in production, my team observed a View that did not
appear to be polling as expected. The application had been running
for about 12 hours (and previously has run for much longer without
issue). Updates to Cassandra did not propagate to the consuming
application. The consumer did not emit any error level logging (in
production, logging is set to error). The application is run on
multiple nodes. Restarting one application instance fixed the issue
(i.e. the View read all events on start-up and continued polling as
expected).
Having limited instrumentation available, there is not much else that
I can specify with certainty about the running instances with
suspected broken Views. The View actor is created with the default
supervision strategy (i.e. restart on exception), which rules out the
scenario that the actor was stopped. Additionally, local tests were
performed to confirm this behavior in the event of an exception.
The hypothesis my team formed to explain the situation is that perhaps
a call to Cassandra via the akka-persistence-cassandra journal never
returned. There are several issues related to the DataStax driver
(e.g. https://datastax-oss.atlassian.net/browse/JAVA-268) that might
be at play here. These issues appear to be resolved in 2.0.4, while
akka-persistence-cassandra is compiled against 2.0.1. My team will
upgrade accordingly.
Assuming this is the issue, I want to voice my concern about how
akka-persistence handles journals that fail to return a response.
Following the code, akka.persistence.Recovery tells the journal to read:
journal ! ReplayMessages(lastSequenceNr + 1L, toSnr, replayMax,
processorId, self)
Then, based on the response type (success/failure), appropriate
callbacks are invoked until ultimately in View, onReplayComplete() is
invoked. This function is responsible for scheduling the next polling
attempt. If the journal fails to respond, then the View never seeks
to poll again because there is no timeout mechanism (that I am aware of).
If what I'm talking through holds water, would it make sense to
consider adding a timeout to the View to ensure it continues to
attempt polling for updates? It could also make sense to instrument a
policy for reporting an error when this stale condition is discovered.
I'm happy to think through the proposed enhancements further should
the hypothesis be validated.
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google
Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to akka-user+unsubscr...@googlegroups.com
<mailto:akka-user+unsubscr...@googlegroups.com>.
To post to this group, send email to akka-user@googlegroups.com
<mailto:akka-user@googlegroups.com>.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
--
Martin Krasser
blog: http://krasserm.blogspot.com
code: http://github.com/krasserm
twitter: http://twitter.com/mrt1nz
--
Read the docs: http://akka.io/docs/
Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.