Hi Michael,

akka-persistence relies on asyncReplayMessages to complete the returned Future either with a result or a failure in any case (incl. a timeout). In this case, a response to ReplayMessages will always be generated. The communication between the journal actor and persistent actors (processors) is always local, so we can (at the moment) assume no request or response messages are lost.

The problem is more related to the cassandra journal implementation i.e. there seem to be some failure scenarios where the Future returned by asyncReplayMessages is not completed, which shouldn't be the case. Please open a ticket in akka-persistence-cassandra with information how to reproduce this failure scenario, if possible.

Nevertheless, having a timeout mechanism directly in akka-persistence would be a reasonable addition. If I remember correctly, this was already discussed somewhere on akka-user or the issue tracker.

Thanks,
Martin

On 04.09.14 03:02, Michael Diamant wrote:
My team uses akka-persistence 2.3.3 and akka-persistence-cassandra 0.3.1. Recently, in production, my team observed a View that did not appear to be polling as expected. The application had been running for about 12 hours (and previously has run for much longer without issue). Updates to Cassandra did not propagate to the consuming application. The consumer did not emit any error level logging (in production, logging is set to error). The application is run on multiple nodes. Restarting one application instance fixed the issue (i.e. the View read all events on start-up and continued polling as expected).

Having limited instrumentation available, there is not much else that I can specify with certainty about the running instances with suspected broken Views. The View actor is created with the default supervision strategy (i.e. restart on exception), which rules out the scenario that the actor was stopped. Additionally, local tests were performed to confirm this behavior in the event of an exception.

The hypothesis my team formed to explain the situation is that perhaps a call to Cassandra via the akka-persistence-cassandra journal never returned. There are several issues related to the DataStax driver (e.g. https://datastax-oss.atlassian.net/browse/JAVA-268) that might be at play here. These issues appear to be resolved in 2.0.4, while akka-persistence-cassandra is compiled against 2.0.1. My team will upgrade accordingly.

Assuming this is the issue, I want to voice my concern about how akka-persistence handles journals that fail to return a response. Following the code, akka.persistence.Recovery tells the journal to read: journal ! ReplayMessages(lastSequenceNr + 1L, toSnr, replayMax, processorId, self)

Then, based on the response type (success/failure), appropriate callbacks are invoked until ultimately in View, onReplayComplete() is invoked. This function is responsible for scheduling the next polling attempt. If the journal fails to respond, then the View never seeks to poll again because there is no timeout mechanism (that I am aware of).

If what I'm talking through holds water, would it make sense to consider adding a timeout to the View to ensure it continues to attempt polling for updates? It could also make sense to instrument a policy for reporting an error when this stale condition is discovered. I'm happy to think through the proposed enhancements further should the hypothesis be validated.
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google Groups "Akka User List" group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com <mailto:akka-user+unsubscr...@googlegroups.com>. To post to this group, send email to akka-user@googlegroups.com <mailto:akka-user@googlegroups.com>.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

--
Martin Krasser

blog:    http://krasserm.blogspot.com
code:    http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

--
     Read the docs: http://akka.io/docs/
     Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
     Search the archives: https://groups.google.com/group/akka-user
--- You received this message because you are subscribed to the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to