Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Hi From your question it looks like you want to build up a persistent view by merging journal streams using multiple persistence ids. That is a common use case and my experience is that is is a bit cumbersome, but doable today. However you want strict replay ordering over multiple persistent actors. If you have a requirement of strict ordering across aggregate roots it sounds like a design flaw in your application, are you perhaps dividing up you domain too granularly? My view your persistent actors should be your aggregate roots, period. Your persistent actor can of course can have an eventual consistency dependency to other actors, for deciding logic or validating input before persisting That being said, for views there are often the need for merging streams of events from multiple journals to build up an aggregated view. But if your persistent actors are aggregate roots then it does not make sense that the view would have any guarantee of the ordering. Events are things that happened in the past so you don't need to validate them after the fact. Other types of ordering seems more like application specific problems. Here are some suggestions: 1. *First come first serve ordering:* Setup an view aggregation actor that is fed events from multiple journal sources. Your aggregate actor is a persistent actor and will persist each messages in the sequence they arrive. You now have strict ordering in your aggregate actor and replays will guaranteed to be in the same order the events arrived. Of course this uses up extra storage and you need to keep track any implicit dependencies if you were to create multiple levels of these. 2. *Timebased ordering:* If it makes sense in your application and you trust the clock on your servers, you can relax your requirements and include a persist timestamp in your message when journaling. When you replay messages from two sources (persistent views) you can merges events into event stream buffer that sorts events based on the persist timestamp before emitting messages. 3. *Shared sequence ordering: *basically your original idéa combined with event stream buffer. You include an extra field which has sequence numbers fed from your sequence source. Then replay into a journal stream buffer that makes sure events are emitted in correct order. If you are thinking about a shared source for sequential ids, then Twitter had something called snowflake https://github.com/twitter/snowflake (written in Scala). The project is deprecated now but the history and code is there. /Magnus Den fredag 24 april 2015 kl. 16:43:46 UTC+2 skrev Olger Warnier: Well, I found that the sequence numbers are actually generated on a per persistent actor instance basis. So that makes replay for a single aggregateId based with limits on the sequence numbers a bit of an intresting challenge Still interested in your opinions as that will have impact on the way to solve this (some kind of atomic sequence generator shared between aggregates ?) On Friday, April 24, 2015 at 10:42:04 AM UTC+2, Olger Warnier wrote: Hi Roland / List, I am looking into an addition/mutation to the Persistency layer that allows storage of an aggregateId (more or less the whole 'tag' idea without being able to have multiple tags to start out with) with a replay (for the view) based on that aggregateId. (bit like the DDD AggregateRoot) Replay is started with a message that contains a start sequence and assumes (logically) that the sequence will go up. With regards to the aggregateId, replay is for all persistenceIds that have registered this aggregateId. If you wish to allow replay on aggregate level, the sequenceId (numbering) should be on aggregate level with as side effect that the sequence numbering on persistenceId level will go up but with 'gaps'. When you are not dependent on a gapless series of persistence events, that won't be an issue (just keep the last processed persistenceId sequence number for your snapshot, and it will still work) Any opinion on this ? Somebody have a use case that requires gapless persistenceId sequence numbers ? Kind regards, Olger On Friday, March 27, 2015 at 1:33:43 PM UTC+1, rkuhn wrote: Hi Murali, the core team at Typesafe cannot work on this right now (we need to finish Streams and HTTP first and have some other obligations as well), but Akka is an open-source project and we very much welcome contributions of all kinds. In this case we should probably start by defining more closely which queries to (initially) support and how to model them in the various backends, so that we can get a feel for how we shall change the Journal SPI. Regards, Roland 27 mar 2015 kl. 12:41 skrev Ganta Murali Krishna gan...@gmail.com: Hello Roland, Any news on this please. When we can expect implementation roughly? Your response will be really
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Hi Roland / List, I am looking into an addition/mutation to the Persistency layer that allows storage of an aggregateId (more or less the whole 'tag' idea without being able to have multiple tags to start out with) with a replay (for the view) based on that aggregateId. (bit like the DDD AggregateRoot) Replay is started with a message that contains a start sequence and assumes (logically) that the sequence will go up. With regards to the aggregateId, replay is for all persistenceIds that have registered this aggregateId. If you wish to allow replay on aggregate level, the sequenceId (numbering) should be on aggregate level with as side effect that the sequence numbering on persistenceId level will go up but with 'gaps'. When you are not dependent on a gapless series of persistence events, that won't be an issue (just keep the last processed persistenceId sequence number for your snapshot, and it will still work) Any opinion on this ? Somebody have a use case that requires gapless persistenceId sequence numbers ? Kind regards, Olger On Friday, March 27, 2015 at 1:33:43 PM UTC+1, rkuhn wrote: Hi Murali, the core team at Typesafe cannot work on this right now (we need to finish Streams and HTTP first and have some other obligations as well), but Akka is an open-source project and we very much welcome contributions of all kinds. In this case we should probably start by defining more closely which queries to (initially) support and how to model them in the various backends, so that we can get a feel for how we shall change the Journal SPI. Regards, Roland 27 mar 2015 kl. 12:41 skrev Ganta Murali Krishna gan...@gmail.com javascript:: Hello Roland, Any news on this please. When we can expect implementation roughly? Your response will be really appreciated. Regards Murali On Wednesday, 27 August 2014 20:04:30 UTC+5:30, rkuhn wrote: Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here https://groups.google.com/d/msg/akka-user/SL5vEVW7aTo/KfqAXAmzol0J and here https://groups.google.com/d/msg/akka-user/4kbYcwWS2OI/hpmAkxnB9D4J). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does nearly the right thing, but it focuses on the wrong aspect: it seems most suited to track a single PersistentActor with some slack, but also not with back-pressure as a first-class citizen (it is possible to achieve it, albeit not trivial). What we distilled as the core functionality for a read-side actor is the following: - it can ask for a certain set of events - it consumes the resulting event stream on its own schedule - it can be stateful and persistent on its own This does not preclude populating e.g. a graph database or a SQL store directly from the journal back-end via Spark, but we do see the need to allow Akka Actors to be used to implement such a projection. Starting from the bottom up, allowing the read-side to be a PersistentActor in itself means that receiving Events should not require a mixin trait like PersistentView. The next bullet point means that the Event stream must be properly back-pressured, and we have a technology under development that is predestined for such an endeavor: Akka Streams. So the proposal is that any Actor can obtain the ActorRef for a given Journal and send it a request for the event stream it wants, and in response it will get a message containing a stream (i.e. Flow) of events and some meta-information to go with it. The question that remains at this point is what exactly it means to “ask for a certain set of events”. In order to keep the number of abstractions minimal, the first use-case for this feature is the recovery of a PersistentActor. Each Journal will probably support different kinds of queries, but it must for this use-case respond to case class QueryByPersistenceId(id: String, fromSeqNr: Long, toSeqNr: Long) with something like case class EventStreamOffer(metadata: Metadata, stream: Flow[PersistentMsg]) The metadata allows the recipient to correlate this offer with the corresponding request and it
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Well, I found that the sequence numbers are actually generated on a per persistent actor instance basis. So that makes replay for a single aggregateId based with limits on the sequence numbers a bit of an intresting challenge Still interested in your opinions as that will have impact on the way to solve this (some kind of atomic sequence generator shared between aggregates ?) On Friday, April 24, 2015 at 10:42:04 AM UTC+2, Olger Warnier wrote: Hi Roland / List, I am looking into an addition/mutation to the Persistency layer that allows storage of an aggregateId (more or less the whole 'tag' idea without being able to have multiple tags to start out with) with a replay (for the view) based on that aggregateId. (bit like the DDD AggregateRoot) Replay is started with a message that contains a start sequence and assumes (logically) that the sequence will go up. With regards to the aggregateId, replay is for all persistenceIds that have registered this aggregateId. If you wish to allow replay on aggregate level, the sequenceId (numbering) should be on aggregate level with as side effect that the sequence numbering on persistenceId level will go up but with 'gaps'. When you are not dependent on a gapless series of persistence events, that won't be an issue (just keep the last processed persistenceId sequence number for your snapshot, and it will still work) Any opinion on this ? Somebody have a use case that requires gapless persistenceId sequence numbers ? Kind regards, Olger On Friday, March 27, 2015 at 1:33:43 PM UTC+1, rkuhn wrote: Hi Murali, the core team at Typesafe cannot work on this right now (we need to finish Streams and HTTP first and have some other obligations as well), but Akka is an open-source project and we very much welcome contributions of all kinds. In this case we should probably start by defining more closely which queries to (initially) support and how to model them in the various backends, so that we can get a feel for how we shall change the Journal SPI. Regards, Roland 27 mar 2015 kl. 12:41 skrev Ganta Murali Krishna gan...@gmail.com: Hello Roland, Any news on this please. When we can expect implementation roughly? Your response will be really appreciated. Regards Murali On Wednesday, 27 August 2014 20:04:30 UTC+5:30, rkuhn wrote: Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here https://groups.google.com/d/msg/akka-user/SL5vEVW7aTo/KfqAXAmzol0J and here https://groups.google.com/d/msg/akka-user/4kbYcwWS2OI/hpmAkxnB9D4J). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does nearly the right thing, but it focuses on the wrong aspect: it seems most suited to track a single PersistentActor with some slack, but also not with back-pressure as a first-class citizen (it is possible to achieve it, albeit not trivial). What we distilled as the core functionality for a read-side actor is the following: - it can ask for a certain set of events - it consumes the resulting event stream on its own schedule - it can be stateful and persistent on its own This does not preclude populating e.g. a graph database or a SQL store directly from the journal back-end via Spark, but we do see the need to allow Akka Actors to be used to implement such a projection. Starting from the bottom up, allowing the read-side to be a PersistentActor in itself means that receiving Events should not require a mixin trait like PersistentView. The next bullet point means that the Event stream must be properly back-pressured, and we have a technology under development that is predestined for such an endeavor: Akka Streams. So the proposal is that any Actor can obtain the ActorRef for a given Journal and send it a request for the event stream it wants, and in response it will get a message containing a stream (i.e. Flow) of events and some meta-information to go with it. The question that remains at this point is what exactly it means to “ask for a certain set of events”. In order to keep the number of abstractions
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Hi Murali, I started development of an application based on Akka Persistence to implement CQRS concepts about a year ago. A lot of ideas came from different topics in this group. Recently I started to extract a small library from this application. The approach I took is to redundantly store all events in a global persistent actor in order to recreate views in a journal agnostic way. This also allows for easy in-memory testing. There is a minor risk, in that storing the event twice breaks consistency. The globally stored events I only use for (re)constructing views. Once there is a better solution, the global persistent actor can be deleted and the views can be reconstructed in a different manner. Even though this solution is not perfect, it might help your use case. I will add more tests and documentation over time as well, since currently, all tests remain in the application code. https://github.com/Product-Foundry/akka-cqrs Hope this helps, Andre On Friday, March 27, 2015 at 1:33:43 PM UTC+1, rkuhn wrote: Hi Murali, the core team at Typesafe cannot work on this right now (we need to finish Streams and HTTP first and have some other obligations as well), but Akka is an open-source project and we very much welcome contributions of all kinds. In this case we should probably start by defining more closely which queries to (initially) support and how to model them in the various backends, so that we can get a feel for how we shall change the Journal SPI. Regards, Roland 27 mar 2015 kl. 12:41 skrev Ganta Murali Krishna gan...@gmail.com javascript:: Hello Roland, Any news on this please. When we can expect implementation roughly? Your response will be really appreciated. Regards Murali On Wednesday, 27 August 2014 20:04:30 UTC+5:30, rkuhn wrote: Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here https://groups.google.com/d/msg/akka-user/SL5vEVW7aTo/KfqAXAmzol0J and here https://groups.google.com/d/msg/akka-user/4kbYcwWS2OI/hpmAkxnB9D4J). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does nearly the right thing, but it focuses on the wrong aspect: it seems most suited to track a single PersistentActor with some slack, but also not with back-pressure as a first-class citizen (it is possible to achieve it, albeit not trivial). What we distilled as the core functionality for a read-side actor is the following: - it can ask for a certain set of events - it consumes the resulting event stream on its own schedule - it can be stateful and persistent on its own This does not preclude populating e.g. a graph database or a SQL store directly from the journal back-end via Spark, but we do see the need to allow Akka Actors to be used to implement such a projection. Starting from the bottom up, allowing the read-side to be a PersistentActor in itself means that receiving Events should not require a mixin trait like PersistentView. The next bullet point means that the Event stream must be properly back-pressured, and we have a technology under development that is predestined for such an endeavor: Akka Streams. So the proposal is that any Actor can obtain the ActorRef for a given Journal and send it a request for the event stream it wants, and in response it will get a message containing a stream (i.e. Flow) of events and some meta-information to go with it. The question that remains at this point is what exactly it means to “ask for a certain set of events”. In order to keep the number of abstractions minimal, the first use-case for this feature is the recovery of a PersistentActor. Each Journal will probably support different kinds of queries, but it must for this use-case respond to case class QueryByPersistenceId(id: String, fromSeqNr: Long, toSeqNr: Long) with something like case class EventStreamOffer(metadata: Metadata, stream: Flow[PersistentMsg]) The metadata allows the recipient to correlate this offer with the corresponding request and it contains other information as we will see in the following. Another way to ask for
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Hi Murali, the core team at Typesafe cannot work on this right now (we need to finish Streams and HTTP first and have some other obligations as well), but Akka is an open-source project and we very much welcome contributions of all kinds. In this case we should probably start by defining more closely which queries to (initially) support and how to model them in the various backends, so that we can get a feel for how we shall change the Journal SPI. Regards, Roland 27 mar 2015 kl. 12:41 skrev Ganta Murali Krishna gant...@gmail.com: Hello Roland, Any news on this please. When we can expect implementation roughly? Your response will be really appreciated. Regards Murali On Wednesday, 27 August 2014 20:04:30 UTC+5:30, rkuhn wrote: Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here https://groups.google.com/d/msg/akka-user/SL5vEVW7aTo/KfqAXAmzol0J and here https://groups.google.com/d/msg/akka-user/4kbYcwWS2OI/hpmAkxnB9D4J). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does nearly the right thing, but it focuses on the wrong aspect: it seems most suited to track a single PersistentActor with some slack, but also not with back-pressure as a first-class citizen (it is possible to achieve it, albeit not trivial). What we distilled as the core functionality for a read-side actor is the following: it can ask for a certain set of events it consumes the resulting event stream on its own schedule it can be stateful and persistent on its own This does not preclude populating e.g. a graph database or a SQL store directly from the journal back-end via Spark, but we do see the need to allow Akka Actors to be used to implement such a projection. Starting from the bottom up, allowing the read-side to be a PersistentActor in itself means that receiving Events should not require a mixin trait like PersistentView. The next bullet point means that the Event stream must be properly back-pressured, and we have a technology under development that is predestined for such an endeavor: Akka Streams. So the proposal is that any Actor can obtain the ActorRef for a given Journal and send it a request for the event stream it wants, and in response it will get a message containing a stream (i.e. Flow) of events and some meta-information to go with it. The question that remains at this point is what exactly it means to “ask for a certain set of events”. In order to keep the number of abstractions minimal, the first use-case for this feature is the recovery of a PersistentActor. Each Journal will probably support different kinds of queries, but it must for this use-case respond to case class QueryByPersistenceId(id: String, fromSeqNr: Long, toSeqNr: Long) with something like case class EventStreamOffer(metadata: Metadata, stream: Flow[PersistentMsg]) The metadata allows the recipient to correlate this offer with the corresponding request and it contains other information as we will see in the following. Another way to ask for events was discussed as Topics or Labels or Tags in the previous threads, and the idea was that the generated stream of all events was enriched by qualifiers that allow the Journal to construct a materialized view (e.g. a separate queue that copies all events of a given type). This view then has a name that is requested from the read-side in order to e.g. have an Actor that keeps track of certain aspects of all persistent ShoppingCarts in a retail application. As I said above we think that this concern should be handled outside of the write-side because logically it does not belong there. Its closest cousin is the construction of an additional index or view within a SQL store, maintained by the RDBMS upon request from the DBA, but available to and relied upon by the read-side. We propose that this is also how this should work with Akka Persistence: the Journal is free to allow the configuration of materialized views that can be requested as event streams by name. The extraction of the indexing characteristics is performed by the Journal
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Maybe in 'The Reactive Manifesto' v3.0: The human user as an auxiliary part of a reactive system should be responsive, resilient, elastic and message-driven too. He may be the weakest link in the chain. W dniu piątek, 9 stycznia 2015 11:56:40 UTC+1 użytkownik Greg Young napisał: Usually it comes down to the realization that the computer is not the book of record. One of my favourites was being asked to build a fully consistent inventory system. I generally like to approach things with questions, the one i had was 'sure but how do we get people who are stealing stuff to appropriately check it out?' On 9 Jan 2015 12:02, Sebastian Bach sebastian@gmail.com javascript: wrote: Thank you Greg. The mind shift from a preventive to a reactive workflow is not easy for users (humans), because it requires a change of habits. For many people computer systems are kind of authoritative. There is this wrong assumption (from early days of computation?) that a computer accepts only a valid input or returns an error. It was then only black or white, the golden era of transactions. But this was (as you pointed out) always to some degree an hypocrisy. Now we have this shades of gray and many users feel unsettled. This holds true for any kind of resource allocation application and the overbooking (or wrong booking) problem. Some of the users define taking workload off of them as avoiding of planning mistakes, like commit and forget. But the actual workflow seems to shift towards an iterative process of human-computer interaction, to some kind of react-react ping-pong. Best Regards Sebastian W dniu środa, 7 stycznia 2015 22:15:42 UTC+1 użytkownik Greg Young napisał: The consistency of the query model should be achieved as soon as possible and close to real-time. It really depends on the domain. I have worked in many situations where the data in question would be perfectly fine updated once per month. (e.g. adding a sold out item to the shopping cart). This is a funny example because it shows not that you need to update read models more quickly but that you need to get the whole business on board. Remember that computer systems are normally part of a larger system fulfilling business needs. It really is a mind shift moving to eventual consistency. In the example of adding a sold out item... why stop it? Does it matter that we don't have any of this item? The real question is how quickly we can get it and if its worth our while to do so. To be fair 30 years ago these times were much much higher than what we talk about today and yet businesses still managed to work their way through things. For many of these types allowing things to go incorrectly is actually a good thing (overbooked seats on an airline, overdraft charges at banks...). To really be benefiting from eventual consistency the whole business process must recognize it. In terms of handling failures they are normally handled in a reactive not a preventative manner (like most business problems). Detect the failure, let a human deal with it. At the end of the day the primary role of the computer system is to take workload off of humans. You will hit the law of diminishing returns. dont try to solve every problem :) Greg On Wed, Jan 7, 2015 at 11:07 PM, Sebastian Bach sebastian@gmail.com wrote: Hi Roland, one thing to keep in mind in the CQRS/ES architecture is that not only the query side depends on the command side (by following the event stream) but also the command side depends on the query side for validation of complex business rules. This has a deep impact on correctness and throughput. Validation checks on an potentially outdated query model in an eventually consistent architecture is a hard problem (e.g. adding a sold out item to the shopping cart). The consistency of the query model should be achieved as soon as possible and close to real-time. A PersistentView in Akka has a default of 5s? On the other hand the speed of validation depends on the speed of the queries. And the throughput depends on the validation speed. Thus, queries directly on the whole event stream are less useful than persistent projections. Keep up the good work :) Cheers Sebastian W dniu wtorek, 7 października 2014 07:32:20 UTC+2 użytkownik rkuhn napisał: Hi Vaughn, from our side nothing has happened yet: my conclusion is that this thread contains all the information we need when we start working on this. The reason why we are waiting is that this work will depend heavily upon Akka Streams and therefore we are finishing those first, which should take roughly one month. Meanwhile, if use cases come up which could be used to refine the plans, please point them out here so that we can take all the inputs into account. Regards, Roland 6
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Thank you Greg. The mind shift from a preventive to a reactive workflow is not easy for users (humans), because it requires a change of habits. For many people computer systems are kind of authoritative. There is this wrong assumption (from early days of computation?) that a computer accepts only a valid input or returns an error. It was then only black or white, the golden era of transactions. But this was (as you pointed out) always to some degree an hypocrisy. Now we have this shades of gray and many users feel unsettled. This holds true for any kind of resource allocation application and the overbooking (or wrong booking) problem. Some of the users define taking workload off of them as avoiding of planning mistakes, like commit and forget. But the actual workflow seems to shift towards an iterative process of human-computer interaction, to some kind of react-react ping-pong. Best Regards Sebastian W dniu środa, 7 stycznia 2015 22:15:42 UTC+1 użytkownik Greg Young napisał: The consistency of the query model should be achieved as soon as possible and close to real-time. It really depends on the domain. I have worked in many situations where the data in question would be perfectly fine updated once per month. (e.g. adding a sold out item to the shopping cart). This is a funny example because it shows not that you need to update read models more quickly but that you need to get the whole business on board. Remember that computer systems are normally part of a larger system fulfilling business needs. It really is a mind shift moving to eventual consistency. In the example of adding a sold out item... why stop it? Does it matter that we don't have any of this item? The real question is how quickly we can get it and if its worth our while to do so. To be fair 30 years ago these times were much much higher than what we talk about today and yet businesses still managed to work their way through things. For many of these types allowing things to go incorrectly is actually a good thing (overbooked seats on an airline, overdraft charges at banks...). To really be benefiting from eventual consistency the whole business process must recognize it. In terms of handling failures they are normally handled in a reactive not a preventative manner (like most business problems). Detect the failure, let a human deal with it. At the end of the day the primary role of the computer system is to take workload off of humans. You will hit the law of diminishing returns. dont try to solve every problem :) Greg On Wed, Jan 7, 2015 at 11:07 PM, Sebastian Bach sebastian@gmail.com javascript: wrote: Hi Roland, one thing to keep in mind in the CQRS/ES architecture is that not only the query side depends on the command side (by following the event stream) but also the command side depends on the query side for validation of complex business rules. This has a deep impact on correctness and throughput. Validation checks on an potentially outdated query model in an eventually consistent architecture is a hard problem (e.g. adding a sold out item to the shopping cart). The consistency of the query model should be achieved as soon as possible and close to real-time. A PersistentView in Akka has a default of 5s? On the other hand the speed of validation depends on the speed of the queries. And the throughput depends on the validation speed. Thus, queries directly on the whole event stream are less useful than persistent projections. Keep up the good work :) Cheers Sebastian W dniu wtorek, 7 października 2014 07:32:20 UTC+2 użytkownik rkuhn napisał: Hi Vaughn, from our side nothing has happened yet: my conclusion is that this thread contains all the information we need when we start working on this. The reason why we are waiting is that this work will depend heavily upon Akka Streams and therefore we are finishing those first, which should take roughly one month. Meanwhile, if use cases come up which could be used to refine the plans, please point them out here so that we can take all the inputs into account. Regards, Roland 6 okt 2014 kl. 20:09 skrev Vaughn Vernon vve...@shiftmethod.com: Hi Roland, I's been a month this the last update on this and I have lost track of the status. Can you provide an update on where this stands? Is there a more recent akka-persistence build that supports the conclusions reached in this discussion? If so, what is the release number? If no, when is will the proposed features be released? Best, Vaughn On Fri, Sep 5, 2014 at 1:09 AM, Roland Kuhn goo...@rkuhn.info wrote: Attempting a second round-up of what shall go into tickets, in addition to my first summary we need to: predefine trait JournalQuery with minimal semantics (to make
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Usually it comes down to the realization that the computer is not the book of record. One of my favourites was being asked to build a fully consistent inventory system. I generally like to approach things with questions, the one i had was 'sure but how do we get people who are stealing stuff to appropriately check it out?' On 9 Jan 2015 12:02, Sebastian Bach sebastian.tomasz.b...@gmail.com wrote: Thank you Greg. The mind shift from a preventive to a reactive workflow is not easy for users (humans), because it requires a change of habits. For many people computer systems are kind of authoritative. There is this wrong assumption (from early days of computation?) that a computer accepts only a valid input or returns an error. It was then only black or white, the golden era of transactions. But this was (as you pointed out) always to some degree an hypocrisy. Now we have this shades of gray and many users feel unsettled. This holds true for any kind of resource allocation application and the overbooking (or wrong booking) problem. Some of the users define taking workload off of them as avoiding of planning mistakes, like commit and forget. But the actual workflow seems to shift towards an iterative process of human-computer interaction, to some kind of react-react ping-pong. Best Regards Sebastian W dniu środa, 7 stycznia 2015 22:15:42 UTC+1 użytkownik Greg Young napisał: The consistency of the query model should be achieved as soon as possible and close to real-time. It really depends on the domain. I have worked in many situations where the data in question would be perfectly fine updated once per month. (e.g. adding a sold out item to the shopping cart). This is a funny example because it shows not that you need to update read models more quickly but that you need to get the whole business on board. Remember that computer systems are normally part of a larger system fulfilling business needs. It really is a mind shift moving to eventual consistency. In the example of adding a sold out item... why stop it? Does it matter that we don't have any of this item? The real question is how quickly we can get it and if its worth our while to do so. To be fair 30 years ago these times were much much higher than what we talk about today and yet businesses still managed to work their way through things. For many of these types allowing things to go incorrectly is actually a good thing (overbooked seats on an airline, overdraft charges at banks...). To really be benefiting from eventual consistency the whole business process must recognize it. In terms of handling failures they are normally handled in a reactive not a preventative manner (like most business problems). Detect the failure, let a human deal with it. At the end of the day the primary role of the computer system is to take workload off of humans. You will hit the law of diminishing returns. dont try to solve every problem :) Greg On Wed, Jan 7, 2015 at 11:07 PM, Sebastian Bach sebastian@gmail.com wrote: Hi Roland, one thing to keep in mind in the CQRS/ES architecture is that not only the query side depends on the command side (by following the event stream) but also the command side depends on the query side for validation of complex business rules. This has a deep impact on correctness and throughput. Validation checks on an potentially outdated query model in an eventually consistent architecture is a hard problem (e.g. adding a sold out item to the shopping cart). The consistency of the query model should be achieved as soon as possible and close to real-time. A PersistentView in Akka has a default of 5s? On the other hand the speed of validation depends on the speed of the queries. And the throughput depends on the validation speed. Thus, queries directly on the whole event stream are less useful than persistent projections. Keep up the good work :) Cheers Sebastian W dniu wtorek, 7 października 2014 07:32:20 UTC+2 użytkownik rkuhn napisał: Hi Vaughn, from our side nothing has happened yet: my conclusion is that this thread contains all the information we need when we start working on this. The reason why we are waiting is that this work will depend heavily upon Akka Streams and therefore we are finishing those first, which should take roughly one month. Meanwhile, if use cases come up which could be used to refine the plans, please point them out here so that we can take all the inputs into account. Regards, Roland 6 okt 2014 kl. 20:09 skrev Vaughn Vernon vve...@shiftmethod.com: Hi Roland, I's been a month this the last update on this and I have lost track of the status. Can you provide an update on where this stands? Is there a more recent akka-persistence build that supports the conclusions reached in this discussion? If so, what is the release number? If no, when is will the
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
On Fri, Jan 9, 2015 at 11:56 AM, Greg Young gregoryyou...@gmail.com wrote: Usually it comes down to the realization that the computer is not the book of record. One of my favourites was being asked to build a fully consistent inventory system. I generally like to approach things with questions, the one i had was 'sure but how do we get people who are stealing stuff to appropriately check it out?' +Long.MAX_VALUE on this. Building warehouse management systems is very eye-opening. On 9 Jan 2015 12:02, Sebastian Bach sebastian.tomasz.b...@gmail.com wrote: Thank you Greg. The mind shift from a preventive to a reactive workflow is not easy for users (humans), because it requires a change of habits. For many people computer systems are kind of authoritative. There is this wrong assumption (from early days of computation?) that a computer accepts only a valid input or returns an error. It was then only black or white, the golden era of transactions. But this was (as you pointed out) always to some degree an hypocrisy. Now we have this shades of gray and many users feel unsettled. This holds true for any kind of resource allocation application and the overbooking (or wrong booking) problem. Some of the users define taking workload off of them as avoiding of planning mistakes, like commit and forget. But the actual workflow seems to shift towards an iterative process of human-computer interaction, to some kind of react-react ping-pong. Best Regards Sebastian W dniu środa, 7 stycznia 2015 22:15:42 UTC+1 użytkownik Greg Young napisał: The consistency of the query model should be achieved as soon as possible and close to real-time. It really depends on the domain. I have worked in many situations where the data in question would be perfectly fine updated once per month. (e.g. adding a sold out item to the shopping cart). This is a funny example because it shows not that you need to update read models more quickly but that you need to get the whole business on board. Remember that computer systems are normally part of a larger system fulfilling business needs. It really is a mind shift moving to eventual consistency. In the example of adding a sold out item... why stop it? Does it matter that we don't have any of this item? The real question is how quickly we can get it and if its worth our while to do so. To be fair 30 years ago these times were much much higher than what we talk about today and yet businesses still managed to work their way through things. For many of these types allowing things to go incorrectly is actually a good thing (overbooked seats on an airline, overdraft charges at banks...). To really be benefiting from eventual consistency the whole business process must recognize it. In terms of handling failures they are normally handled in a reactive not a preventative manner (like most business problems). Detect the failure, let a human deal with it. At the end of the day the primary role of the computer system is to take workload off of humans. You will hit the law of diminishing returns. dont try to solve every problem :) Greg On Wed, Jan 7, 2015 at 11:07 PM, Sebastian Bach sebastian@gmail.com wrote: Hi Roland, one thing to keep in mind in the CQRS/ES architecture is that not only the query side depends on the command side (by following the event stream) but also the command side depends on the query side for validation of complex business rules. This has a deep impact on correctness and throughput. Validation checks on an potentially outdated query model in an eventually consistent architecture is a hard problem (e.g. adding a sold out item to the shopping cart). The consistency of the query model should be achieved as soon as possible and close to real-time. A PersistentView in Akka has a default of 5s? On the other hand the speed of validation depends on the speed of the queries. And the throughput depends on the validation speed. Thus, queries directly on the whole event stream are less useful than persistent projections. Keep up the good work :) Cheers Sebastian W dniu wtorek, 7 października 2014 07:32:20 UTC+2 użytkownik rkuhn napisał: Hi Vaughn, from our side nothing has happened yet: my conclusion is that this thread contains all the information we need when we start working on this. The reason why we are waiting is that this work will depend heavily upon Akka Streams and therefore we are finishing those first, which should take roughly one month. Meanwhile, if use cases come up which could be used to refine the plans, please point them out here so that we can take all the inputs into account. Regards, Roland 6 okt 2014 kl. 20:09 skrev Vaughn Vernon vve...@shiftmethod.com: Hi Roland, I's been a month this the last update on this and I have lost track of the status. Can you provide an update on where this stands?
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
That is a great point Greg. On Wed, Jan 7, 2015 at 10:15 PM, Greg Young gregoryyou...@gmail.com wrote: The consistency of the query model should be achieved as soon as possible and close to real-time. It really depends on the domain. I have worked in many situations where the data in question would be perfectly fine updated once per month. (e.g. adding a sold out item to the shopping cart). This is a funny example because it shows not that you need to update read models more quickly but that you need to get the whole business on board. Remember that computer systems are normally part of a larger system fulfilling business needs. It really is a mind shift moving to eventual consistency. In the example of adding a sold out item... why stop it? Does it matter that we don't have any of this item? The real question is how quickly we can get it and if its worth our while to do so. To be fair 30 years ago these times were much much higher than what we talk about today and yet businesses still managed to work their way through things. For many of these types allowing things to go incorrectly is actually a good thing (overbooked seats on an airline, overdraft charges at banks...). To really be benefiting from eventual consistency the whole business process must recognize it. In terms of handling failures they are normally handled in a reactive not a preventative manner (like most business problems). Detect the failure, let a human deal with it. At the end of the day the primary role of the computer system is to take workload off of humans. You will hit the law of diminishing returns. dont try to solve every problem :) Greg On Wed, Jan 7, 2015 at 11:07 PM, Sebastian Bach sebastian.tomasz.b...@gmail.com wrote: Hi Roland, one thing to keep in mind in the CQRS/ES architecture is that not only the query side depends on the command side (by following the event stream) but also the command side depends on the query side for validation of complex business rules. This has a deep impact on correctness and throughput. Validation checks on an potentially outdated query model in an eventually consistent architecture is a hard problem (e.g. adding a sold out item to the shopping cart). The consistency of the query model should be achieved as soon as possible and close to real-time. A PersistentView in Akka has a default of 5s? On the other hand the speed of validation depends on the speed of the queries. And the throughput depends on the validation speed. Thus, queries directly on the whole event stream are less useful than persistent projections. Keep up the good work :) Cheers Sebastian W dniu wtorek, 7 października 2014 07:32:20 UTC+2 użytkownik rkuhn napisał: Hi Vaughn, from our side nothing has happened yet: my conclusion is that this thread contains all the information we need when we start working on this. The reason why we are waiting is that this work will depend heavily upon Akka Streams and therefore we are finishing those first, which should take roughly one month. Meanwhile, if use cases come up which could be used to refine the plans, please point them out here so that we can take all the inputs into account. Regards, Roland 6 okt 2014 kl. 20:09 skrev Vaughn Vernon vve...@shiftmethod.com: Hi Roland, I's been a month this the last update on this and I have lost track of the status. Can you provide an update on where this stands? Is there a more recent akka-persistence build that supports the conclusions reached in this discussion? If so, what is the release number? If no, when is will the proposed features be released? Best, Vaughn On Fri, Sep 5, 2014 at 1:09 AM, Roland Kuhn goo...@rkuhn.info wrote: Attempting a second round-up of what shall go into tickets, in addition to my first summary we need to: predefine trait JournalQuery with minimal semantics (to make the Journal support discoverable at runtime) predefine queries for named streams since that is universally useful; these are separate from PersistenceID queries due to different consistency requirements add support for write-side tags (see below) add a comprehensive PersistenceTestKit which supports the fabrication of arbitrary event streams for both PersistentActor and read-side verification Ashley, your challenge about considering non-ES write-sides is one that I think we might not take up: the scope of Akka Persistence is to support persistent Actors and their interactions, therefore I believe we should be opinionated about how we achieve that. If you want to use CQRS without ES then Akka might just not be for you (for some values of “you”, not necessarily you ;-) ). Now why tags? My previous conclusion was that burdening the write-side with generating them goes counter to the spirit of ES in that this tagging should well be possible after the fact.
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Hi Vaughn, from our side nothing has happened yet: my conclusion is that this thread contains all the information we need when we start working on this. The reason why we are waiting is that this work will depend heavily upon Akka Streams and therefore we are finishing those first, which should take roughly one month. Meanwhile, if use cases come up which could be used to refine the plans, please point them out here so that we can take all the inputs into account. Regards, Roland 6 okt 2014 kl. 20:09 skrev Vaughn Vernon vver...@shiftmethod.com: Hi Roland, I's been a month this the last update on this and I have lost track of the status. Can you provide an update on where this stands? Is there a more recent akka-persistence build that supports the conclusions reached in this discussion? If so, what is the release number? If no, when is will the proposed features be released? Best, Vaughn On Fri, Sep 5, 2014 at 1:09 AM, Roland Kuhn goo...@rkuhn.info wrote: Attempting a second round-up of what shall go into tickets, in addition to my first summary we need to: predefine trait JournalQuery with minimal semantics (to make the Journal support discoverable at runtime) predefine queries for named streams since that is universally useful; these are separate from PersistenceID queries due to different consistency requirements add support for write-side tags (see below) add a comprehensive PersistenceTestKit which supports the fabrication of arbitrary event streams for both PersistentActor and read-side verification Ashley, your challenge about considering non-ES write-sides is one that I think we might not take up: the scope of Akka Persistence is to support persistent Actors and their interactions, therefore I believe we should be opinionated about how we achieve that. If you want to use CQRS without ES then Akka might just not be for you (for some values of “you”, not necessarily you ;-) ). Now why tags? My previous conclusion was that burdening the write-side with generating them goes counter to the spirit of ES in that this tagging should well be possible after the fact. The problem is that that can be extremely costly, so spawning a particular query on the read-side should not implicitly replay all events of all time, that has the potential of bringing down the whole system due to overload. I still think that stores might want to offer this feature under the covers—i.e. not accessible via Akka Persistence standard APIs—but for those that cannot we need to provide something else. The most prominent use of tags will probably be that each kind of PersistentActor has its own tag, solving the type issue as well (as brought up by Alex and Olger). In summary, write-side tags are just an optimization. Concerning the ability to publish to arbitrary topics from any Actor I am on the fence: this is a powerful feature that can be quite a burden to implement. What we are defining here is—again—Akka Persistence, meaning that all things that are journaled are intended to stay there eternally. Using this to realize a (usually ephemeral) event bus is probably going to suffer from impedance mismatches, as witnessed by previous questions concerning the efficiency of deleting log entries—something that should really not be done in the intended use-cases. So, if an Actor wants to persist an Event to make it part of the journaled event stream, then I’d argue that that Actor is at least conceptually a PersistentActor. What is wrong with requiring it to be one also in practice? The only thing that we might want to add is that recovery (i.e. the write-side reading of events) can be opted out of. Thoughts? For everything going beyond the above I’d say we should wait and see what extensions are provided by Journal implementations and how well they work in practice. Regards, Roland 27 aug 2014 kl. 16:34 skrev Roland Kuhn goo...@rkuhn.info: Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here and here). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Hi Markus, thanks for this very thoughtful contribution! [comments inline] 31 aug 2014 kl. 15:05 skrev Markus H m...@heckelmann.de: Hi Roland, sounds great that you are pushing for the whole CQRS story. I'm just experimenting with akka and CQRS and have no production experience, but I'm thinking about the concepts since some time. So please take my comments with a big grain of salt and forgive me for making it sound somewhat like a wish list. But I think if there is a time for wishes, it might be now. I feel the need to distinguish a little more between commands, querys, reads and writes. In a CQRS setup, I (conceptually) see three parts: (1) the Command side: mainly eventsourced PersistentActors that build little islands of consistency for changing the application state (2) the link between the Command and Query side: the possibility to make use of all or a subset of all events/messages written in (1) for building an optimized query side (3) or even for updating other islands of consistency on the Command side (other aggregates or bounded contexts) in (1) (3) the Query side: keeping an eventually consistent view of the application state in any form that is suitable for fast application queries From the 1-foot view, we write to (1) and read from (3) but each of (1),(2) and (3) has its own reads and writes: (1) writes: -messages produced by the application to be persisted reads: -consistent read of the persisted messages for actor replay -stream of all messages (I think, that's what you mean by SuperscalableTopic) (2) writes: -at least conceptually: all messages of the all-messages stream of (1) reads: -different subsets of the all-messages stream that make sense to different parts of our application (3) writes: -any query-optimized form of our data that was read of some sub-stream of (2) reads: -whatever query the query-side datastore allows (SQL, fulltext searches, graph walks etc.) While (1) is the current akka persistence implementation plus a way to get all messages, (2) is more like the Event Bus (though I would name it differently) in this picture from the axonframework documentation (http://www.axonframework.org/docs/2.0/images/detailed-architecture-overview.png). (1) and (2) could be done by one product like what eventstore does with its projections or it could be different products like Cassandra for (1) and Kafka for (2). (3) could be anything that holds data. Yes, this is a very good summary. Some more detail: On (1): As said before, the command side is fine as it is today to put messages into the datastore and get them out again for persistent actors. I definitely would consider replaying of messages for persistent actors part of the command side, since the command side in itself has stronger consistency requirements (consistentency within an aggregate) than the query side. Additionally, as Ashley wrote, any actor should be able to push messages to the all-messages stream. In contrast to persistent actors I dont' see any need for replay here. Therefore and for other reasons (like message deduplication in (2)) I would like to propose adding an extra unique ID (UUID?) for any message handled by the command side, independent of an actors' persistenceId (which would be needed for replays nevertheless). Also, I see the need to provide some guarantees for the all-messages stream. I would consider an ordering guarantee for messages from the same actor and an otherwise (at least roughly) timestamp based sorting a good compromise. This would also be comparable to the guarantees that akka provides for message sending. Ideally, the order stays the same for repeated reads of the all-messages stream. With the guarantees mentioned before, if the datastore keeps all messages of an actor on the same node, the all-messages stream could even be created per datastore node. On (2): As mentioned above, I see the QueryByPersistenceId as part of (1) as it requires stronger consistency guarantees. All other QueryByWhatever are all about the question, how to retrieve the right subset of messages from the all-messages stream for the application and its domain. This of course differs by application and domain. Therefore I like Martin's QueryByStream(name, ...), where a stream is any subset of messages the application cares about. I also think it should not be up to the datastore to decide what streams to offer. I also can't really imagine how this should work in most datastores. While there might be some named index on top of JSON messages in MongoDB that can be served as as stream, I don't see how to create a stream/view/index in Key-Value stores or RDBMS where a message is probably persisted as byte array whithout any knowledge of the application. To tell the datastore what
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Attempting a second round-up of what shall go into tickets, in addition to my first summary we need to: predefine trait JournalQuery with minimal semantics (to make the Journal support discoverable at runtime) predefine queries for named streams since that is universally useful; these are separate from PersistenceID queries due to different consistency requirements add support for write-side tags (see below) add a comprehensive PersistenceTestKit which supports the fabrication of arbitrary event streams for both PersistentActor and read-side verification Ashley, your challenge about considering non-ES write-sides is one that I think we might not take up: the scope of Akka Persistence is to support persistent Actors and their interactions, therefore I believe we should be opinionated about how we achieve that. If you want to use CQRS without ES then Akka might just not be for you (for some values of “you”, not necessarily you ;-) ). Now why tags? My previous conclusion was that burdening the write-side with generating them goes counter to the spirit of ES in that this tagging should well be possible after the fact. The problem is that that can be extremely costly, so spawning a particular query on the read-side should not implicitly replay all events of all time, that has the potential of bringing down the whole system due to overload. I still think that stores might want to offer this feature under the covers—i.e. not accessible via Akka Persistence standard APIs—but for those that cannot we need to provide something else. The most prominent use of tags will probably be that each kind of PersistentActor has its own tag, solving the type issue as well (as brought up by Alex and Olger). In summary, write-side tags are just an optimization. Concerning the ability to publish to arbitrary topics from any Actor I am on the fence: this is a powerful feature that can be quite a burden to implement. What we are defining here is—again—Akka Persistence, meaning that all things that are journaled are intended to stay there eternally. Using this to realize a (usually ephemeral) event bus is probably going to suffer from impedance mismatches, as witnessed by previous questions concerning the efficiency of deleting log entries—something that should really not be done in the intended use-cases. So, if an Actor wants to persist an Event to make it part of the journaled event stream, then I’d argue that that Actor is at least conceptually a PersistentActor. What is wrong with requiring it to be one also in practice? The only thing that we might want to add is that recovery (i.e. the write-side reading of events) can be opted out of. Thoughts? For everything going beyond the above I’d say we should wait and see what extensions are provided by Journal implementations and how well they work in practice. Regards, Roland 27 aug 2014 kl. 16:34 skrev Roland Kuhn goo...@rkuhn.info: Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here and here). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does nearly the right thing, but it focuses on the wrong aspect: it seems most suited to track a single PersistentActor with some slack, but also not with back-pressure as a first-class citizen (it is possible to achieve it, albeit not trivial). What we distilled as the core functionality for a read-side actor is the following: it can ask for a certain set of events it consumes the resulting event stream on its own schedule it can be stateful and persistent on its own This does not preclude populating e.g. a graph database or a SQL store directly from the journal back-end via Spark, but we do see the need to allow Akka Actors to be used to implement such a projection. Starting from the bottom up, allowing the read-side to be a PersistentActor in itself means that receiving Events should not require a mixin trait like PersistentView. The next bullet point means that the Event stream must be properly back-pressured, and we have a technology under development that is predestined for such an endeavor: Akka Streams. So
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
On 05.09.14 09:09, Roland Kuhn wrote: Attempting a second round-up of what shall go into tickets, in addition to my first summary we need to: * predefine trait JournalQuery with minimal semantics (to make the Journal support discoverable at runtime) * predefine queries for named streams since that is universally useful; these are separate from PersistenceID queries due to different consistency requirements * add support for write-side tags (see below) * add a comprehensive PersistenceTestKit which supports the fabrication of arbitrary event streams for both PersistentActor and read-side verification Ashley, your challenge about considering non-ES write-sides is one that I think we might not take up: the scope of Akka Persistence is to support persistent Actors and their interactions, therefore I believe we should be opinionated about how we achieve that. If you want to use CQRS without ES then Akka might just not be for you (for some values of “you”, not necessarily you ;-) ). Now why tags? My previous conclusion was that burdening the write-side with generating them goes counter to the spirit of ES in that this tagging should well be possible after the fact. The problem is that that can be extremely costly, so spawning a particular query on the read-side should not implicitly replay all events of all time, that has the potential of bringing down the whole system due to overload. I still think that stores might want to offer this feature under the covers—i.e. not accessible via Akka Persistence standard APIs—but for those that cannot we need to provide something else. The most prominent use of tags will probably be that each kind of PersistentActor has its own tag, solving the type issue as well (as brought up by Alex and Olger). In summary, write-side tags are just an optimization. Concerning the ability to publish to arbitrary topics from any Actor I am on the fence: this is a powerful feature that can be quite a burden to implement. What we are defining here is—again—Akka Persistence, meaning that all things that are journaled are intended to stay there eternally. Using this to realize a (usually ephemeral) event bus is probably going to suffer from impedance mismatches, as witnessed by previous questions concerning the efficiency of deleting log entries—something that should really not be done in the intended use-cases. So, if an Actor wants to persist an Event to make it part of the journaled event stream, then I’d argue that that Actor is at least conceptually a PersistentActor. What is wrong with requiring it to be one also in practice? The only thing that we might want to add is that recovery (i.e. the write-side reading of events) can be opted out of. Thoughts? Already doable with: override def preStart(): Unit = { self ! Recover(fromSnapshot = SnapshotSelectionCriteria.None, replayMax = 0L) } but maybe you were more thinking about different traits for writing and recovery (?) For everything going beyond the above I’d say we should wait and see what extensions are provided by Journal implementations and how well they work in practice. Regards, Roland 27 aug 2014 kl. 16:34 skrev Roland Kuhn goo...@rkuhn.info mailto:goo...@rkuhn.info: Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here https://groups.google.com/d/msg/akka-user/SL5vEVW7aTo/KfqAXAmzol0J and here https://groups.google.com/d/msg/akka-user/4kbYcwWS2OI/hpmAkxnB9D4J). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does nearly the right thing, but it focuses on the wrong aspect: it seems most suited to track a single PersistentActor with some slack, but also not with back-pressure as a first-class citizen (it is possible to achieve it, albeit not trivial). What we distilled as the core functionality for a read-side actor is the following: * it can ask for a certain set of events * it consumes the resulting event stream on its own schedule * it can be stateful and persistent on its own This does not preclude populating e.g. a graph database or a SQL store directly from the
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Hi Roland, This will certainly simplify my code. So from my side it will be a good 'start' to experiment with these additions in practice and see what's missing. Kind regards, Olger On Friday, September 5, 2014 9:49:20 AM UTC+2, Martin Krasser wrote: On 05.09.14 09:09, Roland Kuhn wrote: Attempting a second round-up of what shall go into tickets, in addition to my first summary we need to: - predefine trait JournalQuery with minimal semantics (to make the Journal support discoverable at runtime) - predefine queries for named streams since that is universally useful; these are separate from PersistenceID queries due to different consistency requirements - add support for write-side tags (see below) - add a comprehensive PersistenceTestKit which supports the fabrication of arbitrary event streams for both PersistentActor and read-side verification Ashley, your challenge about considering non-ES write-sides is one that I think we might not take up: the scope of Akka Persistence is to support persistent Actors and their interactions, therefore I believe we should be opinionated about how we achieve that. If you want to use CQRS without ES then Akka might just not be for you (for some values of “you”, not necessarily you ;-) ). Now why tags? My previous conclusion was that burdening the write-side with generating them goes counter to the spirit of ES in that this tagging should well be possible after the fact. The problem is that that can be extremely costly, so spawning a particular query on the read-side should not implicitly replay all events of all time, that has the potential of bringing down the whole system due to overload. I still think that stores might want to offer this feature under the covers—i.e. not accessible via Akka Persistence standard APIs—but for those that cannot we need to provide something else. The most prominent use of tags will probably be that each kind of PersistentActor has its own tag, solving the type issue as well (as brought up by Alex and Olger). In summary, write-side tags are just an optimization. Concerning the ability to publish to arbitrary topics from any Actor I am on the fence: this is a powerful feature that can be quite a burden to implement. What we are defining here is—again—Akka Persistence, meaning that all things that are journaled are intended to stay there eternally. Using this to realize a (usually ephemeral) event bus is probably going to suffer from impedance mismatches, as witnessed by previous questions concerning the efficiency of deleting log entries—something that should really not be done in the intended use-cases. So, if an Actor wants to persist an Event to make it part of the journaled event stream, then I’d argue that that Actor is at least conceptually a PersistentActor. What is wrong with requiring it to be one also in practice? The only thing that we might want to add is that recovery (i.e. the write-side reading of events) can be opted out of. Thoughts? Already doable with: override def preStart(): Unit = { self ! Recover(fromSnapshot = SnapshotSelectionCriteria.None, replayMax = 0L) } but maybe you were more thinking about different traits for writing and recovery (?) For everything going beyond the above I’d say we should wait and see what extensions are provided by Journal implementations and how well they work in practice. Regards, Roland 27 aug 2014 kl. 16:34 skrev Roland Kuhn goo...@rkuhn.info javascript: : Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here https://groups.google.com/d/msg/akka-user/SL5vEVW7aTo/KfqAXAmzol0J and here https://groups.google.com/d/msg/akka-user/4kbYcwWS2OI/hpmAkxnB9D4J). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does nearly the right thing, but it focuses on the wrong aspect: it seems most suited to track a single PersistentActor with some slack, but also not with back-pressure as a first-class citizen (it is possible to achieve it, albeit not
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
5 sep 2014 kl. 09:49 skrev Martin Krasser krass...@googlemail.com: On 05.09.14 09:09, Roland Kuhn wrote: Attempting a second round-up of what shall go into tickets, in addition to my first summary we need to: predefine trait JournalQuery with minimal semantics (to make the Journal support discoverable at runtime) predefine queries for named streams since that is universally useful; these are separate from PersistenceID queries due to different consistency requirements add support for write-side tags (see below) add a comprehensive PersistenceTestKit which supports the fabrication of arbitrary event streams for both PersistentActor and read-side verification Ashley, your challenge about considering non-ES write-sides is one that I think we might not take up: the scope of Akka Persistence is to support persistent Actors and their interactions, therefore I believe we should be opinionated about how we achieve that. If you want to use CQRS without ES then Akka might just not be for you (for some values of “you”, not necessarily you ;-) ). Now why tags? My previous conclusion was that burdening the write-side with generating them goes counter to the spirit of ES in that this tagging should well be possible after the fact. The problem is that that can be extremely costly, so spawning a particular query on the read-side should not implicitly replay all events of all time, that has the potential of bringing down the whole system due to overload. I still think that stores might want to offer this feature under the covers—i.e. not accessible via Akka Persistence standard APIs—but for those that cannot we need to provide something else. The most prominent use of tags will probably be that each kind of PersistentActor has its own tag, solving the type issue as well (as brought up by Alex and Olger). In summary, write-side tags are just an optimization. Concerning the ability to publish to arbitrary topics from any Actor I am on the fence: this is a powerful feature that can be quite a burden to implement. What we are defining here is—again—Akka Persistence, meaning that all things that are journaled are intended to stay there eternally. Using this to realize a (usually ephemeral) event bus is probably going to suffer from impedance mismatches, as witnessed by previous questions concerning the efficiency of deleting log entries—something that should really not be done in the intended use-cases. So, if an Actor wants to persist an Event to make it part of the journaled event stream, then I’d argue that that Actor is at least conceptually a PersistentActor. What is wrong with requiring it to be one also in practice? The only thing that we might want to add is that recovery (i.e. the write-side reading of events) can be opted out of. Thoughts? Already doable with: override def preStart(): Unit = { self ! Recover(fromSnapshot = SnapshotSelectionCriteria.None, replayMax = 0L) } but maybe you were more thinking about different traits for writing and recovery (?) True, this looks good enough for now. If this surfaces a lot then we might think about adding a pure writer trait later; I think PersistentActor is unlikely to need to change for its intended purpose so we should be good (considering source compatibility beyond 2.4.0). Regards, Roland For everything going beyond the above I’d say we should wait and see what extensions are provided by Journal implementations and how well they work in practice. Regards, Roland 27 aug 2014 kl. 16:34 skrev Roland Kuhn goo...@rkuhn.info: Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here and here). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does nearly the right thing, but it focuses on the wrong aspect: it seems most suited to track a single PersistentActor with some slack, but also not with back-pressure as a first-class citizen (it is possible to achieve it, albeit not trivial). What we distilled as the core functionality for a read-side actor is the following: it can
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Hi Martin and Alex, the point you raise is a good one, I did not include it in my first email because defining these common (and thereby de-facto required) queries is not a simple task: we should not include something that most journals will opt out of, and what we pick must be of general interest because it presents a burden to every journal implementor—at least morally. My reasoning for keeping persistenceIds and arbitrary named streams separate is that persistenceIds must be supported in a fully linearizable fashion whereas named streams do not necessarily have this requirement; on the write-side it might make sense to specify that persistenceIds are only ever written to from one actor at a time, which potentially simplifies the Journal implementation. This also does not hold for named streams. Concerning types I assume that there is an implicit expectation that they work like types in Java: when I persist a ShoppingCartCreated event I want to see it in the stream for all ShoppingCartEvents. This means that the Journal needs to understand subtype relationships that in turn have to be lifted from the programming language used. It might be that this is possible, but at least it is not trivial. Is it reasonable to expect most Journal implementations to support this? One query that we might want to include generically is the ability to ask for the merged streams of multiple persistenceIds—be that deterministic or not. Another thought: there should probably be a trait JournalQuery from which the discussed case classes inherit, and we should specify that the Journal is obliged to reply to all such requests, explicitly denying those it does not implement. Regards, Roland 28 aug 2014 kl. 11:12 skrev ahjohannessen ahjohannes...@gmail.com: Hi Martin, On Thursday, August 28, 2014 8:01:43 AM UTC+1, Martin Krasser wrote: In your summary, the only query command type pre-defined in akka-persistence is QueryByPersistenceId. I'd find it useful to further pre-define other query command types in akka-persistence to cover the most common use cases, such as: - QueryByStreamDeterministic(name, from, to) (as a generalization of QueryKafkaTopic, ... and maybe also QueryByPersistenceId) - QueryByTypeDeterministic(type, from, to) - QueryByStream(name, fromTime) - QueryByType(type, fromTime) Supporting these commands would still be optional but it would give better guidance for plugin developers which queries to support and, more importantly, make it easier for applications to switch from one plugin to another. Other, more specialized queries would still remain plugin-specific such as QueryByProperty, QueryDynamic(queryString), etc ... I think it is a great idea to standardize on common cases such as those you line up, because it gives guidance and reduces one-off fragmentation among journal APIs. It is reasonable that akka-persistence sets some sort of standard with respect to general use cases of reading streams. -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout. Dr. Roland Kuhn Akka Tech Lead Typesafe – Reactive apps on the JVM. twitter: @rolandkuhn -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
On 28.08.14 12:00, Roland Kuhn wrote: Hi Martin and Alex, the point you raise is a good one, I did not include it in my first email because defining these common (and thereby de-facto required) queries is not a simple task: we should not include something that most journals will opt out of, and what we pick must be of general interest because it presents a burden to every journal implementor—at least morally. My reasoning for keeping persistenceIds and arbitrary named streams separate is that persistenceIds must be supported in a fully linearizable fashion whereas named streams do not necessarily have this requirement; on the write-side it might make sense to specify that persistenceIds are only ever written to from one actor at a time, which potentially simplifies the Journal implementation. This also does not hold for named streams. I agree, makes sense to keep QueryByPersistenceId a separate type. Concerning types I assume that there is an implicit expectation that they work like types in Java: when I persist a ShoppingCartCreated event I want to see it in the stream for all ShoppingCartEvents. This means that the Journal needs to understand subtype relationships that in turn have to be lifted from the programming language used. It might be that this is possible, but at least it is not trivial. Is it reasonable to expect most Journal implementations to support this? I only included it in my proposal because it was discussed/requested very often if I remember correctly. It may not be trivial to support though. Fine for me if they're not pre-defined. One query that we might want to include generically is the ability to ask for the merged streams of multiple persistenceIds—be that deterministic or not. It can be pretty hard for some backend stores to support that in a scalable way (i.e. that scales to a large number of persistenceIds) as you cannot pre-compute the results because the persistenceIds are query arguments. Having a QueryByStream(name, ...) in additition to QueryByPersistenceId is much less of a burden and serves a wide range of use cases. Another thought: there should probably be a trait JournalQuery from which the discussed case classes inherit, and we should specify that the Journal is obliged to reply to all such requests, explicitly denying those it does not implement. +1 Regards, Roland 28 aug 2014 kl. 11:12 skrev ahjohannessen ahjohannes...@gmail.com mailto:ahjohannes...@gmail.com: Hi Martin, On Thursday, August 28, 2014 8:01:43 AM UTC+1, Martin Krasser wrote: In your summary, the only query command type pre-defined in akka-persistence is QueryByPersistenceId. I'd find it useful to further pre-define other query command types in akka-persistence to cover the most common use cases, such as: - QueryByStreamDeterministic(name, from, to) (as a generalization of QueryKafkaTopic, ... and maybe also QueryByPersistenceId) - QueryByTypeDeterministic(type, from, to) - QueryByStream(name, fromTime) - QueryByType(type, fromTime) Supporting these commands would still be optional but it would give better guidance for plugin developers which queries to support and, more importantly, make it easier for applications to switch from one plugin to another. Other, more specialized queries would still remain plugin-specific such as QueryByProperty, QueryDynamic(queryString), etc ... I think it is a great idea to standardize on common cases such as those you line up, because it gives guidance and reduces one-off fragmentation among journal APIs. It is reasonable that akka-persistence sets some sort of standard with respect to general use cases of reading streams. -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com mailto:akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com mailto:akka-user@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout. *Dr. Roland Kuhn* /Akka Tech Lead/ Typesafe http://typesafe.com/ – Reactive apps on the JVM. twitter: @rolandkuhn http://twitter.com/#%21/rolandkuhn -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com
Re: [akka-user] Akka Persistence on the Query Side: The Conclusion
Hi Roland, On Thursday, August 28, 2014 11:00:17 AM UTC+1, rkuhn wrote: Concerning types I assume that there is an implicit expectation that they work like types in Java: when I persist a ShoppingCartCreated event I want to see it in the stream for all ShoppingCartEvents. This means that the Journal needs to understand subtype relationships that in turn have to be lifted from the programming language used. It might be that this is possible, but at least it is not trivial. Is it reasonable to expect most Journal implementations to support this? Having a stream per event type is overkill in my opinion. I think first and foremost the *primary* need concerning types is to be able to group per persistent actor type and not to differentiate individual events, that's possible to do afterwards with something like streamz by Martin. -- Read the docs: http://akka.io/docs/ Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html Search the archives: https://groups.google.com/group/akka-user --- You received this message because you are subscribed to the Google Groups Akka User List group. To unsubscribe from this group and stop receiving emails from it, send an email to akka-user+unsubscr...@googlegroups.com. To post to this group, send email to akka-user@googlegroups.com. Visit this group at http://groups.google.com/group/akka-user. For more options, visit https://groups.google.com/d/optout.
[akka-user] Akka Persistence on the Query Side: The Conclusion
Dear hakkers, there have been several very interesting, educational and productive threads in the past weeks (e.g. here and here). We have taken some time to distill the essential problems as well as discuss the proposed solutions and below is my attempt at a summary. In the very likely case that I missed something, by all means please raise your voice. The intention for this thread is to end with a set of github issues for making Akka Persistence as closely aligned with CQRS/ES principles as we can make it. As Greg and others have confirmed, the write-side (PersistentActor) is already doing a very good job, so we do not see a need to change anything at this point. My earlier proposal of adding specific topics as well as the discussed labels or tags all feel a bit wrong since they benefit only the read-side and should therefore not be a concern/duty of the write-side. On the read-side we came to the conclusion that PersistentView basically does nearly the right thing, but it focuses on the wrong aspect: it seems most suited to track a single PersistentActor with some slack, but also not with back-pressure as a first-class citizen (it is possible to achieve it, albeit not trivial). What we distilled as the core functionality for a read-side actor is the following: it can ask for a certain set of events it consumes the resulting event stream on its own schedule it can be stateful and persistent on its own This does not preclude populating e.g. a graph database or a SQL store directly from the journal back-end via Spark, but we do see the need to allow Akka Actors to be used to implement such a projection. Starting from the bottom up, allowing the read-side to be a PersistentActor in itself means that receiving Events should not require a mixin trait like PersistentView. The next bullet point means that the Event stream must be properly back-pressured, and we have a technology under development that is predestined for such an endeavor: Akka Streams. So the proposal is that any Actor can obtain the ActorRef for a given Journal and send it a request for the event stream it wants, and in response it will get a message containing a stream (i.e. Flow) of events and some meta-information to go with it. The question that remains at this point is what exactly it means to “ask for a certain set of events”. In order to keep the number of abstractions minimal, the first use-case for this feature is the recovery of a PersistentActor. Each Journal will probably support different kinds of queries, but it must for this use-case respond to case class QueryByPersistenceId(id: String, fromSeqNr: Long, toSeqNr: Long) with something like case class EventStreamOffer(metadata: Metadata, stream: Flow[PersistentMsg]) The metadata allows the recipient to correlate this offer with the corresponding request and it contains other information as we will see in the following. Another way to ask for events was discussed as Topics or Labels or Tags in the previous threads, and the idea was that the generated stream of all events was enriched by qualifiers that allow the Journal to construct a materialized view (e.g. a separate queue that copies all events of a given type). This view then has a name that is requested from the read-side in order to e.g. have an Actor that keeps track of certain aspects of all persistent ShoppingCarts in a retail application. As I said above we think that this concern should be handled outside of the write-side because logically it does not belong there. Its closest cousin is the construction of an additional index or view within a SQL store, maintained by the RDBMS upon request from the DBA, but available to and relied upon by the read-side. We propose that this is also how this should work with Akka Persistence: the Journal is free to allow the configuration of materialized views that can be requested as event streams by name. The extraction of the indexing characteristics is performed by the Journal or its backing store, outside the scope of the Journal SPI; one example of doing it this way has been implemented by Martin already. We propose to access the auxiliary streams by something like case class QueryKafkaTopic(name: String, fromSeqNr: Long, toSeqNr: Long) Sequence numbers are necessary for deterministic replay/consumption. We had long discussions about the scalability implications, which is the reason why we propose to leave such queries proprietary to the Journal backend. Assuming a perfectly scalable (but then of course not real-time linearizable) Journal, the query might allow only case class QuerySuperscalableTopic(name: String, fromTime: DateTime) This will try to give you all events that were recorded after the given moment, but replay will not be deterministic, there will not be unique sequence numbers. These properties will be reflected in the Metadata that comes with the EventStreamOffer. The last way to ask for events is