The EventStore is queryable in flexible ways and so needs to have random access and indexing of events. In short it needs to be backed by a DB.
There are ML algorithms that do not need stored events like online learners in the Kappa style. But existing Templates and the PIO architecture do not support Kappa yet. So unless you are developing your own Template you will find that Kafka will not fill all the requirements. It is, however an excellent way to manage streams so I tend to see it as a source for events and have seen it used as such. To be more precise about why your method may not work is that not all events can be aged out. The item property events in PIO $set, $unset, and $delete are embedded in the stream but are only looked at in aggregate since they record a set of changes to an object. If you drop one of these events the current state of the object can not be calculated with certainty. To truly use a streaming model we have to introduce the idea of watermarks that snapshot state in the stream so that any event can be dropped. This is how kappa learners work but for existing Lambda learners is often not possible. Many of the Lambda learners can be converted but not easily and not all. On Jul 6, 2017, at 1:59 AM, Thomas POCREAU <[email protected]> wrote: After some talks in intern, I misunderstood our needs. Indeed, we will use Kafka HDFS connector <http://docs.confluent.io/current/connect/connect-hdfs/docs/index.html> to dump expired data. So we will basically have Kafka for the fresh events and HDFS for the past events. 2017-07-06 7:06 GMT+02:00 Thomas POCREAU <[email protected] <mailto:[email protected]>>: Hi, Thanks for your responses. Our goal is to use kafka as our main event store for event sourcing. I'm pretty sure that kafka can be used with an infinite retention time. We could use KStream and the Java sdk but I would like to give a try to an implementation of PStore on top of spark-streaming-kafka (https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html <https://spark.apache.org/docs/latest/streaming-kafka-0-10-integration.html>). My main concern is related to the http interface to push events. We will probably use KStream or websockets to load events in our Kafka topic used by an app channel. Are you planning on supporting websockets as an alternative to batch import? Regards, Thomas Pocreau Le 5 juil. 2017 21:36, "Pat Ferrel" <[email protected] <mailto:[email protected]>> a écrit : No, we try not to fork :-) But it would be nice as you say. It can be done with a small intermediary app that just reads from a Kafka topic and send events to a localhost EventServer, which would allow events to be custom extracted from say log files (typical contents of Kafka). We’ve done this in non-PIO projects. The intermediary app should use Spark streaming. I may have a snippet of code around if you need it but it just saves to micro-batch files. You’d have to use the PIO Java-SDK to send them to the EventServer. A relatively simple thing. Donald, what did you have in mind for deeper integration? I guess we could cut out the intermediate app and integrate into a new Kafka aware EventServer endpoint where the raw topic input is stored in the EventStore. This would force any log filtering onto the Kafka source. On Jul 5, 2017, at 10:20 AM, Donald Szeto <[email protected] <mailto:[email protected]>> wrote: Hi Thomas, Supporting Kafka is definitely interesting and desirable. Are you looking to sinking your Kafka messages to event store for batch processing, or stream processing directly from Kafka? The latter would require more work because Apache PIO does not yet support streaming properly. Folks from ActionML might have a flavor of PIO that works with Kafka. Regards, Donald On Tue, Jul 4, 2017 at 8:34 AM, Thomas POCREAU <[email protected] <mailto:[email protected]>> wrote: Hi, Thanks a lot for this awesome project. I have a question regarding Kafka and it's possible integration as an Event Store. Do you have any plan on this matter ? Are you aware of someone working on a similar sujet ? Regards, Thomas.
