On Tuesday, August 26, 2014, Martin Krasser
<krass...@googlemail.com
<javascript:_e(%7B%7D,'cvml','krass...@googlemail.com');>> wrote:
On 26.08.14 16:44, Andrzej Dębski wrote:
My mind must have filtered out the possibility of making
snapshots using Views - thanks.
About partitions: I suspected as much. The only thing that I
am wondering now is: if it is possible to dynamically create
partitions in Kafka? AFAIK the number of partitions is set
during topic creation (be it programmatically using API or
CLI tools) and there is CLI tool you can use to modify
existing topic:
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
To keep the invariant " PersistentActor is the only writer
to a partitioned journal topic" you would have to create
those partitions dynamically (usually you don't know up
front how many PersistentActors your system will have) on
per-PersistentActor basis.
You're right. If you want to keep all data in Kafka without
ever deleting them, you'd need to add partitions dynamically
(which is currently possible with APIs that back the CLI). On
the other hand, using Kafka this way is the wrong approach
IMO. If you really need to keep the full event history, keep
old events on HDFS or wherever and only the more recent ones
in Kafka (where a full replay must first read from HDFS and
then from Kafka) or use a journal plugin that is explicitly
designed for long-term event storage.
The main reason why I developed the Kafka plugin was to
integrate my Akka applications in unified log processing
architectures as descibed in Jay Kreps' excellent article
<http://engineering.linkedin.com/distributed-systems/log-what-every-software-engineer-should-know-about-real-time-datas-unifying>.
Also mentioned in this article is a snapshotting strategy
that fits typical retention times in Kafka.
On the other hand maybe you are assuming that each actor is
writing to different topic
yes, and the Kafka plugin is currently implemented that way.
- but I think this solution is not viable because
information about topics is limited by ZK and other factors:
http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic.
A more in-depth discussion about these limitations is given
at
http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka
with a detailed comment from Jay. I'd say that if you
designed your application to run more than a few hundred
persistent actors, then the Kafka plugin is the probably
wrong choice. I tend to design my applications to have only a
small number of persistent actors (which is in contrast to
many other discussions on akka-user) which makes the Kafka
plugin a good candidate.
To recap, the Kafka plugin is a reasonable choice if
- frequent snapshotting is done by persistent actors (every
day or so)
- you don't have more than a few hundred persistent actors and
- your application is a component of a unified log processing
architecture (backed by Kafka)
The most interesting next Kafka plugin feature for me to
develop is an HDFS integration for long-term event storage
(and full event history replay). WDYT?
W dniu wtorek, 26 sierpnia 2014 15:28:47 UTC+2 użytkownik
Martin Krasser napisał:
Hi Andrzej,
On 26.08.14 09:15, Andrzej Dębski wrote:
Hello
Lately I have been reading about a possibility of using
Apache Kafka as journal/snapshot store for
akka-persistence.
I am aware of the plugin created by Martin Krasser:
https://github.com/krasserm/akka-persistence-kafka/ and
also I read other topic about Kafka as journal
https://groups.google.com/forum/#!searchin/akka-user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ
<https://groups.google.com/forum/#%21searchin/akka-user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ>.
In both sources I linked two ideas were presented:
1. Set log retention to 7 days, take snapshots every 3
days (example values)
2. Set log retention to unlimited.
Here is the first question: in first case wouldn't it
mean that persistent views would receive skewed view of
the PersistentActor state (only events from 7 days) -
is it really viable solution? As far as I know
PersistentView can only receive events - it can't
receive snapshots from corresponding PersistentActor
(which is good in general case).
PersistentViews can create their own snapshots which are
isolated from the corresponding PersistentActor's snapshots.
Second question (more directed to Martin): in the
thread I linked you wrote:
I don't go into Kafka partitioning details here
but it is possible to implement the journal driver
in a way that both a single persistent actor's data
are partitioned *and* kept in order
I am very interested in this idea. AFAIK it is not yet
implemented in current plugin but I was wondering if
you could share high level idea how would you achieve
that (one persistent actor, multiple partitions,
ordering ensured)?
The idea is to
- first write events 1 to n to partition 1
- then write events n+1 to 2n to partition 2
- then write events 2n+1 to 3n to partition 3
- ... and so on
This works because a PersistentActor is the only writer
to a partitioned journal topic. During replay, you first
replay partition 1, then partition 2 and so on. This
should be rather easy to implement in the Kafka journal,
just didn't have time so far; pull requests are welcome
:) Btw, the Cassandra journal
<https://github.com/krasserm/akka-persistence-cassandra>
follows the very same strategy for scaling with data
volume (by using different partition keys).
Cheers,
Martin
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives:
https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to
the Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving
emails from it, send an email to
akka-user+...@googlegroups.com.
To post to this group, send email to
akka...@googlegroups.com.
Visit this group at
http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
--
Martin Krasser
blog:http://krasserm.blogspot.com
code:http://github.com/krasserm
twitter:http://twitter.com/mrt1nz
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives:
https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the
Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails
from it, send an email to
akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
--
Martin Krasser
blog:http://krasserm.blogspot.com
code:http://github.com/krasserm
twitter:http://twitter.com/mrt1nz
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives:
https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to a
topic in the Google Groups "Akka User List" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/akka-user/Bz9pWyK7V7g/unsubscribe.
To unsubscribe from this group and all its topics, send an
email to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.
--
Studying for the Turing test
--
>>>>>>>>>> Read the docs: http://akka.io/docs/
>>>>>>>>>> Check the FAQ:
http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>> Search the archives:
https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the
Google Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to akka-user+unsubscr...@googlegroups.com
<javascript:_e(%7B%7D,'cvml','akka-user%2bunsubscr...@googlegroups.com');>.
To post to this group, send email to akka-user@googlegroups.com
<javascript:_e(%7B%7D,'cvml','akka-user@googlegroups.com');>.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.