date:20140826

On 26.08.14 20:24, Andrzej Dębski wrote:

You're right. If you want to keep all data in Kafka without ever
deleting them, you'd need to add partitions dynamically (which is
currently possible with APIs that back the CLI). On the other
hand, using Kafka this way is the wrong approach IMO. If you
really need to keep the full event history, keep old events on
HDFS or wherever and only the more recent ones in Kafka (where a
full replay must first read from HDFS and then from Kafka) or use
a journal plugin that is explicitly designed for long-term event
storage.

That was worrying me all the time with using Kafka in a situation
where I would want to keep the events all the time (or at least
unknown amount of time). The thing that seemed nice is that I would
have journal/event store and pub-sub solution implemented in one
technology - basically I want to go around current limitation of
PersistentView. I wanted to use Kafka topic and replay all events from
the topic to dynamically added read models in my cluster. Maybe in
this situation I should stick to distributed publish-subscribe in
cluster for current event-sending and Cassandra as journal/snapshot
store. I did not read that much about Cassandra and the way it stores
data so I do not know if reading all events would be easy.

That's a single table in Cassandra (some details about ordering here
).
One could derive further tables with a user-defined
ordering/filtering/... from which multiple readers/subscriber could
consume and derive read models. These derived tables are comparable to
user-defined topics in the Kafka journal. Whether they are populated by
the plugin during write transactions or later, by running separate
transformation processes, is an implementation detail. The Kafka journal
does the former, the latter gives more flexibility regarding new read
model requirements (as no upfront knowledge is required what to write to
user-defined tables/topics).

The main reason why I developed the Kafka plugin was to integrate
my Akka applications in unified log processing architectures as
descibed in Jay Kreps' excellent article

.
Also mentioned in this article is a snapshotting strategy that
fits typical retention times in Kafka.

Thanks for the link.

The most interesting next Kafka plugin feature for me to develop
is an HDFS integration for long-term event storage (and full event
history replay). WDYT?

That would be interesting feature - certainly would make akka + Kafka
combination viable for more use cases.

W dniu wtorek, 26 sierpnia 2014 19:44:05 UTC+2 użytkownik Martin
Krasser napisał:

On 26.08.14 16:44, Andrzej Dębski wrote:

My mind must have filtered out the possibility of making
snapshots using Views - thanks.

https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool

.
To keep the invariant " PersistentActor is the only writer to a
partitioned journal topic" you would have to create those
partitions dynamically (usually you don't know up front how many
PersistentActors your system will have) on per-PersistentActor basis.

The main reason why I developed the Kafka plugin was to integrate
my Akka applications in unified log processing architectures as
descibed in Jay Kreps' excellent article

.
Also mentioned in this article is a snapshotting strategy that
fits typical retention times in Kafka.

On the other hand maybe you are assuming that each actor is
writing to different topic

yes, and the Kafka plugin is currently im

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

Whether to go for a 1:1 approach or a 1:n approach (or a partitioned m:n 
approach where m << n) really depends on the concrete use case and 
non-functional requirements. Your example might be a good candidate for 
a 1:1 approach (see also further comments inline) but there are also 
examples for which a 1:n or m:n approach is a better choice. Here are 
some general influencing factors:


- length of event history required to recover state: bank accounts need 
the full event history to be recovered but order management is an 
example where this is often not the case. Orders (trade orders in 
finance, lab orders during medical treatments, ...) usually have a 
limited validity so that you can recover active orders from a limited 
event history (last 10 days, for example) which should make migrations 
after code changes rather painless. BTW, having only a single persistent 
actor (or a few) that maintains state is comparable to role of a 
"Business Logic Processor" in the LMAX architecture which originated 
from the high frequency trading domain.


- latency requirements: creating a new persistent actor has some 
overhead, not only memory but also bootstrap as its creation requires a 
roundtrip to the backend store. Re-activation of passivated actors that 
have been designed around a 1:1 approach, may also be in conflict with 
low latency requirements. Good compromises can often be found by 
following an m:n approach in this case.


- write throughput: high write throughput can only be achieved by 
batching writes and batching is currently implemented on a per 
persistent actor basis. Throughput therefore scales better when having a 
small(er) number of actors. A large number of actors will create more 
but smaller batches, reducing throughput. This is however more a 
limitation of the current implementation of akka-persistence. Maybe a 
switch to batching on journal level is a good idea, so that a single 
write batch can contain events from several actors.


- ...

Even if you need to replay a long event history (for example after a 
code change), you can always do that in the background on a separate 
node until the new version of the persistent actor caught up and switch 
the application to it when done. You could even have both versions 
running at the same time for A/B testing for example. With a replay rate 
of 100k/sec you can replay a billion events within a few hours.


Further comments inline ...

On 26.08.14 20:34, Greg Young wrote:
OK for bank accounts there is some amount of state needed to verify a 
transaction. Let's propose that for now its the branch you opened your 
account at, your current balance,your address and a risk 
classification as well as a customer profitability/loyalty score 
(these are all reasonable things to track in terms of deciding if a 
transaction should be accepted or not)


When validating commands, you only need to keep that part of application 
state within persistent actors for which you have strict consistency 
requirements. In context of bank accounts, this is for sure the case for 
the balance, but not necessarily for customer profitability, loyality 
score or whatever. These metrics may be calculated in the background, 
hence, having eventual read consistency for them should be sufficient. 
Consequently this state can be maintained elsewhere (as part of a 
separate read model) and requested from persistent actors during 
transaction validation. If you need further metrics in the future, new 
read models can be added and included into the validation workflow 
initiated by a persistent actor.




I could keep millions of these inside of a single actor.

A few problems come up though:

Replaying this actor from events is very painful (millions possibly 
hundreds of millions of events and they must be processes serially) 
solution->snapshots?
Snapshots have all the same versioning issues people are used to with 
keeping state around. What happens when the state I am keeping changes 
say now I also need to keep avg+stddev of transaction amount or we 
found a bug in how we were maintaining the loyalty score (back to #1) 
this will invalidate my snapshot


See above, there's no need to keep all of that inside the persistent 
actor for strict read consistency. Allowing eventual consistency during 
command validation where possible not only makes the validation process 
more flexible (by just including new read models if required) but also 
reduces snapshot migration efforts (by simplifying the state structure 
inside persistent actors).


Furthermore, ensuring strict consistency for persistent actor state 
requires usage of persist() instead of persistAsync() which reduces 
throughput at least by a factor of 10. That may again be in conflict 
with write throughput requirements.


To conclude, I think there are use cases where a 1:1 approach makes 
sense but this shouldn't be a general recommendation IMO. It really 
depends on the specific functional and non-functional re

[akka-user] Re: Akka FSM unhandled event PersistentImpl(PrizeCreated(1,T119)) in state Oldest, where does the state come from?

2014-08-26 Thread Jeroen Gordijn

Hi Dennis,

FSM does not work with PersistentActor at this 
moment: https://github.com/akka/akka/issues/15279

Cheers,
Jeroen

Op maandag 25 augustus 2014 22:17:16 UTC+2 schreef Dennis Vriend:
>
> Hi,
>
>  I get the following warning:
>
>
>

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

[akka-user] Re: Setting SO_TIMEOUT in Akka I/O

2014-08-26 Thread Mario Camou

Hi again,

Sorry to reopen this thread after so long, but I'm in another project with 
a similar problem.

One thing I'm noticing is that, even after the OS closes the connection 
(i.e., it doesn't appear in netstat -an) my handler isn't getting any of 
the Tcp.ConnectionClosed messages. Is that normal? It looks bizarre to me.

What I'm doing is, starting a client that makes a connection, then doing a 
kill -9 to simulate the client crashing (or the client machine turning 
off). The socket stays there for a while (again checking via netstat -an) 
but eventually the OS closes it. Akka is not sending any message to 
indicate that the socket has been closed.

Any ideas short of doing what Roiland suggested earlier? (i.e., modifying 
the protocol to periodically send some data).

Thanks,
-Mario.

On Friday, December 27, 2013 9:26:46 PM UTC+1, Mario Camou wrote:
>
> Hi, 
>
> I’m using Akka I/O and am trying to set the SO_TIMEOUT socket value. 
>
> Looking at the docs, I see that the Tcp.Bind message receives a 
> Traversable[SocketOption]. However, looking at the docs for 
> akka.io.Inet.SocketOption, the only subclasses are Broadcast, KeepAlive, 
> OOBInline, ReceiveBufferSize, ReuseAddress, SendBufferSize, TcpNoDelay and 
> TrafficClass. The superclass for the only SoTimeout class I see 
> (akka.actor.IO.SoTimeout) is akka.actor.IO.SocketOption which I assume is 
> part of the old (Iteratee-based) Akka I/O. 
>
> So, how to set the SO_TIMEOUT using the new Akka I/O? Also, how to reset 
> the SO_TIMEOUT once the socket has been opened? 
>
> In my use case I need to set a relatively short SO_TIMEOUT value when the 
> socket is first connected, and bump it up a bit after receiving the first 
> actual message. 
>
> Thanks, 
> -Mario 
> — 
> I want to change the world, but they won’t give me the source code 
>
>

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

OK for bank accounts there is some amount of state needed to verify a
transaction. Let's propose that for now its the branch you opened your
account at, your current balance,your address and a risk classification as
well as a customer profitability/loyalty score (these are all reasonable
things to track in terms of deciding if a transaction should be accepted or
not)

I could keep millions of these inside of a single actor.

A few problems come up though:

Replaying this actor from events is very painful (millions possibly
hundreds of millions of events and they must be processes serially)
solution->snapshots?
Snapshots have all the same versioning issues people are used to with
keeping state around. What happens when the state I am keeping changes say
now I also need to keep avg+stddev of transaction amount or we found a bug
in how we were maintaining the loyalty score (back to #1) this will
invalidate my snapshot (requiring a full replay or else you run into
another whole series of hokey problems trying to do "from here forward"
type things (imagine a new feature that relies on a 6 month moving average)

On Tue, Aug 26, 2014 at 2:15 PM, Martin Krasser 
wrote:

>
> On 26.08.14 20:12, Greg Young wrote:
>
> In particular I am interested in the associated state thats needed, I can
> see keeping it in a single actor but this does not turn out well at all for
> most production systems in particular as changes happen over time.
>
>
> I don't get your point. Please elaborate.
>
>
>
> On Tue, Aug 26, 2014 at 2:08 PM, Martin Krasser 
> wrote:
>
>>  See my eventsourced example(s), that I published 1-2 years ago, others
>> are closed source
>>
>>
>> On 26.08.14 20:06, Greg Young wrote:
>>
>> Love to see an example
>>
>> On Tuesday, August 26, 2014, Martin Krasser 
>> wrote:
>>
>>>
>>> On 26.08.14 19:56, Greg Young wrote:
>>>
>>> I'm curious how you would model say bank accounts with only a few
>>> hundred actors can you go into a bit of detail
>>>
>>>
>>> persistent-actor : bank-account = 1:n (instead of 1:1)
>>>
>>>
>>> On Tuesday, August 26, 2014, Martin Krasser 
>>> wrote:
>>>

 On 26.08.14 16:44, Andrzej Dębski wrote:

 My mind must have filtered out the possibility of making snapshots
 using Views - thanks.

  About partitions: I suspected as much. The only thing that I am
 wondering now is: if it is possible to dynamically create partitions in
 Kafka? AFAIK the number of partitions is set during topic creation (be it
 programmatically using API or CLI tools) and there is CLI tool you can use
 to modify existing topic:
 https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
 To keep the invariant  " PersistentActor is the only writer to a
 partitioned journal topic" you would have to create those partitions
 dynamically (usually you don't know up front how many PersistentActors your
 system will have) on per-PersistentActor basis.

 You're right. If you want to keep all data in Kafka without ever
 deleting them, you'd need to add partitions dynamically (which is currently
 possible with APIs that back the CLI). On the other hand, using Kafka this
 way is the wrong approach IMO. If you really need to keep the full event
 history, keep old events on HDFS or wherever and only the more recent ones
 in Kafka (where a full replay must first read from HDFS and then from
 Kafka) or use a journal plugin that is explicitly designed for long-term
 event storage.

 The main reason why I developed the Kafka plugin was to integrate my
 Akka applications in unified log processing architectures as descibed in
 Jay Kreps' excellent article
 .
 Also mentioned in this article is a snapshotting strategy that fits typical
 retention times in Kafka.

  On the other hand maybe you are assuming that each actor is writing
 to different topic

 yes, and the Kafka plugin is currently implemented that way.

  - but I think this solution is not viable because information about
 topics is limited by ZK and other factors:
 http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic
 .

 A more in-depth discussion about these limitations is given at
 http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka
 with a detailed comment from Jay. I'd say that if you designed your
 application to run more than a few hundred persistent actors, then the
 Kafka plugin is the probably wrong choice. I tend to design my applications
 to have only a small number of persistent actors (which is in contrast to
 many other discussions on akka-user) which makes the Kafka plugin a good
 candidate.

 To recap, the Kafka plugin i

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

2014-08-26 Thread Andrzej Dębski



> You're right. If you want to keep all data in Kafka without ever deleting 
> them, you'd need to add partitions dynamically (which is currently possible 
> with APIs that back the CLI). On the other hand, using Kafka this way is 
> the wrong approach IMO. If you really need to keep the full event history, 
> keep old events on HDFS or wherever and only the more recent ones in Kafka 
> (where a full replay must first read from HDFS and then from Kafka) or use 
> a journal plugin that is explicitly designed for long-term event storage. 
>

That was worrying me all the time with using Kafka in a situation where I 
would want to keep the events all the time (or at least unknown amount of 
time). The thing that seemed nice is that I would have journal/event store 
and pub-sub solution implemented in one technology - basically I want to go 
around current limitation of PersistentView. I wanted to use Kafka topic 
and replay all events from the topic to dynamically added read models in my 
cluster. Maybe in this situation I should stick to distributed 
publish-subscribe in cluster for current event-sending and Cassandra as 
journal/snapshot store. I did not read that much about Cassandra and the 
way it stores data so I do not know if reading all events would be easy.

The main reason why I developed the Kafka plugin was to integrate my Akka 
> applications in unified log processing architectures as descibed in Jay 
> Kreps' excellent article 
> .
>  
> Also mentioned in this article is a snapshotting strategy that fits typical 
> retention times in Kafka.
>

Thanks for the link. 

The most interesting next Kafka plugin feature for me to develop is an HDFS 
> integration for long-term event storage (and full event history replay). 
> WDYT?
>

That would be interesting feature - certainly would make akka + Kafka 
combination viable for more use cases.

W dniu wtorek, 26 sierpnia 2014 19:44:05 UTC+2 użytkownik Martin Krasser 
napisał:
>
>  
> On 26.08.14 16:44, Andrzej Dębski wrote:
>  
> My mind must have filtered out the possibility of making snapshots using 
> Views - thanks. 
>
>  About partitions: I suspected as much. The only thing that I am 
> wondering now is: if it is possible to dynamically create partitions in 
> Kafka? AFAIK the number of partitions is set during topic creation (be it 
> programmatically using API or CLI tools) and there is CLI tool you can use 
> to modify existing topic: 
> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
>  
> To keep the invariant  " PersistentActor is the only writer to a 
> partitioned journal topic" you would have to create those partitions 
> dynamically (usually you don't know up front how many PersistentActors your 
> system will have) on per-PersistentActor basis.
>  
>
> You're right. If you want to keep all data in Kafka without ever deleting 
> them, you'd need to add partitions dynamically (which is currently possible 
> with APIs that back the CLI). On the other hand, using Kafka this way is 
> the wrong approach IMO. If you really need to keep the full event history, 
> keep old events on HDFS or wherever and only the more recent ones in Kafka 
> (where a full replay must first read from HDFS and then from Kafka) or use 
> a journal plugin that is explicitly designed for long-term event storage. 
>
> The main reason why I developed the Kafka plugin was to integrate my Akka 
> applications in unified log processing architectures as descibed in Jay 
> Kreps' excellent article 
> .
>  
> Also mentioned in this article is a snapshotting strategy that fits typical 
> retention times in Kafka.
>
>  
>  On the other hand maybe you are assuming that each actor is writing to 
> different topic 
>  
>
> yes, and the Kafka plugin is currently implemented that way.
>
>  - but I think this solution is not viable because information about 
> topics is limited by ZK and other factors: 
> http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic
> .
>  
>
> A more in-depth discussion about these limitations is given at 
> http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka with 
> a detailed comment from Jay. I'd say that if you designed your application 
> to run more than a few hundred persistent actors, then the Kafka plugin is 
> the probably wrong choice. I tend to design my applications to have only a 
> small number of persistent actors (which is in contrast to many other 
> discussions on akka-user) which makes the Kafka plugin a good candidate. 
>
> To recap, the Kafka plugin is a reasonable choice if
>
> - frequent snapshotting is done by persistent actors (every day or so)
> - you don't have more than a few hundr

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions



On 26.08.14 20:12, Greg Young wrote:
In particular I am interested in the associated state thats needed, I 
can see keeping it in a single actor but this does not turn out well 
at all for most production systems in particular as changes happen 
over time.


I don't get your point. Please elaborate.




On Tue, Aug 26, 2014 at 2:08 PM, Martin Krasser 
mailto:krass...@googlemail.com>> wrote:


See my eventsourced example(s), that I published 1-2 years ago,
others are closed source


On 26.08.14 20:06, Greg Young wrote:

Love to see an example

On Tuesday, August 26, 2014, Martin Krasser
mailto:krass...@googlemail.com>> wrote:


On 26.08.14 19:56, Greg Young wrote:

I'm curious how you would model say bank accounts with only
a few hundred actors can you go into a bit of detail


persistent-actor : bank-account = 1:n (instead of 1:1)



On Tuesday, August 26, 2014, Martin Krasser
 wrote:


On 26.08.14 16:44, Andrzej Dębski wrote:

My mind must have filtered out the possibility of
making snapshots using Views - thanks.

About partitions: I suspected as much. The only thing
that I am wondering now is: if it is possible to
dynamically create partitions in Kafka? AFAIK the
number of partitions is set during topic creation (be
it programmatically using API or CLI tools) and there
is CLI tool you can use to modify existing topic:

https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
To keep the invariant  " PersistentActor is the only
writer to a partitioned journal topic" you would have
to create those partitions dynamically (usually you
don't know up front how many PersistentActors your
system will have) on per-PersistentActor basis.


You're right. If you want to keep all data in Kafka
without ever deleting them, you'd need to add partitions
dynamically (which is currently possible with APIs that
back the CLI). On the other hand, using Kafka this way
is the wrong approach IMO. If you really need to keep
the full event history, keep old events on HDFS or
wherever and only the more recent ones in Kafka (where a
full replay must first read from HDFS and then from
Kafka) or use a journal plugin that is explicitly
designed for long-term event storage.

The main reason why I developed the Kafka plugin was to
integrate my Akka applications in unified log processing
architectures as descibed in Jay Kreps' excellent
article

.
Also mentioned in this article is a snapshotting
strategy that fits typical retention times in Kafka.



On the other hand maybe you are assuming that each
actor is writing to different topic


yes, and the Kafka plugin is currently implemented that way.


- but I think this solution is not viable because
information about topics is limited by ZK and other
factors:

http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic.


A more in-depth discussion about these limitations is
given at
http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka
with a detailed comment from Jay. I'd say that if you
designed your application to run more than a few hundred
persistent actors, then the Kafka plugin is the probably
wrong choice. I tend to design my applications to have
only a small number of persistent actors (which is in
contrast to many other discussions on akka-user) which
makes the Kafka plugin a good candidate.

To recap, the Kafka plugin is a reasonable choice if

- frequent snapshotting is done by persistent actors
(every day or so)
- you don't have more than a few hundred persistent
actors and
- your application is a component of a unified log
processing architecture (backed by Kafka)

The most interesting next Kafka plugin feature for me to
develop is an HDFS integration for long-term event
storage (and full event history replay). WDYT?



W dniu wtorek, 26 sierpnia 2014 15:28:47 UTC+2
użytkownik Martin Krasser napisał:

Hi Andrzej,

On 26.08.14 09:15, Andrzej Dębski wrote:

Hello

Lately I have been reading about a possibility of

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

In particular I am interested in the associated state thats needed, I can
see keeping it in a single actor but this does not turn out well at all for
most production systems in particular as changes happen over time.


On Tue, Aug 26, 2014 at 2:08 PM, Martin Krasser 
wrote:

>  See my eventsourced example(s), that I published 1-2 years ago, others
> are closed source
>
>
> On 26.08.14 20:06, Greg Young wrote:
>
> Love to see an example
>
> On Tuesday, August 26, 2014, Martin Krasser 
> wrote:
>
>>
>> On 26.08.14 19:56, Greg Young wrote:
>>
>> I'm curious how you would model say bank accounts with only a few hundred
>> actors can you go into a bit of detail
>>
>>
>> persistent-actor : bank-account = 1:n (instead of 1:1)
>>
>>
>> On Tuesday, August 26, 2014, Martin Krasser 
>> wrote:
>>
>>>
>>> On 26.08.14 16:44, Andrzej Dębski wrote:
>>>
>>> My mind must have filtered out the possibility of making snapshots using
>>> Views - thanks.
>>>
>>>  About partitions: I suspected as much. The only thing that I am
>>> wondering now is: if it is possible to dynamically create partitions in
>>> Kafka? AFAIK the number of partitions is set during topic creation (be it
>>> programmatically using API or CLI tools) and there is CLI tool you can use
>>> to modify existing topic:
>>> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
>>> To keep the invariant  " PersistentActor is the only writer to a
>>> partitioned journal topic" you would have to create those partitions
>>> dynamically (usually you don't know up front how many PersistentActors your
>>> system will have) on per-PersistentActor basis.
>>>
>>>
>>> You're right. If you want to keep all data in Kafka without ever
>>> deleting them, you'd need to add partitions dynamically (which is currently
>>> possible with APIs that back the CLI). On the other hand, using Kafka this
>>> way is the wrong approach IMO. If you really need to keep the full event
>>> history, keep old events on HDFS or wherever and only the more recent ones
>>> in Kafka (where a full replay must first read from HDFS and then from
>>> Kafka) or use a journal plugin that is explicitly designed for long-term
>>> event storage.
>>>
>>> The main reason why I developed the Kafka plugin was to integrate my
>>> Akka applications in unified log processing architectures as descibed in
>>> Jay Kreps' excellent article
>>> .
>>> Also mentioned in this article is a snapshotting strategy that fits typical
>>> retention times in Kafka.
>>>
>>>
>>>  On the other hand maybe you are assuming that each actor is writing to
>>> different topic
>>>
>>>
>>> yes, and the Kafka plugin is currently implemented that way.
>>>
>>>  - but I think this solution is not viable because information about
>>> topics is limited by ZK and other factors:
>>> http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic
>>> .
>>>
>>>
>>> A more in-depth discussion about these limitations is given at
>>> http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka
>>> with a detailed comment from Jay. I'd say that if you designed your
>>> application to run more than a few hundred persistent actors, then the
>>> Kafka plugin is the probably wrong choice. I tend to design my applications
>>> to have only a small number of persistent actors (which is in contrast to
>>> many other discussions on akka-user) which makes the Kafka plugin a good
>>> candidate.
>>>
>>> To recap, the Kafka plugin is a reasonable choice if
>>>
>>> - frequent snapshotting is done by persistent actors (every day or so)
>>> - you don't have more than a few hundred persistent actors and
>>> - your application is a component of a unified log processing
>>> architecture (backed by Kafka)
>>>
>>> The most interesting next Kafka plugin feature for me to develop is an
>>> HDFS integration for long-term event storage (and full event history
>>> replay). WDYT?
>>>
>>>
>>> W dniu wtorek, 26 sierpnia 2014 15:28:47 UTC+2 użytkownik Martin Krasser
>>> napisał:

  Hi Andrzej,

 On 26.08.14 09:15, Andrzej Dębski wrote:

 Hello

  Lately I have been reading about a possibility of using Apache Kafka
 as journal/snapshot store for akka-persistence.

 I am aware of the plugin created by Martin Krasser:
 https://github.com/krasserm/akka-persistence-kafka/ and also I read
 other topic about Kafka as journal
 https://groups.google.com/forum/#!searchin/akka-user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ
 .

  In both sources I linked two ideas were presented:

  1. Set log retention to 7 days, take snapshots every 3 days (example
 values)
 2. Set log retention to unlimited.

  Here is the first question: in first case wouldn't it mean that
 persistent views would receive skewed view of the Pe

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

See my eventsourced example(s), that I published 1-2 years ago, others
are closed source

On 26.08.14 20:06, Greg Young wrote:

Love to see an example

On Tuesday, August 26, 2014, Martin Krasser > wrote:

On 26.08.14 19:56, Greg Young wrote:

I'm curious how you would model say bank accounts with only a few
hundred actors can you go into a bit of detail

persistent-actor : bank-account = 1:n (instead of 1:1)

On Tuesday, August 26, 2014, Martin Krasser
> wrote:

On 26.08.14 16:44, Andrzej Dębski wrote:

My mind must have filtered out the possibility of making
snapshots using Views - thanks.

https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
To keep the invariant " PersistentActor is the only writer
to a partitioned journal topic" you would have to create
those partitions dynamically (usually you don't know up
front how many PersistentActors your system will have) on
per-PersistentActor basis.

You're right. If you want to keep all data in Kafka without
ever deleting them, you'd need to add partitions dynamically
(which is currently possible with APIs that back the CLI). On
the other hand, using Kafka this way is the wrong approach
IMO. If you really need to keep the full event history, keep
old events on HDFS or wherever and only the more recent ones
in Kafka (where a full replay must first read from HDFS and
then from Kafka) or use a journal plugin that is explicitly
designed for long-term event storage.

The main reason why I developed the Kafka plugin was to
integrate my Akka applications in unified log processing
architectures as descibed in Jay Kreps' excellent article

.
Also mentioned in this article is a snapshotting strategy
that fits typical retention times in Kafka.

On the other hand maybe you are assuming that each actor is
writing to different topic

yes, and the Kafka plugin is currently implemented that way.

- but I think this solution is not viable because
information about topics is limited by ZK and other factors:

http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic.

A more in-depth discussion about these limitations is given
at
http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka
with a detailed comment from Jay. I'd say that if you
designed your application to run more than a few hundred
persistent actors, then the Kafka plugin is the probably
wrong choice. I tend to design my applications to have only a
small number of persistent actors (which is in contrast to
many other discussions on akka-user) which makes the Kafka
plugin a good candidate.

To recap, the Kafka plugin is a reasonable choice if

- frequent snapshotting is done by persistent actors (every
day or so)
- you don't have more than a few hundred persistent actors and
- your application is a component of a unified log processing
architecture (backed by Kafka)

The most interesting next Kafka plugin feature for me to
develop is an HDFS integration for long-term event storage
(and full event history replay). WDYT?

W dniu wtorek, 26 sierpnia 2014 15:28:47 UTC+2 użytkownik
Martin Krasser napisał:

Hi Andrzej,

On 26.08.14 09:15, Andrzej Dębski wrote:

Hello

Lately I have been reading about a possibility of using
Apache Kafka as journal/snapshot store for
akka-persistence.

I am aware of the plugin created by Martin Krasser:
https://github.com/krasserm/akka-persistence-kafka/ and
also I read other topic about Kafka as journal

https://groups.google.com/forum/#!searchin/akka-user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ

In both sources I linked two ideas were presented:

1. Set log retention to 7 days, take snapshots every 3
days (example values)
2. Set log retention to unlimited.

Here is th

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

Love to see an example

On Tuesday, August 26, 2014, Martin Krasser  wrote:

>
> On 26.08.14 19:56, Greg Young wrote:
>
> I'm curious how you would model say bank accounts with only a few hundred
> actors can you go into a bit of detail
>
>
> persistent-actor : bank-account = 1:n (instead of 1:1)
>
>
> On Tuesday, August 26, 2014, Martin Krasser  > wrote:
>
>>
>> On 26.08.14 16:44, Andrzej Dębski wrote:
>>
>> My mind must have filtered out the possibility of making snapshots using
>> Views - thanks.
>>
>>  About partitions: I suspected as much. The only thing that I am
>> wondering now is: if it is possible to dynamically create partitions in
>> Kafka? AFAIK the number of partitions is set during topic creation (be it
>> programmatically using API or CLI tools) and there is CLI tool you can use
>> to modify existing topic:
>> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
>> To keep the invariant  " PersistentActor is the only writer to a
>> partitioned journal topic" you would have to create those partitions
>> dynamically (usually you don't know up front how many PersistentActors your
>> system will have) on per-PersistentActor basis.
>>
>>
>> You're right. If you want to keep all data in Kafka without ever deleting
>> them, you'd need to add partitions dynamically (which is currently possible
>> with APIs that back the CLI). On the other hand, using Kafka this way is
>> the wrong approach IMO. If you really need to keep the full event history,
>> keep old events on HDFS or wherever and only the more recent ones in Kafka
>> (where a full replay must first read from HDFS and then from Kafka) or use
>> a journal plugin that is explicitly designed for long-term event storage.
>>
>> The main reason why I developed the Kafka plugin was to integrate my Akka
>> applications in unified log processing architectures as descibed in Jay
>> Kreps' excellent article
>> .
>> Also mentioned in this article is a snapshotting strategy that fits typical
>> retention times in Kafka.
>>
>>
>>  On the other hand maybe you are assuming that each actor is writing to
>> different topic
>>
>>
>> yes, and the Kafka plugin is currently implemented that way.
>>
>>  - but I think this solution is not viable because information about
>> topics is limited by ZK and other factors:
>> http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic
>> .
>>
>>
>> A more in-depth discussion about these limitations is given at
>> http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka with
>> a detailed comment from Jay. I'd say that if you designed your application
>> to run more than a few hundred persistent actors, then the Kafka plugin is
>> the probably wrong choice. I tend to design my applications to have only a
>> small number of persistent actors (which is in contrast to many other
>> discussions on akka-user) which makes the Kafka plugin a good candidate.
>>
>> To recap, the Kafka plugin is a reasonable choice if
>>
>> - frequent snapshotting is done by persistent actors (every day or so)
>> - you don't have more than a few hundred persistent actors and
>> - your application is a component of a unified log processing
>> architecture (backed by Kafka)
>>
>> The most interesting next Kafka plugin feature for me to develop is an
>> HDFS integration for long-term event storage (and full event history
>> replay). WDYT?
>>
>>
>> W dniu wtorek, 26 sierpnia 2014 15:28:47 UTC+2 użytkownik Martin Krasser
>> napisał:
>>>
>>>  Hi Andrzej,
>>>
>>> On 26.08.14 09:15, Andrzej Dębski wrote:
>>>
>>> Hello
>>>
>>>  Lately I have been reading about a possibility of using Apache Kafka
>>> as journal/snapshot store for akka-persistence.
>>>
>>> I am aware of the plugin created by Martin Krasser:
>>> https://github.com/krasserm/akka-persistence-kafka/ and also I read
>>> other topic about Kafka as journal
>>> https://groups.google.com/forum/#!searchin/akka-user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ
>>> .
>>>
>>>  In both sources I linked two ideas were presented:
>>>
>>>  1. Set log retention to 7 days, take snapshots every 3 days (example
>>> values)
>>> 2. Set log retention to unlimited.
>>>
>>>  Here is the first question: in first case wouldn't it mean that
>>> persistent views would receive skewed view of the PersistentActor state
>>> (only events from 7 days) - is it really viable solution? As far as I know
>>> PersistentView can only receive events - it can't receive snapshots from
>>> corresponding PersistentActor (which is good in general case).
>>>
>>>
>>> PersistentViews can create their own snapshots which are isolated from
>>> the corresponding PersistentActor's snapshots.
>>>
>>>
>>>  Second question (more directed to Martin): in the thread I linked you
>>> wrote:
>>>
>>>   I don't go into Kafka partitioning details here but it

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

On 26.08.14 19:56, Greg Young wrote:
I'm curious how you would model say bank accounts with only a few
hundred actors can you go into a bit of detail

persistent-actor : bank-account = 1:n (instead of 1:1)

On Tuesday, August 26, 2014, Martin Krasser > wrote:

On 26.08.14 16:44, Andrzej Dębski wrote:

My mind must have filtered out the possibility of making
snapshots using Views - thanks.

https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
To keep the invariant " PersistentActor is the only writer to a
partitioned journal topic" you would have to create those
partitions dynamically (usually you don't know up front how many
PersistentActors your system will have) on per-PersistentActor basis.

The main reason why I developed the Kafka plugin was to integrate
my Akka applications in unified log processing architectures as
descibed in Jay Kreps' excellent article

.
Also mentioned in this article is a snapshotting strategy that
fits typical retention times in Kafka.

On the other hand maybe you are assuming that each actor is
writing to different topic

yes, and the Kafka plugin is currently implemented that way.

- but I think this solution is not viable because information
about topics is limited by ZK and other factors:
http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic.

A more in-depth discussion about these limitations is given at
http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka with
a detailed comment from Jay. I'd say that if you designed your
application to run more than a few hundred persistent actors, then
the Kafka plugin is the probably wrong choice. I tend to design my
applications to have only a small number of persistent actors
(which is in contrast to many other discussions on akka-user)
which makes the Kafka plugin a good candidate.

To recap, the Kafka plugin is a reasonable choice if

- frequent snapshotting is done by persistent actors (every day or so)
- you don't have more than a few hundred persistent actors and
- your application is a component of a unified log processing
architecture (backed by Kafka)

The most interesting next Kafka plugin feature for me to develop
is an HDFS integration for long-term event storage (and full event
history replay). WDYT?

W dniu wtorek, 26 sierpnia 2014 15:28:47 UTC+2 użytkownik Martin
Krasser napisał:

Hi Andrzej,

On 26.08.14 09:15, Andrzej Dębski wrote:

Hello

Lately I have been reading about a possibility of using
Apache Kafka as journal/snapshot store for akka-persistence.

I am aware of the plugin created by Martin Krasser:
https://github.com/krasserm/akka-persistence-kafka/ and also
I read other topic about Kafka as journal

https://groups.google.com/forum/#!searchin/akka-user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ

In both sources I linked two ideas were presented:

1. Set log retention to 7 days, take snapshots every 3 days
(example values)
2. Set log retention to unlimited.

Here is the first question: in first case wouldn't it mean
that persistent views would receive skewed view of the
PersistentActor state (only events from 7 days) - is it
really viable solution? As far as I know PersistentView can
only receive events - it can't receive snapshots from
corresponding PersistentActor (which is good in general case).

PersistentViews can create their own snapshots which are
isolated from the corresponding PersistentActor's snapshots.

Second question (more direct

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

I'm curious how you would model say bank accounts with only a few hundred
actors can you go into a bit of detail

On Tuesday, August 26, 2014, Martin Krasser  wrote:

>
> On 26.08.14 16:44, Andrzej Dębski wrote:
>
> My mind must have filtered out the possibility of making snapshots using
> Views - thanks.
>
>  About partitions: I suspected as much. The only thing that I am
> wondering now is: if it is possible to dynamically create partitions in
> Kafka? AFAIK the number of partitions is set during topic creation (be it
> programmatically using API or CLI tools) and there is CLI tool you can use
> to modify existing topic:
> https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
> To keep the invariant  " PersistentActor is the only writer to a
> partitioned journal topic" you would have to create those partitions
> dynamically (usually you don't know up front how many PersistentActors your
> system will have) on per-PersistentActor basis.
>
>
> You're right. If you want to keep all data in Kafka without ever deleting
> them, you'd need to add partitions dynamically (which is currently possible
> with APIs that back the CLI). On the other hand, using Kafka this way is
> the wrong approach IMO. If you really need to keep the full event history,
> keep old events on HDFS or wherever and only the more recent ones in Kafka
> (where a full replay must first read from HDFS and then from Kafka) or use
> a journal plugin that is explicitly designed for long-term event storage.
>
> The main reason why I developed the Kafka plugin was to integrate my Akka
> applications in unified log processing architectures as descibed in Jay
> Kreps' excellent article
> .
> Also mentioned in this article is a snapshotting strategy that fits typical
> retention times in Kafka.
>
>
>  On the other hand maybe you are assuming that each actor is writing to
> different topic
>
>
> yes, and the Kafka plugin is currently implemented that way.
>
>  - but I think this solution is not viable because information about
> topics is limited by ZK and other factors:
> http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic
> .
>
>
> A more in-depth discussion about these limitations is given at
> http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka with
> a detailed comment from Jay. I'd say that if you designed your application
> to run more than a few hundred persistent actors, then the Kafka plugin is
> the probably wrong choice. I tend to design my applications to have only a
> small number of persistent actors (which is in contrast to many other
> discussions on akka-user) which makes the Kafka plugin a good candidate.
>
> To recap, the Kafka plugin is a reasonable choice if
>
> - frequent snapshotting is done by persistent actors (every day or so)
> - you don't have more than a few hundred persistent actors and
> - your application is a component of a unified log processing architecture
> (backed by Kafka)
>
> The most interesting next Kafka plugin feature for me to develop is an
> HDFS integration for long-term event storage (and full event history
> replay). WDYT?
>
>
> W dniu wtorek, 26 sierpnia 2014 15:28:47 UTC+2 użytkownik Martin Krasser
> napisał:
>>
>>  Hi Andrzej,
>>
>> On 26.08.14 09:15, Andrzej Dębski wrote:
>>
>> Hello
>>
>>  Lately I have been reading about a possibility of using Apache Kafka as
>> journal/snapshot store for akka-persistence.
>>
>> I am aware of the plugin created by Martin Krasser: https://github.com/
>> krasserm/akka-persistence-kafka/ and also I read other topic about Kafka
>> as journal https://groups.google.com/forum/#!searchin/akka-
>> user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ.
>>
>>  In both sources I linked two ideas were presented:
>>
>>  1. Set log retention to 7 days, take snapshots every 3 days (example
>> values)
>> 2. Set log retention to unlimited.
>>
>>  Here is the first question: in first case wouldn't it mean that
>> persistent views would receive skewed view of the PersistentActor state
>> (only events from 7 days) - is it really viable solution? As far as I know
>> PersistentView can only receive events - it can't receive snapshots from
>> corresponding PersistentActor (which is good in general case).
>>
>>
>> PersistentViews can create their own snapshots which are isolated from
>> the corresponding PersistentActor's snapshots.
>>
>>
>>  Second question (more directed to Martin): in the thread I linked you
>> wrote:
>>
>>   I don't go into Kafka partitioning details here but it is possible to
>>> implement the journal driver in a way that both a single persistent actor's
>>> data are partitioned *and* kept in order
>>>
>>
>>   I am very interested in this idea. AFAIK it is not yet implemented in
>> current plugin but I was wondering if you could share high level idea how
>

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

On 26.08.14 16:44, Andrzej Dębski wrote:
My mind must have filtered out the possibility of making snapshots
using Views - thanks.

About partitions: I suspected as much. The only thing that I am
wondering now is: if it is possible to dynamically create partitions
in Kafka? AFAIK the number of partitions is set during topic creation
(be it programmatically using API or CLI tools) and there is CLI tool
you can use to modify existing topic:
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
To keep the invariant " PersistentActor is the only writer to a
partitioned journal topic" you would have to create those partitions
dynamically (usually you don't know up front how many PersistentActors
your system will have) on per-PersistentActor basis.

You're right. If you want to keep all data in Kafka without ever
deleting them, you'd need to add partitions dynamically (which is
currently possible with APIs that back the CLI). On the other hand,
using Kafka this way is the wrong approach IMO. If you really need to
keep the full event history, keep old events on HDFS or wherever and
only the more recent ones in Kafka (where a full replay must first read
from HDFS and then from Kafka) or use a journal plugin that is
explicitly designed for long-term event storage.

The main reason why I developed the Kafka plugin was to integrate my
Akka applications in unified log processing architectures as descibed in
Jay Kreps' excellent article
.
Also mentioned in this article is a snapshotting strategy that fits
typical retention times in Kafka.

On the other hand maybe you are assuming that each actor is writing to
different topic

yes, and the Kafka plugin is currently implemented that way.

- but I think this solution is not viable because information about
topics is limited by ZK and other
factors: http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic.

A more in-depth discussion about these limitations is given at
http://www.quora.com/How-many-topics-can-be-created-in-Apache-Kafka with
a detailed comment from Jay. I'd say that if you designed your
application to run more than a few hundred persistent actors, then the
Kafka plugin is the probably wrong choice. I tend to design my
applications to have only a small number of persistent actors (which is
in contrast to many other discussions on akka-user) which makes the
Kafka plugin a good candidate.

To recap, the Kafka plugin is a reasonable choice if

- frequent snapshotting is done by persistent actors (every day or so)
- you don't have more than a few hundred persistent actors and
- your application is a component of a unified log processing
architecture (backed by Kafka)

The most interesting next Kafka plugin feature for me to develop is an
HDFS integration for long-term event storage (and full event history
replay). WDYT?

W dniu wtorek, 26 sierpnia 2014 15:28:47 UTC+2 użytkownik Martin
Krasser napisał:

Hi Andrzej,

On 26.08.14 09:15, Andrzej Dębski wrote:

Hello

Lately I have been reading about a possibility of using Apache
Kafka as journal/snapshot store for akka-persistence.

I am aware of the plugin created by Martin Krasser:
https://github.com/krasserm/akka-persistence-kafka/
and also I
read other topic about Kafka as journal

https://groups.google.com/forum/#!searchin/akka-user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ

In both sources I linked two ideas were presented:

1. Set log retention to 7 days, take snapshots every 3 days
(example values)
2. Set log retention to unlimited.

Here is the first question: in first case wouldn't it mean that
persistent views would receive skewed view of the PersistentActor
state (only events from 7 days) - is it really viable solution?
As far as I know PersistentView can only receive events - it
can't receive snapshots from corresponding PersistentActor (which
is good in general case).

PersistentViews can create their own snapshots which are isolated
from the corresponding PersistentActor's snapshots.

Second question (more directed to Martin): in the thread I linked
you wrote:

I don't go into Kafka partitioning details here but it is
possible to implement the journal driver in a way that both a
single persistent actor's data are partitioned *and* kept in
order

I am very interested in this idea. AFAIK it is not yet
implemented in current plugin but I was wondering if you could
share high level idea how would you achieve that (one persistent
actor, multiple partitions

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

2014-08-26 Thread 'Michael Overmeyer' via Akka User List

This is definitely an issue as many people with event sourced systems have 
millions of topics.

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Improving Akka Persistence wrt CQRS/ES/DDD

2014-08-26 Thread Ashley Aitken

On Thursday, 21 August 2014 16:01:26 UTC+8, rkuhn wrote:
>
>
> My initial impulse would be to postulate an isomorphism between Actor and 
> AR and then say that the Actor Model works just fine with only direct 
> message sends, so the same must hold for ARs and their commands/events.
>

Unfortunately, I am coming to believe this may not be correct.  Akka 
message passing and related functionality is fantastic for processing 
within an aggregate, and Akka Persistence is great for aggregate 
persistence.  However, it seems that most modern CQRS/ES implementations 
use and require a reliable EventBus and possibly a reliable CommandBus for 
the *write* side (with the EventBus and projections for the read side). 

I suggest Vaughn's creation of Number 9, AFAIK an implementation of 
reliable messaging to multiple actors (e.g. for events), is a symptom of 
this lack of a reliable EventBus but not the full solution that a EventBus 
(and CommandBus) would provide.  I believe a full CQRS/ES implementation 
may require reliable decoupled command delivery and reliable 
publish-subscribe for events.  

In theory if Client sends a command to A which in order to fulfill it will 
> need to send a command to B then A’s response to the Client will contain 
> B’s reply in some capacity. Whether there are cases where this structure 
> becomes impractical is something I cannot tell with my current knowledge. 
>

No, I believe there are use-cases where a process manager listens for an 
event without having first sent a command to the aggregate root producing 
the event.  An example could be a process manager listening for any 
ItemPicked events and starting a process to check the stock levels and 
possibly reorder, separate from the sales process.  Again this seems to 
require a reliable publish-subscribe facility for events.

Now I am really impressed with the functionality provided by Akka, 
including its routers and distributed pub-bub functionality.  And, as we 
have seen how Akka Persistence and its distributed journal / store can 
enable AtLeastOnceDelivery for messages. I thus suggest that the 
distributed store could more generally enable other services to move up to 
a more reliable level of local and distributed functionality (IF NEEDED).  

Currently:  Akka Persistence provides PersistentActosr and 
AtLeastOnceDelivery

Future?: Akka Persistence provides PersisentActor, AtLeastOnceDelivery, 
ReliableRouter, ReliableDistributedPubSub, and Reliable...?

Of course, if reliable routing and publish-subscribe can be implemented in 
another way in Akka that would be fine as well. Or perhaps the successful 
"let it crash" approach extends to messaging, i.e. "let it be lost."  If so 
I am keen to find out how to make a CQRS/ES implementation work effectively 
if/when commands are lost and if/when events are not delivered to those 
needing to be notified of them.

The require reliable functionality can obviously be added on top of Akka by 
using other services.  However, I think use of the distributed store that 
comes with Akka Persistence to implement these could remove the need for an 
extra infrastructure component.  And, if this is done with a store that may 
be central to enterprise infrastructure, e.g. Kafka, then that would also 
make integration easier.  

I see something like this was discussed early in 
2013: https://groups.google.com/d/topic/akka-user/cmDna0_Mo58/discussion

Cheers,
Ashley.

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

2014-08-26 Thread Andrzej Dębski

My mind must have filtered out the possibility of making snapshots using 
Views - thanks.

About partitions: I suspected as much. The only thing that I am wondering 
now is: if it is possible to dynamically create partitions in Kafka? AFAIK 
the number of partitions is set during topic creation (be it 
programmatically using API or CLI tools) and there is CLI tool you can use 
to modify existing topic: 
https://cwiki.apache.org/confluence/display/KAFKA/Replication+tools#Replicationtools-5.AddPartitionTool.
 
To keep the invariant  " PersistentActor is the only writer to a 
partitioned journal topic" you would have to create those partitions 
dynamically (usually you don't know up front how many PersistentActors your 
system will have) on per-PersistentActor basis.

On the other hand maybe you are assuming that each actor is writing to 
different topic - but I think this solution is not viable because 
information about topics is limited by ZK and other 
factors: 
http://grokbase.com/t/kafka/users/133v60ng6v/limit-on-number-of-kafka-topic.

W dniu wtorek, 26 sierpnia 2014 15:28:47 UTC+2 użytkownik Martin Krasser 
napisał:
>
>  Hi Andrzej,
>
> On 26.08.14 09:15, Andrzej Dębski wrote:
>  
> Hello 
>
>  Lately I have been reading about a possibility of using Apache Kafka as 
> journal/snapshot store for akka-persistence. 
>
> I am aware of the plugin created by Martin Krasser: 
> https://github.com/krasserm/akka-persistence-kafka/ and also I read other 
> topic about Kafka as journal 
> https://groups.google.com/forum/#!searchin/akka-user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ
> .
>
>  In both sources I linked two ideas were presented:
>
>  1. Set log retention to 7 days, take snapshots every 3 days (example 
> values)
> 2. Set log retention to unlimited.
>
>  Here is the first question: in first case wouldn't it mean that 
> persistent views would receive skewed view of the PersistentActor state 
> (only events from 7 days) - is it really viable solution? As far as I know 
> PersistentView can only receive events - it can't receive snapshots from 
> corresponding PersistentActor (which is good in general case).
>  
>
> PersistentViews can create their own snapshots which are isolated from the 
> corresponding PersistentActor's snapshots.
>
>  
>  Second question (more directed to Martin): in the thread I linked you 
> wrote: 
>
>   I don't go into Kafka partitioning details here but it is possible to 
>> implement the journal driver in a way that both a single persistent actor's 
>> data are partitioned *and* kept in order
>>
>
>   I am very interested in this idea. AFAIK it is not yet implemented in 
> current plugin but I was wondering if you could share high level idea how 
> would you achieve that (one persistent actor, multiple partitions, ordering 
> ensured)?
>  
>
> The idea is to
>
> - first write events 1 to n to partition 1
> - then write events n+1 to 2n to partition 2
> - then write events 2n+1 to 3n to partition 3
> - ... and so on
>
> This works because a PersistentActor is the only writer to a partitioned 
> journal topic. During replay, you first replay partition 1, then partition 
> 2 and so on. This should be rather easy to implement in the Kafka journal, 
> just didn't have time so far; pull requests are welcome :) Btw, the Cassandra 
> journal  follows 
> the very same strategy for scaling with data volume (by using different 
> partition keys). 
>
> Cheers,
> Martin
>
>  -- 
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ: 
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >> Search the archives: https://groups.google.com/group/akka-user
> --- 
> You received this message because you are subscribed to the Google Groups 
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to akka-user+...@googlegroups.com .
> To post to this group, send email to akka...@googlegroups.com 
> .
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>
>
> -- 
> Martin Krasser
>
> blog:http://krasserm.blogspot.com
> code:http://github.com/krasserm
> twitter: http://twitter.com/mrt1nz
>
> 

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Periodic "Disassociated" with remote system

2014-08-26 Thread Caoyuan

Hi Roland,

The cluster is based on https://github.com/wandoulabs/spray-socketio. We, 
Wandou Labs ( http://www.snappea.com/ ), are going to use it for at least 
10+ millions persistent connections, from mobile devices to our service. 
These mobile devices can then, share status, push messages, fire real-time 
events, virtually connect to each others etc.

Feel free for more questions :-)

Regards,
Caoyuan Deng ( https://github.com/dcaoyuan )

On Tuesday, August 26, 2014 10:10:27 PM UTC+8, rkuhn wrote:
>
> Hi Caoyuan,
>
> 26 aug 2014 kl. 09:51 skrev Caoyuan >:
>
>
>
> On Monday, August 25, 2014 6:31:15 PM UTC+8, Akka Team wrote:
>>
>> Hi Caouyan,
>>
>> It is usually dangerous to set the heartbeat-pause to a lesser value than 
>> the heartbeat interval itself. If a heartbeat gets lost, then the next 
>> heartbeat will definitely not make the deadline. I recommend to set it to a 
>> larger value. Also, I would go with a lower heartbeat-interval setting, 10s 
>> seems more appropriate if you want low heartbeat traffic.
>>
>> -Endre
>>
>
> Got it now. Thanks. 
>
> BTW, Our cluster has ran 15 days with 1 million long-connections, stable 
> and consistent.
>
>
> That’s great to hear, and it does make me a bit curious about the rest of 
> the story: care to share it privately or even publicly?
>
> Regards,
>
> Roland
>
>  
>
>>
>>
>> On Mon, Aug 25, 2014 at 9:31 AM, Caoyuan  wrote:
>>
>>> Update Aug 25, 2014:
>>>
>>> We changed 
>>> akka.remote.transport-failure-detector.acceptable-heartbeat-pause = 10 s 
>>> instead of 5 s, the WARN message gone. I guess the [Disassociated] WARN 
>>> might be caused by network delay or GC pause (Full GC lasts 3+ secs now on 
>>> our system) etc. The setting is
>>>
>>> akka.remote {
>>>  transport-failure-detector {
>>> heartbeat-interval = 30 s   # default 4s
>>>acceptable-heartbeat-pause = 10 s  # default 10s
>>>  }
>>> }
>>>
>>> But, that could not explain the periodic "Disassociated" WARN occurred 
>>> before, which, seems could not be recovered from Disassociated state.
>>>
>>> On Monday, August 11, 2014 12:08:00 AM UTC+8, Caoyuan wrote:

 We have an akka cluster with 10 nodes. it works almost smoothly except 
 periodic firing "Disassociated" WARN log, which seems cannot be recovered:

 The following is the log records.

 ..
 2014-08-10 00:00:09,253 WARN  a.remote.ReliableDeliverySupervisor 
 akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/
 reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
 ClusterSystem@10.0.65.3:2552] has failed, address is now gated for 
 [5000] ms. Reason is: [Disassociated].

 2014-08-10 00:00:44,292 WARN  a.remote.ReliableDeliverySupervisor 
 akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/
 reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
 ClusterSystem@10.0.65.3:2552] has failed, address is now gated for 
 [5000] ms. Reason is: [Disassociated].

 2014-08-10 00:01:49,332 WARN  a.remote.ReliableDeliverySupervisor 
 akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/
 reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
 ClusterSystem@10.0.65.3:2552] has failed, address is now gated for 
 [5000] ms. Reason is: [Disassociated].

 2014-08-10 00:02:24,373 WARN  a.remote.ReliableDeliverySupervisor 
 akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/
 reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
 ClusterSystem@10.0.65.3:2552] has failed, address is now gated for 
 [5000] ms. Reason is: [Disassociated].

 2014-08-10 00:02:59,412 WARN  a.remote.ReliableDeliverySupervisor 
 akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/
 reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
 ClusterSystem@10.0.65.3:2552] has failed, address is now gated for 
 [5000] ms. Reason is: [Disassociated].

 2014-08-10 00:03:34,452 WARN  a.remote.ReliableDeliverySupervisor 
 akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/
 reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%
 4010.0.65.3%3A2552-5 - Association with remote system [akka.tcp://
 ClusterSystem@10.0.65.3:2552] has failed, address is now gated for 
 [5000] ms. Reason is: [Disassociated].
 ..

 The warning continually occurred almost all day, with the period 35 
 seconds (30 + 5 s) or 65 seconds (30 + 30 + 5 s), which is exactly the 
 setting of akka.remote's transport failure detector:

 akka.remote {
  transport-failure-detect

Re: [akka-user] Re: Akka persistentChannel does not delete message from Journal upon confirm

For what it's worth, I checked out the code that Manas provided 
(https://github.com/manasdebashiskar/akkapersistenceExample) and changed 
the build.scala to use Scala 2.10.4.
That itself changed nothing, and the problem still existed.

However, when I changed the build.scala to use Akka 2.3.5 (ignoring the 
deprecation warnings since Persistent Channels, etc, are now deprecated), 
this problem seems to have gone away.

On Wednesday, 26 March 2014 10:06:23 UTC-4, Martin Krasser wrote:
>
>  
> On 26.03.14 14:20, Manas Kar wrote:
>  
> Thanks Martin for helping me with this. I will keep an eye on this ticket. 
> I realized that I never thanked you for the awesome piece of work you have 
> done on Akka persistence to make our life so easy. 
>  
>
> Thanks you!
>
>  
>  ...Manas
>
> On Wednesday, March 26, 2014 3:46:29 AM UTC-4, Martin Krasser wrote: 
>>
>>  Hi Manas,
>>
>> I can reproduce what you reported and need to investigate that further. I 
>> verified that the written messages are actually deleted and that there are 
>> no additional writes of messages or confirmation markers made by your app. 
>> I'm quite busy with other stuff at the moment, so it may take a while. 
>> Ticket created: https://www.assembla.com/spaces/akka/tickets/3962
>>
>> Thanks for testing and reporting
>>
>> Cheers,
>> Martin
>>
>> On 25.03.14 19:25, Manas Kar wrote:
>>  
>> Hi Martin,  
>>  To answer your previous question nothing happens if I restart my 
>> application. The persisted files don't get deleted. I also tested the code 
>> with a smaller size message and was able to see the same problem.
>>  I have created a very simple runnable application that uses 
>> PersistenceChannel to send messages to a IOWorker that immediately confirms 
>> the message. As stated above; this examples shows how the journal space 
>> keeps growing over time even though the message gets  confirmed and 
>> should be deleted after a while asynchronously.
>>
>>  The code is placed here 
>> . 
>>
>> Once again thanks for your help. I am sure it is something that I am 
>> doing wrong because of which only I see this and no-one else seems to be 
>> complaining about. 
>>
>>  ...Manas
>>
>>  
>> On Friday, March 21, 2014 3:49:17 AM UTC-4, Martin Krasser wrote: 
>>>
>>>  
>>> On 19.03.14 15:38, Manas Kar wrote:
>>>  
>>> Hi Martin, 
>>>  I have a process that is running since last Sunday evening using the 
>>> above code base(Given here 
>>> )
>>>  
>>> . The journal space has grown to be around 24 GB so far. I am still 
>>> continuing my run to see when the compaction will happen and the "marked 
>>> deleted" messages will be physically deleted. 
>>>
>>>
>>> What happens when you restart the application?
>>>
>>>  The only way my code is different than your example is that my message 
>>> has the size of 8 MB. Could that be playing some role? 
>>>  
>>>
>>> No.
>>>
>>>  
>>> Thanks 
>>> Manas
>>>
>>>
>>> On Wednesday, March 12, 2014 1:09:38 PM UTC-4, Manas Kar wrote: 

 Thanks Martin. 
  That explains what I am seeing.

 On Wednesday, March 12, 2014 12:15:15 PM UTC-4, Martin Krasser wrote: 
>
>  
> On 11.03.14 19:22, Manas Kar wrote:
>  
> Hi Martin,  
>  I observed something that kind of contradicts my previous statement. 
> It would be correct to say that the size of journal folder is 
> approximately 
> half the size of SumOf(messages).
>
>   My messages don't get replayed after the configurable times from 
> the channels. But the journal's are not physically deleted amounting to 
> large disk usage.
>  Please find my modified scala code that follows the example you 
> provided and still leaves residues in journal folder. 
>  
>
> LevelDB doesn't delete entries immediately, it rather marks them as 
> deleted and physically deletes them during compactions. Details at 
> http://leveldb.googlecode.com/svn/trunk/doc/impl.html.
>
>  
>  Thanks
> ...Manas
>
>
> On Tuesday, March 11, 2014 2:19:51 AM UTC-4, Manas Kar wrote: 
>>
>> Hi,  
>>  I am very excited about the latest and the biggest feature of 
>> Akka2.3.0 that is Persistence. 
>>  I have posted a question about Akka persistentChannel here. 
>> 
>>  
>>  Basically I have a persistentChannel that sends some IOWork to a 
>> destination actor. Destination actor tries to do the work and when 
>> successfully completes the work or the deadline has no time left it 
>> sends a 
>> "confirm" on  the ConfirmablePersistent message.
>>
>>  My expectation after reading the documentation was that my message 
>> w

Re: [akka-user] Periodic "Disassociated" with remote system

2014-08-26 Thread Roland Kuhn

Hi Caoyuan,

26 aug 2014 kl. 09:51 skrev Caoyuan :

> 
> 
> On Monday, August 25, 2014 6:31:15 PM UTC+8, Akka Team wrote:
> Hi Caouyan,
> 
> It is usually dangerous to set the heartbeat-pause to a lesser value than the 
> heartbeat interval itself. If a heartbeat gets lost, then the next heartbeat 
> will definitely not make the deadline. I recommend to set it to a larger 
> value. Also, I would go with a lower heartbeat-interval setting, 10s seems 
> more appropriate if you want low heartbeat traffic.
> 
> -Endre
> 
> Got it now. Thanks. 
> 
> BTW, Our cluster has ran 15 days with 1 million long-connections, stable and 
> consistent.

That’s great to hear, and it does make me a bit curious about the rest of the 
story: care to share it privately or even publicly?

Regards,

Roland

>  
> 
> 
> On Mon, Aug 25, 2014 at 9:31 AM, Caoyuan  wrote:
> Update Aug 25, 2014:
> 
> We changed akka.remote.transport-failure-detector.acceptable-heartbeat-pause 
> = 10 s instead of 5 s, the WARN message gone. I guess the [Disassociated] 
> WARN might be caused by network delay or GC pause (Full GC lasts 3+ secs now 
> on our system) etc. The setting is
> 
> akka.remote {
>   transport-failure-detector {
> heartbeat-interval = 30 s   # default 4s
> acceptable-heartbeat-pause = 10 s  # default 10s
>   }
> }
> 
> But, that could not explain the periodic "Disassociated" WARN occurred 
> before, which, seems could not be recovered from Disassociated state.
> 
> On Monday, August 11, 2014 12:08:00 AM UTC+8, Caoyuan wrote:
> We have an akka cluster with 10 nodes. it works almost smoothly except 
> periodic firing "Disassociated" WARN log, which seems cannot be recovered:
> 
> The following is the log records.
> 
> ..
> 2014-08-10 00:00:09,253 WARN  a.remote.ReliableDeliverySupervisor 
> akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5
>  - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] 
> has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
> 
> 2014-08-10 00:00:44,292 WARN  a.remote.ReliableDeliverySupervisor 
> akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5
>  - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] 
> has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
> 
> 2014-08-10 00:01:49,332 WARN  a.remote.ReliableDeliverySupervisor 
> akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5
>  - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] 
> has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
> 
> 2014-08-10 00:02:24,373 WARN  a.remote.ReliableDeliverySupervisor 
> akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5
>  - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] 
> has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
> 
> 2014-08-10 00:02:59,412 WARN  a.remote.ReliableDeliverySupervisor 
> akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5
>  - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] 
> has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
> 
> 2014-08-10 00:03:34,452 WARN  a.remote.ReliableDeliverySupervisor 
> akka.tcp://ClusterSystem@10.0.69.169:2551/system/endpointManager/reliableEndpointWriter-akka.tcp%3A%2F%2FClusterSystem%4010.0.65.3%3A2552-5
>  - Association with remote system [akka.tcp://ClusterSystem@10.0.65.3:2552] 
> has failed, address is now gated for [5000] ms. Reason is: [Disassociated].
> ..
> 
> 
> The warning continually occurred almost all day, with the period 35 seconds 
> (30 + 5 s) or 65 seconds (30 + 30 + 5 s), which is exactly the setting of 
> akka.remote's transport failure detector:
> 
> akka.remote {
>   transport-failure-detector {
> heartbeat-interval = 30 s   # default 4s
> acceptable-heartbeat-pause = 5 s  # default 10s
>   }
> 
> Where, the failure-detector mark it unavailable after heartbeat-interval + 
> acceptable-heartbeat-pause period (35 s).
> 
> We're using akka-2.3.3. the node which logged is at 10.0.69.169:2551, and the 
> remote node is at 10.0.65.3:2552
> 
> I tried to dig via the akka.remoting source code, but with no progressing.
> 
> Thoughts ?
> 
> -Caoyuan Deng
> 
> 
> 
> 
> -- 
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ: 
> >> http://doc.akka.io/docs/akka/current/additional/faq.html
> >> Search the archives: https://groups.google.com/group/akka-user
> --- 
> You received this message because you are subscribed to

Re: [akka-user] Apache Kafka as journal - retention times/PersistentView and partitions

Hi Andrzej,

On 26.08.14 09:15, Andrzej Dębski wrote:

Hello

Lately I have been reading about a possibility of using Apache Kafka 
as journal/snapshot store for akka-persistence.

I am aware of the plugin created by Martin 
Krasser: https://github.com/krasserm/akka-persistence-kafka/ and also 
I read other topic about Kafka as 
journal https://groups.google.com/forum/#!searchin/akka-user/kakfka/akka-user/iIHmvC6bVrI/zeZJtW0_6FwJ.

In both sources I linked two ideas were presented:

1. Set log retention to 7 days, take snapshots every 3 days (example 
values)

2. Set log retention to unlimited.

Here is the first question: in first case wouldn't it mean that 
persistent views would receive skewed view of the PersistentActor 
state (only events from 7 days) - is it really viable solution? As far 
as I know PersistentView can only receive events - it can't receive 
snapshots from corresponding PersistentActor (which is good in general 
case).

PersistentViews can create their own snapshots which are isolated from 
the corresponding PersistentActor's snapshots.

Second question (more directed to Martin): in the thread I linked you 
wrote:

 I don't go into Kafka partitioning details here but it is
possible to implement the journal driver in a way that both a
single persistent actor's data are partitioned *and* kept in order

 I am very interested in this idea. AFAIK it is not yet implemented in 
current plugin but I was wondering if you could share high level idea 
how would you achieve that (one persistent actor, multiple partitions, 
ordering ensured)?

The idea is to

- first write events 1 to n to partition 1
- then write events n+1 to 2n to partition 2
- then write events 2n+1 to 3n to partition 3
- ... and so on

This works because a PersistentActor is the only writer to a partitioned 
journal topic. During replay, you first replay partition 1, then 
partition 2 and so on. This should be rather easy to implement in the 
Kafka journal, just didn't have time so far; pull requests are welcome 
:) Btw, the Cassandra journal 
 follows the 
very same strategy for scaling with data volume (by using different 
partition keys).

Cheers,
Martin

--
>> Read the docs: http://akka.io/docs/
>> Check the FAQ: 
http://doc.akka.io/docs/akka/current/additional/faq.html

>> Search the archives: https://groups.google.com/group/akka-user
---
You received this message because you are subscribed to the Google 
Groups "Akka User List" group.
To unsubscribe from this group and stop receiving emails from it, send 
an email to akka-user+unsubscr...@googlegroups.com 
.
To post to this group, send email to akka-user@googlegroups.com 
.

Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

--
Martin Krasser

blog:http://krasserm.blogspot.com
code:http://github.com/krasserm
twitter: http://twitter.com/mrt1nz

--

 Read the docs: http://akka.io/docs/
 Check the FAQ: http://doc.akka.io/docs/akka/current/additional/faq.html
 Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka User List" group.

To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

[akka-user] Re: Getting camel message body as java.io.File when using file:/// endpoint

2014-08-26 Thread Slawek Rogulski

much appreciated

On Friday, 13 June 2014 01:27:11 UTC+8, Ian Hummel wrote:
>
> So it's been almost 1 year, but funny enough I had to go back and work on 
> this code a little more and finally found an answer.  Updating the thread 
> in the hopes that this is useful for future Google searches...
>
> Whenever you create an Actor that extends akka.camel.Consumer, it 
> registers itself (including the URI it wants to listen on) with 
> the ConsumerRegistrar.  This class hooks into the Camel Context and adds a 
> route from the Camel side of things to the actor.  The route gets created 
> by ConsumerActorRouteBuilder, which applies any route customizations via 
> the following code:
>
>   def configure(): Unit =
> applyUserRouteCustomization(
>   settings.Conversions.apply(
> endpointUri take endpointUri.indexOf(":"), // e.g. "http" from "
> http://whatever/...";
> 
> from(endpointUri).routeId(consumer.path.toString))).to(targetActorUri)
>
> settings.Converisons.apply will end up calling (in Camel.scala)
>
> conversions.get(s).fold(r)(r.convertBodyTo)
>
> So basically it reads any "statically" configured type conversions from 
> the settings file and hardcodes a conversion BEFORE your actor will receive 
> the CamelMessage.
>
> The issue I was running into was the default reference.conf file includes 
> this snippet:
>
> akka {
>   camel {
> ...
> conversions {
>   "file" = "java.io.InputStream"
> }
>   }
> }
>
> Effectively this means that the message body has been converted to a 
> java.io.InputStream before you even have a crack at it.  You can't recover 
> the java.io.File from the java.io.InputStream.  The solution is to just 
> override that key, using some dummy conversion like:
>
>   val confString = """
> akka.camel.streamingCache = false
> akka.camel.conversions {
>   "file" = "java.io.File"
> }
>   """
>   val conf = ConfigFactory.parseString(confString)
>
> Then your actors can do this:
>
>   def receive = {
> case msg: CamelMessage ⇒ {
>   val file = msg.bodyAs[File]
> }
>   }
>
> And everything just works.
>
>
> Cheers!
>
>
> On Wednesday, August 28, 2013 6:55:27 PM UTC-4, Ian Hummel wrote:
>>
>> Hello everyone,
>>
>> I'm using the file component in my Consumer actor and am trying to get 
>> the message body as a java.io.File object.  Reading 
>> http://people.apache.org/~dkulp/camel/file2.html leads me to believe 
>> that should be as simple as:
>>
>> import akka.actor.ActorSystem
>> import akka.actor.Props
>> import akka.actor.ActorLogging
>> import akka.camel.Consumer
>> import akka.camel.CamelMessage
>> import java.io.File
>>
>> class ExampleActor extends Consumer with ActorLogging {
>> def endpointUri = "file:///tmp/inbox?noop=true"
>>
>> def receive = {
>> case msg: CamelMessage => {
>> log.info("Received message {}", msg.bodyAs[File])
>> }
>> }
>> }
>>
>> object ExampleActor extends App {
>> val system = ActorSystem()
>> system.actorOf(Props[ExampleActor])
>> system.awaitTermination
>> }
>>
>> But whenever I do that I get the following exception:
>>
>> [ERROR] [08/28/2013 18:04:48.630] 
>> [default-akka.actor.default-dispatcher-2] [akka://default/user/$a] No type 
>> converter available to convert from type: 
>> org.apache.camel.converter.stream.InputStreamCache to the required type: 
>> java.io.File with value 
>> org.apache.camel.converter.stream.InputStreamCache@5620f476
>> org.apache.camel.NoTypeConversionAvailableException: No type converter 
>> available to convert from type: 
>> org.apache.camel.converter.stream.InputStreamCache to the required type: 
>> java.io.File with value 
>> org.apache.camel.converter.stream.InputStreamCache@5620f476
>> at 
>> org.apache.camel.impl.converter.BaseTypeConverterRegistry.mandatoryConvertTo(BaseTypeConverterRegistry.java:169)
>> at 
>> org.apache.camel.impl.converter.BaseTypeConverterRegistry.mandatoryConvertTo(BaseTypeConverterRegistry.java:142)
>> at akka.camel.CamelMessage.getBodyAs(CamelMessage.scala:98)
>> at akka.camel.CamelMessage.bodyAs(CamelMessage.scala:87)
>> at ExampleActor$$anonfun$receive$1.applyOrElse(ExampleActor.scala:13)
>> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
>> at akka.actor.ActorCell.invoke(ActorCell.scala:456)
>> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
>> at akka.dispatch.Mailbox.run(Mailbox.scala:219)
>> at 
>> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
>> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
>> at 
>> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
>> at 
>> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
>> at 
>> scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
>>
>> I have tried the same, simple code with stream caching disabled, but it 
>> doesn't make any difference.  Here's what I get in that case:
>>
>> [ERROR] [08/28/2013 18:37:56.97

Re: [akka-user] Programmatic mailbox configuration during runtime

Superficially it sounds like your problem may be well-suited for the
upcoming Akka Streams—that could be something to keep in mind as well.


On Tue, Aug 26, 2014 at 11:48 AM, Christian Kreutzfeldt 
wrote:

> Hi Viktor
>
> Am Dienstag, 26. August 2014 11:41:46 UTC+2 schrieb √:
>>
>> Hi Christian,
>>
>>
>>
>> On Tue, Aug 26, 2014 at 11:34 AM, Christian Kreutzfeldt > > wrote:
>>
>>> Hi Viktor,
>>>
>>> thanks for the quick reply. What I am concerned about is the way how to
>>> read from websocket
>>>
>>
>> Read it from the websocket and store it into Kafka, and have the
>> Kafka-consumer actor reused for the work-distribution?
>>
> Yeah, thought about that as well but ... breaks my architecture somehow.
> But probably this is the best solution as it enables some other nice
> features like shutting down the processing actors while the source listener
> still receives data and inserts it into the kafka topic :-)
>
>
>>
>>> .. kafka is easy as I simply pull data and it does not matter when I do
>>> that, I always receive the next message. But when it comes to websockets,
>>> the server pushes data and I need to forward the data somehow. When I do
>>> not consume the message, it is lost. I use the WebSocketClient provided by
>>> the Jetty framework but that one internally spawns a thread reading from
>>> the socket forwarding data to a callback provided by me. As far as I
>>> understood, spawning threads inside actors isn't a great idea :-) ...or am
>>> I wrong? Do you have any suggestion how to solve this or point me to a
>>> resource which provides any information about how to handle similar
>>> problems ... unfortunately google does not help in that case :-(
>>>
>>
>> You could use the Play WebSocket API.
>>
> ;-) thanks!!
>
>>
>>
>>>
>>> Regarding the kafka consumer, actually I have an implementation which
>>> does exactly what you described: read single message from kafka, forward
>>> message, issue next read request via scheduler / directly
>>> getSelf().tell(..), wait for read request, read single message from kafka
>>>  I could keep on reading forever from kafka but that would block the
>>> actor. But unfortunately, I have to wait for the next read request to
>>> access kafka. Somehow it is a nice approach especially if a processing
>>> actor issues that message since it ensures that it has resources available
>>> for processing the next message. On the other hand it slows down the whole
>>> process and leaves gaps where nothing is processed. From a software
>>> architecture aspect this is a clean solution but I need to process data
>>> quite fast ;-)
>>>
>>
>> You could have the Kafka-actor batch-read up to X messages to not be
>> fully lock-step with the consumer-actor.
>>
> Mmmh, I should try that as well.
>
>>
>> --
>> Cheers,
>> √
>>
>
> Thanks for your quick replies!
>
> Best regards,
>   Christian
>
> --
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to akka-user+unsubscr...@googlegroups.com.
> To post to this group, send email to akka-user@googlegroups.com.
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Cheers,
√

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Programmatic mailbox configuration during runtime

2014-08-26 Thread Christian Kreutzfeldt

Hi Viktor

Am Dienstag, 26. August 2014 11:41:46 UTC+2 schrieb √:
>
> Hi Christian,
>
>
> On Tue, Aug 26, 2014 at 11:34 AM, Christian Kreutzfeldt  > wrote:
>
>> Hi Viktor,
>>
>> thanks for the quick reply. What I am concerned about is the way how to 
>> read from websocket 
>>
>
> Read it from the websocket and store it into Kafka, and have the 
> Kafka-consumer actor reused for the work-distribution?
>
Yeah, thought about that as well but ... breaks my architecture somehow. 
But probably this is the best solution as it enables some other nice 
features like shutting down the processing actors while the source listener 
still receives data and inserts it into the kafka topic :-) 

 
>
>> .. kafka is easy as I simply pull data and it does not matter when I do 
>> that, I always receive the next message. But when it comes to websockets, 
>> the server pushes data and I need to forward the data somehow. When I do 
>> not consume the message, it is lost. I use the WebSocketClient provided by 
>> the Jetty framework but that one internally spawns a thread reading from 
>> the socket forwarding data to a callback provided by me. As far as I 
>> understood, spawning threads inside actors isn't a great idea :-) ...or am 
>> I wrong? Do you have any suggestion how to solve this or point me to a 
>> resource which provides any information about how to handle similar 
>> problems ... unfortunately google does not help in that case :-(
>>
>
> You could use the Play WebSocket API.
>
;-) thanks!! 

>  
>
>>
>> Regarding the kafka consumer, actually I have an implementation which 
>> does exactly what you described: read single message from kafka, forward 
>> message, issue next read request via scheduler / directly 
>> getSelf().tell(..), wait for read request, read single message from kafka 
>>  I could keep on reading forever from kafka but that would block the 
>> actor. But unfortunately, I have to wait for the next read request to 
>> access kafka. Somehow it is a nice approach especially if a processing 
>> actor issues that message since it ensures that it has resources available 
>> for processing the next message. On the other hand it slows down the whole 
>> process and leaves gaps where nothing is processed. From a software 
>> architecture aspect this is a clean solution but I need to process data 
>> quite fast ;-)
>>
>
> You could have the Kafka-actor batch-read up to X messages to not be fully 
> lock-step with the consumer-actor.
>
Mmmh, I should try that as well.  

>
> -- 
> Cheers,
> √
>

Thanks for your quick replies! 

Best regards,
  Christian 

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Programmatic mailbox configuration during runtime

Hi Christian,


On Tue, Aug 26, 2014 at 11:34 AM, Christian Kreutzfeldt 
wrote:

> Hi Viktor,
>
> thanks for the quick reply. What I am concerned about is the way how to
> read from websocket
>

Read it from the websocket and store it into Kafka, and have the
Kafka-consumer actor reused for the work-distribution?


> .. kafka is easy as I simply pull data and it does not matter when I do
> that, I always receive the next message. But when it comes to websockets,
> the server pushes data and I need to forward the data somehow. When I do
> not consume the message, it is lost. I use the WebSocketClient provided by
> the Jetty framework but that one internally spawns a thread reading from
> the socket forwarding data to a callback provided by me. As far as I
> understood, spawning threads inside actors isn't a great idea :-) ...or am
> I wrong? Do you have any suggestion how to solve this or point me to a
> resource which provides any information about how to handle similar
> problems ... unfortunately google does not help in that case :-(
>

You could use the Play WebSocket API.


>
> Regarding the kafka consumer, actually I have an implementation which does
> exactly what you described: read single message from kafka, forward
> message, issue next read request via scheduler / directly
> getSelf().tell(..), wait for read request, read single message from kafka
>  I could keep on reading forever from kafka but that would block the
> actor. But unfortunately, I have to wait for the next read request to
> access kafka. Somehow it is a nice approach especially if a processing
> actor issues that message since it ensures that it has resources available
> for processing the next message. On the other hand it slows down the whole
> process and leaves gaps where nothing is processed. From a software
> architecture aspect this is a clean solution but I need to process data
> quite fast ;-)
>

You could have the Kafka-actor batch-read up to X messages to not be fully
lock-step with the consumer-actor.


>
> Kind regards,
>   Christian
>
> Am Dienstag, 26. August 2014 11:22:17 UTC+2 schrieb √:
>>
>> Hi Christian,
>>
>> Reading from websockets or kafka is not something that should be done by
>> the mailbox, it's something an actor should do. Then your consumer-actor
>> (the one which you wanted to switch mailboxes on) can ask the
>> websocket/kafka actor for the next message (pull) or have the
>> websocket/kafka actor send the messages to the consumer-actor.
>>
>> Does that help?
>>
>>
>> On Tue, Aug 26, 2014 at 11:14 AM, Christian Kreutzfeldt > > wrote:
>>
>>> Hi
>>>
>>> I am working on a streaming project where I need to consume different
>>> source types on demand. To comply with the akka model I decided to
>>> implement dedicated mailbox types which
>>> establish a connection with the desired source and read data from it,
>>> eg. kafka topics or websockets. As I do not know which source to consume
>>> before starting the application I cannot
>>> provide the full configuration parameters but need to set them during
>>> runtime.
>>>
>>> What I would like to know is, if there is a chance to get configuration
>>> data into a mailbox when it is instantiated. The data must not be read from
>>> a configuration file but may be provided
>>> by the caller on demand.
>>>
>>> The only solution I actually thought of was to ramp up the actor +
>>> mailbox without any configuration and send a message holding the
>>> configuration right after its instantiation. The mailbox
>>> sees the message, extracts the information and configures the source
>>> listener as required. The message might be removed from the mailbox or be
>>> provided to the actor - depends on
>>> spec.
>>>
>>> In some way I dislike this solution as it makes me react to messages on
>>> two different layers: the mailbox and the actor.
>>>
>>> I tried to establish the source connection from within an actor,
>>> scheduled READ MESSAGES to itself and on receiving this message type the
>>> actor takes a single message from
>>> the source. The drawback of this solution is a reduced throughput as
>>> each READ MESSAGE must be awaited and then the source must be consumed. It
>>> allows to throttle the reading
>>> speed but it does not allow to speed things up arbitrarily as the READ
>>> MESSAGE must be issued .. which takes time :-(
>>>
>>> Aside from that I must provide another managing layer which handles the
>>> reading actors. Therefore I'd love to use the mailbox approach but find it
>>> quite annoying to send in control messages.
>>>
>>> With kind regards,
>>>   Christian
>>>
>>> --
>>> >> Read the docs: http://akka.io/docs/
>>> >> Check the FAQ: http://doc.akka.io/docs/akka/
>>> current/additional/faq.html
>>> >> Search the archives: https://groups.google.com/
>>> group/akka-user
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop recei

Re: [akka-user] Programmatic mailbox configuration during runtime

2014-08-26 Thread Christian Kreutzfeldt

Hi Viktor,

thanks for the quick reply. What I am concerned about is the way how to 
read from websocket .. kafka is easy as I simply pull data and it does not 
matter when I do that, I always receive the next message. But when it comes 
to websockets, the server pushes data and I need to forward the data 
somehow. When I do not consume the message, it is lost. I use the 
WebSocketClient provided by the Jetty framework but that one internally 
spawns a thread reading from the socket forwarding data to a callback 
provided by me. As far as I understood, spawning threads inside actors 
isn't a great idea :-) ...or am I wrong? Do you have any suggestion how to 
solve this or point me to a resource which provides any information about 
how to handle similar problems ... unfortunately google does not help in 
that case :-(

Regarding the kafka consumer, actually I have an implementation which does 
exactly what you described: read single message from kafka, forward 
message, issue next read request via scheduler / directly 
getSelf().tell(..), wait for read request, read single message from kafka 
 I could keep on reading forever from kafka but that would block the 
actor. But unfortunately, I have to wait for the next read request to 
access kafka. Somehow it is a nice approach especially if a processing 
actor issues that message since it ensures that it has resources available 
for processing the next message. On the other hand it slows down the whole 
process and leaves gaps where nothing is processed. From a software 
architecture aspect this is a clean solution but I need to process data 
quite fast ;-)

Kind regards,
  Christian

Am Dienstag, 26. August 2014 11:22:17 UTC+2 schrieb √:
>
> Hi Christian,
>
> Reading from websockets or kafka is not something that should be done by 
> the mailbox, it's something an actor should do. Then your consumer-actor 
> (the one which you wanted to switch mailboxes on) can ask the 
> websocket/kafka actor for the next message (pull) or have the 
> websocket/kafka actor send the messages to the consumer-actor.
>
> Does that help?
>
>
> On Tue, Aug 26, 2014 at 11:14 AM, Christian Kreutzfeldt  > wrote:
>
>> Hi
>>
>> I am working on a streaming project where I need to consume different 
>> source types on demand. To comply with the akka model I decided to 
>> implement dedicated mailbox types which
>> establish a connection with the desired source and read data from it, eg. 
>> kafka topics or websockets. As I do not know which source to consume before 
>> starting the application I cannot
>> provide the full configuration parameters but need to set them during 
>> runtime. 
>>
>> What I would like to know is, if there is a chance to get configuration 
>> data into a mailbox when it is instantiated. The data must not be read from 
>> a configuration file but may be provided
>> by the caller on demand. 
>>
>> The only solution I actually thought of was to ramp up the actor + 
>> mailbox without any configuration and send a message holding the 
>> configuration right after its instantiation. The mailbox
>> sees the message, extracts the information and configures the source 
>> listener as required. The message might be removed from the mailbox or be 
>> provided to the actor - depends on 
>> spec. 
>>
>> In some way I dislike this solution as it makes me react to messages on 
>> two different layers: the mailbox and the actor. 
>>
>> I tried to establish the source connection from within an actor, 
>> scheduled READ MESSAGES to itself and on receiving this message type the 
>> actor takes a single message from
>> the source. The drawback of this solution is a reduced throughput as each 
>> READ MESSAGE must be awaited and then the source must be consumed. It 
>> allows to throttle the reading
>> speed but it does not allow to speed things up arbitrarily as the READ 
>> MESSAGE must be issued .. which takes time :-(
>>
>> Aside from that I must provide another managing layer which handles the 
>> reading actors. Therefore I'd love to use the mailbox approach but find it 
>> quite annoying to send in control messages.
>>
>> With kind regards,
>>   Christian
>>
>> -- 
>> >> Read the docs: http://akka.io/docs/
>> >> Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to akka-user+...@googlegroups.com .
>> To post to this group, send email to akka...@googlegroups.com 
>> .
>> Visit this group at http://groups.google.com/group/akka-user.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Cheers,
> √
>  

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current

[akka-user] Start actor with dynamically created dispatcher

2014-08-26 Thread volker . stampa

Hi,

is it possible to start an actor with a dynamically created dispatcher? I 
know that a dispatcher can be configured 

 per 
actor, however this dispatcher is created and configured statically. I 
rather look for something like:

def createExecutorService(localContext: ...): ExecutorService = ...
val localContext = ...
context.actorOf(someProps.withDispatcher(createExecutorService(localContext
)))

The use-case is using secure impersonation when interacting with hadoop. 
When interacting with hadoop I dynamically create an actor (that internally 
spawns Futures on its dispatcher). For secure impersonation each code 
section that interacts with hadoop has to be wrapped in something like:

UserGroupInformation.createProxyUser(userName, UserGroupInformation.
getLoginUser).doAs {
  new PrivilegedAction[A] {
def run() = { ... }
  }
}

To avoid spreading this over the actor's code a dynamically created 
dispatcher could help (in this case the userName is the localContext). If 
that is configured with a ThreadFactory like this:

val tf: ThreadFactory = _
new ThreadFactory {
  override def newThread(r: Runnable) = {
tf.newThread(new Runnable {
  override def run() = {
UserGroupInformation.createProxyUser(userName, UserGroupInformation.
getLoginUser).doAs(new PrivilegedAction[Unit] {
  override def run() = r.run()
})
  }
})
  }
}

all code executed by the actor (and its spawned Futures) would 
automatically be executed on behalf of a given hadoop-user.

Thanks for any help/hints/pointers,
Volker

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Programmatic mailbox configuration during runtime

Hi Christian,

Reading from websockets or kafka is not something that should be done by
the mailbox, it's something an actor should do. Then your consumer-actor
(the one which you wanted to switch mailboxes on) can ask the
websocket/kafka actor for the next message (pull) or have the
websocket/kafka actor send the messages to the consumer-actor.

Does that help?


On Tue, Aug 26, 2014 at 11:14 AM, Christian Kreutzfeldt 
wrote:

> Hi
>
> I am working on a streaming project where I need to consume different
> source types on demand. To comply with the akka model I decided to
> implement dedicated mailbox types which
> establish a connection with the desired source and read data from it, eg.
> kafka topics or websockets. As I do not know which source to consume before
> starting the application I cannot
> provide the full configuration parameters but need to set them during
> runtime.
>
> What I would like to know is, if there is a chance to get configuration
> data into a mailbox when it is instantiated. The data must not be read from
> a configuration file but may be provided
> by the caller on demand.
>
> The only solution I actually thought of was to ramp up the actor + mailbox
> without any configuration and send a message holding the configuration
> right after its instantiation. The mailbox
> sees the message, extracts the information and configures the source
> listener as required. The message might be removed from the mailbox or be
> provided to the actor - depends on
> spec.
>
> In some way I dislike this solution as it makes me react to messages on
> two different layers: the mailbox and the actor.
>
> I tried to establish the source connection from within an actor, scheduled
> READ MESSAGES to itself and on receiving this message type the actor takes
> a single message from
> the source. The drawback of this solution is a reduced throughput as each
> READ MESSAGE must be awaited and then the source must be consumed. It
> allows to throttle the reading
> speed but it does not allow to speed things up arbitrarily as the READ
> MESSAGE must be issued .. which takes time :-(
>
> Aside from that I must provide another managing layer which handles the
> reading actors. Therefore I'd love to use the mailbox approach but find it
> quite annoying to send in control messages.
>
> With kind regards,
>   Christian
>
> --
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to akka-user+unsubscr...@googlegroups.com.
> To post to this group, send email to akka-user@googlegroups.com.
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Cheers,
√

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

[akka-user] Programmatic mailbox configuration during runtime

2014-08-26 Thread Christian Kreutzfeldt

Hi

I am working on a streaming project where I need to consume different 
source types on demand. To comply with the akka model I decided to 
implement dedicated mailbox types which
establish a connection with the desired source and read data from it, eg. 
kafka topics or websockets. As I do not know which source to consume before 
starting the application I cannot
provide the full configuration parameters but need to set them during 
runtime. 

What I would like to know is, if there is a chance to get configuration 
data into a mailbox when it is instantiated. The data must not be read from 
a configuration file but may be provided
by the caller on demand. 

The only solution I actually thought of was to ramp up the actor + mailbox 
without any configuration and send a message holding the configuration 
right after its instantiation. The mailbox
sees the message, extracts the information and configures the source 
listener as required. The message might be removed from the mailbox or be 
provided to the actor - depends on 
spec. 

In some way I dislike this solution as it makes me react to messages on two 
different layers: the mailbox and the actor. 

I tried to establish the source connection from within an actor, scheduled 
READ MESSAGES to itself and on receiving this message type the actor takes 
a single message from
the source. The drawback of this solution is a reduced throughput as each 
READ MESSAGE must be awaited and then the source must be consumed. It 
allows to throttle the reading
speed but it does not allow to speed things up arbitrarily as the READ 
MESSAGE must be issued .. which takes time :-(

Aside from that I must provide another managing layer which handles the 
reading actors. Therefore I'd love to use the mailbox approach but find it 
quite annoying to send in control messages.

With kind regards,
  Christian

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Re: Periodic "Disassociated" with remote system