Re: [akka-user] Re: Cassandra Time Stamp Problem and How Akka can help?

2017-03-17 Thread Justin du coeur
On Fri, Mar 17, 2017 at 7:29 AM, kant kodali  wrote:

> Do I need to specify how many nodes or shards I want to distribute to ?
>

Not by number, but IIRC you can assign particular roles to nodes, and have
those roles determine what sorts of things get distributed to those nodes.

Note that you *do* implicitly have to pre-decide how many shards to break a
given concept down to.  But a given node typically hosts a number of
shards, and that rebalances dynamically.


> Node can go up and down right.. Can the Akka cluster discover how many
> nodes are available at any given time?
>

That's essentially what clustering does, yes.  There's an underlying gossip
mechanism, so that all nodes have a rough idea of all of the others at any
given time.


> Also, Why should I manually down the node? I know that there is a failure
> detector so if the Akka cluster "thinks" a node is dead then why cant it
> simply distribute that region to other actors?
>

Problem is, there's a lot of judgement call involved in deciding whether a
node is just temporarily unavailable due to a network failure or is
actually down.  Getting this wrong has *serious* consequences, and can lead
to data corruption.  Akka per se doesn't make that decision, although
Lightbend does sell a product named Split Brain Resolver that provides a
fairly sophisticated algorithm to make the decision.

Basically, Akka doesn't say whether it thinks a node is *down*, it just
knows that it is temporarily unavailable.  You have to decide when that
actually means "down".

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.


Re: [akka-user] Re: Cassandra Time Stamp Problem and How Akka can help?

2017-03-17 Thread kant kodali
Do I need to specify how many nodes or shards I want to distribute to ? 
Node can go up and down right.. Can the Akka cluster discover how many 
nodes are available at any given time? Also, Why should I manually down the 
node? I know that there is a failure detector so if the Akka cluster 
"thinks" a node is dead then why cant it simply distribute that region to 
other actors?

Thanks!


On Thursday, March 16, 2017 at 9:46:10 AM UTC-7, Justin du coeur wrote:
>
> Look at it this way -- typically, you're generating events about some 
> *thing*, which corresponds to the key you're using in Cassandra.  That's 
> the "entity" I'm talking about, and typically it would have a single Actor 
> in Akka, to which all events are being sent.  That Actor mediates all the 
> Cassandra reads and writes about that entity without thread contention, so 
> you don't have to worry about race conditions.  If the entity isn't being 
> used continually, you can allow it to passivate (go to sleep and stop using 
> memory) after a timeout, and have it automatically revive (based on the 
> event history) when it is next needed.
>
> And yes, Akka Cluster Sharding is smart about dealing with it when the 
> node dies -- so long as you "down" the node (telling Akka that yes, this 
> node is considered dead), it will move the shard to another node as 
> necessary.  It's a pretty mature system for dealing with this sort of stuff.
>
> I don't have a straightforward example myself (my system uses all of this, 
> but is fairly complex) -- anybody else have a good example to point to?
>
> On Wed, Mar 15, 2017 at 7:29 PM, kant kodali  > wrote:
>
>> What is each Entity if I may ask? By Entity you mean Actor? If I shard 
>> messages across group of actors or actor systems through some user 
>> specified function and say an actor or actor system(a node) dies then Does 
>> Akka redirect that shard to other actors (more like rebalancing) ? Any 
>> simple example somewhere I can take a look please?
>>
>> Thanks!
>>
>>
>>
>>
>> On Tuesday, March 14, 2017 at 4:52:43 AM UTC-7, kant kodali wrote:
>>>
>>> Hi All,
>>>
>>> I have Kafka as my live streaming source of data (This data isn't 
>>>  really events but rather just messages with a state) and I want to insert 
>>> this data into Cassandra but I have the following problem.
>>>
>>> Cassandra uses Last Write Wins Strategy using timestamps to resolve 
>>> conflicting writes. 
>>>
>>> By default, Cassandra enables server side timestamps and they are 
>>> monotonic per node. other words two nodes can produce the same timestamp 
>>> (although not often). So if they are two writes that go to two different 
>>> coordinator nodes and are trying to update the same Cassandra partition one 
>>> write will overwrite the other (we cannot deterministically say which one). 
>>> But from the user perspective it would look like both writes were 
>>> successful although we lost the state of one write request (Widely known 
>>> word to describe this anomaly is called "LOST UPDATES").  So if one doesn't 
>>> want this to happen Cassandra recommends to use client side timestamps but 
>>> we can run into the same problem in the following scenario.
>>>
>>> Client side Cassandra timestamps are monotonic by each client (By client 
>>> I mean think of it as a process that uses Cassandra driver API) so if one 
>>> has multiple processes which each of them having Cassandra driver api then 
>>> they can generate a same time stamp (although not often) and are trying to 
>>> update the same Cassandra partition then we will run into the same problem 
>>> as above. And multiple processes talking to Cassandra is very common in the 
>>> industry. In my case these multiple processes will be Kafka Consumers which 
>>> will consume data from Kafka and insert it into Cassandra. 
>>>
>>> If one of the two writes that are contending fails and other succeeds 
>>> such that a failed write can automatically retry using some mechanism in 
>>> Akka (then it will be an acceptable solution) but how do we do that?
>>>
>>> I somehow think there might be a nice reactive pattern using Akka 
>>> whether it is sharding or something else can help me solve this problem?
>>>
>>> Disclaimer: I am new to Akka and trying to put in lot of effort to learn 
>>> as quickly as possible so I will be open and thankful to any new ideas on 
>>> how to solve this problem in a scalable way as possible?
>>>
>>> Thanks,
>>> kant
>>>
>> -- 
>> >> Read the docs: http://akka.io/docs/
>> >> Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>> >> Search the archives: https://groups.google.com/group/akka-user
>> --- 
>> You received this message because you are subscribed to the Google Groups 
>> "Akka User List" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to akka-user+...@googlegroups.com .
>> To post to this group, send email to akka...@googlegroups.com 

Re: [akka-user] Re: Cassandra Time Stamp Problem and How Akka can help?

2017-03-16 Thread Justin du coeur
Look at it this way -- typically, you're generating events about some
*thing*, which corresponds to the key you're using in Cassandra.  That's
the "entity" I'm talking about, and typically it would have a single Actor
in Akka, to which all events are being sent.  That Actor mediates all the
Cassandra reads and writes about that entity without thread contention, so
you don't have to worry about race conditions.  If the entity isn't being
used continually, you can allow it to passivate (go to sleep and stop using
memory) after a timeout, and have it automatically revive (based on the
event history) when it is next needed.

And yes, Akka Cluster Sharding is smart about dealing with it when the node
dies -- so long as you "down" the node (telling Akka that yes, this node is
considered dead), it will move the shard to another node as necessary.
It's a pretty mature system for dealing with this sort of stuff.

I don't have a straightforward example myself (my system uses all of this,
but is fairly complex) -- anybody else have a good example to point to?

On Wed, Mar 15, 2017 at 7:29 PM, kant kodali  wrote:

> What is each Entity if I may ask? By Entity you mean Actor? If I shard
> messages across group of actors or actor systems through some user
> specified function and say an actor or actor system(a node) dies then Does
> Akka redirect that shard to other actors (more like rebalancing) ? Any
> simple example somewhere I can take a look please?
>
> Thanks!
>
>
>
>
> On Tuesday, March 14, 2017 at 4:52:43 AM UTC-7, kant kodali wrote:
>>
>> Hi All,
>>
>> I have Kafka as my live streaming source of data (This data isn't  really
>> events but rather just messages with a state) and I want to insert this
>> data into Cassandra but I have the following problem.
>>
>> Cassandra uses Last Write Wins Strategy using timestamps to resolve
>> conflicting writes.
>>
>> By default, Cassandra enables server side timestamps and they are
>> monotonic per node. other words two nodes can produce the same timestamp
>> (although not often). So if they are two writes that go to two different
>> coordinator nodes and are trying to update the same Cassandra partition one
>> write will overwrite the other (we cannot deterministically say which one).
>> But from the user perspective it would look like both writes were
>> successful although we lost the state of one write request (Widely known
>> word to describe this anomaly is called "LOST UPDATES").  So if one doesn't
>> want this to happen Cassandra recommends to use client side timestamps but
>> we can run into the same problem in the following scenario.
>>
>> Client side Cassandra timestamps are monotonic by each client (By client
>> I mean think of it as a process that uses Cassandra driver API) so if one
>> has multiple processes which each of them having Cassandra driver api then
>> they can generate a same time stamp (although not often) and are trying to
>> update the same Cassandra partition then we will run into the same problem
>> as above. And multiple processes talking to Cassandra is very common in the
>> industry. In my case these multiple processes will be Kafka Consumers which
>> will consume data from Kafka and insert it into Cassandra.
>>
>> If one of the two writes that are contending fails and other succeeds
>> such that a failed write can automatically retry using some mechanism in
>> Akka (then it will be an acceptable solution) but how do we do that?
>>
>> I somehow think there might be a nice reactive pattern using Akka whether
>> it is sharding or something else can help me solve this problem?
>>
>> Disclaimer: I am new to Akka and trying to put in lot of effort to learn
>> as quickly as possible so I will be open and thankful to any new ideas on
>> how to solve this problem in a scalable way as possible?
>>
>> Thanks,
>> kant
>>
> --
> >> Read the docs: http://akka.io/docs/
> >> Check the FAQ: http://doc.akka.io/docs/akka/
> current/additional/faq.html
> >> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to akka-user+unsubscr...@googlegroups.com.
> To post to this group, send email to akka-user@googlegroups.com.
> Visit this group at https://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this 

[akka-user] Re: Cassandra Time Stamp Problem and How Akka can help?

2017-03-15 Thread kant kodali
What is each Entity if I may ask? By Entity you mean Actor? If I shard 
messages across group of actors or actor systems through some user 
specified function and say an actor or actor system(a node) dies then Does 
Akka redirect that shard to other actors (more like rebalancing) ? Any 
simple example somewhere I can take a look please?

Thanks!



On Tuesday, March 14, 2017 at 4:52:43 AM UTC-7, kant kodali wrote:
>
> Hi All,
>
> I have Kafka as my live streaming source of data (This data isn't  really 
> events but rather just messages with a state) and I want to insert this 
> data into Cassandra but I have the following problem.
>
> Cassandra uses Last Write Wins Strategy using timestamps to resolve 
> conflicting writes. 
>
> By default, Cassandra enables server side timestamps and they are 
> monotonic per node. other words two nodes can produce the same timestamp 
> (although not often). So if they are two writes that go to two different 
> coordinator nodes and are trying to update the same Cassandra partition one 
> write will overwrite the other (we cannot deterministically say which one). 
> But from the user perspective it would look like both writes were 
> successful although we lost the state of one write request (Widely known 
> word to describe this anomaly is called "LOST UPDATES").  So if one doesn't 
> want this to happen Cassandra recommends to use client side timestamps but 
> we can run into the same problem in the following scenario.
>
> Client side Cassandra timestamps are monotonic by each client (By client I 
> mean think of it as a process that uses Cassandra driver API) so if one has 
> multiple processes which each of them having Cassandra driver api then they 
> can generate a same time stamp (although not often) and are trying to 
> update the same Cassandra partition then we will run into the same problem 
> as above. And multiple processes talking to Cassandra is very common in the 
> industry. In my case these multiple processes will be Kafka Consumers which 
> will consume data from Kafka and insert it into Cassandra. 
>
> If one of the two writes that are contending fails and other succeeds such 
> that a failed write can automatically retry using some mechanism in Akka 
> (then it will be an acceptable solution) but how do we do that?
>
> I somehow think there might be a nice reactive pattern using Akka whether 
> it is sharding or something else can help me solve this problem?
>
> Disclaimer: I am new to Akka and trying to put in lot of effort to learn 
> as quickly as possible so I will be open and thankful to any new ideas on 
> how to solve this problem in a scalable way as possible?
>
> Thanks,
> kant
>

-- 
>>  Read the docs: http://akka.io/docs/
>>  Check the FAQ: 
>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>  Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.