Re: [akka-user] Shared cluster shard allocation strategy

Patrik Nordwall Thu, 22 Oct 2015 04:29:44 -0700

Hi Vikas,

Hard to remember what I was thinking about almost a year ago. It must have
been something like:


create a ConsistentHash[ActorRef]
<https://github.com/akka/akka/blob/master/akka-actor/src/main/scala/akka/routing/ConsistentHash.scala>from
the region actor references that comes from the currentShardAllocations
parameter and then pick the  region to use by
consistentHash.nodeFor(shardId), given that the shardId for releated Orders
and Customers is the same.

Since the node ring (the region actor refs) may change between invocations
of allocateShard for Orders and Customers this is only a best effort
approximation, but I think it should work rather well in practise.

Cheers,
Patrik

On Tue, Oct 20, 2015 at 5:16 AM, <vi...@eero.com> wrote:

> Hi Patrik,
>
> Would you be able to provide more info on how to do co-location with
> consistent hashing? That would be very helpful!
>
> Thanks,
> Vikas
>
> On Saturday, November 22, 2014 at 8:08:04 AM UTC-8, Patrik Nordwall wrote:
>>
>> Hi Moritz,
>>
>> On Wed, Nov 19, 2014 at 3:27 PM, Moritz Schallaböck <
>> moritz.sc...@agido.com> wrote:
>>
>>> Hello fellow hAkkers,
>>>
>>> we have multiple persistent actor types distributed using cluster
>>> sharding. Some of them logically belong together, lets say they're
>>> customers and their orders. Customers never talk to orders of other
>>> customers, and vice versa. Thus it makes sense to us to have these actors
>>> reside on the same cluster shard (and consequently, in the same VM).
>>>
>>> We implemented this by returning identical ShardIds for the customer
>>> c123 and its orders c123-o0, c123-o1, etc. But of course, this doesn't work
>>> like we thought it would. :) The ShardResolvers of two instances of
>>> ShardRegion operate independently, and we just end up with two shards --
>>> one for customers and one for orders -- which share a name but not
>>> necessarily a cluster host. I have seen this misunderstanding crop up a few
>>> times before on this list, which makes it slightly less embarrasing to
>>> admit the mistake. ;)
>>>
>>> We could stop using cluster sharding for the orders completely, and
>>> instead route all messages for the orders through the customers, which
>>> would restart the actors on demand. But that sounds like a lot of
>>> extraneous code: many other actors talk to the orders[0], and the customers
>>> shouldn't need to route these messages or worry about them, the customer
>>> actors need not even be alive for them. And we'd also have to worry about
>>> the other things that cluster sharding does: support for passivation of
>>> orders, gracefully handling rebalances of customers (killing all order
>>> actors when it happens, I guess), maybe other things.
>>>
>>> [0] I realize that this will lead to the question: if many other actors
>>> talk to the orders without involving the customers, why do you want them on
>>> the same host? Lets just assume for the sake of argument that circumstances
>>> make this a reasonable requirement, unless you're saying it's not a
>>> reasonable requirement under any circumstances.
>>>
>>> The alternative would involve writing a custom ShardAllocationStrategy
>>> that's shared among the customer and order ShardRegions. I suppose it would
>>> involve the following:
>>>  - maintain the associations between ShardRegion actorRef and ShardIds
>>> for each entity type;
>>>  1. for a new requested allocation for entity type X:
>>>  2. check if the same shardId is already allocated for any other entity
>>> type Y, yielding (at least one) associated shardRegionActorRefY
>>>  3. if so, determine if there is any shardRegionActorX for entity type X
>>> that's on the same host as shardRegionActorRefY
>>>  4. if so, allocate the shardId to shardRegionActorX (ie. return it;
>>> optionally balance between several candidates)
>>>  5. otherwise, fallback to any other ShardAllocationStrategy (updating
>>> the associations based on its return value)
>>>
>>> Eugh. I feel dirty now. Apart from the general horrificness, I imagine
>>> step 3 is fraught with peril. And of course, the whole thing would need to
>>> be thread-safe because it will be accessed and modified concurrently by
>>> several ShardRegions. (Time to dust off ye olde ConcurrentHashMap.) The
>>> more I look at it, the more fragile and less feasible it seems.
>>>
>>
>> Yes, there are a lot of pitfalls in that approach
>> One more that you perhaps didn't think of is that the shardRegionActorX
>> for entity type X might have been allocated and then later
>> shardRegionActorY for entity type Y is to be allocated by a coordinator
>> running in a different JVM (because of a crash). Then the shared
>> ShardAllocationStrategy has no information about the previous
>> shardRegionActorX.
>>
>> The design of cluster sharding is based on that each entity type can be
>> managed independent of other entity types. If Customer and Order have a
>> tight coupling I think they should be modelled as one aggregate type.
>>
>>
>>>
>>> At the same time, having this sort of control over the clustering of
>>> several entity types does not seem particularly outrageous. Are we missing
>>> something?
>>>
>>
>> You could do a best effort co-location of associated customer and order
>> shards by using consistent hashing in the ShardAllocationStrategy. I can
>> explain more if you find that interesting.
>>
>> Cheers,
>> Patrik
>>
>>
>>>
>>> Thanks as always for your thoughts,
>>> Moritz
>>>
>>> --
>>> >>>>>>>>>> Read the docs: http://akka.io/docs/
>>> >>>>>>>>>> Check the FAQ:
>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>> >>>>>>>>>> Search the archives:
>>> https://groups.google.com/group/akka-user
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "Akka User List" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to akka-user+...@googlegroups.com.
>>> To post to this group, send email to akka...@googlegroups.com.
>>> Visit this group at http://groups.google.com/group/akka-user.
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>>
>> Patrik Nordwall
>> Typesafe <http://typesafe.com/> -  Reactive apps on the JVM
>> Twitter: @patriknw
>>
>> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to akka-user+unsubscr...@googlegroups.com.
> To post to this group, send email to akka-user@googlegroups.com.
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>



-- 

Patrik Nordwall
Typesafe <http://typesafe.com/> -  Reactive apps on the JVM
Twitter: @patriknw

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] Shared cluster shard allocation strategy

Reply via email to