Re: [Discuss] Cloud Controller Clustering Model

Akila Ravihansa Perera Thu, 27 Nov 2014 23:39:54 -0800

Hi,

According to this design Autoscaler (AS)/Stratos Manager (SM) will talk to
>> Cloud Controller (CC) via the Cloud Controller Service endpoint exposed via
>> the load balancer.
>>
>> *Data Replication*
>> When a request comes into one of the CC instances it will execute the
>> necessary actions and update the data holder and/or topology which is in
>> memory. At this point the data holder changes will be replicated to other
>> instances using a distributed map. Once the coordinator receives the above
>> updates it will persist the changes to the registry database.
>>
>
> Are we sending a notification (cluster message) when the distributed map
> updated?
>


This is handled by Hazelcast OOTB right?


>> In this design we might not need to replicate the topology since it is
>> already there in the message broker. The idea is to let coordinator publish
>> the topology changes and the other members to listen to it.
>>
>
So that means worker nodes listen to the topology as well as cluster
messages? I think we need to clarify this model a bit more.


>
> This would add a latency for the events. What are the issues we would
> face, when each node sends out the event? Of course, the complete topology
> should only be sent out by the Coordinator.
>

Sending out multiple topology events (for eg - MemberActivated,
MemberTerminated) will trigger many listeners multiple times, and that's
probably not a good idea. Or did you mean something else here, sorry I'm
bit confused.


> Also, we need to make CC data publishers activated only when a node is the
> Coordinator.
>
> Further, only the Coordinator should react to the Instance status events
> etc. IMO.
>

I think this might result in an inconsistent state if the coordinator fails
while processing an instance status event (or any other event for that
matter). Perhaps we can implement a notifier cluster message to indicate
whether incoming events are processed successfully. If the coordinator
fails, the next elected coordinator should be able to pick up from the last
successful event handled.


> There's a cache to hold the validated partitions of a Cartridge, we need
> to use a distributed hash map for that too.
>

+1


>> Please add your thoughts.
>>
>> Thanks
>>
>>
>> --
>> Imesh Gunaratne
>>
>> Technical Lead, WSO2
>> Committer & PMC Member, Apache Stratos
>>
>
>
>
> --
> Best Regards,
> Nirmal
>
> Nirmal Fernando.
> PPMC Member & Committer of Apache Stratos,
> Senior Software Engineer, WSO2 Inc.
>
> Blog: http://nirmalfdo.blogspot.com/
>



-- 
Akila Ravihansa Perera
Software Engineer, WSO2

Blog: http://ravihansa3000.blogspot.com

Re: [Discuss] Cloud Controller Clustering Model

Reply via email to