Re: [infinispan-dev] design for cluster events (wiki page)

Sanne Grinovero Fri, 01 Nov 2013 09:06:43 -0700

On 1 November 2013 11:56, Mircea Markus <[email protected]> wrote:
>
> On Oct 31, 2013, at 10:20 PM, Sanne Grinovero <[email protected]> wrote:
>
>> On 31 October 2013 20:07, Mircea Markus <[email protected]> wrote:
>>>
>>> On Oct 31, 2013, at 3:45 PM, Dennis Reed <[email protected]> wrote:
>>>
>>>> On 10/31/2013 02:18 AM, Bela Ban wrote:
>>>>>
>>>>>> Also if we did have read only, what criteria would cause those nodes
>>>>>> to be writeable again?
>>>>> Once you become the primary partition, e.g. when a view is received
>>>>> where view.size() >= N where N is a predefined threshold. Can be
>>>>> different, as long as it is deterministic.
>>>>>
>>>>>> There is no guarantee when the other nodes
>>>>>> will ever come back up or if there will ever be additional ones anytime 
>>>>>> soon.
>>>>> If a system picks the Primary Partition approach, then it can become
>>>>> completely inaccessible (read-only). In this case, I envisage that a
>>>>> sysadmin will be notified, who can then start additional nodes for the
>>>>> system to acquire primary partition and become accessible again.
>>>>
>>>> There should be a way to manually modify the primary partition status.
>>>> So if the admin knows the nodes will never return, they can manually
>>>> enable the partition.
>>>
>>> The status will be exposed through JMX at any point, disregarding if 
>>> there's a split brain going on or not.
>>>
>>>>
>>>> Also, the PartitionContext should know whether the nodes left normally
>>>> or not.
>>>> If you have 5 nodes in a cluster, and you shut down 3 of them, you'll
>>>> want the remaining two to remain available.
>>>> But if there was a network partition, you wouldn't.  So it needs to know
>>>> the difference.
>>>
>>> very good point again.
>>> Thank you Dennis!
>>
>> Let's clarify. If 3 nodes out of 5 are killed without a
>> reconfiguration, you do NOT want the remaining two to remain available
>> unless explicitly told so by an admin. It is not possible to
>> automatically make a distinction between 3 nodes being shut down vs. 3
>> crashed nodes.
>
> I'm not sure you can make this generalization: it's really up to the 
> implementor of PartitionHandlingStrategy to decide that. Knowing whether it 
> was a clean shutdown or not might be relevant to that decision. I think the 
> focus of this functionality should be on how exactly to react to partitions 
> happening, but provide the hooks for the user to make that decision and act 
> on system's availability.


We're on the same page on that, I'm just stressing that there is not
automatic way that we can make a distinction between crash or
intentional shutdown, if we don't have the "clean shutdown" method
like Bela also reminded.

>
>>
>> In our face to face meeting we did point out that an admin needs hooks
>> to be able to:
>> - specify how many nodes are expected in the full system (and adapt
>> dynamically)
>
> yes, that's an custom implementation of PartitionHandlingStrategy. One we 
> might provide out of the box.

Right it could be part of the default PartitionHandlingStrategy but I
think all strategies might be interested in this, and that it's
Infinispan (core) responsibility to also provide ways to admin the
expected view at runtime.


>> - some admin command to "clean shutdown" a node (which was also
>> discussed as a strong requirement in scope of CacheStores so I'm
>> assuming the operation is defined already)
>>
>> The design Wiki has captured the API we discussed around the
>> PartitionHandlingStrategy but is missing the details about these
>> operations, that should probably be added to the PartitionContext as
>> well.
>
> The PartitionContext allows a partition to be marked as unavailable, I think 
> that should do.

You also need the "clean shutdown", very likely with the RPC suggested by Bela.


>> Also in the scope of CacheStore consistency we had discussed the need
>> to store the expected nodes to be in the View: for example when the
>> grid is started and all nodes are finding each other, the Cache shall
>> not be considered started until all required nodes have joined.
>
> the discussion is here: 
> https://community.jboss.org/wiki/ControlledClusterShutdownWithDataRestoreFromPersistentStorage
>
>>
>> Cheers,
>> Sanne
>> _______________________________________________
>> infinispan-dev mailing list
>> [email protected]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>
> Cheers,
> --
> Mircea Markus
> Infinispan lead (www.infinispan.org)
>
>
>
>
>
> _______________________________________________
> infinispan-dev mailing list
> [email protected]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev

_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] design for cluster events (wiki page)

Reply via email to