Re: [infinispan-dev] design for cluster events (wiki page)

Mircea Markus Fri, 01 Nov 2013 09:57:27 -0700

Exact same page, thanks for clarifications.
> On 1 Nov 2013, at 16:05, Sanne Grinovero <[email protected]> wrote:
> 
>> On 1 November 2013 11:56, Mircea Markus <[email protected]> wrote:
>> 
>>> On Oct 31, 2013, at 10:20 PM, Sanne Grinovero <[email protected]> wrote:
>>> 
>>>> On 31 October 2013 20:07, Mircea Markus <[email protected]> wrote:
>>>> 
>>>>> On Oct 31, 2013, at 3:45 PM, Dennis Reed <[email protected]> wrote:
>>>>> 
>>>>>> On 10/31/2013 02:18 AM, Bela Ban wrote:
>>>>>> 
>>>>>>> Also if we did have read only, what criteria would cause those nodes
>>>>>>> to be writeable again?
>>>>>> Once you become the primary partition, e.g. when a view is received
>>>>>> where view.size() >= N where N is a predefined threshold. Can be
>>>>>> different, as long as it is deterministic.
>>>>>> 
>>>>>>> There is no guarantee when the other nodes
>>>>>>> will ever come back up or if there will ever be additional ones anytime 
>>>>>>> soon.
>>>>>> If a system picks the Primary Partition approach, then it can become
>>>>>> completely inaccessible (read-only). In this case, I envisage that a
>>>>>> sysadmin will be notified, who can then start additional nodes for the
>>>>>> system to acquire primary partition and become accessible again.
>>>>> 
>>>>> There should be a way to manually modify the primary partition status.
>>>>> So if the admin knows the nodes will never return, they can manually
>>>>> enable the partition.
>>>> 
>>>> The status will be exposed through JMX at any point, disregarding if 
>>>> there's a split brain going on or not.
>>>> 
>>>>> 
>>>>> Also, the PartitionContext should know whether the nodes left normally
>>>>> or not.
>>>>> If you have 5 nodes in a cluster, and you shut down 3 of them, you'll
>>>>> want the remaining two to remain available.
>>>>> But if there was a network partition, you wouldn't.  So it needs to know
>>>>> the difference.
>>>> 
>>>> very good point again.
>>>> Thank you Dennis!
>>> 
>>> Let's clarify. If 3 nodes out of 5 are killed without a
>>> reconfiguration, you do NOT want the remaining two to remain available
>>> unless explicitly told so by an admin. It is not possible to
>>> automatically make a distinction between 3 nodes being shut down vs. 3
>>> crashed nodes.
>> 
>> I'm not sure you can make this generalization: it's really up to the 
>> implementor of PartitionHandlingStrategy to decide that. Knowing whether it 
>> was a clean shutdown or not might be relevant to that decision. I think the 
>> focus of this functionality should be on how exactly to react to partitions 
>> happening, but provide the hooks for the user to make that decision and act 
>> on system's availability.
> 
> We're on the same page on that, I'm just stressing that there is not
> automatic way that we can make a distinction between crash or
> intentional shutdown, if we don't have the "clean shutdown" method
> like Bela also reminded.
> 
>> 
>>> 
>>> In our face to face meeting we did point out that an admin needs hooks
>>> to be able to:
>>> - specify how many nodes are expected in the full system (and adapt
>>> dynamically)
>> 
>> yes, that's an custom implementation of PartitionHandlingStrategy. One we 
>> might provide out of the box.
> 
> Right it could be part of the default PartitionHandlingStrategy but I
> think all strategies might be interested in this, and that it's
> Infinispan (core) responsibility to also provide ways to admin the
> expected view at runtime.
> 
> 
>>> - some admin command to "clean shutdown" a node (which was also
>>> discussed as a strong requirement in scope of CacheStores so I'm
>>> assuming the operation is defined already)
>>> 
>>> The design Wiki has captured the API we discussed around the
>>> PartitionHandlingStrategy but is missing the details about these
>>> operations, that should probably be added to the PartitionContext as
>>> well.
>> 
>> The PartitionContext allows a partition to be marked as unavailable, I think 
>> that should do.
> 
> You also need the "clean shutdown", very likely with the RPC suggested by 
> Bela.
> 
> 
>>> Also in the scope of CacheStore consistency we had discussed the need
>>> to store the expected nodes to be in the View: for example when the
>>> grid is started and all nodes are finding each other, the Cache shall
>>> not be considered started until all required nodes have joined.
>> 
>> the discussion is here: 
>> https://community.jboss.org/wiki/ControlledClusterShutdownWithDataRestoreFromPersistentStorage
>> 
>>> 
>>> Cheers,
>>> Sanne
>>> _______________________________________________
>>> infinispan-dev mailing list
>>> [email protected]
>>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
>> 
>> Cheers,
>> --
>> Mircea Markus
>> Infinispan lead (www.infinispan.org)
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> infinispan-dev mailing list
>> [email protected]
>> https://lists.jboss.org/mailman/listinfo/infinispan-dev
> 
> _______________________________________________
> infinispan-dev mailing list
> [email protected]
> https://lists.jboss.org/mailman/listinfo/infinispan-dev


_______________________________________________
infinispan-dev mailing list
[email protected]
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] design for cluster events (wiki page)

Reply via email to