On 1 November 2013 11:56, Mircea Markus <[email protected]> wrote: > > On Oct 31, 2013, at 10:20 PM, Sanne Grinovero <[email protected]> wrote: > >> On 31 October 2013 20:07, Mircea Markus <[email protected]> wrote: >>> >>> On Oct 31, 2013, at 3:45 PM, Dennis Reed <[email protected]> wrote: >>> >>>> On 10/31/2013 02:18 AM, Bela Ban wrote: >>>>> >>>>>> Also if we did have read only, what criteria would cause those nodes >>>>>> to be writeable again? >>>>> Once you become the primary partition, e.g. when a view is received >>>>> where view.size() >= N where N is a predefined threshold. Can be >>>>> different, as long as it is deterministic. >>>>> >>>>>> There is no guarantee when the other nodes >>>>>> will ever come back up or if there will ever be additional ones anytime >>>>>> soon. >>>>> If a system picks the Primary Partition approach, then it can become >>>>> completely inaccessible (read-only). In this case, I envisage that a >>>>> sysadmin will be notified, who can then start additional nodes for the >>>>> system to acquire primary partition and become accessible again. >>>> >>>> There should be a way to manually modify the primary partition status. >>>> So if the admin knows the nodes will never return, they can manually >>>> enable the partition. >>> >>> The status will be exposed through JMX at any point, disregarding if >>> there's a split brain going on or not. >>> >>>> >>>> Also, the PartitionContext should know whether the nodes left normally >>>> or not. >>>> If you have 5 nodes in a cluster, and you shut down 3 of them, you'll >>>> want the remaining two to remain available. >>>> But if there was a network partition, you wouldn't. So it needs to know >>>> the difference. >>> >>> very good point again. >>> Thank you Dennis! >> >> Let's clarify. If 3 nodes out of 5 are killed without a >> reconfiguration, you do NOT want the remaining two to remain available >> unless explicitly told so by an admin. It is not possible to >> automatically make a distinction between 3 nodes being shut down vs. 3 >> crashed nodes. > > I'm not sure you can make this generalization: it's really up to the > implementor of PartitionHandlingStrategy to decide that. Knowing whether it > was a clean shutdown or not might be relevant to that decision. I think the > focus of this functionality should be on how exactly to react to partitions > happening, but provide the hooks for the user to make that decision and act > on system's availability.
We're on the same page on that, I'm just stressing that there is not automatic way that we can make a distinction between crash or intentional shutdown, if we don't have the "clean shutdown" method like Bela also reminded. > >> >> In our face to face meeting we did point out that an admin needs hooks >> to be able to: >> - specify how many nodes are expected in the full system (and adapt >> dynamically) > > yes, that's an custom implementation of PartitionHandlingStrategy. One we > might provide out of the box. Right it could be part of the default PartitionHandlingStrategy but I think all strategies might be interested in this, and that it's Infinispan (core) responsibility to also provide ways to admin the expected view at runtime. >> - some admin command to "clean shutdown" a node (which was also >> discussed as a strong requirement in scope of CacheStores so I'm >> assuming the operation is defined already) >> >> The design Wiki has captured the API we discussed around the >> PartitionHandlingStrategy but is missing the details about these >> operations, that should probably be added to the PartitionContext as >> well. > > The PartitionContext allows a partition to be marked as unavailable, I think > that should do. You also need the "clean shutdown", very likely with the RPC suggested by Bela. >> Also in the scope of CacheStore consistency we had discussed the need >> to store the expected nodes to be in the View: for example when the >> grid is started and all nodes are finding each other, the Cache shall >> not be considered started until all required nodes have joined. > > the discussion is here: > https://community.jboss.org/wiki/ControlledClusterShutdownWithDataRestoreFromPersistentStorage > >> >> Cheers, >> Sanne >> _______________________________________________ >> infinispan-dev mailing list >> [email protected] >> https://lists.jboss.org/mailman/listinfo/infinispan-dev > > Cheers, > -- > Mircea Markus > Infinispan lead (www.infinispan.org) > > > > > > _______________________________________________ > infinispan-dev mailing list > [email protected] > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list [email protected] https://lists.jboss.org/mailman/listinfo/infinispan-dev
