Considering very few options can be changed at runtime safely, should we rather focus of a strategy where we start a new grid and populate it with the old grid before flipping the proxy to the new one?
On Mon 2016-07-18 17:12, Tristan Tarrant wrote: > On 14/07/16 12:17, Sebastian Laskawiec wrote: > > Hey! > > > > I've been thinking about potential use of Kubernetes/OpenShift > > (OpenShift = Kubernetes + additional features) Rolling Update > > mechanism for updating configuration of Hot Rod servers. You might > > find some more information about the rolling updates here [1][2] but > > putting it simply, Kubernetes replaces nodes in the cluster one at a > > time. What's worth mentioning, Kubernetes ensures that the newly > > created replica is fully operational before taking down another one. > > > > There are two things that make me scratching my head... > > > > #1 - What type of configuration changes can we introduce using rolling > > updates? > > > > I'm pretty sure introducing a new cache definition won't do any harm. > > But what if we change a cache type from Distributed to Replicated? Do > > you have any idea which configuration changes are safe and which are > > not? Could come up with such list? > Very few changes are safe, but obviously this would need to be verified > on a per-attribute basis. All of the attributes which can be changed at > runtime (timeouts, eviction size) are safe. > > > > > #2 - How to prevent loosing data during the rolling update process? > I believe you want to write losing :) > > In Kubernetes we have a mechanism called lifecycle hooks [3] (we can > > invoke a script during container startup/shutdown). The problem with > > shutdown script is that it's time constrained (if it won't end up > > within certain amount of time, Kubernetes will simply kill the > > container). Fortunately this time is configurable. > > > > The idea to prevent from loosing data would be to invoke (enquque and > > wait for finish) state transfer process triggered by the shutdown hook > > (with timeout set to maximum value). If for some reason this won't > > work (e.g. a user has so much data that migrating it this way would > > take ages), there is a backup plan - Infinispan Rolling Upgrades [4]. > The thing that concerns me here is the amount of churn involved: the > safest bet for us is that the net topology doesn't change, i.e. you end > up with the exact number of nodes you started with and they are replaced > one by one in a way that the replacement assumes the identity of the > replaced (both as persistent uuid, owned segments and data in a > persistent store). Other types could be supported but they will > definitely have a level of risk. > Also we don't have any guarantees that a newer version will be able to > cluster with an older one... > > Tristan > _______________________________________________ > infinispan-dev mailing list > infinispan-dev@lists.jboss.org > https://lists.jboss.org/mailman/listinfo/infinispan-dev _______________________________________________ infinispan-dev mailing list infinispan-dev@lists.jboss.org https://lists.jboss.org/mailman/listinfo/infinispan-dev