I assume you have created your caches/tables with backups>=1.

You should restart one node at a time, and wait until the restarted node
has rejoined the cluster, then wait for rebalancing to begin, then wait for
rebalancing to finish before restarting the next node. Kubernetes readiness
probes aren't sophisticated enough. "Node ready" state isn't the same thing
as "Cluster ready" state, but Kubernetes scheduler can't distinguish. This
should be handled by an operator, either human, or a Kubernetes automated
one.

On Tue, Sep 3, 2024 at 1:13 PM Humphrey <[email protected]> wrote:

> Thanks, I meant Rolling Update of the same version of Ignite (2.16). Not
> upgrade to a new version. We have our ignite embedded in Spring Boot
> application, and when changing code we need to deploy new version of the
> jar.
>
> Humphrey
>
> On 3 Sep 2024, at 19:24, Gianluca Bonetti <[email protected]>
> wrote:
>
> 
> Hello
>
> If you want to upgrade Apache Ignite version, this is not supported by
> Apache Ignite
>
> "Ignite cluster cannot have nodes that run on different Ignite versions.
> You need to stop the cluster and start it again on the new Ignite version."
> https://ignite.apache.org/docs/latest/installation/upgrades
>
> If you need rolling upgrades you can upgrade to GridGain which bring
> rolling upgrades together with many other interesting features
> "Rolling Upgrades is a feature of GridGain Enterprise and Ultimate Edition
> that allows nodes with different GridGain versions to coexist in a cluster
> while you roll out a new version. This prevents downtime when performing
> software upgrades."
> https://www.gridgain.com/docs/latest/installation-guide/rolling-upgrades
>
> Cheers
> Gianluca Bonetti
>
> On Tue, 3 Sept 2024 at 18:15, Humphrey Lopez <[email protected]> wrote:
>
>> Hello, we have several pods with ignite caches running in kubernetes. We
>> only use memory mode (not persistence) and want to perform rolling update
>> of without losing data. What metric should we monitor to know when it’s
>> safe to replace the next pod?
>>
>> We have tried the Cluser.Rebalanced (1) metric from JMX in a readiness
>> probe but we still end up losing data from the caches.
>>
>> 1)
>> https://ignite.apache.org/docs/latest/monitoring-metrics/new-metrics#cluster
>>
>> Should we use another mechanism or metric for determining the readiness
>> of the new started pod?
>>
>>
>> Humphrey
>>
>

Reply via email to