As a follow up to this email, we are starting to collect evidence that
replicated caches within our Ignite grid are failing to replicate values in
a small number of cases.

In the cases we observe so far, with a cluster of 4 nodes participating in
a replicated cache, only one node reports having the correct value for a
key, and the other three report having no value for that key.

The documentation is pretty opinionated about the
CacheWriteSynchronizationMode not being impactful with respect to
consistency for replicated caches. As noted below, we use PrimarySync (the
default) for these caches, which would suggest a potential failure mode
preventing the backup copies obtaining their copy once the primary copy has
been written.

We are continuing to investigate and would be interested in any
suggestions you may have as to the likely cause.

Thanks,
Raymond.


On Wed, Jul 19, 2023 at 10:37 AM Raymond Wilson <raymond_wil...@trimble.com>
wrote:

> I have a query regarding the CacheWriteSynchronizationMode in
> CacheConfiguration.
>
> This enum is defined like this in the .Net client:
>
>   public enum CacheWriteSynchronizationMode
>   {
>     /// <summary>
>     /// Mode indicating that Ignite should wait for write or commit
> replies from all nodes.
>     /// This behavior guarantees that whenever any of the atomic or
> transactional writes
>     /// complete, all other participating nodes which cache the written
> data have been updated.
>     /// </summary>
>     FullSync,
>     /// <summary>
>     /// Flag indicating that Ignite will not wait for write or commit
> responses from participating nodes,
>     /// which means that remote nodes may get their state updated a bit
> after any of the cache write methods
>     /// complete, or after {@link Transaction#commit()} method completes.
>     /// </summary>
>     FullAsync,
>     /// <summary>
>     /// This flag only makes sense for {@link CacheMode#PARTITIONED} mode.
> When enabled, Ignite will wait
>     /// for write or commit to complete on primary node, but will not wait
> for backups to be updated.
>     /// </summary>
>     PrimarySync,
>   }
>
> We have some replicated caches (where cfg.CacheMode =
> CacheMode.Replicated), but we don't specify the WriteSynchronizationMode.
>
> I note in the comment for PrimarySync (the default) that this "only makes
> sense" for Partitioned caches. Given we don't set this mode for our
> replicated caches then they will be using the PrimarySync write
> synchronization mode.
>
> The core Ignite help does not distinguish these synchronization modes and
> strongly implies that all three synchronization modes have equivalent
> consistency guarantees, but the help comment implies that replicated caches
> should use either FullSync or FullAsync to ensure all replicated contexts
> receive the written value.
>
> As a background, I am investigating an issue in our system that could be
> explained by replicated caches not having consistent values and am writing
> some triage tooling to prove if that is the case or not by comparing the
> stored values in each of the replicates cache nodes, However, I'm also
> doing some due diligence on our configuration and ran into this item.
>
> Thanks,
> Raymond.
>
>
> --
> <http://www.trimble.com/>
> Raymond Wilson
> Trimble Distinguished Engineer, Civil Construction Software (CCS)
> 11 Birmingham Drive | Christchurch, New Zealand
> raymond_wil...@trimble.com
>
>
> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>
>


-- 
<http://www.trimble.com/>
Raymond Wilson
Trimble Distinguished Engineer, Civil Construction Software (CCS)
11 Birmingham Drive | Christchurch, New Zealand
raymond_wil...@trimble.com

<https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>

Reply via email to