Hello, Raymond. Usually, experimental is feature that can be changed in future. This statement relates to the public API of the feature usually.
> Does this imply risk if run against a production environment grid? It depends. As for read repair, CHECK_ONLY is read only mode and can’t harm your data. Other modes that fix data inconsistency was used on our production and there are no known issues. > 22 авг. 2023 г., в 03:12, Raymond Wilson <[email protected]> > написал(а): > > Thanks for the pointer to the read repair facility added in Ignite 2.14. > > Unfortunately the .WithReadRepair() extension does not seem to be present in > the Ignite C# client. > > This means we either need to use the experimental Command.sh support, or > improve our tooling to effectively do the same. I am curious why this is > labelled as experimental? Does this imply risk if run against a production > environment grid? > > Raymond. > > > On Mon, Aug 21, 2023 at 5:50 PM Николай Ижиков <[email protected] > <mailto:[email protected]>> wrote: >> Hello. >> >> I don’t know the cause of your issue. >> But, we have feature to overcome it [1] >> >> Consistency repair can be run from control.sh. >> >> ``` >> ./bin/control.sh --enable-experimental >> ... >> [EXPERIMENTAL] >> Check/Repair cache consistency using Read Repair approach: >> control.(sh|bat) --consistency repair cache-name partition >> >> Parameters: >> cache-name - Cache to be checked/repaired. >> partition - Cache's partition to be checked/repaired. >> >> [EXPERIMENTAL] >> Cache consistency check/repair operations status: >> control.(sh|bat) --consistency status >> >> [EXPERIMENTAL] >> Finalize partitions update counters: >> control.(sh|bat) --consistency finalize >> ``` >> >> It seems that docs for a cmd command not full. >> It also accepts strategy argument so you can manage your repair actions more >> accurate. >> Try to run: >> >> ``` >> ❯ ./bin/control.sh --enable-experimental --consistency repair --cache >> default --strategy CHECK_ONLY --partitions 1,2,3,…your_partitions_list... >> ``` >> >> Available strategies with good description can be found in sources [2] >> >> >> [1] https://ignite.apache.org/docs/latest/key-value-api/read-repair >> [2] >> https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/cache/ReadRepairStrategy.java >> >> >> >>> 21 авг. 2023 г., в 07:46, Raymond Wilson <[email protected] >>> <mailto:[email protected]>> написал(а): >>> >>> [Replying onto correct thread] >>> >>> As a follow up to this email, we are starting to collect evidence that >>> replicated caches within our Ignite grid are failing to replicate values in >>> a small number of cases. >>> >>> In the cases we observe so far, with a cluster of 4 nodes participating in >>> a replicated cache, only one node reports having the correct value for a >>> key, and the other three report having no value for that key. >>> >>> The documentation is pretty opinionated about the >>> CacheWriteSynchronizationMode not being impactful with respect to >>> consistency for replicated caches. As noted below, we use PrimarySync (the >>> default) for these caches, which would suggest a potential failure mode >>> preventing the backup copies obtaining their copy once the primary copy has >>> been written. >>> >>> We are continuing to investigate and would be interested in any suggestions >>> you may have as to the likely cause. >>> >>> Thanks, >>> Raymond. >>> >>> On Thu, Jul 27, 2023 at 12:38 PM Raymond Wilson <[email protected] >>> <mailto:[email protected]>> wrote: >>>> Hi, >>>> >>>> I have a query regarding data safety of replicated caches in the case of >>>> hard failure of the compute resource but where the storage resource is >>>> available when the node returns. >>>> >>>> We are using Ignite 2.15 with the C# client. >>>> >>>> We have a number of these caches that have four nodes participating in the >>>> replicated caches, all with the default PrimarySync write synchronization >>>> mode. All data storage configurations are configured with WalMode = >>>> WalMode.Fsync. >>>> >>>> We have logic performing writes against these caches which will continue >>>> once the primary node for the replicated cache has written the data item. >>>> >>>> I am unsure of the guarantees made by Ignite at this point in the event of >>>> failure. Specifically, hard/red-button failure of compute hardware >>>> resources and/or abrupt (but recoverable) detachment of storage resources. >>>> >>>> Scenario one: Primary node returns "OK", then immediately fails (before >>>> check point). When the primary node returns should I expect the replicated >>>> value to be in the primary, and to appear in all other nodes too. >>>> >>>> Scenario two: Primary node returns "OK", then a secondary node immediately >>>> fails (before achieving the write and so before any check point). When the >>>> secondary node returns should I expect the replicated value to be in the >>>> recovered secondary node? >>>> >>>> In relation to these scenarios, does setting the cache write >>>> synchronization mode improve the safety of the write as all nodes must >>>> acknowledge the write before it returns. >>>> >>>> If there is an improvement in write safety in this instance, does this >>>> imply the Fsync WalMode write pathway has opportunities for data loss in >>>> these failure situations? >>>> >>>> Thanks, >>>> Raymond. >>>> >>>> >>>> >>>> >>>> -- >>>> <http://www.trimble.com/> >>>> Raymond Wilson >>>> Trimble Distinguished Engineer, Civil Construction Software (CCS) >>>> 11 Birmingham Drive | Christchurch, New Zealand >>>> [email protected] <mailto:[email protected]> >>>> >>>> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch> >>> >>> -- >>> <http://www.trimble.com/> >>> Raymond Wilson >>> Trimble Distinguished Engineer, Civil Construction Software (CCS) >>> 11 Birmingham Drive | Christchurch, New Zealand >>> [email protected] <mailto:[email protected]> >>> >>> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch> > > > -- > <http://www.trimble.com/> > Raymond Wilson > Trimble Distinguished Engineer, Civil Construction Software (CCS) > 11 Birmingham Drive | Christchurch, New Zealand > [email protected] <mailto:[email protected]> > > <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>
