Anton I'm Ok with your proposal but IMO it should be provided as IEP?
On Mon, Apr 29, 2019 at 4:05 PM Anton Vinogradov <a...@apache.org> wrote: > Sergey, > > I'd like to continue the discussion since it closely linked to the problem > I'm currently working on. > > 1) writeSynchronizationMode should not be a part of cache configuration, > agree. > This should be up to the user to decide how strong should be "update > guarantee". > So, I propose to have a special cache proxy, .withBlaBla() (at 3.x). > > 2) Primary fail on !FULL_SYNC is not the single problem leads to an > inconsistent state. > Bugs and incorrect recovery also cause the same problem. > > Currently, we have a solution [1] to check cluster to be consistent, but it > has a bad resolution (will tell you only what partitions are broken). > So, to find the broken entries you need some special API, which will check > all copies and let you know what's went wrong. > > 3) Since we mostly agree that write should affect some backups in sync way, > how about to have similar logic for reading? > > So, I propose to have special proxy .withQuorumRead(backupsCnt) which will > check the explicit amount of backups on each read and return you the latest > values. > This proxy already implemented [2] for all copies, but I'm going to extend > it with explicit backups number. > > Thoughts? > > 3.1) Backups can be checked in two ways: > - request data from all backups, but wait for explicit number (solves the > slow backup issue, but produce traffic) > - request data from an explicit number of backups (less traffic, but can be > as slow as all copies check case) > what strategy is better? Should it be configurable? > > [1] > > https://apacheignite-tools.readme.io/docs/control-script#section-verification-of-partition-checksums > [2] https://issues.apache.org/jira/browse/IGNITE-10663 > > On Thu, Apr 25, 2019 at 7:04 PM Sergey Kozlov <skoz...@gridgain.com> > wrote: > > > There's another point to improve: > > if *syncPartitions=N* comes as the configurable in run-time it will > allow > > to manage the consistency-performance balance runtime, e.g. switch to > full > > async for preloading and then go to back to full sync for regular > > operations > > > > > > On Thu, Apr 25, 2019 at 6:48 PM Sergey Kozlov <skoz...@gridgain.com> > > wrote: > > > > > Vyacheskav, > > > > > > You're right with the referring to MongoDB doc. In general the idea is > > > very similar. Many vendors use such approach (1). > > > > > > [1] > > > > > > https://dev.mysql.com/doc/refman/8.0/en/replication-options-master.html#sysvar_rpl_semi_sync_master_wait_for_slave_count > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Apr 25, 2019 at 6:40 PM Vyacheslav Daradur < > daradu...@gmail.com> > > > wrote: > > > > > >> Hi, Sergey, > > >> > > >> Makes sense to me in case of performance issues, but may lead to > losing > > >> data. > > >> > > >> >> *by the new option *syncPartitions=N* (not best name just for > > >> referring) > > >> > > >> Seems similar to "Write Concern"[1] in MongoDB. It is used in the same > > >> way as you described. > > >> > > >> On the other hand, if you have such issues it should be investigated > > >> first: why it causes performance drops: network issues etc. > > >> > > >> [1] https://docs.mongodb.com/manual/reference/write-concern/ > > >> > > >> On Thu, Apr 25, 2019 at 6:24 PM Sergey Kozlov <skoz...@gridgain.com> > > >> wrote: > > >> > > > >> > Ilya > > >> > > > >> > See comments inline. > > >> > On Thu, Apr 25, 2019 at 5:11 PM Ilya Kasnacheev < > > >> ilya.kasnach...@gmail.com> > > >> > wrote: > > >> > > > >> > > Hello! > > >> > > > > >> > > When you have 2 backups and N = 1, how will conflicts be resolved? > > >> > > > > >> > > > >> > > Imagine that you had N = 1, and primary node failed immediately > > after > > >> > > operation. Now you have one backup that was updated synchronously > > and > > >> one > > >> > > which did not. Will they stay unsynced, or is there any mechanism > of > > >> > > re-syncing? > > >> > > > > >> > > > >> > Same way as Ignite processes the failures for PRIMARY_SYNC. > > >> > > > >> > > > >> > > > > >> > > Why would one want to "update for 1 primary and 1 backup > > >> synchronously, > > >> > > update the rest of backup partitions asynchronously"? What's the > use > > >> case? > > >> > > > > >> > > > >> > The case to have more backups but do not pay the performance penalty > > for > > >> > that :) > > >> > For the distributed systems one backup looks like risky. But more > > >> backups > > >> > directly impacts to performance. > > >> > Other point is to split the strict consistent apps like bank apps > and > > >> the > > >> > other apps like fraud detection, analytics, reports and so on. > > >> > In that case you can configure partitions distribution by a custom > > >> affinity > > >> > and have following: > > >> > - first set of nodes for critical (from consistency point) > operations > > >> > - second set of nodes have async backup partitions only for other > > >> > operations (reports, analytics) > > >> > > > >> > > > >> > > > >> > > > > >> > > Regards, > > >> > > -- > > >> > > Ilya Kasnacheev > > >> > > > > >> > > > > >> > > чт, 25 апр. 2019 г. в 16:55, Sergey Kozlov <skoz...@gridgain.com > >: > > >> > > > > >> > > > Igniters > > >> > > > > > >> > > > I'm working with the wide range of cache configurations and > found > > >> (from > > >> > > my > > >> > > > standpoint) the interesting point for the discussion: > > >> > > > > > >> > > > Now we have following *writeSynchronizationMode *options: > > >> > > > > > >> > > > 1. *FULL_ASYNC* > > >> > > > - primary partition updated asynchronously > > >> > > > - backup partitions updated asynchronously > > >> > > > 2. *PRIMARY_SYNC* > > >> > > > - primary partition updated synchronously > > >> > > > - backup partitions updated asynchronously > > >> > > > 3. *FULL_SYNC* > > >> > > > - primary partition updated synchronously > > >> > > > - backup partitions updated synchronously > > >> > > > > > >> > > > The approach above is covering everything if you've 0 or 1 > backup. > > >> > > > But for 2 or more backups we can't reach the following case > > >> (something > > >> > > > between *PRIMARY_SYNC *and *FULL_SYNC*): > > >> > > > - update for 1 primary and 1 backup synchronously > > >> > > > - update the rest of backup partitions asynchronously > > >> > > > > > >> > > > The idea is to join all current modes into single one and > replace > > >> > > > *writeSynchronizationMode > > >> > > > *by the new option *syncPartitions=N* (not best name just for > > >> referring) > > >> > > > covers the approach: > > >> > > > > > >> > > > - N = 0 means *FULL_ASYNC* > > >> > > > - N = (backups+1) means *FULL_SYNC* > > >> > > > - 0 < N < (backups+1) means either *PRIMARY_SYNC *(N=1) or > new > > >> mode > > >> > > > described above > > >> > > > > > >> > > > IMO it will allow to make more flexible and consistent > > >> configurations > > >> > > > > > >> > > > -- > > >> > > > Sergey Kozlov > > >> > > > GridGain Systems > > >> > > > www.gridgain.com > > >> > > > > > >> > > > > >> > > > >> > > > >> > -- > > >> > Sergey Kozlov > > >> > GridGain Systems > > >> > www.gridgain.com > > >> > > >> > > >> > > >> -- > > >> Best Regards, Vyacheslav D. > > >> > > > > > > > > > -- > > > Sergey Kozlov > > > GridGain Systems > > > www.gridgain.com > > > > > > > > > -- > > Sergey Kozlov > > GridGain Systems > > www.gridgain.com > > > -- Sergey Kozlov GridGain Systems www.gridgain.com