Hello, Anton. > Customer should be able to change strategy on the fly according to time> > periods or load.
I think we should allow to administrator to enable/disable Consistency check. This option shouldn't be related to application code because "Consistency check" is some kind of maintance procedure. What do you think? В Вт, 16/04/2019 в 12:47 +0300, Anton Vinogradov пишет: > Andrey, thanks for tips > > > > You can perform consistency check using idle verify utility. > > Could you please point to utility's page? > According to its name, it requires to stop the cluster to perform the check? > That's impossible at real production when you should have downtime less > that some minutes per year. > So, the only case I see is to use online check during periods of moderate > activity. > > > > Recovery tool is good idea > > This tool is a part of my IEP. > But recovery tool (process) > - will allow you to check entries in memory only (otherwise, you will warm > up the cluster incorrectly), and that's a problem when you have > persisted/in_memory rate > 10:1 > - will cause latency drop for some (eg. 90+ percentile) requests, which is > not acceptable for real production, when we have strict SLA. > - will not guarantee that each operation will use consistent data, > sometimes it's extremely essential > so, the process is a cool idea, but, sometime you may need more. > > Ivan, thanks for analysis > > > > why it comes as an on-demand enabled proxy but not as a mode enabled by > > some configuration property > It's a bad idea to have this feature permanently enabled, it slows down the > system by design. > Customer should be able to change strategy on the fly according to time > periods or load. > Also, we're going to use this proxy for odd requests or for every 5-th, > 10-th, 100-th request depends on the load/time/SLA/etc. > The goal is to perform as much as possible gets-with-consistency operations > without stopping the cluster and never find a problem :) > > > > As for me it will be great if we can improve consistency guarantees > > provided by default. > Once you checked backups you decreased throughput and increased latency. > This feature requred only for some financial, nuclear, health systems when > you should be additionally sure about consistency. > It's like a > - read from backups > - data modification outside the transaction > - using FULL_ASYNC instead of FULL_SYNC, > sometimes it's possible, sometimes not. > > > > 1. It sounds suspicious that reads can cause writes (unexpected > > deadlocks might be possible). > Code performs writes > - key per additional transaction in case original tx was OPTIMISTIC || > READ_COMMITTED, > - all keys per same tx in case original tx was PESSIMISTIC && > !READ_COMMITTED, since you already obtain the locks, > so, deadlock should be impossible. > > > > 2. I do not believe that it is possible to implement a (bugless?) > > feature which will fix other bugs. > It does not fix the bugs, it looks for inconsistency (no matter how it > happened) and reports using events (previous state and how it was fixed). > This allows continuing processing for all the entries, even inconsistent. > But, each such fix should be rechecked manually, for sure. > > On Tue, Apr 16, 2019 at 11:39 AM Павлухин Иван <vololo...@gmail.com> wrote: > > > Anton, > > > > Thank you for your effort for improving consistency guarantees > > provided by Ignite. > > > > The subject sounds really vital. Could you please elaborate why it > > comes as an on-demand enabled proxy but not as a mode enabled by some > > configuration property (or even as a default behavior)? How do you see > > the future development of such consistency checks? As for me it will > > be great if we can improve consistency guarantees provided by default. > > > > Also thinking loud a bit: > > 1. It sounds suspicious that reads can cause writes (unexpected > > deadlocks might be possible). > > 2. I do not believe that it is possible to implement a (bugless?) > > feature which will fix other bugs. > > 3. A storage (or database) product (Ignite in our case) consistency is > > not equal to a user application consistency. So, it might be that > > introduced checks are insufficient to make business applications > > happy. > > > > пн, 15 апр. 2019 г. в 19:27, Andrey Gura <ag...@apache.org>: > > > > > > Anton, > > > > > > I'm trying tell you that this proxy can produce false positive result, > > > incorrect result and just hide bugs. What will the next solution? > > > withNoBugs proxy? > > > > > > You can perform consistency check using idle verify utility. Recovery > > > tool is good idea but user should trigger this process, not some cache > > > proxy implementation. > > > > > > On Mon, Apr 15, 2019 at 5:34 PM Anton Vinogradov <a...@apache.org> wrote: > > > > > > > > Seems, we already fixed all bugs caused this feature, but there is no > > > > warranty we will not create new :) > > > > This proxy is just checker that consistency is ok. > > > > > > > > > > reaching bugless implementation > > > > > > > > Not sure it's possible. Once you have software it contains bugs. > > > > This proxy will tell you whether these bugs lead to inconsistency. > > > > > > > > On Mon, Apr 15, 2019 at 5:19 PM Andrey Gura <ag...@apache.org> wrote: > > > > > > > > > Method name is minor problem. I still believe that there is no need > > > > > for this proxy because there are no any guarantees about bugless > > > > > implementation this functionality. Better way is reaching bugless > > > > > implementation of current functionality. > > > > > > > > > > On Mon, Apr 15, 2019 at 4:51 PM Anton Vinogradov <a...@apache.org> > > > > wrote: > > > > > > > > > > > > Andrey, > > > > > > > > > > > > > > It means also that at least method name is bad. > > > > > > > > > > > > Agreed, already discussed with Aleksey Plekhanov. > > > > > > Decided that ".withConsistencyCheck()" is a proper name. > > > > > > > > > > > > > > What is the profit? > > > > > > > > > > > > This proxy allows to check (and fix) is there any consistency > > > > violation > > > > > > across the topology. > > > > > > The proxy will check all backups contain the same values as > > > > primary. > > > > > > So, when it's possible (you're ready to spend resources for this > > > > check) > > > > > you > > > > > > will be able to read-with-consistency-check. > > > > > > This will decrease the amount of "inconsistency caused > > > > > > war/strikes/devastation" situations, which is important for > > > > financial > > > > > > systems. > > > > > > > > > > > > On Mon, Apr 15, 2019 at 3:58 PM Andrey Gura <ag...@apache.org> > > > > wrote: > > > > > > > > > > > > > Anton, > > > > > > > > > > > > > > what does expression "withConsistency" mean? From user's > > > > standpoint it > > > > > > > means that all operations performed without this proxy are not > > > > > > > consistent. It means also that at least method name is bad. > > > > > > > > > > > > > > Are there any guarantees that withConsistency proxy will not > > > > contain > > > > > > > bugs that will lead to inconsistent write after inconsistency was > > > > > > > found? I think there are no such guarantees. Bugs still are > > > > possible. > > > > > > > So I always must use withConsistency proxy because I doesn't have > > > > > > > other choice - all ways are unreliable and withConsistency just > > > > sounds > > > > > > > better. > > > > > > > > > > > > > > Eventually we will have two different ways for working with cache > > > > > > > values with different bugs set. What is the profit? > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Fri, Apr 12, 2019 at 2:49 PM Anton Vinogradov <a...@apache.org> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > Folks, > > > > > > > > > > > > > > > > I've checked the tx benchmarks and found no performance drop. > > > > > > > > Also, see no issues at TC results. > > > > > > > > So, seems, code ready to be merged. > > > > > > > > > > > > > > > > Everyone interested, please share any objections about > > > > > > > > - public API > > > > > > > > - test coverage > > > > > > > > - implementation approach > > > > > > > > > > > > > > > > On Wed, Apr 3, 2019 at 5:46 PM Anton Vinogradov <a...@apache.org > > > > > wrote: > > > > > > > > > > > > > > > > > Nikolay, > > > > > > > > > > > > > > > > > > This is not a PoC, but the final solution (I hope so:) ) > > > > required > > > > > the > > > > > > > > > review. > > > > > > > > > LWW means Last Write Wins, detailed explanation can be found > > > > at > > > > > IEP-31. > > > > > > > > > > > > > > > > > > On Wed, Apr 3, 2019 at 5:24 PM Nikolay Izhikov < > > > > > > > > > > nizhi...@apache.org> > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hello, Anton. > > > > > > > > > > > > > > > > > > > > Thanks for the PoC. > > > > > > > > > > > > > > > > > > > > > finds correct values according to LWW strategy > > > > > > > > > > > > > > > > > > > > Can you, please, clarify what is LWW strategy? > > > > > > > > > > > > > > > > > > > > В Ср, 03/04/2019 в 17:19 +0300, Anton Vinogradov пишет: > > > > > > > > > > > Ilya, > > > > > > > > > > > > > > > > > > > > > > This is impossible due to a conflict between some > > > > isolation > > > > > levels > > > > > > > and > > > > > > > > > > > get-with-consistency expectations. > > > > > > > > > > > Basically, it's impossible to perform get-with-consistency > > > > > > > > > > after the > > > > > > > > > > other > > > > > > > > > > > get at !READ_COMMITTED transaction. > > > > > > > > > > > The problem here is that value should be cached according > > > > to the > > > > > > > > > > isolation > > > > > > > > > > > level, so get-with-consistency is restricted in this case. > > > > > > > > > > > Same problem we have at case get-with-consistency after > > > > put, so > > > > > we > > > > > > > have > > > > > > > > > > > restriction here too. > > > > > > > > > > > So, the order matter. :) > > > > > > > > > > > > > > > > > > > > > > See OperationRestrictionsCacheConsistencyTest [1] for > > > > details. > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > https://github.com/apache/ignite/blob/8b0b0c3e1bde93ff9c4eb5667d794dd64a8b06f0/modules/core/src/test/java/org/apache/ignite/internal/processors/cache/consistency/OperationRestrictionsCacheConsistencyTest.java > > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 3, 2019 at 4:54 PM Ilya Kasnacheev < > > > > > > > > > > > > > > > > > > > > ilya.kasnach...@gmail.com> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > Hello! > > > > > > > > > > > > > > > > > > > > > > > > Sounds useful especially for new feature development. > > > > > > > > > > > > > > > > > > > > > > > > Can you do a run of all tests with > > > > cache.forConsistency(), > > > > > see if > > > > > > > > > > there are > > > > > > > > > > > > cases that fail? > > > > > > > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > > > > -- > > > > > > > > > > > > Ilya Kasnacheev > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > ср, 3 апр. 2019 г. в 16:17, Anton Vinogradov < > > > > a...@apache.org>: > > > > > > > > > > > > > > > > > > > > > > > > > Igniters, > > > > > > > > > > > > > > > > > > > > > > > > > > Sometimes, at real deployment, we're faced with > > > > inconsistent > > > > > > > state > > > > > > > > > > across > > > > > > > > > > > > > the topology. > > > > > > > > > > > > > This means that somehow we have different values for > > > > the > > > > > same > > > > > > > key at > > > > > > > > > > > > > different nodes. > > > > > > > > > > > > > This is an extremely rare situation, but, when you > > > > have > > > > > > > thousands of > > > > > > > > > > > > > terabytes of data, this can be a real problem. > > > > > > > > > > > > > > > > > > > > > > > > > > Apache Ignite provides a consistency guarantee, each > > > > > > > > > > affinity > > > > > > > node > > > > > > > > > > should > > > > > > > > > > > > > contain the same value for the same key, at least > > > > > > > > > > eventually. > > > > > > > > > > > > > But this guarantee can be violated because of bugs, > > > > see > > > > > IEP-31 > > > > > > > [1] > > > > > > > > > > for > > > > > > > > > > > > > details. > > > > > > > > > > > > > > > > > > > > > > > > > > So, I created the issue [2] to handle such situations. > > > > > > > > > > > > > The main idea is to have a special > > > > cache.withConsistency() > > > > > proxy > > > > > > > > > > allows > > > > > > > > > > > > > checking a fix inconsistency on get operation. > > > > > > > > > > > > > > > > > > > > > > > > > > I've created PR [3] with following improvements (when > > > > > > > > > > > > > cache.withConsistency() proxy used): > > > > > > > > > > > > > > > > > > > > > > > > > > - PESSIMISTIC && !READ_COMMITTED transaction > > > > > > > > > > > > > -- checks values across the topology (under locks), > > > > > > > > > > > > > -- finds correct values according to LWW strategy, > > > > > > > > > > > > > -- records special event in case consistency > > > > violation found > > > > > > > > > > (contains > > > > > > > > > > > > > inconsistent map <Node, <K,V>> and last values <K,V>), > > > > > > > > > > > > > -- enlists writes with latest value for each > > > > inconsistent > > > > > key, > > > > > > > so > > > > > > > > > > it will > > > > > > > > > > > > > be written on tx.commit(). > > > > > > > > > > > > > > > > > > > > > > > > > > - OPTIMISTIC || READ_COMMITTED transactions > > > > > > > > > > > > > -- checks values across the topology (not under > > > > locks, so > > > > > > > > > > false-positive > > > > > > > > > > > > > case is possible), > > > > > > > > > > > > > -- starts PESSIMISTIC && SERIALIZABLE (at separate > > > > thread) > > > > > > > > > > transaction > > > > > > > > > > > > > > > > > > > > > > > > for > > > > > > > > > > > > > each possibly broken key and fixes it on a commit if > > > > > > > > > > necessary. > > > > > > > > > > > > > -- original transaction performs get-after-fix and > > > > can be > > > > > > > continued > > > > > > > > > > if > > > > > > > > > > > > > > > > > > > > > > > > the > > > > > > > > > > > > > fix does not conflict with isolation level. > > > > > > > > > > > > > > > > > > > > > > > > > > Future plans > > > > > > > > > > > > > - Consistency guard (special process periodically > > > > checks we > > > > > > > have no > > > > > > > > > > > > > inconsistency). > > > > > > > > > > > > > - MVCC support. > > > > > > > > > > > > > - Atomic caches support. > > > > > > > > > > > > > - Thin client support. > > > > > > > > > > > > > - SQL support. > > > > > > > > > > > > > - Read-with-consistency before the write operation. > > > > > > > > > > > > > - Single key read-with-consistency optimization, now > > > > the > > > > > > > collection > > > > > > > > > > > > > approach used each time. > > > > > > > > > > > > > - Do not perform read-with-consistency for the key in > > > > case > > > > > it > > > > > > > was > > > > > > > > > > > > > consistently read some time ago. > > > > > > > > > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/IGNITE/IEP-31+Consistency+check+and+fix > > > > > > > > > > > > > [2] > > > > https://issues.apache.org/jira/browse/IGNITE-10663 > > > > > > > > > > > > > [3] https://github.com/apache/ignite/pull/5656 > > > > > > > > > > > > > > > > > > > > > -- > > Best regards, > > Ivan Pavlukhin > >
signature.asc
Description: This is a digitally signed message part