>> I would enable this option only after we confirm it's stable. We have to add system property in that case.
It's a bad idea to have such options at API because it can be unstable. It's like allowMoreThan1024Node(bool) since we have issues with that. 2018-04-09 20:03 GMT+03:00 Yakov Zhdanov <yzhda...@apache.org>: > How about skipWalOnRebalancing or disableWalOnRebalancing? I like the first > one better and both are shorter. > > --Yakov > > 2018-04-09 19:55 GMT+03:00 Denis Magda <dma...@apache.org>: > > > I would enable this option only after we confirm it's stable. Until it > > happens it should be treated as an experimental feature that is turned on > > manually with a parameter like Ilya is suggesting. > > > > Ilya, the name sounds good to me. > > > > -- > > Denis > > > > On Mon, Apr 9, 2018 at 9:16 AM, Anton Vinogradov <a...@apache.org> wrote: > > > > > Ilya, > > > > > > WAL should be automatically disabled at initial rebalancing. > > > It can't be disabled at regilar rebalancing, since you never ready to > > lose > > > data you protected by WAL. > > > Is there any need to have special param in that case? > > > > > > 2018-04-09 16:41 GMT+03:00 Ilya Lantukh <ilant...@gridgain.com>: > > > > > > > Igniters, > > > > > > > > I am currently at the finish line of > > > > https://issues.apache.org/jira/browse/IGNITE-8017 ("Disable WAL > during > > > > initial preloading") implementation. And I need that such behavior > > should > > > > be configurable. In my intermediate implementation I have parameter > > > called > > > > "disableWalDuringRebalancing" in IgniteConfiguration. Do you thing > such > > > > name is meaningful and self-explanatory? Do we need to ensure that it > > has > > > > the same value on every node? Should I make it configurable per cache > > > > rather than globally? > > > > > > > > Please share your thoughts. > > > > > > > > On Mon, Apr 9, 2018 at 4:32 PM, Ilya Lantukh <ilant...@gridgain.com> > > > > wrote: > > > > > > > > > Denis, > > > > > > > > > > Those ticket are rather complex, and so I don't know when I'll be > > able > > > to > > > > > start working on them. > > > > > > > > > > On Fri, Mar 30, 2018 at 11:45 PM, Denis Magda <dma...@apache.org> > > > wrote: > > > > > > > > > >> Ilya, > > > > >> > > > > >> Just came across the IEP put together by you: > > > > >> https://cwiki.apache.org/confluence/display/IGNITE/IEP-16% > > > > >> 3A+Optimization+of+rebalancing > > > > >> > > > > >> Excellent explanation, thanks for aggregating everything there. > > > > >> > > > > >> Two tickets below don't have a fixed version assigned: > > > > >> https://issues.apache.org/jira/browse/IGNITE-8020 > > > > >> https://issues.apache.org/jira/browse/IGNITE-7935 > > > > >> > > > > >> Do you plan to work on them in 2.6 time frame, right? > > > > >> > > > > >> -- > > > > >> Denis > > > > >> > > > > >> On Tue, Mar 27, 2018 at 9:29 AM, Denis Magda <dma...@apache.org> > > > wrote: > > > > >> > > > > >> > Ilya, granted you all the required permissions. Please let me > know > > > if > > > > >> you > > > > >> > still have troubles with the wiki. > > > > >> > > > > > >> > -- > > > > >> > Denis > > > > >> > > > > > >> > On Tue, Mar 27, 2018 at 8:56 AM, Ilya Lantukh < > > > ilant...@gridgain.com> > > > > >> > wrote: > > > > >> > > > > > >> >> Unfortunately, I don't have permission to create page for IEP > on > > > > wiki. > > > > >> >> Denis, can you grant it? My username is ilantukh. > > > > >> >> > > > > >> >> On Mon, Mar 26, 2018 at 8:04 PM, Anton Vinogradov < > a...@apache.org > > > > > > > >> wrote: > > > > >> >> > > > > >> >> > >> It is impossible to disable WAL only for certain > partitions > > > > >> without > > > > >> >> > >> completely overhauling design of Ignite storage mechanism. > > > Right > > > > >> now > > > > >> >> we > > > > >> >> > can > > > > >> >> > >> afford only to change WAL mode per cache group. > > > > >> >> > > > > > >> >> > Cache group rebalancing is a one cache rebalancing, and then > > this > > > > >> cache > > > > >> >> > ("cache group") can be presented as a set of virtual caches. > > > > >> >> > So, there is no issues for initial rebalancing. > > > > >> >> > Lets disable WAL on initial rebalancing. > > > > >> >> > > > > > >> >> > 2018-03-26 16:46 GMT+03:00 Ilya Lantukh < > ilant...@gridgain.com > > >: > > > > >> >> > > > > > >> >> > > Dmitry, > > > > >> >> > > It is impossible to disable WAL only for certain partitions > > > > without > > > > >> >> > > completely overhauling design of Ignite storage mechanism. > > > Right > > > > >> now > > > > >> >> we > > > > >> >> > can > > > > >> >> > > afford only to change WAL mode per cache group. > > > > >> >> > > > > > > >> >> > > The idea is to disable WAL when node doesn't have any > > partition > > > > in > > > > >> >> OWNING > > > > >> >> > > state, which means it doesn't have any consistent data and > > > won't > > > > be > > > > >> >> able > > > > >> >> > to > > > > >> >> > > restore from WAL anyway. I don't see any potential use for > > WAL > > > on > > > > >> such > > > > >> >> > > node, but we can keep a configurable parameter indicating > can > > > we > > > > >> >> > > automatically disable WAL in such case or not. > > > > >> >> > > > > > > >> >> > > On Fri, Mar 23, 2018 at 10:40 PM, Dmitry Pavlov < > > > > >> >> dpavlov....@gmail.com> > > > > >> >> > > wrote: > > > > >> >> > > > > > > >> >> > > > Denis, as I understood, there is and idea to exclude only > > > > >> rebalanced > > > > >> >> > > > partition(s) data. All other data will go to the WAL. > > > > >> >> > > > > > > > >> >> > > > Ilya, please correct me if I'm wrong. > > > > >> >> > > > > > > > >> >> > > > пт, 23 мар. 2018 г. в 22:15, Denis Magda < > > dma...@apache.org > > > >: > > > > >> >> > > > > > > > >> >> > > > > Ilya, > > > > >> >> > > > > > > > > >> >> > > > > That's a decent boost (5-20%) even having WAL enabled. > > Not > > > > sure > > > > >> >> that > > > > >> >> > we > > > > >> >> > > > > should stake on the WAL "off" mode here because if the > > > whole > > > > >> >> cluster > > > > >> >> > > goes > > > > >> >> > > > > down, it's then the data consistency is questionable. > As > > an > > > > >> >> > architect, > > > > >> >> > > I > > > > >> >> > > > > wouldn't disable WAL for the sake of rebalancing; it's > > too > > > > >> risky. > > > > >> >> > > > > > > > > >> >> > > > > If you agree, then let's create the IEP. This way it > will > > > be > > > > >> >> easier > > > > >> >> > to > > > > >> >> > > > > track this endeavor. BTW, are you already ready to > > release > > > > any > > > > >> >> > > > > optimizations in 2.5 that is being discussed in a > > separate > > > > >> thread? > > > > >> >> > > > > > > > > >> >> > > > > -- > > > > >> >> > > > > Denis > > > > >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > > >> >> > > > > On Fri, Mar 23, 2018 at 6:37 AM, Ilya Lantukh < > > > > >> >> ilant...@gridgain.com > > > > >> >> > > > > > > >> >> > > > > wrote: > > > > >> >> > > > > > > > > >> >> > > > > > Denis, > > > > >> >> > > > > > > > > > >> >> > > > > > > - Don't you want to aggregate the tickets under an > > IEP? > > > > >> >> > > > > > Yes, I think so. > > > > >> >> > > > > > > > > > >> >> > > > > > > - Does it mean we're going to update our B+Tree > > > > >> >> implementation? > > > > >> >> > Any > > > > >> >> > > > > ideas > > > > >> >> > > > > > how risky it is? > > > > >> >> > > > > > One of tickets that I created ( > > > > >> >> > > > > > https://issues.apache.org/jira/browse/IGNITE-7935) > > > > involves > > > > >> >> B+Tree > > > > >> >> > > > > > modification, but I am not planning to do it in the > > > nearest > > > > >> >> future. > > > > >> >> > > It > > > > >> >> > > > > > shouldn't affect existing tree operations, only > > introduce > > > > new > > > > >> >> ones > > > > >> >> > > > > (putAll, > > > > >> >> > > > > > invokeAll, removeAll). > > > > >> >> > > > > > > > > > >> >> > > > > > > - Any chance you had a prototype that shows > > performance > > > > >> >> > > optimizations > > > > >> >> > > > > the > > > > >> >> > > > > > approach you are suggesting to take? > > > > >> >> > > > > > I have a prototype for simplest improvements ( > > > > >> >> > > > https://issues.apache.org/ > > > > >> >> > > > > > jira/browse/IGNITE-8019 & https://issues.apache.org/ > > > > >> >> > > > > > jira/browse/IGNITE-8018) > > > > >> >> > > > > > - together they increase throughput by 5-20%, > depending > > > on > > > > >> >> > > > configuration > > > > >> >> > > > > > and environment. Also, I've tested different WAL > modes > > - > > > > >> >> switching > > > > >> >> > > from > > > > >> >> > > > > > LOG_ONLY to NONE gives over 100% boost - this is > what I > > > > >> expect > > > > >> >> from > > > > >> >> > > > > > https://issues.apache.org/jira/browse/IGNITE-8017. > > > > >> >> > > > > > > > > > >> >> > > > > > On Thu, Mar 22, 2018 at 9:48 PM, Denis Magda < > > > > >> dma...@apache.org > > > > >> >> > > > > > >> >> > > > wrote: > > > > >> >> > > > > > > > > > >> >> > > > > > > Ilya, > > > > >> >> > > > > > > > > > > >> >> > > > > > > That's outstanding research and summary. Thanks for > > > > >> spending > > > > >> >> your > > > > >> >> > > > time > > > > >> >> > > > > on > > > > >> >> > > > > > > this. > > > > >> >> > > > > > > > > > > >> >> > > > > > > Not sure I have enough expertise to challenge your > > > > >> approach, > > > > >> >> but > > > > >> >> > it > > > > >> >> > > > > > sounds > > > > >> >> > > > > > > 100% reasonable to me. As side notes: > > > > >> >> > > > > > > > > > > >> >> > > > > > > - Don't you want to aggregate the tickets under > an > > > > IEP? > > > > >> >> > > > > > > - Does it mean we're going to update our B+Tree > > > > >> >> > implementation? > > > > >> >> > > > Any > > > > >> >> > > > > > > ideas how risky it is? > > > > >> >> > > > > > > - Any chance you had a prototype that shows > > > > performance > > > > >> >> > > > > optimizations > > > > >> >> > > > > > of > > > > >> >> > > > > > > the approach you are suggesting to take? > > > > >> >> > > > > > > > > > > >> >> > > > > > > -- > > > > >> >> > > > > > > Denis > > > > >> >> > > > > > > > > > > >> >> > > > > > > On Thu, Mar 22, 2018 at 8:38 AM, Ilya Lantukh < > > > > >> >> > > ilant...@gridgain.com > > > > >> >> > > > > > > > > >> >> > > > > > > wrote: > > > > >> >> > > > > > > > > > > >> >> > > > > > > > Igniters, > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > I've spent some time analyzing performance of > > > > rebalancing > > > > >> >> > > process. > > > > >> >> > > > > The > > > > >> >> > > > > > > > initial goal was to understand, what limits it's > > > > >> throughput, > > > > >> >> > > > because > > > > >> >> > > > > it > > > > >> >> > > > > > > is > > > > >> >> > > > > > > > significantly slower than network and storage > > device > > > > can > > > > >> >> > > > > theoretically > > > > >> >> > > > > > > > handle. > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > Turns out, our current implementation has a > number > > of > > > > >> issues > > > > >> >> > > caused > > > > >> >> > > > > by > > > > >> >> > > > > > a > > > > >> >> > > > > > > > single fundamental problem. > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > During rebalance data is sent in batches called > > > > >> >> > > > > > > > GridDhtPartitionSupplyMessages. Batch size is > > > > >> configurable, > > > > >> >> > > > default > > > > >> >> > > > > > > value > > > > >> >> > > > > > > > is 512KB, which could mean thousands of key-value > > > > pairs. > > > > >> >> > However, > > > > >> >> > > > we > > > > >> >> > > > > > > don't > > > > >> >> > > > > > > > take any advantage over this fact and process > each > > > > entry > > > > >> >> > > > > independently: > > > > >> >> > > > > > > > - checkpointReadLock is acquired multiple times > for > > > > every > > > > >> >> > entry, > > > > >> >> > > > > > leading > > > > >> >> > > > > > > to > > > > >> >> > > > > > > > unnecessary contention - this is clearly a bug; > > > > >> >> > > > > > > > - for each entry we write (and fsync, if > > > configuration > > > > >> >> assumes > > > > >> >> > > it) > > > > >> >> > > > a > > > > >> >> > > > > > > > separate WAL record - so, if batch contains N > > > entries, > > > > we > > > > >> >> might > > > > >> >> > > end > > > > >> >> > > > > up > > > > >> >> > > > > > > > doing N fsyncs; > > > > >> >> > > > > > > > - adding every entry into CacheDataStore also > > happens > > > > >> >> > completely > > > > >> >> > > > > > > > independently. It means, we will traverse and > > modify > > > > each > > > > >> >> index > > > > >> >> > > > tree > > > > >> >> > > > > N > > > > >> >> > > > > > > > times, we will allocate space in FreeList N times > > and > > > > we > > > > >> >> will > > > > >> >> > > have > > > > >> >> > > > to > > > > >> >> > > > > > > > additionally store in WAL O(N*log(N)) page delta > > > > records. > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > I've created a few tickets in JIRA with very > > > different > > > > >> >> levels > > > > >> >> > of > > > > >> >> > > > > scale > > > > >> >> > > > > > > and > > > > >> >> > > > > > > > complexity. > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > Ways to reduce impact of independent processing: > > > > >> >> > > > > > > > - https://issues.apache.org/ > > jira/browse/IGNITE-8019 > > > - > > > > >> >> > > > aforementioned > > > > >> >> > > > > > > bug, > > > > >> >> > > > > > > > causing contention on checkpointReadLock; > > > > >> >> > > > > > > > - https://issues.apache.org/ > > jira/browse/IGNITE-8018 > > > - > > > > >> >> > > inefficiency > > > > >> >> > > > > in > > > > >> >> > > > > > > > GridCacheMapEntry implementation; > > > > >> >> > > > > > > > - https://issues.apache.org/ > > jira/browse/IGNITE-8017 > > > - > > > > >> >> > > > automatically > > > > >> >> > > > > > > > disable > > > > >> >> > > > > > > > WAL during preloading. > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > Ways to solve problem on more global level: > > > > >> >> > > > > > > > - https://issues.apache.org/ > > jira/browse/IGNITE-7935 > > > - > > > > a > > > > >> >> ticket > > > > >> >> > > to > > > > >> >> > > > > > > > introduce > > > > >> >> > > > > > > > batch modification; > > > > >> >> > > > > > > > - https://issues.apache.org/ > > jira/browse/IGNITE-8020 > > > - > > > > >> >> complete > > > > >> >> > > > > > redesign > > > > >> >> > > > > > > of > > > > >> >> > > > > > > > rebalancing process for persistent caches, based > on > > > > file > > > > >> >> > > transfer. > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > Everyone is welcome to criticize above ideas, > > suggest > > > > new > > > > >> >> ones > > > > >> >> > or > > > > >> >> > > > > > > > participate in implementation. > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > -- > > > > >> >> > > > > > > > Best regards, > > > > >> >> > > > > > > > Ilya > > > > >> >> > > > > > > > > > > > >> >> > > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > > > > > >> >> > > > > > -- > > > > >> >> > > > > > Best regards, > > > > >> >> > > > > > Ilya > > > > >> >> > > > > > > > > > >> >> > > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > > >> >> > > -- > > > > >> >> > > Best regards, > > > > >> >> > > Ilya > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > > >> >> > > > > >> >> -- > > > > >> >> Best regards, > > > > >> >> Ilya > > > > >> >> > > > > >> > > > > > >> > > > > > >> > > > > > > > > > > > > > > > > > > > > -- > > > > > Best regards, > > > > > Ilya > > > > > > > > > > > > > > > > > > > > > -- > > > > Best regards, > > > > Ilya > > > > > > > > > >