Maxim, Let's create a branch with 10 checks of Sync and 10 checks of Async. Then, run it 20 times at TC. This should be enough I think.
вт, 4 сент. 2018 г. в 13:09, Maxim Muzafarov <maxmu...@gmail.com>: > Anton, > > I agree with you 20 time is not enough. I've checked the single run of the > test class - it consumes ~7min per each execution. > CacheSuite8 total execution timeout - 210 min, so we can perform only 30 > class execution in this suite. Our strategy here is > to `20 times within single` and put into the TC queue 50 runs. Total ~7000 > min or 5 days. > > Not sure that we should perform exactly 1000 executions, hopefully, we will > stop adding to the queue new tasks at some point. > > On Tue, 4 Sep 2018 at 12:59 Anton Vinogradov <a...@apache.org> wrote: > > > Maxim, > > 20 is not 1k :) > > Also, you forgot to check GridCacheRebalancingAsyncSelfTest > > > > I'm not sure we should have exactly 1k runs, but 20 is definitely not > > enough. > > > > Roman, > > I propose to use IDEA "run until failure" feature and perform test > locally > > (at your PC) while you're not using PC. > > > > вт, 4 сент. 2018 г. в 12:51, Maxim Muzafarov <maxmu...@gmail.com>: > > > > > Roman, Anton, > > > > > > I've already created additional PR [2] all and run it on TC [1]. > > > Please, follow up with the results. > > > > > > [1] > > > > > > > > > https://ci.ignite.apache.org/viewType.html?buildTypeId=IgniteTests24Java8_Cache8&tab=buildTypeStatusDiv&branch_IgniteTests24Java8=pull%2F4676%2Fhead > > > [2] https://github.com/apache/ignite/pull/4676/files > > > > > > > > > On Tue, 4 Sep 2018 at 12:46 Roman Shtykh <rsht...@yahoo.com.invalid> > > > wrote: > > > > > > > Anton, > > > > Thank you. I would like to recheck it. How can this (1_000 runs) be > > done > > > > in TC? > > > > > > > > > > > > On Tuesday, September 4, 2018, 5:42:01 p.m. GMT+9, Anton > > Vinogradov < > > > > a...@apache.org> wrote: > > > > > > > > Roman, > > > > > > > > I see you uncommented this line. > > > > I do not remember deadlock detail, but I remember it was the > extremely > > > rare > > > > case. > > > > I found and "fixed" it some days before merge when I had 24x7 sanity > > > check > > > > week :) > > > > > > > > So, I propose to have at least 1_000 runs of this tests before > keeping > > > this > > > > uncommented. > > > > > > > > > > > > > > > > вт, 21 авг. 2018 г. в 11:08, Maxim Muzafarov <maxmu...@gmail.com>: > > > > > > > > > Roman, > > > > > > > > > > I worked recently on rebalance improvements and haven't found any > > > > problems > > > > > with delayed cache rebalacne. > > > > > Agree with you - let's uncomment this and remove scary comment. > Will > > > you > > > > > create a ticket for it? > > > > > > > > > > In case of any problems we can easily detec deadlock with newly > > > > configured > > > > > `FailureHandler`. > > > > > > > > > > On Tue, 21 Aug 2018 at 03:49 Roman Shtykh <rsht...@yahoo.com> > wrote: > > > > > > > > > > > Hi Maxim, > > > > > > > > > > > > I have some issues with a cluster with rebalance delay enabled, > but > > > > need > > > > > > to check more -- if I find it's related I'll share. > > > > > > Just wanted to make sure it's not an issue anymore from someone > > > working > > > > > on > > > > > > rebalancing. We should remove that comment then, it looks scary > :) > > > > > > > > > > > > -- > > > > > > Roman Shtykh > > > > > > > > > > > > > > > > > > On Tuesday, August 21, 2018, 12:49:00 a.m. GMT+9, Maxim > Muzafarov < > > > > > > maxmu...@gmail.com> wrote: > > > > > > > > > > > > > > > > > > Hello Roman, > > > > > > > > > > > > Did you faced with real issue of delayed rebalance or it's just > > only > > > > for > > > > > > your personal interest? > > > > > > If yes, please, share details and we will try to help you. > > > > > > > > > > > > As for this comment I don't think he is actual. That change was > in > > > > 2015. > > > > > > Much has changed > > > > > > within rebalance process since that time. I've uncommented it and > > > > > > rechecked with that > > > > > > cache configuration and haven't seen any failed tests or issues. > > > > > > > > > > > > Probably, that problem was about cache in SYNC mode does not > start > > > util > > > > > it > > > > > > loads all data > > > > > > from other nodes. But currently delayed rebalance works the same > > way > > > as > > > > > > IgniteCache#rebalance(), > > > > > > so you can `setRebalanceDelay` to `-1` and call it manually to > > check. > > > > > > > > > > > > On Mon, 20 Aug 2018 at 11:19 Roman Shtykh > > <rsht...@yahoo.com.invalid > > > > > > > > > > wrote: > > > > > > > > > > > > Igniters, > > > > > > I have found "Known issue, possible deadlock in case of low > > priority > > > > > cache > > > > > > rebalancing delayed" comment in > > > > > > GridCacheRebalancingSyncSelfTest#getConfiguration.Can you please > > > > explain > > > > > > when using rebalance delay can be an issue and why? > > > > > > > > > > > > -- Roman > > > > > > > > > > > > -- > > > > > > -- > > > > > > Maxim Muzafarov > > > > > > > > > > > -- > > > > > -- > > > > > Maxim Muzafarov > > > > > > > > > > > -- > > > -- > > > Maxim Muzafarov > > > > > > -- > -- > Maxim Muzafarov >