Igniters, While working on ignite-1171 we discovered couple more issues in discovery that might have threaten custom events processing under some circumstances (we have continuous processes based on this logic, for example).
Alexey Goncharuk has picked this up. Another critical issue discovered today - https://issues.apache.org/jira/browse/IGNITE-1516 - performance drop in offheap query benchmark. Semyon will be fixing it. https://issues.apache.org/jira/browse/IGNITE-973 - Sergi has come to conclusion that race still present in cache offheap swap logic. Currently this is assigned to Semyon, too. We need to postpone release till very beginning of next week. --Yakov 2015-09-18 12:01 GMT+03:00 Yakov Zhdanov <[email protected]>: > Alex, I think that your approach with delaying custom message will work. > As far as coordinator crash protection, we guarantee delivery of certain > messages types (including custom message). This logic was implemented long > ago and seems to work. So, the message just gets resent. > > Semyon, can you please take a look at Alex's changes? > > --Yakov > > 2015-09-18 3:24 GMT+03:00 Alexey Goncharuk <[email protected]>: > >> Yakov, >> >> The approach with collecting discovery data on NodeAddFinished message >> does >> not work because this messages get relayed to clients before the message >> passes the whole ring. If we make it to pass the ring and relay it to >> clients on the second round, we get the same race as I was fixing. >> >> I think the correct approach here is to delay custom event messages when >> node join is in progress - basically do not allow custom messages between >> NodeAddedMessage and NodeAddFinished message. I implemented a very simple >> fix in ignite-1171, however I need you someone else with good expertise in >> discovery protocol to take a look at my changes because I am sure I missed >> something - e.g. I am not sure how delayed messages should be handled in >> case when coordinator node crashes. >> >> 2015-09-17 8:52 GMT-07:00 Yakov Zhdanov <[email protected]>: >> >> > Alex, I think it makes sense to continue investigating this. We can >> discuss >> > whether we include or skip the fix once fix is ready. >> > >> > As far as other tickets: >> > >> > >> https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%20ignite-1.4%20AND%20status%20!%3D%20closed%20ORDER%20BY%20assignee%20ASC%2C%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC >> > >> > IGNITE-1171 Getting affinity for topology version earlier than affinity >> is >> > calculated - is on Alex Goncharuk. >> > IGNITE-973 Failed to get value for key: 13791. at >> > >> > >> o.a.i.i.processors.query.h2.opt.GridH2AbstractKeyValueRow.getValue(GridH2AbstractKeyValueRow.java:223) >> > - assigned to Sergi. There seems to be a problem in offheap indexing >> which >> > can be reproduced from time to time. This is an old issue and I think >> can >> > be postponed if does not fit. >> > >> > +1 IGFS issue >> > and rest ver.x issues >> > >> > I hope IGNITE-1171 will be fixed today so picture become much cleaner. >> > >> > -- >> > Yakov Zhdanov, Director R&D >> > *GridGain Systems* >> > www.gridgain.com >> > >> > 2015-09-17 0:59 GMT+03:00 Alexey Goncharuk <[email protected] >> >: >> > >> > > Yakov, Igniters, >> > > >> > > I have found at least one issue related to ignite-1171 hang, it is >> caused >> > > by a race between discovery custom message and collectDiscoveryData() >> > call >> > > (updated the ticket). I remember we wanted to call >> collectDiscoveryData() >> > > during the NodeAddFinishedMessage processing, however it was not >> > > implemented - do we think that this is a correct change and do we >> want it >> > > to be fixed in 1.4? Discovery changes are quite sensitive and I would >> > > prefer them to be tested thoroughly. >> > > >> > > 2015-09-16 9:09 GMT-07:00 Yakov Zhdanov <[email protected]>: >> > > >> > > > Guys, >> > > > >> > > > I want to update release status. >> > > > >> > > > Testing has revealed some cache issues which should be fixed with >> the >> > > > release. Moreover, it turned out that these issues block vert.x >> > release. >> > > > So, if we fix them we can consider including vert.x into 1.4 >> release. >> > > Which >> > > > is good I think. >> > > > >> > > > I think that Alex Goncharuk is the best person who can look into >> vert.x >> > > > issues. Alex, please first of all pay attention to IGNITE-1171 - >> > Getting >> > > > affinity for topology version earlier than affinity is calculated - >> > Test >> > > > reproducing the issue has been added to ignite1.4. Alex please let >> us >> > > know >> > > > if this can be fixed. >> > > > >> > > > These issues are on Semyon Boikov: >> > > > >> > > > IGNITE-973 Failed to get value for key: 13791. at >> > > > >> > > > >> > > >> > >> o.a.i.i.processors.query.h2.opt.GridH2AbstractKeyValueRow.getValue(GridH2AbstractKeyValueRow.java:223) >> > > > - We need more time to finish with this. Some race in swap is still >> > > there. >> > > > IGNITE-1452 OptimizedMarshaller.unmarshal hangs in >> > > > IgniteCacheQueryNodeRestartSelfTest2 - Need to check TC and merge. >> > > > >> > > > Rest of tickets are vert.x related. Here is the link - >> > > > >> > > > >> > > >> > >> https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%20ignite-1.4%20AND%20status%20!%3D%20closed%20ORDER%20BY%20assignee%20ASC%2C%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC >> > > > >> > > > Andrey Gura, please provide as much information as you can for the >> rest >> > > of >> > > > vert.x tickets. >> > > > >> > > > Thanks! >> > > > >> > > > --Yakov >> > > > >> > > > 2015-09-15 19:12 GMT+03:00 Yakov Zhdanov <[email protected]>: >> > > > >> > > > > Raul, how is your status with the streamer? I think there is no >> > reason >> > > > for >> > > > > rush. We can put it to 1.5. Please let me know what you think. >> > > > > >> > > > > As far as release status here are the open tickets - >> > > > > >> > > > >> > > >> > >> https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%20ignite-1.4%20AND%20status%20!%3D%20closed%20ORDER%20BY%20assignee%20ASC%2C%20due%20ASC%2C%20priority%20DESC%2C%20created%20ASC >> > > > > >> > > > > https://issues.apache.org/jira/browse/IGNITE-1239 - Alex >> Goncharuk, >> > > can >> > > > > you please let us know if this will be finished today? >> > > > > https://issues.apache.org/jira/browse/IGNITE-1490 - Ilya Suntsov >> > works >> > > > on >> > > > > reproducing this. I suspect we may have problems with near cache >> > > > evictions. >> > > > > Can Val or Alex proceed with this after Ilya finishes test run? >> Ilya, >> > > > > please respond in ticket upon your results. >> > > > > >> > > > > Thanks! >> > > > > >> > > > > --Yakov >> > > > > >> > > > > 2015-09-15 11:15 GMT+03:00 Raul Kripalani <[email protected]>: >> > > > > >> > > > >> Hi guys, >> > > > >> >> > > > >> The MQTT streamer I'm working on will be ready this week. >> Hopefully >> > as >> > > > >> soon >> > > > >> as today or tomorrow. >> > > > >> >> > > > >> It's not important for the 1.4 release, but it seems like it'll >> make >> > > the >> > > > >> timeline to get potentially merged. >> > > > >> >> > > > >> Regards, >> > > > >> Raúl. >> > > > >> On 15 Sep 2015 00:05, "Yakov Zhdanov" <[email protected]> >> wrote: >> > > > >> >> > > > >> > Guys, >> > > > >> > >> > > > >> > Current status is the following: >> > > > >> > >> > > > >> > 1. Sam needs to merge his fixes after TC is finished. >> > > > >> > 2. Some minor changes pending from Denis + release notes fix >> > pointed >> > > > by >> > > > >> > Dmitry. >> > > > >> > 3. Several suites are still red on TC >> > > > >> > >> > > > >> > I have moved plenty of tickets to ignite-1.5. Here is the link >> to >> > > > >> currently >> > > > >> > open tickets that I want everyone (esp. assignees) to look >> through >> > > and >> > > > >> tell >> > > > >> > me whether ticket can be moved or should be fixed - >> > > > >> > >> > > > >> > >> > > > >> >> > > > >> > > >> > >> https://issues.apache.org/jira/issues/?jql=project%20%3D%20IGNITE%20AND%20fixVersion%20%3D%20ignite-1.4%20AND%20status%20!%3D%20closed%20ORDER%20BY%20due%20ASC%2C%20priority%20DESC >> > > > >> > >> > > > >> > Alex Goncharuk has 5 tickets. >> > > > >> > Semyon Boikov has 5 tickets. >> > > > >> > Valentin has 4 >> > > > >> > Sergi has 4 >> > > > >> > Vladimir has 3 >> > > > >> > Ivan V. has 3 >> > > > >> > >> > > > >> > Guys, please look your tickets through and let us know your >> > > decision. >> > > > >> > >> > > > >> > --Yakov >> > > > >> > >> > > > >> > 2015-09-14 21:04 GMT+03:00 Dmitriy Setrakyan < >> > [email protected] >> > > >: >> > > > >> > >> > > > >> > > Yakov, >> > > > >> > > >> > > > >> > > I know you were managing the 1.4 release. Can you please >> provide >> > > an >> > > > >> > update >> > > > >> > > of what goes into the release at this point and what is the >> > > overall >> > > > >> plan? >> > > > >> > > >> > > > >> > > Thanks, >> > > > >> > > D. >> > > > >> > > >> > > > >> > >> > > > >> >> > > > > >> > > > > >> > > > >> > > >> > >> > >
