Yakov, I think I fixed the remaining issues in the branch. There was one issue with the pending queue - my original ordering for messages was not correct. The other thing was the NodeAddFinished message processing that I consulter you with over Skype. The TC looks green(ish), I cleaned up the code and merged it to ignite-1171 (non-debug) branch, triggered TC one more time.
It would be great if you guys trigger TC couple more times and monitor it's state because we changed I guess the most sensitive part of Ignite, but it feels like we're pretty close to get this issue fixed :) 2015-09-22 9:43 GMT-07:00 Yakov Zhdanov <[email protected]>: > Alex, I spent some time debugging this today. > > I noticed that we do not verify that topology version of the custom message > is identical to current ring version. After I added this condition test > started passing. However, it hangs from time to time since custom message > gets discarded before it gets processed (the new condition works here) > which means that topology version has somehow been changed, but custom > message has not been processed yet by that time. > > My changes are in ignite-1171-debug. Can you please take a further look? > > --Yakov > > 2015-09-22 5:50 GMT+03:00 Alexey Goncharuk <[email protected]>: > > > Folks, > > > > I was debugging issues with discovery today, my findings are below: > > > > - Issue with assertion "topology version has not been updated" was > > caused by sending discard message for custom messages. Now since we > > re-arrange custom messages, discardId gets repositioned and messages > > that > > should have been discarded were not discarded. > > - Fixed the issue above by introducing separate pending queue for > custom > > messages which gets discarded independently from other discovery > > messages. > > - Did not get to the bottom of "joining nodes" assertion. From the > debug > > I see that coordinator always fires custom messages at the right > moment, > > when joiningNodes is empty, however despite the fixed (above) issue > with > > custom messages discard, custom processed custom messages get re-sent > > which > > leads to this assertion > > > > I committed my pending debug code to ignite-1171-debug branch, if any of > > you guys is up to debugging this issue while I'm asleep - great, if not - > > I'll continue digging into it tomorrow. > > > > 2015-09-21 10:55 GMT-07:00 Yakov Zhdanov <[email protected]>: > > > > > Igniters, > > > > > > We are not ready to release today. > > > > > > Alexey Goncharuk is still working on ignite-1171. Alex please provide > > > updates by the end of the day. > > > > > > https://issues.apache.org/jira/browse/IGNITE-1516 - performance > offheap > > > query benchmark is not fully recovered. Semyon will be fixing it. > Sergi, > > > can you please assist? > > > > > > https://issues.apache.org/jira/browse/IGNITE-973 - Semyon has fixed > race > > > in > > > cache logic, but issue is still reproducible due to possible issues in > > > indexing logic. Sergi, this is on you. Can you please take a look? > > > > > > --Yakov > > > > > >
