Guys, what about not invalidating cache contexts on stop? Let's cleanup on higher level.
--Yakov 2015-08-04 22:48 GMT+03:00 Denis Magda <[email protected]>: > Alex, thanks for the review! > > Sure, this is just a local fix. > Recently I've detected and fixed several issues in TCP communication SPI > that happened because of invalidated cache context. In addition, Andrey > Gura mentioned that periodically he reproduces hangs in cache get > operations that most likely to happen because of invalidated cache context > as well. > > Seems that it's time to fix the situation with invalidated cache context > globally. I'll create a task in JIRA in several days when return from a > short vacation putting extensive details. Then someone from the community > or me will have a chance to makes his/her hands dirty with this :) > > As for this deadlock I'll merge that changes in any case because we need to > have them in the code to omit other RuntimeExceptions that may happen > because of any other reason. The threads that led to the deadlock were > threads from partitions supply pool or some internal workers pool. > > Regards, > Denis > > On 4 авг. 2015 г., at 22:09, Alexey Goncharuk <[email protected]> > wrote: > > The change by itself looks right and can be merged, however I do not think > this is a complete fix. What kind of running threads were using invalidated > cache context? These threads may raise plenty of other exceptions if > invalid context is used. I think the proper solution should block a guard > (I am sure we already have a guard that we can reuse) and wait for all > threads to release this guard before cleaning up the context. > > 2015-08-04 8:28 GMT-07:00 Denis Magda <[email protected]>: > > Hi Alex, Igniters, > > I've fixed a deadlock in GridDhtAtomicCache that was a reason of frequent > hanging of "Cache Restart" test suite. > > In short, the deadlock happened because a cache was already stopped but > some running threads, that perform cache related operations, keep using > invalidated GridCacheContext. > All the details are described here: > https://issues.apache.org/jira/browse/IGNITE-1189 < > https://issues.apache.org/jira/browse/IGNITE-1189> > > Alex, as one of earlier implementers of this code, please review the > changes. > > Regards, > Denis >
