Re: Fixed deadlock in GridDhtAtomicCache (Alex G. your review is needed)

Denis Magda Mon, 10 Aug 2015 05:18:04 -0700

Andrey Gura,

Could you put the info on the errors you observed with cache readoperations in the ticket below?


--
Denis

On 8/10/2015 3:13 PM, Denis Magda wrote:

What do you mean under the cleanup on a higher level?
Do you consider setting all cache context references to null whenrequired letting a garbage collector to deallocate context's internalswhen it's time for that?
In any case I've created a ticker where we can put all the usefulthoughts/ideas that should help an implementor.
https://issues.apache.org/jira/browse/IGNITE-1221

--
Denis

On 8/5/2015 5:48 PM, Yakov Zhdanov wrote:
Guys, what about not invalidating cache contexts on stop? Let'scleanup on
higher level.

--Yakov

2015-08-04 22:48 GMT+03:00 Denis Magda <[email protected]>:
Alex, thanks for the review!

Sure, this is just a local fix.
Recently I've detected and fixed several issues in TCP communicationSPI
that happened because of invalidated cache context. In addition, Andrey
Gura mentioned that periodically he reproduces hangs in cache get
operations that most likely to happen because of invalidated cachecontext
as well.
Seems that it's time to fix the situation with invalidated cachecontext
globally. I'll create a task in JIRA in several days when return from a
short vacation putting extensive details. Then someone from thecommunity
or me will have a chance to makes his/her hands dirty with this :)
As for this deadlock I'll merge that changes in any case because weneed to
have them in the code to omit other RuntimeExceptions that may happen
because of any other reason. The threads that led to the deadlock were
threads from partitions supply pool or some internal workers pool.

Regards,
Denis
On 4 авг. 2015 г., at 22:09, Alexey Goncharuk<[email protected]>
wrote:
The change by itself looks right and can be merged, however I do notthinkthis is a complete fix. What kind of running threads were usinginvalidated
cache context? These threads may raise plenty of other exceptions if
invalid context is used. I think the proper solution should block aguard
(I am sure we already have a guard that we can reuse) and wait for all
threads to release this guard before cleaning up the context.

2015-08-04 8:28 GMT-07:00 Denis Magda <[email protected]>:

Hi Alex, Igniters,
I've fixed a deadlock in GridDhtAtomicCache that was a reason offrequent
hanging of "Cache Restart" test suite.

In short, the deadlock happened because a cache was already stopped but
some running threads, that perform cache related operations, keep using
invalidated GridCacheContext.
All the details are described here:
https://issues.apache.org/jira/browse/IGNITE-1189 <
https://issues.apache.org/jira/browse/IGNITE-1189>

Alex, as one of earlier implementers of this code, please review the
changes.

Regards,
Denis

Re: Fixed deadlock in GridDhtAtomicCache (Alex G. your review is needed)

Reply via email to