[
https://issues.apache.org/jira/browse/IGNITE-1189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651879#comment-14651879
]
Denis Magda commented on IGNITE-1189:
-------------------------------------
During debugging found out that there is no any issue in the tests or
{{IgniteEx.stop()}} implementation.
The tests start failing with the issue in the title because there is a
preceding test that fails with timeout exception -
{{IgniteCacheAtomicNodeRestartTest.testRestartWithPutTenNodesTwoBackups}}.
According to the logs {{testRestartWithPutTenNodesTwoBackups}} hangs due to a
deadlock:
1) Thread_1 acquired the first lock that protects {{GridCacheConcurrentMap}}
and then blocked by trying to acquire the second lock ({{GridCacheMapEntry}})
when calling {{GridCacheMapEntry.obsolete}};
2) Thread_2 is trying to acquire lock that protects {{GridCacheConcurrentMap}}.
Seems that Thread_2 acquired the lock to {{GridCacheMapEntry}} instance, that
Thread_1 is trying to acquire, before and now want to get synchronized access
to {{GridCacheConcurrentMap}}. However, there is no any useful information in
the dump to prove this or to find another reason.
Will analyze code flow to detect the deadlock in it.
> Ignite instance with this name has already been started
> -------------------------------------------------------
>
> Key: IGNITE-1189
> URL: https://issues.apache.org/jira/browse/IGNITE-1189
> Project: Ignite
> Issue Type: Bug
> Components: general
> Reporter: Denis Magda
> Assignee: Denis Magda
>
> I see this issue from time to time on Team City in different tests.
> In general, the stack trace looks like this one:
> {noformat}
> org.apache.ignite.IgniteCheckedException: Ignite instance with this name has
> already been started: distributed.IgniteCacheAtomicNodeRestartTest0
> at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:920)
> at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:477)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:683)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:667)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest.startGrid(GridAbstractTest.java:644)
> at
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.startGrids(GridCacheAbstractNodeRestartSelfTest.java:158)
> at
> org.apache.ignite.internal.processors.cache.distributed.GridCacheAbstractNodeRestartSelfTest.testRestart(GridCacheAbstractNodeRestartSelfTest.java:177)
> at
> org.apache.ignite.internal.processors.cache.distributed.near.GridCachePartitionedNodeRestartTest.testRestart(GridCachePartitionedNodeRestartTest.java:58)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at junit.framework.TestCase.runTest(TestCase.java:176)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest.runTestInternal(GridAbstractTest.java:1624)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest.access$000(GridAbstractTest.java:70)
> at
> org.apache.ignite.testframework.junits.GridAbstractTest$6.run(GridAbstractTest.java:1562)
> {noformat}
> Tried to reproduce locally but failed. According to the TC the issue is
> reproducable with {{IgniteCacheAtomicNodeRestartTest}}.
> My understanding the bug appears because of:
> 1) An issue in the test;
> 2) {{IgniteEx.stop()}} that doesn't remove a grid from its internal map when
> grid's state is not equal to {{STARTED}} during the stop.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)