Hi,

I have configured cache as off-heap partitioned cache. Running 3 nodes on
separate machine. Loaded some data into cache using my application's normal
operations. 

Used "/kill -9 <pid>/" to kill node 3.

Node 2 shows below Warning on console after every 10 seconds -

/11:03:03,320 WARNING
[org.apache.ignite.internal.processors.cache.GridCachePartitionExchangeManager]
(exchange-worker-#256%TESTNODE%) Failed to wait for partition map exchange
[topVer=AffinityTopologyVersion [topVer=3, minorTopVer=0],
node=8cc0ac24-24b9-4d69-8472-b6a567f4d907]. Dumping pending objects that
might be the cause:/

Node 1 looks fine. However application does not work anymore and threaddump
shows it is waiting on cache put -

/java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007ecbd4a38> (a
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache$AffinityReadyFuture)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:159)
        at
org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:117)
        at
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.awaitTopologyVersion(GridAffinityAssignmentCache.java:523)
        at
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.cachedAffinity(GridAffinityAssignmentCache.java:434)
        at
org.apache.ignite.internal.processors.affinity.GridAffinityAssignmentCache.nodes(GridAffinityAssignmentCache.java:387)
        at
org.apache.ignite.internal.processors.cache.GridCacheAffinityManager.nodes(GridCacheAffinityManager.java:259)
        at
org.apache.ignite.internal.processors.cache.GridCacheAffinityManager.primary(GridCacheAffinityManager.java:295)
        at
org.apache.ignite.internal.processors.cache.GridCacheAffinityManager.primary(GridCacheAffinityManager.java:286)
        at
org.apache.ignite.internal.processors.cache.GridCacheAffinityManager.primary(GridCacheAffinityManager.java:310)
        at
org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.entryExx(GridDhtColocatedCache.java:176)
        at
org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.entryEx(GridNearTxLocal.java:1251)
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.enlistWriteEntry(IgniteTxLocalAdapter.java:2354)
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.enlistWrite(IgniteTxLocalAdapter.java:1990)
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.putAsync0(IgniteTxLocalAdapter.java:2902)
        at
org.apache.ignite.internal.processors.cache.transactions.IgniteTxLocalAdapter.putAsync(IgniteTxLocalAdapter.java:1859)
        at
org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2240)
        at
org.apache.ignite.internal.processors.cache.GridCacheAdapter$22.op(GridCacheAdapter.java:2238)
        at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.syncOp(GridCacheAdapter.java:4351)
        at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2238)
        at
org.apache.ignite.internal.processors.cache.GridCacheAdapter.put(GridCacheAdapter.java:2215)
        at
org.apache.ignite.internal.processors.cache.IgniteCacheProxy.put(IgniteCacheProxy.java:1214)/


Is there any specific configuration I need to provide for self recovery of
cluster? Losing cache data is fine, data is backedup in some persistent
store Example - DATABASE.

-Sam



--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Cluster-hung-after-a-node-killed-tp8965.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Reply via email to