[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-09-29 Thread Dmitriy Pavlov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16632901#comment-16632901
 ] 

Dmitriy Pavlov commented on IGNITE-7618:


Folks, let me remind that MTCGA was not handled
[MTCGA]: new failures in builds [1871897] needs to be handled
https://lists.apache.org/thread.html/16c8718047b6514f41c5987b8571a8fb64d20893ad5803dfe23d2ab7@%3Cdev.ignite.apache.org%3E

the test is still fail

> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.7
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.txLockAsync(GridDistributedCacheAdapter.java:109)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.getAllAsync(GridNearTxLocal.java:1723)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$4.op(GridDhtColocatedCache.java:197)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-09-17 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617261#comment-16617261
 ] 

ASF GitHub Bot commented on IGNITE-7618:


Github user SomeFire closed the pull request at:

https://github.com/apache/ignite/pull/4095


> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.7
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.txLockAsync(GridDistributedCacheAdapter.java:109)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.getAllAsync(GridNearTxLocal.java:1723)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$4.op(GridDhtColocatedCache.java:197)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter$AsyncOp.op(GridCacheAdapter.java:5140)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.asyncOp(GridCacheAdapter.java:4275)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-09-14 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615410#comment-16615410
 ] 

ASF GitHub Bot commented on IGNITE-7618:


Github user asfgit closed the pull request at:

https://github.com/apache/ignite/pull/4733


> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.8
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.txLockAsync(GridDistributedCacheAdapter.java:109)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.getAllAsync(GridNearTxLocal.java:1723)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$4.op(GridDhtColocatedCache.java:197)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter$AsyncOp.op(GridCacheAdapter.java:5140)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.asyncOp(GridCacheAdapter.java:4275)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-09-14 Thread Dmitriy Govorukhin (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615409#comment-16615409
 ] 

Dmitriy Govorukhin commented on IGNITE-7618:


[~ibessonov][~Jokser] Thanks for your work, changes are merged into master.

> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.8
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.txLockAsync(GridDistributedCacheAdapter.java:109)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.getAllAsync(GridNearTxLocal.java:1723)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$4.op(GridDhtColocatedCache.java:197)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter$AsyncOp.op(GridCacheAdapter.java:5140)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.asyncOp(GridCacheAdapter.java:4275)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-09-14 Thread Pavel Kovalenko (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16615048#comment-16615048
 ] 

Pavel Kovalenko commented on IGNITE-7618:
-

[~ibessonov] Thank you for contribution. Now it looks good. 
[~DmitriyGovorukhin] could you please commit the change?

> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.8
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.txLockAsync(GridDistributedCacheAdapter.java:109)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.getAllAsync(GridNearTxLocal.java:1723)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$4.op(GridDhtColocatedCache.java:197)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter$AsyncOp.op(GridCacheAdapter.java:5140)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.asyncOp(GridCacheAdapter.java:4275)
>  

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-09-14 Thread Ivan Bessonov (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614927#comment-16614927
 ] 

Ivan Bessonov commented on IGNITE-7618:
---

[~Jokser] Thank you for the review. I made "clusterIsActive" field volatile, so 
I guess it might be closed.

> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.8
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.txLockAsync(GridDistributedCacheAdapter.java:109)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.getAllAsync(GridNearTxLocal.java:1723)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$4.op(GridDhtColocatedCache.java:197)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter$AsyncOp.op(GridCacheAdapter.java:5140)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter.asyncOp(GridCacheAdapter.java:4275)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-09-14 Thread Pavel Kovalenko (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614843#comment-16614843
 ] 

Pavel Kovalenko commented on IGNITE-7618:
-

[~ibessonov] I reviewed your change and think that "clusterIsActivate" should 
be either volatile or final to make sure that we will not have visibility 
problems. Overall look good.

> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.8
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.txLockAsync(GridDistributedCacheAdapter.java:109)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.getAllAsync(GridNearTxLocal.java:1723)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache$4.op(GridDhtColocatedCache.java:197)
>   at 
> org.apache.ignite.internal.processors.cache.GridCacheAdapter$AsyncOp.op(GridCacheAdapter.java:5140)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-09-12 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611842#comment-16611842
 ] 

ASF GitHub Bot commented on IGNITE-7618:


GitHub user ibessonov opened a pull request:

https://github.com/apache/ignite/pull/4733

IGNITE-7618 validateCache synchronously waits for state change



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gridgain/apache-ignite ignite-7618

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/4733.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4733


commit e73fc53dae414bb7990a67814387fad7ac143fb3
Author: ibessonov 
Date:   2018-09-12T09:43:43Z

IGNITE-7618 Avoid "publicApiActiveState" call in 
GridDhtTopologyFutureAdapter#validateCache




> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ivan Bessonov
>Priority: Major
> Fix For: 2.8
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-06-22 Thread Alexey Goncharuk (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16520534#comment-16520534
 ] 

Alexey Goncharuk commented on IGNITE-7618:
--

[~SomeFire], in your example I do not see how this is invalid. Any cache 
operation happens on some topology version, and since cluster 
activation/deactivation changes the topology version as well, we can 
immediately tell whether the cache operation should happen or not.

If the activation process is not finished yet, then the active(true) method did 
not return, then it means that a cache operation happens concurrently with 
active(true) and it is fine to throw an exception (the cluster is not active 
_yet_).

> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ryabov Dmitrii
>Priority: Major
> Fix For: 2.6
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.txLockAsync(GridDistributedCacheAdapter.java:109)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-06-09 Thread Ryabov Dmitrii (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16506923#comment-16506923
 ] 

Ryabov Dmitrii commented on IGNITE-7618:


[~agoncharuk], I investigated issue a bit more. In current implementation 
{{ExchangeActions}} doesn't have cluster state, only state changing. 
 
But the main problem is that cache gets last finished future, which can be not 
actual, so the future must check real state in state processor. 
If we will save state to the future, then bad situation is possible: in 
sequence deactivation-activation-operation deactivation future can be finished, 
activation future not finished, but user already started operation. Like in 
{{IgniteStandByClusterTest#testSimple()}} test. 
In this situation valid operation will see future containing deactivated state, 
so operation will fail. 
 
I can't reproduce situation from description, so could you please share your 
reproducer, if you have one? Because without reproducer this ticket is about 
serious refactoring, which isn't acceptable in most cases by community. 

> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ryabov Dmitrii
>Priority: Major
> Fix For: 2.6
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-06-06 Thread Ryabov Dmitrii (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16503420#comment-16503420
 ] 

Ryabov Dmitrii commented on IGNITE-7618:


[~agoncharuk], can you write reproducer for this hanging? Also 
{{GridDhtTopologyFutureAdapter}} and it's inheritor 
{{ClientCacheDhtTopologyFuture}} doesn't  have {{ExchangeActions}}, but it 
validates cache too. So what we should do in this case? We can move 
{{ExchangeActions}} from {{GridDhtPartitionsExchangeFuture}} to the abstract 
class and set every time, so {{ClientCacheDhtTopologyFuture}} will have it too. 
Or we can use something else.

> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ryabov Dmitrii
>Priority: Major
> Fix For: 2.6
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.GridDistributedCacheAdapter.txLockAsync(GridDistributedCacheAdapter.java:109)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.near.GridNearTxLocal.getAllAsync(GridNearTxLocal.java:1723)
>   at 
> 

[jira] [Commented] (IGNITE-7618) validateCache synchronously waits for state change

2018-05-30 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/IGNITE-7618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495331#comment-16495331
 ] 

ASF GitHub Bot commented on IGNITE-7618:


GitHub user SomeFire opened a pull request:

https://github.com/apache/ignite/pull/4095

IGNITE-7618 validateCache synchronously waits for state change



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/SomeFire/ignite ignite-7618

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/ignite/pull/4095.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4095


commit 7e7c1e5119af60bbaa9968ca8cadf1008049fa6f
Author: Dmitrii Ryabov 
Date:   2018-05-30T14:52:24Z

Check state by exchActions in GridDhtPartitionsExchangeFuture




> validateCache synchronously waits for state change
> --
>
> Key: IGNITE-7618
> URL: https://issues.apache.org/jira/browse/IGNITE-7618
> Project: Ignite
>  Issue Type: Bug
>  Components: cache
>Affects Versions: 2.4
>Reporter: Alexey Goncharuk
>Assignee: Ryabov Dmitrii
>Priority: Major
> Fix For: 2.6
>
>
> Currently, we call publicApiActiveState(waitForTransition=true) in 
> GridDhtTopologyFutureAdapter#validateCache. This is incorrect, because the 
> cluster state is updated from discovery thread, and mapping may be happening 
> on a previous topology version. We must capture the cluster active state flag 
> to the exchange future (I believe we already have all the mechanics for this 
> in ExchangeActions class) and validate the state in the exchange future 
> itself, without touching ClusterStateProcessor.
> Below is an example of a deadlock happened because of synchronous wait:
> {code}
> "db-checkpoint-thread-#12318%database.IgniteDbBaselineTopologySelfTest0%" 
> #15498 prio=5 os_prio=0 tid=0x7fa173e80800 nid=0x3d08 waiting on 
> condition [0x7fa2069a8000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   - parking to wait for  <0x0007abfc16e8> (a 
> java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>   at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>   at 
> java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock.lock(ReentrantReadWriteLock.java:943)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:2991)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2797)
>   at 
> org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2722)
>   at 
> org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
>   at java.lang.Thread.run(Thread.java:745)
> "snapshot-scheduler-restats-#12202%database.IgniteDbBaselineTopologySelfTest0%"
>  #15332 prio=5 os_prio=0 tid=0x7fa228143000 nid=0x3c65 waiting on 
> condition [0x7fa2be919000]
>java.lang.Thread.State: WAITING (parking)
>   at sun.misc.Unsafe.park(Native Method)
>   at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:177)
>   at 
> org.apache.ignite.internal.util.future.GridFutureAdapter.get(GridFutureAdapter.java:140)
>   at 
> org.apache.ignite.internal.processors.cluster.GridClusterStateProcessor.publicApiActiveState(GridClusterStateProcessor.java:194)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTopologyFutureAdapter.validateCache(GridDhtTopologyFutureAdapter.java:83)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.mapOnTopology(GridDhtColocatedLockFuture.java:787)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedLockFuture.map(GridDhtColocatedLockFuture.java:763)
>   at 
> org.apache.ignite.internal.processors.cache.distributed.dht.colocated.GridDhtColocatedCache.lockAllAsync(GridDhtColocatedCache.java:646)
>   at 
>