Hello, I'm using Ignite 2.0.0, and I would like to ask if you have any doubts
about the deadlock.
The first use pattern is to create a new cache time unit, and after a
certain period of time, it will perform Destroy.

Example)

We create a cache that keeps the data of the 3-minute cycle as shown below

[00:00_Cache] [00:01_Cache] [00:02_Cache] 

After one minute, create a new cache [00: 03_Cache] and clear old cache [00:
00_Cache].

[00:00_Cache] is destroy!
[00:03_Cache] is new!

below current cache list
[00:01_Cache] [00:02_Cache] [00:03_Cache] 

The reason for using this is to remove the data of a certain time period
quickly rather than the expiry of Cache. As a result of eye observation, it
was possible to quickly remove data in the time zone without using a lot of
CPU.
In this state, I kept it for about 5 hours, and then I took down 5 Client
nodes that existed in Topology for a while and then uploaded them again.
Then, about ten minutes later, a deadlock occurred with the following
message.

[19:48:51,290][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-3-#4%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=179, val
    Deadlock: true
    Completed: 1054320
Thread [name="sys-stripe-3-#4%null%", id=21, state=BLOCKED, blockCnt=5364,
waitCnt=1261740]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@6c7a9d31,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at
o.a.i.i.processors.cache.GridCacheMapEntry.markObsoleteIfEmpty(GridCacheMapEntry.java:2095)
        at
o.a.i.i.processors.cache.CacheOffheapEvictionManager.touch(CacheOffheapEvictionManager.java:44)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.unlockEntries(GridDhtAtomicCache.java:2896)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1853)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

[19:48:51,423][WARN ][grid-timeout-worker-#15%null%][G] >>> Possible
starvation in striped pool.
    Thread name: sys-stripe-5-#6%null%
    Queue: [Message closure [msg=GridIoMessage [plc=2, topic=TOPIC_CACHE,
topicOrd=8, ordered=false, timeout=0, skipOnTimeout=false,
msg=GridDhtAtomicUpdateRequest [keys=[KeyCacheObjectImpl [part=541, val
    Deadlock: true
    Completed: 932925
Thread [name="sys-stripe-5-#6%null%", id=23, state=BLOCKED, blockCnt=5629,
waitCnt=1137576]
    Lock
[object=o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCacheEntry@449f1914,
ownerName=sys-stripe-6-#7%null%, ownerId=24]
        at sun.misc.Unsafe.monitorEnter(Native Method)
        at o.a.i.i.util.GridUnsafe.monitorEnter(GridUnsafe.java:1193)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.lockEntries(GridDhtAtomicCache.java:2815)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal0(GridDhtAtomicCache.java:1741)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateAllAsyncInternal(GridDhtAtomicCache.java:1630)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.processNearAtomicUpdateRequest(GridDhtAtomicCache.java:3016)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.access$400(GridDhtAtomicCache.java:127)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:282)
        at
o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache$6.apply(GridDhtAtomicCache.java:277)
        at
o.a.i.i.processors.cache.GridCacheIoManager.processMessage(GridCacheIoManager.java:863)
        at
o.a.i.i.processors.cache.GridCacheIoManager.onMessage0(GridCacheIoManager.java:386)
        at
o.a.i.i.processors.cache.GridCacheIoManager.handleMessage(GridCacheIoManager.java:308)
        at
o.a.i.i.processors.cache.GridCacheIoManager.access$000(GridCacheIoManager.java:100)
        at
o.a.i.i.processors.cache.GridCacheIoManager$1.onMessage(GridCacheIoManager.java:253)
        at
o.a.i.i.managers.communication.GridIoManager.invokeListener(GridIoManager.java:1257)
        at
o.a.i.i.managers.communication.GridIoManager.processRegularMessage0(GridIoManager.java:885)
        at
o.a.i.i.managers.communication.GridIoManager.access$2100(GridIoManager.java:114)
        at
o.a.i.i.managers.communication.GridIoManager$7.run(GridIoManager.java:802)
        at o.a.i.i.util.StripedExecutor$Stripe.run(StripedExecutor.java:483)
        at java.lang.Thread.run(Thread.java:745)

Deadlock jmc picture
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-1.ignite-deadlock-1>
 
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-2.ignite-deadlock-2>
 
<http://apache-ignite-users.70518.x6.nabble.com/file/t1415/ignite-deadlock-3.png>
 

As you can see in the picture above, we can see that sys-stripe-5 and
sys-stripe-6 are the owner of the thread. Besides Ignite Cache Configuration
is shown below.

return ignite.getOrCreateCache(new CacheConfiguration<String,
RollupMetric>()
            .setName(cacheName)
            .setCacheMode(CacheMode.PARTITIONED)
            .setAtomicityMode(CacheAtomicityMode.ATOMIC)
            .setRebalanceMode(CacheRebalanceMode.ASYNC)
            .setMemoryPolicyName(MEMORY_POLICY_NAME)
            .setBackups(1)
            .setStatisticsEnabled(true)
            .setManagementEnabled(true)
            .setCopyOnRead(false)
            .setQueryParallelism(20)
            .setLongQueryWarningTimeout(10000) // 10s
            .setEagerTtl(false)
            .setExpiryPolicyFactory(CreatedExpiryPolicy.factoryOf(new
Duration(TimeUnit.DAYS, 365)))
         
.setMaxConcurrentAsyncOperations(CacheConfiguration.DFLT_MAX_CONCURRENT_ASYNC_OPS
* 10)
            .setAffinity(new CoupangAffinityFunction())
            .setIndexedTypes(String.class, RollupMetric.class));

The reason for setting the CacheExpiryPolicy to 1 year above is because the
entry is evicted by clearing the cache as described previously.

Ignite Memory Configuration
<property name="memoryConfiguration">
      <bean class="org.apache.ignite.configuration.MemoryConfiguration">
        
        <property name="memoryPolicies">
          <list>
            <bean
class="org.apache.ignite.configuration.MemoryPolicyConfiguration">
              <property name="name" value="RollupMemory"/>
              
              <property name="pageEvictionMode" value="RANDOM_LRU"/>
              <property name="metricsEnabled" value="true"/>
              
              <property name="initialSize" value="21474836480"/>
              
              <property name="maxSize" value="21474836480"/>
            </bean>
          </list>
        </property>
        <property name="pageSize" value="4096"/>
        <property name="concurrencyLevel" value="8"/>
      </bean>
    </property>

For what reason did Deadlock occur? Is there an option or usage pattern to
solve this?

I think it is due to the client's topology changes. If so, how would you
handle it?

Please let me know if you have any additional questions.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Reply via email to