[ 
https://issues.apache.org/jira/browse/IGNITE-8295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16442297#comment-16442297
 ] 

Andrew Mashenkov commented on IGNITE-8295:
------------------------------------------

After wrap partStoreLock into checkpointLock i've got next stacktrace.
Seems, we should truncate partition file under checkpointLock.

java.lang.AssertionError: FullPageId [pageId=0001005700000003, 
effectivePageId=0000005700000003, grpId=2141373874]
 at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:730)
 at 
org.apache.ignite.internal.processors.cache.persistence.pagemem.PageMemoryImpl.acquirePage(PageMemoryImpl.java:624)
 at 
org.apache.ignite.internal.processors.cache.persistence.DataStructure.acquirePage(DataStructure.java:142)
 at 
org.apache.ignite.internal.processors.cache.persistence.freelist.PagesList.saveMetadata(PagesList.java:301)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.saveStoreMetadata(GridCacheOffheapManager.java:186)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheOffheapManager.onCheckpointBegin(GridCacheOffheapManager.java:164)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.markCheckpointBegin(GridCacheDatabaseSharedManager.java:3155)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.doCheckpoint(GridCacheDatabaseSharedManager.java:2909)
 at 
org.apache.ignite.internal.processors.cache.persistence.GridCacheDatabaseSharedManager$Checkpointer.body(GridCacheDatabaseSharedManager.java:2808)
 at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:110)
 at java.lang.Thread.run(Thread.java:748)

> Possible deadlock on partition eviction.
> ----------------------------------------
>
>                 Key: IGNITE-8295
>                 URL: https://issues.apache.org/jira/browse/IGNITE-8295
>             Project: Ignite
>          Issue Type: Bug
>          Components: persistence
>            Reporter: Andrew Mashenkov
>            Assignee: Andrew Mashenkov
>            Priority: Major
>             Fix For: 2.6
>
>         Attachments: deadlock.stack
>
>
> GridCacheOffheapManager.recreateCacheDataStore() calls 
> updatePartitionCounter() under partStoreLock which may try to acquire 
> checkpointReadLock.
> recreateCacheDataStore() method can be called with checkpointReadLock (on 
> GridDhtPartitionsExchangeFuture.updatePartitionFullMap) 
> or without checkpointReadLock (GridDhtPartitionEvictor thread calls 
> evictPartitionAsync),
> So, checkpoint can cause a deadlock if it happens in between.
> Seems, we should acquire checkpointReadLock before partStoreLock. 
>   



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to