[ https://issues.apache.org/jira/browse/GEODE-6526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Anilkumar Gingade reassigned GEODE-6526: ---------------------------------------- Assignee: Anilkumar Gingade > deadlock between tombstone gc and region destroy threads > -------------------------------------------------------- > > Key: GEODE-6526 > URL: https://issues.apache.org/jira/browse/GEODE-6526 > Project: Geode > Issue Type: Bug > Components: regions > Reporter: Anilkumar Gingade > Assignee: Anilkumar Gingade > Priority: Major > > There is a potential for tombstoneGC thread to be dead-locked with > region.destroy thread in following condition. > On Member1: > Thread1 - Destroys key1 (say region version 3) > Thread2 - Destroys key2 (say region version 4) > Thread3 - Starts tombstone GC (this records the tombstone gc version for > member1 as 4) > All the above three messages are sent/replicated member2; concurrently. > On Member2: > -- The destroy of key2 finishes first; and > -- The destroy of key1 which is in progress (takes a Region Entry lock); > calls tombstone removal, which tries to take region-size lock (held by > tombstone gc thread). > -- And concurrently the tombstoneGC message gets processed; this will record > the GC versions to be 4 for member2 and collects "key 1"s region entry for > removal. While removal this takes a region-size lock and tries to take > region-entry lock (waits for lock). > The above action from destroy and tomstone-gc threads results in deadlock. > The solution is to, not remove the tombstone during region.destroy; this will > be removed as part the next tombstoneGC processing. > > {noformat} > Found one Java-level deadlock: > ============================= > "Pooled Message Processor 118": > waiting to lock monitor 0x00007f02bdcfd0a8 (object 0x00007f0a569075d8, a > java.lang.Object), > which is held by "Pooled Message Processor 24" > "Pooled Message Processor 24": > waiting to lock monitor 0x00007f02bd8f99d8 (object 0x00007f099bb69270, a > org.apache.geode.internal.cache.entries.VersionedThinDiskLRURegionEntryHeapObjectKey), > which is held by "P2P message reader for > 169.84.85.56(psin9p197_cache2:58016)<v3>:1026 shared ordered uid=9 port=32780" > "P2P message reader for 169.84.85.56(psin9p197_cache2:58016)<v3>:1026 shared > ordered uid=9 port=32780": > waiting to lock monitor 0x00007f02bd8f9928 (object 0x00007f11f8f27d18, a > java.lang.String), > which is held by "Pooled Message Processor 24" > Java stack information for the threads listed above: > =================================================== > "Pooled Message Processor 118": > at > org.apache.geode.internal.cache.TombstoneService.gcTombstones(TombstoneService.java:209) > - waiting to lock <0x00007f0a569075d8> (a java.lang.Object) > at > org.apache.geode.internal.cache.LocalRegion.expireTombstones(LocalRegion.java:3293) > at > org.apache.geode.internal.cache.DistributedTombstoneOperation$TombstoneMessage.operateOnRegion(DistributedTombstoneOperation.java:169) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1191) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1091) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:378) > at > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:444) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1121) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109) > at > org.apache.geode.distributed.internal.ClusterDistributionManager$4$1.run(ClusterDistributionManager.java:791) > at java.lang.Thread.run(Thread.java:748) > "Pooled Message Processor 24": > at > org.apache.geode.internal.cache.AbstractRegionMap.removeTombstone(AbstractRegionMap.java:3321) > - waiting to lock <0x00007f099bb69270> (a > org.apache.geode.internal.cache.entries.VersionedThinDiskLRURegionEntryHeapObjectKey) > - locked <0x00007f11f8f27d18> (a java.lang.String) > at > org.apache.geode.internal.cache.TombstoneService.gcTombstones(TombstoneService.java:259) > - locked <0x00007f0a569075d8> (a java.lang.Object) > at > org.apache.geode.internal.cache.LocalRegion.expireTombstones(LocalRegion.java:3293) > at > org.apache.geode.internal.cache.DistributedTombstoneOperation$TombstoneMessage.operateOnRegion(DistributedTombstoneOperation.java:169) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1191) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1091) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:378) > at > org.apache.geode.distributed.internal.DistributionMessage$1.run(DistributionMessage.java:444) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1121) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109) > at > org.apache.geode.distributed.internal.ClusterDistributionManager$4$1.run(ClusterDistributionManager.java:791) > at java.lang.Thread.run(Thread.java:748) > "P2P message reader for 169.84.85.56(psin9p197_cache2:58016)<v3>:1026 shared > ordered uid=9 port=32780": > at > org.apache.geode.internal.cache.AbstractRegionMap.removeTombstone(AbstractRegionMap.java:3320) > - waiting to lock <0x00007f11f8f27d18> (a java.lang.String) > at > org.apache.geode.internal.cache.entries.AbstractRegionEntry.makeTombstone(AbstractRegionEntry.java:273) > at > org.apache.geode.internal.cache.entries.AbstractRegionEntry.destroy(AbstractRegionEntry.java:904) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.destroyEntry(RegionMapDestroy.java:723) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.destroyExistingEntry(RegionMapDestroy.java:387) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.handleExistingRegionEntry(RegionMapDestroy.java:238) > - locked <0x00007f099bb69270> (a > org.apache.geode.internal.cache.entries.VersionedThinDiskLRURegionEntryHeapObjectKey) > at > org.apache.geode.internal.cache.map.RegionMapDestroy.destroy(RegionMapDestroy.java:149) > at > org.apache.geode.internal.cache.AbstractRegionMap.destroy(AbstractRegionMap.java:1093) > at > org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6504) > at > org.apache.geode.internal.cache.LocalRegion.mapDestroy(LocalRegion.java:6478) > at > org.apache.geode.internal.cache.LocalRegionDataView.destroyExistingEntry(LocalRegionDataView.java:56) > at > org.apache.geode.internal.cache.LocalRegion.basicDestroy(LocalRegion.java:6430) > at > org.apache.geode.internal.cache.DistributedRegion.basicDestroy(DistributedRegion.java:1599) > at > org.apache.geode.internal.cache.DestroyOperation$DestroyMessage.operateOnRegion(DestroyOperation.java:87) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.basicProcess(DistributedCacheOperation.java:1191) > at > org.apache.geode.internal.cache.DistributedCacheOperation$CacheOperationMessage.process(DistributedCacheOperation.java:1091) > at > org.apache.geode.distributed.internal.DistributionMessage.scheduleAction(DistributionMessage.java:378) > at > org.apache.geode.distributed.internal.DistributionMessage.schedule(DistributionMessage.java:436) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.scheduleIncomingMessage(ClusterDistributionManager.java:3250) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.handleIncomingDMsg(ClusterDistributionManager.java:2912) > at > org.apache.geode.distributed.internal.ClusterDistributionManager.access$1500(ClusterDistributionManager.java:109) > at > org.apache.geode.distributed.internal.ClusterDistributionManager$DMListener.messageReceived(ClusterDistributionManager.java:4038) > at > org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.dispatchMessage(GMSMembershipManager.java:1120) > at > org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager.handleOrDeferMessage(GMSMembershipManager.java:1039) > at > org.apache.geode.distributed.internal.membership.gms.mgr.GMSMembershipManager$MyDCReceiver.messageReceived(GMSMembershipManager.java:402) > at > org.apache.geode.distributed.internal.direct.DirectChannel.receive(DirectChannel.java:731) > at > org.apache.geode.internal.tcp.TCPConduit.messageReceived(TCPConduit.java:868) > at > org.apache.geode.internal.tcp.Connection.dispatchMessage(Connection.java:3965) > at > org.apache.geode.internal.tcp.Connection.runOioReader(Connection.java:2112) > at org.apache.geode.internal.tcp.Connection.run(Connection.java:1690) > at java.lang.Thread.run(Thread.java:748) > {noformat} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)