[ 
https://issues.apache.org/jira/browse/GEODE-8686?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated GEODE-8686:
----------------------------------
    Labels: pull-request-available  (was: )

> Tombstone removal optimization during GII could cause deadlock
> --------------------------------------------------------------
>
>                 Key: GEODE-8686
>                 URL: https://issues.apache.org/jira/browse/GEODE-8686
>             Project: Geode
>          Issue Type: Improvement
>    Affects Versions: 1.10.0, 1.11.0, 1.12.0, 1.13.0, 1.14.0
>            Reporter: Donal Evans
>            Assignee: Donal Evans
>            Priority: Major
>              Labels: pull-request-available
>
> Similar to the issue described in GEODE-6526, if the condition in the below 
> if statement in {{AbstractRegionMap.initialImagePut()}} evaluates to true, a 
> call to {{AbstractRegionMap.removeTombstone()}} will be triggered that could 
> lead to deadlock between the calling thread and a Tombstone GC thread calling 
> {{TombstoneService.gcTombstones()}}. 
> {code:java}
> if (owner.getServerProxy() == null && 
> owner.getVersionVector().isTombstoneTooOld( entryVersion.getMemberID(), 
> entryVersion.getRegionVersion())) { 
>   // the received tombstone has already been reaped, so don't retain it 
>   if (owner.getIndexManager() != null) { 
>     owner.getIndexManager().updateIndexes(oldRe, IndexManager.REMOVE_ENTRY, 
> IndexProtocol.REMOVE_DUE_TO_GII_TOMBSTONE_CLEANUP); 
>   } 
>   removeTombstone(oldRe, entryVersion, false, false); 
>   return false; 
> } else { 
>   owner.scheduleTombstone(oldRe, entryVersion); 
>   lruEntryDestroy(oldRe); 
> }
> {code}
> The proposed change is to remove this if statement and allow the old 
> tombstone to be collected later by calling {{scheduleTombstone()}} in all 
> cases. The call to {{AbstractRegionMap.removeTombstone()}} in 
> {{AbstractRegionMap.initialImagePut()}} is intended to be an optimization to 
> allow immediate removal of tombstones that we know have already been 
> collected on other members, but since the conditions to trigger it are rare 
> (the old entry must be a tombstone, the new entry received during GII must be 
> a tombstone with a newer version, and we must have already collected a 
> tombstone with a newer version than the new entry) and the overhead of 
> scheduling a tombstone to be collected is comparatively low, the performance 
> impact of removing this optimization in favour of simply scheduling the 
> tombstone to be collected in all cases should be insignificant.
> The solution to the deadlock observed in GEODE-6526 was also to remove the 
> call to {{AbstractRegionMap.removeTombstone()}} and allow the tombstone to be 
> collected later and did not result in any unwanted behaviour, so the proposed 
> fix should be similarly low-impact.
> Also of note is that with this proposed change, there will be no calls to 
> {{AbstractRegionMap.removeTombstone()}} outside of the {{TombstoneService}} 
> class, which should ensure that other deadlocks involving this method are not 
> possible.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to