Ryan McMahon created GEODE-5653:
-----------------------------------

             Summary: Race between put of parallel WAN sender event and 
distributed clear causes off-heap orphan
                 Key: GEODE-5653
                 URL: https://issues.apache.org/jira/browse/GEODE-5653
             Project: Geode
          Issue Type: Bug
          Components: offheap, regions, wan
            Reporter: Ryan McMahon


A race resulting in an off-heap orphan can occur with Parallel WAN when 
off-heap is enabled on the cache and user's data region.  It is the result of 
the put occurring in two distinct steps without proper synchronization.  The 
initial value of the region entry is Token.REMOVED_PHASE_1 (step 1) then later 
is replaced with actual GatewaySenderEvent (step 2).  If a distributed clear is 
processed between these two steps, it can result in an orphan.  More details on 
the race are described below.

The race is between:
*Thread 1*. A put of a {{GatewaySenderEvent}} containing an off-heap value to 
one of the WAN "shadow" region's {{BucketRegionQueue}}
and
*Thread 2*. A distributed clear on the {{BucketRegionQueue}} containing that 
value

The race occurs as follows:
*Thread 1 (Put)*: Put results in a new region entry where the value is 
{{Token.REMOVED_PHASE1}} and put it into the {{CustomEntryConcurrentHashMap}} 
owned by the {{AbstractRegionMap}}.
*Thread 2 (Clear)*: Iterates through bucket region's segments and clears the 
entries.  At this time the value is still {{Token.REMOVED_PHASE_1}}.
*Thread 1 (Put)*: The {{Token.REMOVED_PHASE_1}} is replaced with the actual 
{{GatewaySenderEvent}} in the region entry.  However, the entry was removed via 
the clear above.  When the entry is removed from the region, its off-heap is 
orphaned because it is no longer in the {{CustomEntryConcurrentHashMap}}.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to