Ryan McMahon created GEODE-5653: ----------------------------------- Summary: Race between put of parallel WAN sender event and distributed clear causes off-heap orphan Key: GEODE-5653 URL: https://issues.apache.org/jira/browse/GEODE-5653 Project: Geode Issue Type: Bug Components: offheap, regions, wan Reporter: Ryan McMahon
A race resulting in an off-heap orphan can occur with Parallel WAN when off-heap is enabled on the cache and user's data region. It is the result of the put occurring in two distinct steps without proper synchronization. The initial value of the region entry is Token.REMOVED_PHASE_1 (step 1) then later is replaced with actual GatewaySenderEvent (step 2). If a distributed clear is processed between these two steps, it can result in an orphan. More details on the race are described below. The race is between: *Thread 1*. A put of a {{GatewaySenderEvent}} containing an off-heap value to one of the WAN "shadow" region's {{BucketRegionQueue}} and *Thread 2*. A distributed clear on the {{BucketRegionQueue}} containing that value The race occurs as follows: *Thread 1 (Put)*: Put results in a new region entry where the value is {{Token.REMOVED_PHASE1}} and put it into the {{CustomEntryConcurrentHashMap}} owned by the {{AbstractRegionMap}}. *Thread 2 (Clear)*: Iterates through bucket region's segments and clears the entries. At this time the value is still {{Token.REMOVED_PHASE_1}}. *Thread 1 (Put)*: The {{Token.REMOVED_PHASE_1}} is replaced with the actual {{GatewaySenderEvent}} in the region entry. However, the entry was removed via the clear above. When the entry is removed from the region, its off-heap is orphaned because it is no longer in the {{CustomEntryConcurrentHashMap}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)