Dan Smith created GEODE-7569:
--------------------------------

             Summary: Hang during StateFlush due to new flipping the 
containsRegionContentChange on PartitionMessageWithDirectReply
                 Key: GEODE-7569
                 URL: https://issues.apache.org/jira/browse/GEODE-7569
             Project: Geode
          Issue Type: Bug
          Components: membership
            Reporter: Dan Smith


The recent changes in GEODE-7435 in e3a31e190031f094ac3bd1517722d6bead710418 
have caused a distributed deadlock when making a copy of a bucket.

These changes flipped the value of containsRegionContentChange for 
PartitionMessageWithDirectReply.

That flag controls what messages participate in a state flush operation. Now, 
many new messages are part of a state flush, including messages which trigger 
bucket creation. This causes the following distributed deadlock:

1. Member A is waiting for a StateFlush to finish
2. Member B is stuck in StateStabilizationMessage, waiting for messages to be 
processed
3. Member B is in the middle of processing some messages, which is what is 
holding up the StateStabilizationMessage
4. Some of those messages are PartitionMessageWithDirectReply messages that end 
up triggering createBucketAtomically. That method is blocks waiting for bucket 
creation in Member A to finish.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to