albertogpz opened a new pull request, #7830:
URL: https://github.com/apache/geode/pull/7830

   There is a distributed deadlock that can appear
   when stopping the gateway sender if a race condition
   happens in which the stop gateway sender command gets blocked
   indefinitely trying to get the size of the queue from remote peers
   (ParallelGatewaySenderQueue.size() call) and
   also one call to store one event in the queue tries to get
   the lifecycle lock (acquired by the gateway sender command).
   
   These two calls could get into a deadlock under heavy load and
   make the system unresponsive for any traffic request (get, put, ...).
   
   In order to avoid it, in the storage of the event in the gateway
   sender queue (AbstractGatewaySender.distribute() call),
   instead to trying to get the lifecycle lock without
   any timeout, a try with a timeout is added. If the
   try returns false it is checked if the gateway sender is running. If
   it is not running, the event is dropped and there is no need to get the lock.
   Otherwise, the lifecycle lock acquire is retried until it succeeds or
   the gateway sender is stopped.
   
   <!-- Thank you for submitting a contribution to Apache Geode. -->
   
   <!-- In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken: 
   -->
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in 
the commit message?
   
   - [ ] Has your PR been rebased against the latest commit within the target 
branch (typically `develop`)?
   
   - [ ] Is your initial contribution a single, squashed commit?
   
   - [ ] Does `gradlew build` run cleanly?
   
   - [ ] Have you written or updated unit tests to verify your changes?
   
   - [ ] If adding new dependencies to the code, are these dependencies 
licensed in a way that is compatible for inclusion under [ASF 
2.0](http://www.apache.org/legal/resolved.html#category-a)?
   
   <!-- Note:
   Please ensure that once the PR is submitted, check Concourse for build 
issues and
   submit an update to your PR as soon as possible. If you need help, please 
send an
   email to d...@geode.apache.org.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@geode.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to