HoustonPutman opened a new pull request, #3268:
URL: https://github.com/apache/solr/pull/3268
This is a small fix and I really don't think it warrants a JIRA and
especially a changelog entry. I can make a JIRA if people thinks it deserves
one.
The `DistributedApiAsyncTracker` mentioned that there could be a race
condition between completing an asynchronous request and checking its status.
This is causing very infrequent test failures, such as:
`ReindexCollectionTest.testAbort`.
The solution is to just check the ZK paths in reverse order from how they
are updated.
So when completing or canceling tasks, they are updated in the following
order:
1. `trackedAsyncTasks.put(asyncId, ...)` or
`trackedAsyncTasks.remove(asyncId)`
1. `inFlightAsyncTasks.deleteInFlightTask(asyncId)`
Therefore in `getAsyncTaskRequestStatus(asyncId)`, we need to check
`inFlightAsyncTasks` before `trackedAsyncTasks`. This means we can get a
false-positive "Submitted" or "Running" result (race condition described
below). But that will just lead to the client checking again at a later time,
and the next time they call, `inFlightAsyncTasks` will have been updated and we
will get the actual response from `trackedAsyncTasks`.
Before this PR, the race condition would give us a false-negative "Operation
failed. Please resubmit" result. (race condition described below). This would
tell the client to try again, when in fact the task could have been successful.
This false-negative is much worse than the false-positive described above.
Race condition before this PR: (false-negative)
1. `getAsyncTaskRequestStatus()` -- `trackedAsyncTasks` is checked -- no
response is found
1. `setTaskCompleted()` -- `trackedAsyncTasks` id is updated -- response is
put into ZK
1. `setTaskCompleted()` -- `inFlightAsyncTasks` id is deleted -- asyncID is
deleted from ZK
1. `getAsyncTaskRequestStatus()` -- `inFlightAsyncTasks ` is checked --
asyncId is not found
* Return a failure - Assume node died because `inFlightAsyncTasks `
ephemeral node is gone
Race condition after this PR: (false-positive)
1. `setTaskCompleted()` -- `trackedAsyncTasks` id is updated -- response is
put into ZK
1. `getAsyncTaskRequestStatus()` -- `inFlightAsyncTasks ` is checked --
asyncId is found
* Return that the task is in progress
1. `setTaskCompleted()` -- `inFlightAsyncTasks` id is deleted -- asyncID is
deleted from ZK
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]