markrmiller opened a new pull request, #4440: URL: https://github.com/apache/solr/pull/4440
https://issues.apache.org/jira/browse/SOLR-18244 Fix several concurrency bugs in ParallelHttpShardHandler Catch RejectedExecutionException from commExecutor and route through recordShardSubmitError so saturation/shutdown does not propagate synchronously and abandon already-submitted shard requests; this preserves shards.tolerant semantics and returns a 503 instead of 500. Synchronize submitFutures registration and removal on the same monitor HttpShardHandler uses for responseFutureMap so cancelAll observes a consistent view of both maps. Skip work in the runAsync runnable when the outer future has already been cancelled, avoiding a wasted lbClient.requestAsync call. In HttpShardHandler.takeCompletedIncludingErrors, replace the blocking responses.take() with a 50ms poll so subclasses that gate responsesPending() on an async tracker (e.g. submitFutures) cannot cause a lost-wakeup hang when the tracker drains between the pending check and the queue wait. Wrap the inner whenComplete body in a try/catch so a thrown transformResponse (or other failure inside the lambda) still enqueues a response and unblocks take() instead of stranding the consumer. Expose cancellationLock() and isCanceled() on HttpShardHandler so subclasses can synchronize on the same monitor that already guards the canceled flag, responseFutureMap, and the responses queue. Synchronize ParallelHttpShardHandler.responsesPending() on the cancellation monitor. Without the lock, take()'s loop performs three unsynchronized isEmpty() reads (responseFutureMap, responses, submitFutures) and can transiently observe all three as empty even though super.makeShardRequest's put and outer.whenComplete's submitFutures.remove are causally ordered. The intermediate state is consistent with each individual read but never with the cross-map invariant, so take() exits with null and the pending response (which arrives moments later via the inner whenComplete) is silently dropped by SearchHandler. Acquiring the lock serializes the three reads with every state-mutating operation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
