gnodet commented on PR #14222: URL: https://github.com/apache/camel/pull/14222#issuecomment-2124614440
> > > > There's another busy-wait loop issue in `org.apache.camel.support.Task`. This is usually not a problem when performing non trivial tasks and re-attempting again in case of a failure, such as trying to connect to a remote server every 5 seconds. However, this package is also used to wait for trivial conditions, such as in the `ServicePool` where [a foreground task is used to wait for a service to be started every 5 milliseconds](https://github.com/apache/camel/blob/740c3b1a0235356346e8ee9ca38d0596513ca425/core/camel-support/src/main/java/org/apache/camel/support/cache/ServicePool.java#L139). I think this is an abuse of the framework. For such cases, I think an alternative strategy should be provided, using an implementation of `java.util.concurrent.Condition` that can be waited for. > > > > > > > > > Upon further investigation, the potential problem comes from this commit: [07b9c68](https://github.com/apache/camel/commit/07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a) I'm not sure how this is supposed to work: services usually inherit from `ServiceSupport` class where starting the service is a blocking operation. When the service is created, we call `addService` with `forceStart` set to true. My understanding is that returning from this call, the service will either be started, or the start will be deferred, but I don't really see how it can be in the `STARTING` state. @orpiske ? > > > > > > The key problem I was avoiding on [07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a](https://github.com/apache/camel/commit/07b9c68ea2a9a70cf3a3f42c1045fb9fd0acbe6a), was the `if (producer instanceof StatefulService) {` which caused a [type check miss](https://redhatperf.github.io/post/type-check-scalability-issue/) under high load. > > The STARTING state check had been [there before](https://github.com/apache/camel/commit/e653b815839be4fb5b1146a9fd8312a15d997318). Looking at the history of this change, it seems that the STARTING state check is related to [CAMEL-14048](https://issues.apache.org/jira/browse/CAMEL-14048) [and a problem with Netty](https://github.com/apache/camel/commit/0f9f8790cccf3e45af4fe9c46ec030c5e63d2083). > > Thx for the pointers. I've had another look, I think the fix [0f9f879](https://github.com/apache/camel/commit/0f9f8790cccf3e45af4fe9c46ec030c5e63d2083) is wrong and the problem was actually fixed correctly by [4c8c071](https://github.com/apache/camel/commit/4c8c071fba493dbf41f09103712036805133a5a8). This commit makes sure that the double-check locking only see a fully started service. > > I'll add a commit to remove that part of the code. Pushed a commit. However I think the `ServicePool` still has some synchronization issues. We use a `BlockingQueue`, but we're holding a lock to access it, so that's quite unexpected... That's for another PR though, maybe with the `ResequencerEngine` concurrency to avoid locking on the object itself. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@camel.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org