Not Fuseki-related per se, but I've experienced something similar when the HTTP client is running out of connections.
On Tue, Feb 16, 2021 at 11:50 AM Dave Reynolds <dave.e.reyno...@gmail.com> wrote: > > We have a mysterious problem with fuseki in production that we've not > seen before. Posting in case anyone has seen something similar and has > any advice but I realise there's not really much here to go on. > > Environment: > Fuseki 3.17 (was 3.16, tried upgrade just in case) using TDB1 > OpenJDK java 8 > Docker container (running in k8s pod) > ABW EBS file system > O(2k) small updates per day (uses RDFConnection to send update) > Variable read request rate but issue hits at low request levels > > Symptoms are that fuseki receives an update request but never completes it: > > INFO 550175 POST http://localhost:3030/ds > INFO 550175 Update > INFO 550175 204 No Content (20 ms) > INFO 550176 POST http://localhost:3030/ds > INFO 550176 Update > --> > INFO 550178 Query = ASK { ?s ?p ?o } > INFO 550178 GET > http://localhost:3030/ds?query=ASK+%7B+%3Fs+%3Fp+%3Fo+%7D > INFO 550179 GET > http://localhost:3030/ds?query=ASK+%7B+%3Fs+%3Fp+%3Fo+%7D > INFO 550179 Query = ASK { ?s ?p ?o } > > So no 204 return from request 550176. > > From that point on fuseki continues to log incoming read queries but > does not answer any of them and the update request never terminates. > Acts as if there's some form of deadlock. > > Update requests are serialised, there's never more than one in flight at > a time. > > It's not the update itself that's the issue. It's small and if the > container is restarted with the same data and the same update sequence > is reapplied it all works fine. > > The jvm stats all look completely fine in the prometheus records. > > The various parts of this set up have been in various production > settings without problems in the past. In particular, we've run the > exact same pattern of mixed updates and queries in fuseki in a k8s > environment for two years without ever having a lockup. But on a new > deployment it's happening every few days. > > There are differences between the new and old deployments but the ones > we've identified seem very unlikely to be the cause. We've not used > RDFConnection in the client before but can't see how that could affect > this. We don't often run with TDB on EBS but we do have a dozen > instances of that around which haven't had problems. We have generally > shifted to AWS Corretto as the jvm but we have plenty of OpenJDK > instances around without problems. The docker image is slightly unusual > in using the s6 overlay init system rather than running fuseki as the > root process but again can't see how this might cause these symptoms and > other uses of that, with fuseki, have been fine. > > We'll find a workaround eventually, possibly involving shifting to TDB2, > but posting in case anyone has had an experience similar enough to this > to give us some hints. > > Dave >