The issue is probably that you are not nesting the optionals, i.e.

OPTIONAL {
... 
  OPTIONAL {
    ...
      OPTIONAL { 
        ....
      }
  }
}

in other words you have to fix your queries

On Mon, 2024-09-09 at 14:10 +0000, Hugo Mills wrote:
> 
> 
> 
> One addendum here:
>  
> I can confirm that almost all of the problematic queries are of a
> similar structure to the one I quoted here. They only differ in the
> trailing OPTIONAL clauses. None of the non-problematic queries have
> this structure.
>  
> Hugo.
>  
> 
> 
> Hugo Mills​​​​
> 
> Development Team Leader
> 
> agrimetrics.co.uk
> 
> Reading Enterprise Centre, Whiteknights Road, Reading, UK, RG6 6BU
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> From: Hugo Mills <[email protected]>
> Sent: 09 September 2024 11:12
> To: [email protected]
> Subject: RE: Fuseki: Query timeouts not timing out
>  
> 
> 
> (Apologies if the quoting is broken here, I'm having to use Outlook
> and do old-style quoting manually).
> 
> 
>      Hugo Mills​​​​
> 
> Development Team Leader
> 
>  agrimetrics.co.uk
> 
> Reading Enterprise Centre, Whiteknights Road, Reading, UK, RG6 6BU
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On 08 September 2024 11:39, Andy Seaborne wrote:
> > On 06/09/2024 16:23, Simon Bin wrote:
> > > As far as I know, there have been some fixes to query time-outs
> > > in the
> > > latest Jena versions.
> > 
> > Yes - and the area has been rewritten.
> > 
> > For this, and for the previous email, we need more information,
> > otherwise all we > can do is speculate.
> > 
> > + workload - query only or query + frequent updates?
> 
> Mostly read queries. There are updates, but infrequent.
> 
> > + example queries
> 
> This is one of the ones that's been taking a long time:
> 
> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
> PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
> PREFIX spatialrelations:
> <http://data.ordnancesurvey.co.uk/ontology/spatialrelations/>
> PREFIX def-ef: <http://location.data.gov.uk/def/ef/SamplingPoint/>
> PREFIX def-bwq:
> <http://environment.data.gov.uk/def/bathing-water-quality/>
> PREFIX geo: <http://www.w3.org/2003/01/geo/wgs84_pos#>
> PREFIX dgu: <http://reference.data.gov.uk/def/reference/>
> PREFIX dct: <http://purl.org/dc/terms/>
> PREFIX def-som: <http://environment.data.gov.uk/def/bwq-som/>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX dcterms: <http://purl.org/dc/terms/>
> PREFIX def-stp: <http://environment.data.gov.uk/def/bwq-stp/>
> PREFIX def-bw: <http://environment.data.gov.uk/def/bathing-water/>
> 
> SELECT DISTINCT ?item WHERE
> {
> {
> BIND(xsd:dateTime(replace(str(now()), "\\..*$", "")) AS ?doi)
> {
> ?som def-bw:bathingWater ?bwt .
> ?som def-som:startOfSuspension ?start
> FILTER NOT EXISTS {?som dct:isReplacedBy ?som2 }
> }
> OPTIONAL { ?som def-som:endOfSuspension ?end }
> BIND(?som AS ?item)
> FILTER ( ( ( bound(?end) && ( ?start <= ?doi ) ) && ( ?doi <= ?end )
> ) || ( ( ! bound(?end) ) && ( ?start <= ?doi ) ) )
> }
> UNION
> {
> BIND(xsd:dateTime(replace(str(now()), "\\..*$", "")) AS ?doi)
> ?stp def-stp:riskLevel def-stp:increased .
> ?stp def-stp:bathingWater ?bwt .
> ?stp def-stp:publishedAt ?padt .
> ?stp def-stp:expiresAt ?end .
> ?stp def-stp:predictedAt ?start
> FILTER ( ( ?start <= ?doi ) && ( ?doi <= ?end ) )
> FILTER NOT EXISTS {
> ?stp2 def-stp:bathingWater ?bwt .
> ?stp2 def-stp:publishedAt ?padt2 .
> ?stp2 def-stp:expiresAt ?end2 .
> ?stp2 def-stp:predictedAt ?start2
> FILTER ( ( ( ?padt2 > ?padt ) && ( ?start2 <= ?doi ) ) && ( ?doi <=
> ?end2 ) )
> }
> BIND(?stp AS ?item)
> }
> UNION
> {
> BIND(xsd:dateTime(replace(str(now()), "\\..*$", "")) AS ?doi)
> {
> ?pi rdf:type def-som:PollutionIncident .
> ?pi def-bw:bathingWater ?bwt .
> ?pi def-som:startOfIncident ?start
> FILTER NOT EXISTS {?pi dct:isReplacedBy ?other }
> }
> OPTIONAL { ?pi def-som:endOfIncident ?end }
> FILTER ( ( ?start <= ?doi ) && ( ( ! bound(?end) ) || ( ?doi <= ?end
> ) ) )
> BIND(?pi AS ?item)
> }
> OPTIONAL {
> ?item def-stp:riskLevel ?___1 .
> ?___1 def-stp:riskNotation ?___0
> }
> OPTIONAL {
> ?item def-stp:samplingPoint ?___3 .
> ?___3 geo:long ?___2
> }
> OPTIONAL {
> ?item def-ef:samplingPoint ?___5 .
> ?___5 spatialrelations:northing ?___4
> }
> OPTIONAL {
> ?item dgu:uriSet ?___7 .
> ?___7 rdf:type ?___6
> }
> OPTIONAL {
> ?item def-bwq:sampleYear ?___9 .
> ?___9 skos:prefLabel ?___8
> }
> OPTIONAL {
> ?item def-stp:predictedAt ?___10
> }
> OPTIONAL {
> ?item def-bw:bathingWater ?___12 .
> ?___12 rdf:type ?___11
> }
> OPTIONAL {
> ?item def-som:incidentType ?___14 .
> ?___14 rdf:type ?___13
> }
> OPTIONAL {
> ?item def-som:notation ?___15
> }
> OPTIONAL {
> ?___2 spatialrelations:northing ?___16
> }
> OPTIONAL {
> ?item def-som:recordDateTime ?___17
> }
> OPTIONAL {
> ?item def-som:startOfIncident ?___18
> }
> OPTIONAL {
> ?item dct:description ?___19
> }
> OPTIONAL {
> ?___4 geo:lat ?___20
> }
> OPTIONAL {
> ?item def-som:nirsRef ?___21
> }
> }
> ORDER BY ?___0 DESC(?___2) DESC(?___4) ?___6 ?___8 ?___10
> DESC(?___11)
> DESC(?___13) ?___15 ?___16 ?___17 ?___18 DESC(?___19)
> DESC(?___20) DESC(?___21) ?item
> OFFSET 0
> LIMIT 25
> 
> I suspect that this query has been machine-generated by some code
> within the app (as I think I've mentioned, we didn't write this;
> we're just having to maintain it now). The original formatting is not
> what a human would have written. Most of the queries that the app is
> sending to Fuseki look similar. I haven't worked through what the
> differences on the long-running ones are yet, but will be doing that
> in detail today.
> 
> > + details about the data (e.g. how many triples, any inference
> > being
> > applied, how is the data stored?
> 
> Rough order of magnitude: 20MT. No inference, stored as TDB (see
> below).
> 
> > + Is Fuseki being run standalone or as a WAR file?
> 
> Standalone.
> 
> > + Fuseki configuration (config.ttl).
> 
> @prefix : <#> .
> @prefix fuseki: <http://jena.apache.org/fuseki#> .
> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
> @prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> .
> @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> .
> 
> [] rdf:type fuseki:Server ;
> ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue
> "20000,40000" ] ;
> ja:context [ ja:cxtName "arq:logExec" ; ja:cxtValue "all" ] ;
> ja:context [ ja:cxtName "arq:optReorderBGP" ; ja:cxtValue "all" ] ;
> 
> fuseki:services (
> <#service_ds>
> ) .
> 
> # TDB
> [] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .
> tdb:DatasetTDB rdfs:subClassOf ja:RDFDataset .
> tdb:GraphTDB rdfs:subClassOf ja:Model .
> 
> <#service_ds> rdf:type fuseki:Service ;
> rdfs:label "TDB Service (RW)" ;
> fuseki:name "ds" ;
> fuseki:serviceQuery "query" ;
> fuseki:serviceQuery "sparql" ;
> fuseki:serviceUpdate "update" ;
> fuseki:serviceUpload "upload" ;
> fuseki:serviceReadWriteGraphStore "data" ;
> # A separate read-only graph store endpoint:
> fuseki:serviceReadGraphStore "get" ;
> fuseki:dataset <#ds> ;
> .
> 
> <#ds> rdf:type tdb:DatasetTDB ;
> tdb:location "/fuseki/databases/DS-DB" ;
> tdb:unionDefaultGraph true ;
> .
> 
> > + JVM version, and heap setting. Did you adjust he settings after
> > https://lists.apache.org/thread/5d98v6j49zdy042fxwwtnkhcfvl69206 ?
> 
> OpenJDK 19, and, after that thread, we reduced the JVM memory
> settings to: -Xmx18g -Xms18g. (Yes, I realise that this is still
> huge, but it's now at least a lot smaller than the container).
> 
> > Jena 3.4.0 is very old. There are security fixes for Jena and for
> > dependencies that have happen over the last 7 years.
> 
> As I think I mentioned elsewhere (and in the thread linked above),
> last time we tried upgrading Fuseki in this deployment, we had
> problems with memory usage. But I guess we're going to have to try
> again.
> 
> > Andy
> 
> > Cheers,
> > 
> > On Fri, 2024-09-06 at 15:20 +0000, Hugo Mills wrote:
> > > 
> > > 
> > > 
> > > As described in an earlier email to this list, we’ve got a
> > > problem
> > > with our installation of Fuseki 3.4.0(*) where it goes into a
> > > state
> > > of very high CPU usage, and load average climbs to high levels,
> > > leading to the database becoming unresponsive. We’ve managed to
> > > extract some stats on queries when this happens, and while we
> > > haven’t
> > > been able to draw any conclusions on why it happens, we have
> > > spotted
> > > one odd thing.
> > > 
> > > We’ve got a server-wide timeout setting of “20000,40000”. When
> > > the
> > > server is behaving, almost all queries are answered within 100ms.
> > > When it’s not, we see queries running for much, much longer – in
> > > some
> > > cases, up to almost 6 minutes. This would seem to be something of
> > > a
> > > mismatch with the timeout settings. I would expect to see a bunch
> > > of
> > > queries being killed at shortly after 20s, and a bunch being
> > > killed
> > > at shortly after 40s, and nothing beyond that point. Why is
> > > Fuseki
> > > ignoring the timeout settings we’ve given it? Is there a
> > > parameter
> > > that can be passed to Fuseki (an HTTP header?) which overrides
> > > the
> > > default timeouts? Have we simply misunderstood what the timeout
> > > setting in the server config file does?
> > > 
> > > Thanks,
> > > Hugo.
> > > 
> > > (*) We’re running 3.4.0 because when we tried a later version,
> > > the
> > > application failed even more often than it does right now. I
> > > don’t
> > > know if the failure mode was the same as this one – that predates
> > > this current investigation by a year or more, and I wasn’t
> > > involved
> > > in that.
> > > Hugo Mills​​​​
> > > 
> > > Development Team Leader
> > > 
> > > agrimetrics.co.uk
> > > 
> > > Reading Enterprise Centre, Whiteknights Road, Reading, UK, RG6
> > > 6BU
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > > 
> > 

Reply via email to