osma commented on issue #3044:
URL: https://github.com/apache/jena/issues/3044#issuecomment-2720484943

   I've been investigating a similar problem - probably the same one - where 
Fuseki 5.3.0 seems to ignore the timeout on certain inefficient queries. In the 
worst case that I saw, a single query was allowed to run for hours, even with 
the timeout set to 1 second (1000 ms). If you run a public or semi-public 
SPARQL endpoint, such queries are almost bound to happen eventually, and it 
leads to a DoS-style situation especially if such long-running queries start 
piling up. Each of them will fully utilize a single CPU core.
   
   I only noticed this issue now, since it wasn't created yet when I started my 
investigation. I'm adding some details and information about my case in the 
hope that it's helpful, even though the main issue seems already known and a 
potential fix is available in PR #3047. (I will test the fix as well and 
provide comments)
   
   I'm using a TDB2 dataset `tdb2-yso` where the file 
[yso-skos.ttl](https://github.com/NatLibFi/Finto-data/blob/master/vocabularies/yso/yso-skos.ttl)
 has been loaded into the graph `http://www.yso.fi/onto/yso/`
   
   Here is a relatively simple (but stupid, I know) query that triggers the 
problem:
   
   ```
   PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
   SELECT *
   WHERE {
     GRAPH <http://www.yso.fi/onto/yso/> {
       ?uri ?p ?o .
       OPTIONAL {
         ?x skos:member ?o .
         FILTER NOT EXISTS {
           ?x skos:member ?other .
           FILTER NOT EXISTS {
             ?other skos:broader ?uri 
           }                       
         }                     
       }   
     }  VALUES (?uri) {
       (<http://www.yso.fi/onto/yso/p24009>) 
     } 
   }
   ```
   
   Here is my Fuseki configuration that sets a timeout of 2 seconds for the 
service (just to demonstrate that it won't help either) and 1 second for the 
dataset:
   
   ```
   @prefix :      <http://base/#> .
   @prefix tdb2:  <http://jena.apache.org/2016/tdb#> .
   @prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
   @prefix ja:    <http://jena.hpl.hp.com/2005/11/Assembler#> .
   @prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
   @prefix fuseki: <http://jena.apache.org/fuseki#> .
   
   [] rdf:type fuseki:Server ;
      ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "2000" ] .
   
   :tdb2  a                   fuseki:Service ;
       rdfs:label                          "TDB2" ;
       fuseki:dataset                      :dataset ;
       fuseki:name                         "ds" ;
       fuseki:serviceQuery                 "query" , "sparql" ;
       fuseki:serviceReadGraphStore        "get" ;
       fuseki:serviceReadWriteGraphStore   "data" ;
       fuseki:serviceUpdate                "update" ;
       fuseki:serviceUpload                "upload" .
   
   :dataset
       a tdb2:DatasetTDB2 ;
       ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "1000" ] ;
       tdb2:location  "tdb2-yso" .
   ```
   
   Here is what happens with Fuseki 5.3.0 (and the current SNAPSHOT from 
`main`) when I run the query:
   
   ```
   10:53:06 INFO  Server          :: Apache Jena Fuseki 5.3.0
   10:53:06 INFO  Config          :: Fuseki Base = 
/home/oisuomin/proj/test-fuseki5-timeout/run
   10:53:06 INFO  Config          :: Database: /ds
   10:53:06 INFO  Config          :: UI Base = fuseki-server.jar
   10:53:06 INFO  Server          :: Configuration file: fuseki-config.ttl
   10:53:06 INFO  Server          :: Path = /ds
   10:53:06 INFO  Server          ::   Memory: 4.0 GiB
   10:53:06 INFO  Server          ::   Java:   21.0.6
   10:53:06 INFO  Server          ::   OS:     Linux 6.11.0-19-generic amd64
   10:53:06 INFO  Server          ::   PID:    480591
   10:53:06 INFO  Shiro           :: Shiro configuration: 
file:/home/oisuomin/proj/test-fuseki5-timeout/run/shiro.ini
   10:53:06 INFO  Server          :: Start Fuseki (http=3030)
   10:53:11 INFO  Fuseki          :: [3] POST http://localhost:3030/ds/
   10:53:11 INFO  Fuseki          :: [3] Query = PREFIX skos: 
<http://www.w3.org/2004/02/skos/core#> SELECT * WHERE {   GRAPH 
<http://www.yso.fi/onto/yso/> {     ?uri ?p ?o .     OPTIONAL {       ?x 
skos:member ?o .       FILTER NOT EXISTS {         ?x skos:member ?other .      
   FILTER NOT EXISTS {           ?other skos:broader ?uri          }            
                  }                          }      }  VALUES (?uri) {     
(<http://www.yso.fi/onto/yso/p24009>)    }  }
   10:53:22 INFO  Fuseki          :: [3] 503 Service Unavailable (10.769 s)
   ```
   
   So the query runs for 10+ seconds before being cancelled. The 1 second 
timeout doesn't cut it off at 1 second as it should.
   
   I tested earlier Jena Fuseki releases as well. 5.2.0, 5.1.0 and 5.0.0 all 
had the same problem. The last release where the timeout setting still worked 
was 4.10.0:
   
   ```
   10:52:00 INFO  Server          :: Apache Jena Fuseki 4.10.0
   10:52:00 INFO  Config          :: 
FUSEKI_HOME=/home/oisuomin/sw/apache-jena-fuseki-4.10.0
   10:52:00 INFO  Config          :: 
FUSEKI_BASE=/home/oisuomin/proj/test-fuseki5-timeout/run
   10:52:00 INFO  Config          :: Shiro file: 
file:///home/oisuomin/proj/test-fuseki5-timeout/run/shiro.ini
   10:52:01 INFO  Server          :: Path = /ds
   10:52:01 INFO  Server          ::   Memory: 4.0 GiB
   10:52:01 INFO  Server          ::   Java:   21.0.6
   10:52:01 INFO  Server          ::   OS:     Linux 6.11.0-19-generic amd64
   10:52:01 INFO  Server          ::   PID:    480212
   10:52:01 INFO  Server          :: Started 2025/03/13 10:52:01 EET on port 
3030
   10:52:03 INFO  Fuseki          :: [1] POST http://localhost:3030/ds/
   10:52:03 INFO  Fuseki          :: [1] Query = PREFIX skos: 
<http://www.w3.org/2004/02/skos/core#> SELECT * WHERE {   GRAPH 
<http://www.yso.fi/onto/yso/> {     ?uri ?p ?o .     OPTIONAL {       ?x 
skos:member ?o .       FILTER NOT EXISTS {         ?x skos:member ?other .      
   FILTER NOT EXISTS {           ?other skos:broader ?uri          }            
                  }                          }      }  VALUES (?uri) {     
(<http://www.yso.fi/onto/yso/p24009>)    }  }
   10:52:04 INFO  Fuseki          :: [1] 503 Service Unavailable (1.020 s)
   ```
   
   I'll provide comments on the PR separately in a moment (it seems to work).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to