If the dataset is read-only, then it is always empty. The fuseki:serviceReadWriteGraphStore is the only way to get data into the database. The other two services are read-only.

Are you sure it is the database growing and not Java loading classes? On a tight memory footprint, and because classes are loaded on-demand, there are other sources of RAM usage.

Also - The heap will grow until it hits the heap size. Java does not call a full garbage collection until it needs to so sending SHACL requests, or read-only queries, for example, will grow the heap and not all the space is reclaimed until a full GC is done (Laura - this relates to heap size < real RAM size and never swap).

    Andy

On 27/07/2021 18:15, Marco Fiocco wrote:
On Tue, 27 Jul 2021 at 18:04, Andy Seaborne <[email protected]> wrote:



On 27/07/2021 14:19, Marco Fiocco wrote:
Hello,

I'm running a in-memory Fuseki 3.16 server and I see that the allocated
memory keeps growing linearly indefinitely even if idle.

That is strange because if there are no requests, it does no work.

Initially I reserved 1GB of memory and I've noticed that the process
gets OOM killed every 2 hours.

What pattern of usage is it getting?


Actually it's used as read-only. But the memory grows even if there is no
request.


   > Now I've allocated 2GB because I've read somewhere that 2GB is the
minimum for Java heaps. Is that true?

It's not that simple - you have an in-memory dataset so the space needed
is proportional to the amount of data.


At the moment the the initial memory (with the dataset loaded with the REST
API) is around 600-700MB.
 From that it grows by itself...


I'm waiting to see if it will get again.
Is this a bug or there is a better way to config it?

If you don't need the union graph, "rdf:type ja:MemoryDataset" is a
better in-memory choice. It has a smaller foot print and (I guess in
your setup you delete data as well as add it?) managed DELETE and PUT
better for GSP.  TDB, in-memory is primarily a testing configuration.


Would the memory be lower if instead of in-memory we use on disk TDB or
TDB2?

Thanks



My Fuseki config is:

@prefix fuseki:  <http://jena.apache.org/fuseki#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix :        <#> .

[] rdf:type fuseki:Server .

<#service> rdf:type fuseki:Service ;
      rdfs:label          "Dataset with SHACL validation" ;
      fuseki:name         "ds" ;
                                                                      # See
the endpoint url in build.gradle
      fuseki:serviceReadWriteGraphStore "data" ;
                                                              # SPARQL Graph
store protocol (read and write)
      fuseki:endpoint  [ fuseki:operation fuseki:query ;
  fuseki:name "sparql"  ] ;       # SPARQL query service
      fuseki:endpoint  [ fuseki:operation fuseki:shacl ;
  fuseki:name "shacl" ] ;         # SHACL query service
      fuseki:dataset      <#dataset> .

## In memory TDB with union graph.
<#dataset> rdf:type   tdb:DatasetTDB ;
    tdb:location "--mem--" ;
    # Query timeout on this dataset (1s, 1000 milliseconds)
    ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "1000" ] ;
    # Make the default graph be the union of all named graphs.
    tdb:unionDefaultGraph true .

Thanks
Marco



Reply via email to