Re: In-memory Fuseki keeps growing memory indefinitely even if idle

Andy Seaborne Tue, 27 Jul 2021 12:34:50 -0700

If the dataset is read-only, then it is always empty. Thefuseki:serviceReadWriteGraphStore is the only way to get data into thedatabase. The other two services are read-only.

Are you sure it is the database growing and not Java loading classes? Ona tight memory footprint, and because classes are loaded on-demand,there are other sources of RAM usage.

Also - The heap will grow until it hits the heap size. Java does notcall a full garbage collection until it needs to so sending SHACLrequests, or read-only queries, for example, will grow the heap and notall the space is reclaimed until a full GC is done (Laura - this relatesto heap size < real RAM size and never swap).


    Andy

On 27/07/2021 18:15, Marco Fiocco wrote:

On Tue, 27 Jul 2021 at 18:04, Andy Seaborne <[email protected]> wrote:



On 27/07/2021 14:19, Marco Fiocco wrote:

Hello,

I'm running a in-memory Fuseki 3.16 server and I see that the allocated

memory keeps growing linearly indefinitely even if idle.

That is strange because if there are no requests, it does no work.

Initially I reserved 1GB of memory and I've noticed that the process

gets OOM killed every 2 hours.

What pattern of usage is it getting?


Actually it's used as read-only. But the memory grows even if there is no
request.

   > Now I've allocated 2GB because I've read somewhere that 2GB is the
minimum for Java heaps. Is that true?

It's not that simple - you have an in-memory dataset so the space needed
is proportional to the amount of data.

At the moment the the initial memory (with the dataset loaded with the REST
API) is around 600-700MB.
 From that it grows by itself...

I'm waiting to see if it will get again.
Is this a bug or there is a better way to config it?


If you don't need the union graph, "rdf:type ja:MemoryDataset" is a
better in-memory choice. It has a smaller foot print and (I guess in
your setup you delete data as well as add it?) managed DELETE and PUT
better for GSP.  TDB, in-memory is primarily a testing configuration.

Would the memory be lower if instead of in-memory we use on disk TDB or
TDB2?

Thanks


My Fuseki config is:

@prefix fuseki:  <http://jena.apache.org/fuseki#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix :        <#> .

[] rdf:type fuseki:Server .

<#service> rdf:type fuseki:Service ;
      rdfs:label          "Dataset with SHACL validation" ;
      fuseki:name         "ds" ;

                                                                      # See
the endpoint url in build.gradle

      fuseki:serviceReadWriteGraphStore "data" ;

                                                              # SPARQL Graph
store protocol (read and write)

      fuseki:endpoint  [ fuseki:operation fuseki:query ;

  fuseki:name "sparql"  ] ;       # SPARQL query service

      fuseki:endpoint  [ fuseki:operation fuseki:shacl ;

  fuseki:name "shacl" ] ;         # SHACL query service

      fuseki:dataset      <#dataset> .

## In memory TDB with union graph.
<#dataset> rdf:type   tdb:DatasetTDB ;
    tdb:location "--mem--" ;
    # Query timeout on this dataset (1s, 1000 milliseconds)
    ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "1000" ] ;
    # Make the default graph be the union of all named graphs.
    tdb:unionDefaultGraph true .

Thanks
Marco

Re: In-memory Fuseki keeps growing memory indefinitely even if idle

Reply via email to