Re: In-memory Fuseki keeps growing memory indefinitely even if idle

Marco Fiocco Tue, 27 Jul 2021 14:17:22 -0700

Ok let me clarify the steps.
I start Fuseki as a docker container with the config you saw earlier.
Then I load the dataset with curl. After that I intend to use Fuseki as
"read only" query only.
That at moment, there is absolutely no query being done, but still the
memory allocated keeps growing.


I've noticed 2 interesting things though:
Fuseki starts with Java options "-Xmx2048m -Xms2048m" and I have now
reserved 3GB of RAM for the service.
Now
- If wait enough, the memory keeps growing but when it reaches about 2.4GB
it is suddenly deallocated back to starting size when the dataset was
loaded (about 700MB)
- if I access the Fuseki /ds endpoint to download the data, THAT also
deallocates the memory back to 700MB.

Is this normal Java behaviour?


On Tue, 27 Jul 2021 at 20:34, Andy Seaborne <[email protected]> wrote:

> If the dataset is read-only, then it is always empty. The
> fuseki:serviceReadWriteGraphStore is the only way to get data into the
> database. The other two services are read-only.
>
> Are you sure it is the database growing and not Java loading classes? On
> a tight memory footprint, and because classes are loaded on-demand,
> there are other sources of RAM usage.
>
> Also - The heap will grow until it hits the heap size. Java does not
> call a full garbage collection until it needs to so sending SHACL
> requests, or read-only queries, for example, will grow the heap and not
> all the space is reclaimed until a full GC is done (Laura - this relates
> to heap size < real RAM size and never swap).
>
>      Andy
>
> On 27/07/2021 18:15, Marco Fiocco wrote:
> > On Tue, 27 Jul 2021 at 18:04, Andy Seaborne <[email protected]> wrote:
> >
> >>
> >>
> >> On 27/07/2021 14:19, Marco Fiocco wrote:
> >>> Hello,
> >>>
> >>> I'm running a in-memory Fuseki 3.16 server and I see that the allocated
> >> memory keeps growing linearly indefinitely even if idle.
> >>
> >> That is strange because if there are no requests, it does no work.
> >>
> >>> Initially I reserved 1GB of memory and I've noticed that the process
> >> gets OOM killed every 2 hours.
> >>
> >> What pattern of usage is it getting?
> >>
> >
> > Actually it's used as read-only. But the memory grows even if there is no
> > request.
> >
> >
> >>    > Now I've allocated 2GB because I've read somewhere that 2GB is the
> >> minimum for Java heaps. Is that true?
> >>
> >> It's not that simple - you have an in-memory dataset so the space needed
> >> is proportional to the amount of data.
> >>
> >>
> > At the moment the the initial memory (with the dataset loaded with the
> REST
> > API) is around 600-700MB.
> >  From that it grows by itself...
> >
> >
> >>> I'm waiting to see if it will get again.
> >>> Is this a bug or there is a better way to config it?
> >>
> >> If you don't need the union graph, "rdf:type ja:MemoryDataset" is a
> >> better in-memory choice. It has a smaller foot print and (I guess in
> >> your setup you delete data as well as add it?) managed DELETE and PUT
> >> better for GSP.  TDB, in-memory is primarily a testing configuration.
> >>
> >>
> > Would the memory be lower if instead of in-memory we use on disk TDB or
> > TDB2?
> >
> > Thanks
> >
> >>
> >>>
> >>> My Fuseki config is:
> >>>
> >>> @prefix fuseki:  <http://jena.apache.org/fuseki#> .
> >>> @prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> >>> @prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
> >>> @prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
> >>> @prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
> >>> @prefix :        <#> .
> >>>
> >>> [] rdf:type fuseki:Server .
> >>>
> >>> <#service> rdf:type fuseki:Service ;
> >>>       rdfs:label          "Dataset with SHACL validation" ;
> >>>       fuseki:name         "ds" ;
> >>                                                                       #
> See
> >> the endpoint url in build.gradle
> >>>       fuseki:serviceReadWriteGraphStore "data" ;
> >>                                                               # SPARQL
> Graph
> >> store protocol (read and write)
> >>>       fuseki:endpoint  [ fuseki:operation fuseki:query ;
> >>   fuseki:name "sparql"  ] ;       # SPARQL query service
> >>>       fuseki:endpoint  [ fuseki:operation fuseki:shacl ;
> >>   fuseki:name "shacl" ] ;         # SHACL query service
> >>>       fuseki:dataset      <#dataset> .
> >>>
> >>> ## In memory TDB with union graph.
> >>> <#dataset> rdf:type   tdb:DatasetTDB ;
> >>>     tdb:location "--mem--" ;
> >>>     # Query timeout on this dataset (1s, 1000 milliseconds)
> >>>     ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue "1000" ] ;
> >>>     # Make the default graph be the union of all named graphs.
> >>>     tdb:unionDefaultGraph true .
> >>>
> >>> Thanks
> >>> Marco
> >>>
> >>
> >
>

Re: In-memory Fuseki keeps growing memory indefinitely even if idle

Reply via email to