The new container-related JDK features might be relevant here. https://developer.oracle.com/java/technologies/javase/8u191-relnotes.html
We use this in docker-compose: -XX:+UseContainerSupport -XX:MaxRAMPercentage=75 On Thu, 29 Jul 2021 at 11.46, Marco Fiocco <marco.fio...@gmail.com> wrote: > I'm building a docker image with openjdk:14-alpine and cannot enable > ShenandoahGC, even with the experimental feature flag. > It seems that OpenJDK must be compiled with some feature in order to > support that. > > However I've tuned the Java config with "-Xmx512m -Xms512m" and max > reserved memory in my docker orchestrator to 1024MB, and it works fine. > The memory keeps growing as usual but it get flushed by the GC around 750MB > so it's never OOM killed at least. > > Thanks > Marco > > > > On Wed, 28 Jul 2021 at 11:01, Andy Seaborne <a...@apache.org> wrote: > > > > > > > On 27/07/2021 22:17, Marco Fiocco wrote: > > > Ok let me clarify the steps. > > > I start Fuseki as a docker container with the config you saw earlier. > > > Then I load the dataset with curl. After that I intend to use Fuseki as > > > "read only" query only. > > > That at moment, there is absolutely no query being done, but still the > > > memory allocated keeps growing. > > > > Does the Fuseki log show finished requests? > > > > > > > > I've noticed 2 interesting things though: > > > Fuseki starts with Java options "-Xmx2048m -Xms2048m" and I have now > > > reserved 3GB of RAM for the service. > > > Now > > > - If wait enough, the memory keeps growing but when it reaches about > > 2.4GB > > > it is suddenly deallocated back to starting size when the dataset was > > > loaded (about 700MB) > > > - if I access the Fuseki /ds endpoint to download the data, THAT also > > > deallocates the memory back to 700MB. > > > > > > Is this normal Java behaviour? > > > > Yes. > > > > As queries happen, the heap will grow, There is some reclamation in > > incremental GCs, which are very quick and happening much of the time, > > but these GC cycles do not collect all unused space. A full GC can do > that. > > > > Th JDK runtime lets the heap become full with space that didn't get > > reclaimed by an incremental GC, then it trigger a full GC. At that > > point, the in-use space drops right back to the in-use space which ree > > is the data of the in-memory dataset. > > > > Very roughly, Java takes about 0.5G over the heap size so 2.4GB is what > > to expect before the runtime triggers a full GC and the full GC will > > reduce the heap in uses. The process size will not shrink. > > > > TDB1 itself isn't perfect at releasing unused data if deletion occurs > > but your description doesn't include any deletes. > > > > > the process gets OOM killed every 2 hours > > > > Is that an exception of some OS control killing the process because it > > exceeds some configuration limit? (ulimit, VM or conatiner provision, > > ....) Is that what you mean by "now reserved 3GB of RAM for the > > service." because f it was 2G, that limit will be hit. > > > > The OS process will grow to more than 2G, the heap isn't the only use of > > space in Java. > > > > Do you get an OutOfMemoryException (OOME) from Java or a the OS (etc) > > says it is too big? > > > > An OOME happens when the full GC does not release memory to fulfil a > > request for space from Java. > > > > Capturing state at that point (Jerven's -XX:+HeapDumpOnOutOfMemoryError > > suggestion) would help identify why. But is there a request in > > progress at the time (see the Fuseki log)? > > > > Andy > > > > > On Tue, 27 Jul 2021 at 20:34, Andy Seaborne <a...@apache.org> wrote: > > > > > >> If the dataset is read-only, then it is always empty. The > > >> fuseki:serviceReadWriteGraphStore is the only way to get data into the > > >> database. The other two services are read-only. > > >> > > >> Are you sure it is the database growing and not Java loading classes? > On > > >> a tight memory footprint, and because classes are loaded on-demand, > > >> there are other sources of RAM usage. > > >> > > >> Also - The heap will grow until it hits the heap size. Java does not > > >> call a full garbage collection until it needs to so sending SHACL > > >> requests, or read-only queries, for example, will grow the heap and > not > > >> all the space is reclaimed until a full GC is done (Laura - this > relates > > >> to heap size < real RAM size and never swap). > > >> > > >> Andy > > >> > > >> On 27/07/2021 18:15, Marco Fiocco wrote: > > >>> On Tue, 27 Jul 2021 at 18:04, Andy Seaborne <a...@apache.org> wrote: > > >>> > > >>>> > > >>>> > > >>>> On 27/07/2021 14:19, Marco Fiocco wrote: > > >>>>> Hello, > > >>>>> > > >>>>> I'm running a in-memory Fuseki 3.16 server and I see that the > > allocated > > >>>> memory keeps growing linearly indefinitely even if idle. > > >>>> > > >>>> That is strange because if there are no requests, it does no work. > > >>>> > > >>>>> Initially I reserved 1GB of memory and I've noticed that the > process > > >>>> gets OOM killed every 2 hours. > > >>>> > > >>>> What pattern of usage is it getting? > > >>>> > > >>> > > >>> Actually it's used as read-only. But the memory grows even if there > is > > no > > >>> request. > > >>> > > >>> > > >>>> > Now I've allocated 2GB because I've read somewhere that 2GB is > > the > > >>>> minimum for Java heaps. Is that true? > > >>>> > > >>>> It's not that simple - you have an in-memory dataset so the space > > needed > > >>>> is proportional to the amount of data. > > >>>> > > >>>> > > >>> At the moment the the initial memory (with the dataset loaded with > the > > >> REST > > >>> API) is around 600-700MB. > > >>> From that it grows by itself... > > >>> > > >>> > > >>>>> I'm waiting to see if it will get again. > > >>>>> Is this a bug or there is a better way to config it? > > >>>> > > >>>> If you don't need the union graph, "rdf:type ja:MemoryDataset" is a > > >>>> better in-memory choice. It has a smaller foot print and (I guess in > > >>>> your setup you delete data as well as add it?) managed DELETE and > PUT > > >>>> better for GSP. TDB, in-memory is primarily a testing > configuration. > > >>>> > > >>>> > > >>> Would the memory be lower if instead of in-memory we use on disk TDB > or > > >>> TDB2? > > >>> > > >>> Thanks > > >>> > > >>>> > > >>>>> > > >>>>> My Fuseki config is: > > >>>>> > > >>>>> @prefix fuseki: <http://jena.apache.org/fuseki#> . > > >>>>> @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . > > >>>>> @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . > > >>>>> @prefix tdb: <http://jena.hpl.hp.com/2008/tdb#> . > > >>>>> @prefix ja: <http://jena.hpl.hp.com/2005/11/Assembler#> . > > >>>>> @prefix : <#> . > > >>>>> > > >>>>> [] rdf:type fuseki:Server . > > >>>>> > > >>>>> <#service> rdf:type fuseki:Service ; > > >>>>> rdfs:label "Dataset with SHACL validation" ; > > >>>>> fuseki:name "ds" ; > > >>>> > > # > > >> See > > >>>> the endpoint url in build.gradle > > >>>>> fuseki:serviceReadWriteGraphStore "data" ; > > >>>> # > > SPARQL > > >> Graph > > >>>> store protocol (read and write) > > >>>>> fuseki:endpoint [ fuseki:operation fuseki:query ; > > >>>> fuseki:name "sparql" ] ; # SPARQL query service > > >>>>> fuseki:endpoint [ fuseki:operation fuseki:shacl ; > > >>>> fuseki:name "shacl" ] ; # SHACL query service > > >>>>> fuseki:dataset <#dataset> . > > >>>>> > > >>>>> ## In memory TDB with union graph. > > >>>>> <#dataset> rdf:type tdb:DatasetTDB ; > > >>>>> tdb:location "--mem--" ; > > >>>>> # Query timeout on this dataset (1s, 1000 milliseconds) > > >>>>> ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue > "1000" > > ] ; > > >>>>> # Make the default graph be the union of all named graphs. > > >>>>> tdb:unionDefaultGraph true . > > >>>>> > > >>>>> Thanks > > >>>>> Marco > > >>>>> > > >>>> > > >>> > > >> > > > > > >