The new container-related JDK features might be relevant here.
https://developer.oracle.com/java/technologies/javase/8u191-relnotes.html

We use this in docker-compose:

-XX:+UseContainerSupport -XX:MaxRAMPercentage=75

On Thu, 29 Jul 2021 at 11.46, Marco Fiocco <marco.fio...@gmail.com> wrote:

> I'm building a docker image with openjdk:14-alpine and cannot enable
> ShenandoahGC, even with the experimental feature flag.
> It seems that OpenJDK must be compiled with some feature in order to
> support that.
>
> However I've tuned the Java config with "-Xmx512m -Xms512m" and max
> reserved memory in my docker orchestrator to 1024MB, and it works fine.
> The memory keeps growing as usual but it get flushed by the GC around 750MB
> so it's never OOM killed at least.
>
> Thanks
> Marco
>
>
>
> On Wed, 28 Jul 2021 at 11:01, Andy Seaborne <a...@apache.org> wrote:
>
> >
> >
> > On 27/07/2021 22:17, Marco Fiocco wrote:
> > > Ok let me clarify the steps.
> > > I start Fuseki as a docker container with the config you saw earlier.
> > > Then I load the dataset with curl. After that I intend to use Fuseki as
> > > "read only" query only.
> > > That at moment, there is absolutely no query being done, but still the
> > > memory allocated keeps growing.
> >
> > Does the Fuseki log show finished requests?
> >
> > >
> > > I've noticed 2 interesting things though:
> > > Fuseki starts with Java options "-Xmx2048m -Xms2048m" and I have now
> > > reserved 3GB of RAM for the service.
> > > Now
> > > - If wait enough, the memory keeps growing but when it reaches about
> > 2.4GB
> > > it is suddenly deallocated back to starting size when the dataset was
> > > loaded (about 700MB)
> > > - if I access the Fuseki /ds endpoint to download the data, THAT also
> > > deallocates the memory back to 700MB.
> > >
> > > Is this normal Java behaviour?
> >
> > Yes.
> >
> > As queries happen, the heap will grow, There is some reclamation in
> > incremental GCs, which are very quick and happening much of the time,
> > but these GC cycles do not collect all unused space. A full GC can do
> that.
> >
> > Th JDK runtime lets the heap become full with space that didn't get
> > reclaimed by an incremental GC, then it trigger a full GC. At that
> > point, the in-use space drops right back to the in-use space which ree
> > is the data of the in-memory dataset.
> >
> > Very roughly, Java takes about 0.5G over the heap size so 2.4GB is what
> > to expect before the runtime triggers a full GC and the full GC will
> > reduce the heap in uses. The process size will not shrink.
> >
> > TDB1 itself isn't perfect at releasing unused data if deletion occurs
> > but your description doesn't include any deletes.
> >
> > > the process gets OOM killed every 2 hours
> >
> > Is that an exception of some OS control killing the process because it
> > exceeds some configuration limit? (ulimit, VM or conatiner provision,
> > ....) Is that what you mean by "now reserved 3GB of RAM for the
> > service." because f it was 2G, that limit will be hit.
> >
> > The OS process will grow to more than 2G, the heap isn't the only use of
> > space in Java.
> >
> > Do you get an OutOfMemoryException (OOME) from Java or a the OS (etc)
> > says it is too big?
> >
> > An OOME happens when the full GC does not release memory to fulfil a
> > request for space from Java.
> >
> > Capturing state at that point (Jerven's -XX:+HeapDumpOnOutOfMemoryError
> >   suggestion) would help identify why. But is there a request in
> > progress at the time (see the Fuseki log)?
> >
> >      Andy
> >
> > > On Tue, 27 Jul 2021 at 20:34, Andy Seaborne <a...@apache.org> wrote:
> > >
> > >> If the dataset is read-only, then it is always empty. The
> > >> fuseki:serviceReadWriteGraphStore is the only way to get data into the
> > >> database. The other two services are read-only.
> > >>
> > >> Are you sure it is the database growing and not Java loading classes?
> On
> > >> a tight memory footprint, and because classes are loaded on-demand,
> > >> there are other sources of RAM usage.
> > >>
> > >> Also - The heap will grow until it hits the heap size. Java does not
> > >> call a full garbage collection until it needs to so sending SHACL
> > >> requests, or read-only queries, for example, will grow the heap and
> not
> > >> all the space is reclaimed until a full GC is done (Laura - this
> relates
> > >> to heap size < real RAM size and never swap).
> > >>
> > >>       Andy
> > >>
> > >> On 27/07/2021 18:15, Marco Fiocco wrote:
> > >>> On Tue, 27 Jul 2021 at 18:04, Andy Seaborne <a...@apache.org> wrote:
> > >>>
> > >>>>
> > >>>>
> > >>>> On 27/07/2021 14:19, Marco Fiocco wrote:
> > >>>>> Hello,
> > >>>>>
> > >>>>> I'm running a in-memory Fuseki 3.16 server and I see that the
> > allocated
> > >>>> memory keeps growing linearly indefinitely even if idle.
> > >>>>
> > >>>> That is strange because if there are no requests, it does no work.
> > >>>>
> > >>>>> Initially I reserved 1GB of memory and I've noticed that the
> process
> > >>>> gets OOM killed every 2 hours.
> > >>>>
> > >>>> What pattern of usage is it getting?
> > >>>>
> > >>>
> > >>> Actually it's used as read-only. But the memory grows even if there
> is
> > no
> > >>> request.
> > >>>
> > >>>
> > >>>>     > Now I've allocated 2GB because I've read somewhere that 2GB is
> > the
> > >>>> minimum for Java heaps. Is that true?
> > >>>>
> > >>>> It's not that simple - you have an in-memory dataset so the space
> > needed
> > >>>> is proportional to the amount of data.
> > >>>>
> > >>>>
> > >>> At the moment the the initial memory (with the dataset loaded with
> the
> > >> REST
> > >>> API) is around 600-700MB.
> > >>>   From that it grows by itself...
> > >>>
> > >>>
> > >>>>> I'm waiting to see if it will get again.
> > >>>>> Is this a bug or there is a better way to config it?
> > >>>>
> > >>>> If you don't need the union graph, "rdf:type ja:MemoryDataset" is a
> > >>>> better in-memory choice. It has a smaller foot print and (I guess in
> > >>>> your setup you delete data as well as add it?) managed DELETE and
> PUT
> > >>>> better for GSP.  TDB, in-memory is primarily a testing
> configuration.
> > >>>>
> > >>>>
> > >>> Would the memory be lower if instead of in-memory we use on disk TDB
> or
> > >>> TDB2?
> > >>>
> > >>> Thanks
> > >>>
> > >>>>
> > >>>>>
> > >>>>> My Fuseki config is:
> > >>>>>
> > >>>>> @prefix fuseki:  <http://jena.apache.org/fuseki#> .
> > >>>>> @prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
> > >>>>> @prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
> > >>>>> @prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
> > >>>>> @prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
> > >>>>> @prefix :        <#> .
> > >>>>>
> > >>>>> [] rdf:type fuseki:Server .
> > >>>>>
> > >>>>> <#service> rdf:type fuseki:Service ;
> > >>>>>        rdfs:label          "Dataset with SHACL validation" ;
> > >>>>>        fuseki:name         "ds" ;
> > >>>>
> >   #
> > >> See
> > >>>> the endpoint url in build.gradle
> > >>>>>        fuseki:serviceReadWriteGraphStore "data" ;
> > >>>>                                                                #
> > SPARQL
> > >> Graph
> > >>>> store protocol (read and write)
> > >>>>>        fuseki:endpoint  [ fuseki:operation fuseki:query ;
> > >>>>    fuseki:name "sparql"  ] ;       # SPARQL query service
> > >>>>>        fuseki:endpoint  [ fuseki:operation fuseki:shacl ;
> > >>>>    fuseki:name "shacl" ] ;         # SHACL query service
> > >>>>>        fuseki:dataset      <#dataset> .
> > >>>>>
> > >>>>> ## In memory TDB with union graph.
> > >>>>> <#dataset> rdf:type   tdb:DatasetTDB ;
> > >>>>>      tdb:location "--mem--" ;
> > >>>>>      # Query timeout on this dataset (1s, 1000 milliseconds)
> > >>>>>      ja:context [ ja:cxtName "arq:queryTimeout" ; ja:cxtValue
> "1000"
> > ] ;
> > >>>>>      # Make the default graph be the union of all named graphs.
> > >>>>>      tdb:unionDefaultGraph true .
> > >>>>>
> > >>>>> Thanks
> > >>>>> Marco
> > >>>>>
> > >>>>
> > >>>
> > >>
> > >
> >
>

Reply via email to