Hi Christophe,

This behaviour is to be expected. It is a sign there is a lot of unused memory and the file system cache has used it. All unused memory (Linux, Mac, Windows) is used by teh OS for the file system cache automatically.

> Is there a way to monitor that better than just record memory use?

VisualVM shows the heap and also allows you to force a garbage collection as well as look at the heap.


There is no need to set the heap as high as 20G - in fact, it will slow the server down!


The figure top(1) shows is the total virtual memory (VIRT) and resident memory (RES) for the whole OS process.

For Fuseki (TDB2) this is not the heap size (which is why the process size VIRT is showing larger than the heap).

TDB2 uses memory mapped files. These files are in the OS file system cache and become part of ("mapped") the process's virtual memory. The OS manages which areas of the file are really in-memory and which aren't.

The OS will grow the file system cache to use all available memory for resident segments of files. It will automatically shrink the resident space if there is demand from other processes. But the files are still in the virtual memory space of the process.

So the virtual space becomes the entire space for the index files but not all of the files are in real memory. Top(1) includes the siz eof files touched.

If you want to see the heap, use VisualVM.


The trouble with a big heap is that Java will grow the heap while doing lightweight garbage collections, but not do a full GC until it gets close to the max heap size. Only a full GC frees up all unused memory, the lightweight GC's balance reclaiming and low performance impact with the effect of not reclaiming everything. You will see a slowly rising saw-tooth in VisualVM,then a big drop as a full GC cuts in.

A smaller heap stops the JVM delaying the full GC and makes the throughput impact of a the full GC less.

But a growing heap is squeezing out the resident parts of the virtual memory from the indexes files. The OS does not know some space is (probably) unused.

There is then less filesystem cache memory mapped files means more I/O to manage virtual vs resident which means Fuseki is slower.

    Andy

On 20/10/2022 12:29, christophe heligon wrote:
Hi everyone,

I am using Apache Jena Fuseki 4.6.1 to host 2 datasets (rdf + rdf-star).

I have noticed an unexpected behaviour on my server as my RAM gets filled little by little over time ( 100 Mo / few minutes can be several Go at the end) when no queries are received by the server. When I submit a query using fuseki GUI the RAM gets freed (at least partially) from what seemed to have accumulated into it in between queries.

The java process is using that memory see attached file. Note that I use the standard configuration provided by the server except for the max memory that I set to 20 Go. The java process is largely exceeding this limit as it may reach over 50 Go.

Is it a known issue of Fuseki? Java?
Is there a way to monitor that better than just record memory use?

Any fix published or hint on how to solve that?

Best regards,
Christophe

--

        

        


      ChristopheHéligon

Institut de Génétique & Développement de Rennes

Equipe Ingénierie Inverse de la Division Cellulaire (CeDRE)



christophe.heli...@univ-rennes1.fr <mailto:christophe.heli...@univ-rennes1.fr>

        https://igdr.univ-rennes1.fr <https://igdr.univ-rennes1.fr>

        UMR 6290 CNRS - UR1, ERL Inserm U1305
Campus Santé de Villejean, 2 avenue du Professeur Léon Bernard,
CS 34317 / 35043 Rennes Cedex, France



Reply via email to