Hi Christophe,
This behaviour is to be expected. It is a sign there is a lot of unused
memory and the file system cache has used it. All unused memory (Linux,
Mac, Windows) is used by teh OS for the file system cache automatically.
> Is there a way to monitor that better than just record memory use?
VisualVM shows the heap and also allows you to force a garbage
collection as well as look at the heap.
There is no need to set the heap as high as 20G - in fact, it will slow
the server down!
The figure top(1) shows is the total virtual memory (VIRT) and resident
memory (RES) for the whole OS process.
For Fuseki (TDB2) this is not the heap size (which is why the process
size VIRT is showing larger than the heap).
TDB2 uses memory mapped files. These files are in the OS file system
cache and become part of ("mapped") the process's virtual memory. The OS
manages which areas of the file are really in-memory and which aren't.
The OS will grow the file system cache to use all available memory for
resident segments of files. It will automatically shrink the resident
space if there is demand from other processes. But the files are still
in the virtual memory space of the process.
So the virtual space becomes the entire space for the index files but
not all of the files are in real memory. Top(1) includes the siz eof
files touched.
If you want to see the heap, use VisualVM.
The trouble with a big heap is that Java will grow the heap while doing
lightweight garbage collections, but not do a full GC until it gets
close to the max heap size. Only a full GC frees up all unused memory,
the lightweight GC's balance reclaiming and low performance impact with
the effect of not reclaiming everything. You will see a slowly rising
saw-tooth in VisualVM,then a big drop as a full GC cuts in.
A smaller heap stops the JVM delaying the full GC and makes the
throughput impact of a the full GC less.
But a growing heap is squeezing out the resident parts of the virtual
memory from the indexes files. The OS does not know some space is
(probably) unused.
There is then less filesystem cache memory mapped files means more I/O
to manage virtual vs resident which means Fuseki is slower.
Andy
On 20/10/2022 12:29, christophe heligon wrote:
Hi everyone,
I am using Apache Jena Fuseki 4.6.1 to host 2 datasets (rdf + rdf-star).
I have noticed an unexpected behaviour on my server as my RAM gets
filled little by little over time ( 100 Mo / few minutes can be several
Go at the end) when no queries are received by the server. When I submit
a query using fuseki GUI the RAM gets freed (at least partially) from
what seemed to have accumulated into it in between queries.
The java process is using that memory see attached file. Note that I use
the standard configuration provided by the server except for the max
memory that I set to 20 Go. The java process is largely exceeding this
limit as it may reach over 50 Go.
Is it a known issue of Fuseki? Java?
Is there a way to monitor that better than just record memory use?
Any fix published or hint on how to solve that?
Best regards,
Christophe
--
ChristopheHéligon
Institut de Génétique & Développement de Rennes
Equipe Ingénierie Inverse de la Division Cellulaire (CeDRE)
christophe.heli...@univ-rennes1.fr
<mailto:christophe.heli...@univ-rennes1.fr>
https://igdr.univ-rennes1.fr <https://igdr.univ-rennes1.fr>
UMR 6290 CNRS - UR1, ERL Inserm U1305
Campus Santé de Villejean, 2 avenue du Professeur Léon Bernard,
CS 34317 / 35043 Rennes Cedex, France