Re: Fuseki - how to release memory

A. Soroka Fri, 06 Jan 2017 08:18:25 -0800

Can you give us your actual Fuseki config (i.e. assembler file)? Or are you 
repeatedly creating new datasets via the admin API?


---
A. Soroka
The University of Virginia Library

> On Jan 6, 2017, at 10:43 AM, Janda, Radim <radim.ja...@reporters.cz> wrote:
> 
> Hello,
> we use in-memory datasets.
> JVM is big enough but as we process thousands of small data sets the memory
> is allocated continuously.
> Actualy we restart Fuseki every hour to avoid out of memory error.
> However the performance is also decreasing in time (before restart) that's
> why we are looking for the possibility of memory cleanup.
> 
> Radim
> 
> On Fri, Jan 6, 2017 at 4:12 PM, Andy Seaborne <a...@apache.org> wrote:
> 
>> Are you using persistent or an in-memory datasets for your working storage?
>> 
>> If you really mean memory (RAM), are you sure the JVM is big enough?
>> 
>> Fuseki tries to avoid holding on to cache transactions but if the server
>> is under heavy read requests (Rob's point) then it can build up (solution -
>> reduce the read load for a short while) - also TDB does try to switch to
>> emergency measures after a while but maybe before then the RAM usage has
>> grown too much.
>> 
>>    Andy
>> 
>> 
>> On 06/01/17 14:07, Rob Vesse wrote:
>> 
>>> Deleting data does not reclaim all the memory, exactly what is and isn’t
>>> reclaimed depends somewhat on your exact usage pattern.
>>> 
>>> The B+Tree’s which are the primary data structure for TDB, the default
>>> database used in Fuseki, does not reclaim the space. It is potentially
>>> subject fragmentation as well so memory used tends to grow over time. The
>>> node table portion of the database, the mapping from RDF terms to internal
>>> database identifiers is a sequential data structure that will only ever
>>> grow over time. It is also worth noting that many of the data structures
>>> are backed by memory mapped files which are off-heap and subject to the
>>> vagaries of how your OS handles this.
>>> 
>>> Additionally, if you place Fuseki under continuous load TDB maybe blocked
>>> from writing the in memory journal back to disk which can cause back to
>>> grow unbounded overtime and prevent memory being reclaimed. Adding
>>> occasional pauses between operations can help to alleviate this.
>>> 
>>> As Lorenz notes for this kind of use case you may not need Fuseki at all
>>> and could simply drive TDB programmatically instead.
>>> 
>>> As a general point creating a fresh database rather than reusing an
>>> existing one will much more efficiently use memory. However, if you’re
>>> running on Windows then there is a known OS specific JVM bug that can cause
>>> memory mapped files to not be properly deleted until after the process
>>> exits.
>>> 
>>> Rob
>>> 
>>> On 06/01/2017 12:23, "Janda, Radim" <radim.ja...@reporters.cz> wrote:
>>> 
>>>    Hello Lorenz,
>>>    yes I meant delete data from Fuseki using DELETE command.
>>>    We have version 2.4 installed.
>>>    We use two types of queries:
>>>    1. Insert new triples based on existing triples rdf model (insert
>>> sparql)
>>>    2. Find some results in the data (select sparql)
>>> 
>>>    Thanks
>>> 
>>>    Radim
>>> 
>>>    On Fri, Jan 6, 2017 at 1:04 PM, Lorenz B. <
>>>    buehm...@informatik.uni-leipzig.de> wrote:
>>> 
>>>> Hello Radim,
>>>> 
>>>> just to avoid confusion, with "Delete whole Fuseki" you mean the
>>> data
>>>> loaded into Fuseki, right?
>>>> 
>>>> Which Fuseki version do you use?
>>>> 
>>>> What kind of transformation do you do? I'm asking because I'm
>>> wondering
>>>> if it's necessary to use Fuseki.
>>>> 
>>>> 
>>>> 
>>>> Cheers,
>>>> Lorenz
>>>> 
>>>>> Hello,
>>>>> We use Jena Fuseki to process a lot of small data sets.
>>>>> 
>>>>> It works in the following way:
>>>>> 1. Delete whole Fuseki (using DELETE command)
>>>>> 2. Load data to Fuseki (using INSERT)
>>>>> 3. Tranform data and create output (sparql called from Python)
>>>>> 4. ad 1)2)3 .... delete Fuseki and Transform another data set
>>>>> 
>>>>> We have found out that memory is not released after delete in
>>> Fuseki.
>>>>> That means we have lack of memory after some data sets are
>>> transformed.
>>>>> Actually we restart Fuseki server after some number of data sets
>>> but we
>>>>> are looking for the better solution.
>>>>> 
>>>>> Can you please help us with memory releasing?
>>>>> 
>>>>> Many thanks
>>>>> 
>>>>> Radim
>>>>> 
>>>> --
>>>> Lorenz Bühmann
>>>> AKSW group, University of Leipzig
>>>> Group: http://aksw.org - semantic web research center
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>

Re: Fuseki - how to release memory

Reply via email to