On 18/03/16 09:16, Dominique Vandensteen wrote:
Hi,
I'm having problems handling "big" graphs (50M to 100M triples at current
stage) in my fuseki servers using sparql.
The 2 actions I need todo are "DROP GRAPH <...>" and "MOVE <...> TO <...>".
Doing these action with these graphs I get OutOfMemory errors. Some
investigation pionted me to http://markmail.org/message/hjisrglx4eicrxyt
and
http://mail-archives.apache.org/mod_mbox/jena-users/201504.mbox/%3ccaj+mtwad1vfcnjaro37xkiwgyj7mrnillzvmsx1_nrj+rrf...@mail.gmail.com%3E
Using this config:
<#yourdatasetname> rdf:type tdb:DatasetTDB ;
ja:context [ ja:cxtName "tdb:transactionJournalWriteBlockMode" ;
ja:cxtValue "mapped" ] ;
ja:context [ ja:cxtName "arq:spillToDiskThreshold" ; ja:cxtValue
10000 .
] .
Solves my problem but brings up another problem. My temp folder gets filled
up with JenaTempByteBuffer-...UUID...tmp files until my disk is full. These
files remain locked so I cannot delete them.
The files seem to be created
by org.apache.jena.tdb.base.file.BufferAllocatorMapped but are for some
reason not released.
Is there any way to work around this issue?
I'm using
-fuseki 2.3.1
-jvm 1.8.0_25 64bit
-windows 10
mapped + Windows => files don't go away until the JVM exits [1] and even
then it does not seem to be reliable according to some reports.
I thought BufferAllocatorDirect was supposed to get round this but it
allocates on direct memory (AKA malloc).
It would need a spill to plain file implementation of BufferAllocator
which we don't seem to have.
Andy
[1]
http://bugs.java.com/view_bug.do?bug_id=4724038
and others.
Dominique Vandensteen
Head of development
+ 32 474 870856
[email protected]
skype: domi.vds