Hi Rick,
On 12/12/13 11:03, Rick Moynihan wrote:
Hi all,
I have a script which dumps 2 modestly sized n-triples files into fuseki
via curl and a HTTP PUT.
e.g. the script does the following 2 actions:
curl -X PUT --data-binary @data/file-1.nt -H 'Content-Type: text/plain' '
http://localhost:3030/linkeddev-test/data?graph=http://foo-bar.org/graph1'
Unrelated, aesthetically better:
Content-Type: application/n-triples
(Fuseki/RIOT ignores text/plain and uses the file extension - text/plain
is wrong so much it's unrelaible).
curl -X PUT --data-binary @data/file-2.nt -H 'Content-Type: text/plain' '
http://localhost:3030/linkeddev-test/data?graph=http://foo-bar.org/graph2'
And it does them one after the other, never in parallel?
File 1 is 162mb
File 2 is 223mb
so about 1.6 and 2.2 million triples?
Sometimes this imports fine, other times the import takes minutes, Fuseki
consumes 380% CPU and I have to kill it after a few minutes.
When its fine, how long does it take?
At least once the import finished after a few minutes, but Fuseki continued
to consume 380% CPU for about 20 minutes afterwards (despite their being no
load on it at all). After a short while longer it crashed with an
OutOfMemoryException.
>
I'm using TDB for storage.
I'm a little concerned with the non-deterministic nature of the issue, but
it seems to occur frequently... infact it seems to have problems more often
than not.
Any help or suggestions much appreciated.
R.
It might be GC pressure and its GC's very hard but not making signifcant
progress - tis can show as very high CPU, nothing happening and then
OOME. How much heap have you given the java process?
The other thing to look at memory mapped files. TDB uses mmapped files
which are not part of the Java heap. Don't give the Fuseki all of RAM
for the heap - leave as much for the OS to use for file system cache as
possible (but Fuseki still needs a decent heap to manage transactions).
I assume it's a 64 bit machine but which OS? (Even amongst Linuxes
handling of mmap varies for reason I don't understand.)
(Linux specific question:)
What does top show for the process in terms of real and virtual memory?
Andy