On 25/02/13 20:07, Joshua Greben wrote:
Hello All,

I am new to this list and to Jena and was wondering if anyone could
offer advice for loading a large triplestore.

I am trying to load 670M Ntriples into a store using tdbloader on a
single machine with 64-bit hardware and 8GB of memory. However, I am
running into a massive slowdown. When the load starts the tdbloader
is processing around 30K tps but by the time it has loaded 130M
triples it can essentially no longer load any more and slows down to
2300 tps. At that point I have to kill the process because it will
basically never finish.

Is 8GB of memory enough or is there a more efficient way to load this
data? I am trying to load the data into a single DB location. Should
I be splitting up the triples and loading them into different DBs?

Advice from anyone who has experience successfully loading a large
triplestore is much appreciated.

Only 8G is pushing it somewhat for 670M triples. It will finish; it will take a very long time. Faster loads have been reported by using a larger machine (e.g. Freebase in 8 hours on a IBM Power7 and 48G RAM).

tdbloader2 (Linux only) may get you there a bit quicker but really you need a bigger machine.

Once built, you can copy the dataset as files to other machines.

        Andy


Thanks!

- Josh



Joshua Greben Library Systems Programmer & Analyst Stanford
University Libraries (650) 714-1937 [email protected]




Reply via email to