[Virtuoso-users] Request for optimize virtuoso.ini configuration files

Marc-Alexandre Nolin Wed, 03 Feb 2010 21:06:59 +0000

Hello,

I've a huge amount a triples in N3 format and some triples can have
very large literals (Complete Genbank in N3 and some literals are
complete sequences). The total amount of triples is above 6 billions
and uncompressed data is around 700 gigabytes. I just can't find a way
to completely load it into Virtuoso.


I've tryed the provided script to load Bio2RDF data located here
http://docs.openlinksw.com/virtuoso/rdfperformancetuning.html#rdfperfloading,
but even if it is visibly working, the server always eventually do a
segmentation fault. And since this script don't provide output to know
which triple he was trying to load before the crash happen, I can't
restart it at the point where it was.

We made our own version of this script using Perl and the TTLP_MT
procedure. With our version, we know where it was before a crash, so
we can start it again from that point. The important point to notice
here, is that server also crash at repetition after I restart the load
and the time between 2 crashs grow smaller each time. Eventually, it
become impossible to continue the load.

I've try to do some tuning in the Virtuoso.ini myself, but I've only
manage to get minor speed boost.

Should I remove all indexes for the loading or add more?

The server use to do the loading is a 32 cores, 128 GB ram and a Raid
array of 1.5 TB.

I can provide you with the Genbank dump if you want to play with it.
If so, just tell me and I will give you the link and what correction
need to be done for loading it (There was some errors with the RDFizer
that generate the Genbank dump, but they are easily corrected with a
regexp before the load).

So do you have an optimize virtuoso.ini you can suggest me to help?

Thanks,

Marc-Alexandre Nolin

P.S.: We have other large N3 dump (even more than twice the size of
this one) that we can't manage to succeed at loading them. But I think
that if we can solve the problem with this dump, it will also work
with the others dumps.

[Virtuoso-users] Request for optimize virtuoso.ini configuration files

Reply via email to