Experiment is all good. I accept what I'm getting myself into ;) I'm
trying to step up from tdbloader due to my needs.
Just a quick FYI: I'm currently testing tdbloader3 on a very small
sample size just to make sure it is working as intended. I'll apply it
to my close to 500m triples - I don't have an exact count on this yet,
hopefully I can get that off tdbstats at some point. But as you know, I
have a few roadblocks. Hoping to resolve them soon with your help.
-Sarven
On 12-03-02 03:41 PM, Paolo Castagna wrote:
For other who want to help testing tdbloader3...
We could use Freebase data dump as test dataset. It's ~600 million triples.
You can use this: https://github.com/castagna/freebase2rdf to convert the
Freebase dump into RDF.
Here is how I run tdbloader3, in this case giving the JVM 5 GB of RAM:
java -cp
target/jena-tdbloader3-0.1-incubating-SNAPSHOT-jar-with-dependencies.jar
-server -d64 -Xmx5120M cmd.tdbloader3 --no-stats --compression
--spill-size-auto --loc target/freebase
freebase-datadump-rdf.nt.gz
Last but not least, remember that tdbloader3 is still an experiment (and there
are good reasons why it is in the SVN 'Scratch' area). The need of loading ever
growing RDF datasets is, however, real
and rightly so.
Paolo
Paolo Castagna wrote:
Sarven Capadisli wrote:
I've documented some of my experiences here:
https://issues.apache.org/jira/browse/JENA-117#comment-13221016
Thanks Sarven
Re: https://issues.apache.org/jira/browse/JENA-117#comment-13221074
Paolo