Dear all,

>>> It also seems that some of your post answers the question from my previous
>>> email. That sounds as if it is pretty hard to create HDT exports (not much
>>> surprise there). Maybe it would be nice to at least reuse the work: could we
>>> re-publish your HDT dumps after you created them?
>> 
>> yes, sure, here they are:
>> http://wikidataldf.com/download/
> 
> I should add, yes, it is pretty hard to create the HDT file since the
> process requires an awful lot of RAM, and I don't know if in the
> future I will be able to produce them.

Maybe some nuance: creating HDT exports is not *that* hard.

First, on a technical level, it's simply:
    rdf2hdt -f turtle triples.ttl triples.hdt
so that's not really difficult ;-)

Second, concerning machine resources:
for datasets with millions of triples, you can easily do it on any machine.
It doesn't take that much RAM, and certainly not that much disk space.
When you have hundreds of millions of triples, as is the case with 
Wikidata/DBpedia/…,
having a significant amount of RAM does indeed help a lot.
The people working on HDT will surely improve that requirement in the future.

We should really see HDT generation as a one-time server effort
that serves to reduce future server efforts significantly.

Best,

Ruben

PS If anybody has trouble generating an HDT file,
feel free to send me a link to your dump and I'll do it for you.
_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l

Reply via email to