Re: [Wikidata] Wikidata HDT dump

Laura Morales Fri, 03 Nov 2017 00:48:46 -0700

Hello list,

a very kind person from this list has generated the .hdt.index file for me, 
using the 1-year old wikidata HDT file available at the rdfhdt website. So I 
was finally able to setup a working local endpoint using HDT+Fuseki. Set up was 
easy, launch time (for Fuseki) also was quick (a few seconds), the only change 
I made was to replace -Xmx1024m to -Xmx4g in the Fuseki startup script (btw I'm 
not very proficient in Java, so I hope this is the correct way). I've ran some 
queries too. Simple select or traversal queries seems fast to me (I haven't 
measured them but the response is almost immediate), other queries such as 
"select distinct ?class where { [] a ?class }" takes several seconds or a few 
minutes to complete, which kinda tells me the HDT indexes don't work well on 
all queries. But otherwise for simple queries it works perfectly! At least I'm 
able to query the dataset!
In conclusion, I think this more or less gives some positive feedback for using 
HDT on a "commodity computer", which means it can be very useful for people 
like me who want to use the dataset locally but who can't setup a full-blown 
server. If others want to try as well, they can offer more (hopefully positive) 
feedback.
For all of this, I heartwarmingly plea any wikidata dev to please consider 
scheduling a HDT dump (.hdt + .hdt.index) along with the other regular dumps 
that it creates weekly.


Thank you!!

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata HDT dump

Reply via email to