Re: [Wikidata] Wikidata HDT dump

Laura Morales Tue, 31 Oct 2017 23:47:44 -0700

> It's a machine with 378 GiB of RAM and 64 threads running Scientific
> Linux 7.2, that we use mainly for benchmarks.
> 
> Building the index was really all about memory because the CPUs have
> actually a lower per-thread performance (2.30 GHz vs 3.5 GHz) compared
> to those of my regular workstation, which was unable to build it.



If your regular workstation was using more CPU, I guess it was because of 
swapping. Thanks for the statistics, it means a "commodity" CPU could handle 
this fine, the bottleneck is RAM. I wonder how expensive it is to buy a machine 
like yours... it sounds like in the $30K-$50K range?


> You're right. The limited query language of hdtSearch is closer to
> grep than to SPARQL.
> 
> Thank you for pointing out Fuseki, I'll have a look at it.


I think a SPARQL command-line tool could exist, but AFAICT it doesn't exist 
(yet?). Anyway, I have already successfully setup Fuseki with a HDT backend, 
although my HDT files are all small. Feel free to drop me an email if you need 
any help setting up Fuseki.

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Wikidata HDT dump

Reply via email to