> @Laura: I suspect Wouter wants to know if he "ignores" the previous errors 
> and proposes a rather incomplete dump (just for you) or waits for Stas' 
> feedback.


OK. I wonder though, if it would be possible to setup a regular HDT dump 
alongside the already regular dumps. Looking at the dumps page, 
https://dumps.wikimedia.org/wikidatawiki/entities/, it looks like a new dump is 
generated once a week more or less. So if a HDT dump could be added to the 
schedule, it should show up with the next dump and then so forth with the 
future dumps. Right now even the Turtle dump contains the bad triples, so 
adding a HDT file now would not introduce more inconsistencies. The problem 
will be fixed automatically with the future dumps once the Turtle is fixed 
(because the HDT is generated from the .ttl file anyway).


> Btw why don't you use the oldest version in HDT website?


1. I have downloaded it and I'm trying to use it, but the HDT tools (eg. query) 
require to build an index before I can use the HDT file. I've tried to create 
the index, but I ran out of memory again (even though the index is smaller than 
the .hdt file itself). So any Wikidata dump should contain both the .hdt file 
and the .hdt.index file unless there is another way to generate the index on 
commodity hardware

2. because it's 1 year old :)

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to