date:20181002

Re: [Wikidata] Wikidata HDT dump

2018-10-02 Thread Laura Morales

> You shouldn't have to keep anything in RAM to HDT-ize something as you could > make the dictionary by sorting on disk and also do the joins to look up > everything against the dictionary by sorting. Yes but somebody has to write the code for it :) My understanding is that they keep everything

Re: [Wikidata] Wikidata HDT dump

2018-10-02 Thread Laura Morales

> 100 GB "with an optimized code" could be enough to produce an HDT like that. The current software definitely cannot handle wikidata with 100GB. It was tried before and it failed. I'm glad to see that new code will be released to handle large files. After skimming that paper it looks like they