Re: [Wikitech-l] [Offline-l] The Whole Wikipedia in English with pictures in one 40GB big file

2014-03-08 Thread Emmanuel Engelhart
Le 07/03/2014 19:25, Asaf Bartov a écrit : btw, are these new improved tools documented anywhere? http://kiwix.org/wiki/Development does not seem to point in the right direction. The usage is pretty straightforward (for IT people) and IMO everything necessary is explained in the READMEs: *

Re: [Wikitech-l] [Offline-l] The Whole Wikipedia in English with pictures in one 40GB big file

2014-03-08 Thread Jay Ashworth
- Original Message - From: Emmanuel Engelhart kel...@kiwix.org PS: We really want to make a post @blog.wikimedia.org (so in English). If someone is volunteer to write this, I would really appreciate his help. If you write such a blog post in what English you have handy, I'd be happy

Re: [Wikitech-l] [Offline-l] The Whole Wikipedia in English with pictures in one 40GB big file

2014-03-02 Thread Emmanuel Engelhart
Le 02/03/2014 01:33, Samuel Klein a écrit : Brilliant. Congrats to everyone who is working on this! What is needed to scrape categories? 0 - For all dumped pages (so at least NS_MAIN and NS_CATEGORY pages), download the list of categories they belong to (with the MW API). 1 - For each dumped