Re: [Server-devel] Running complete Wikipedia offline
On Wed, 12 Dec 2012, Sameer Verma wrote: I've been debating the possibility of running a *complete* copy of Wikipedia (txt and images) offline on the XS. At this point, the targets are English (https://en.wikipedia.org) and Hindi (https://hi.wikipedia.org). Old thread, I know. Have you looked at kiwix - http://www.kiwix.org/index.php/Main_Page It is 'screened' content, with effort to pick a good article version. Cheers,Andy! ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] Running complete Wikipedia offline
On Sun, Dec 16, 2012 at 4:36 AM, Daniel Drake d...@laptop.org wrote: On Wed, Dec 12, 2012 at 9:28 PM, Sameer Verma sve...@sfsu.edu wrote: I've been debating the possibility of running a *complete* copy of Wikipedia (txt and images) offline on the XS. At this point, the targets are English (https://en.wikipedia.org) and Hindi (https://hi.wikipedia.org). The demand on the local server wouldn't be huge, given the relatively small footprint at the school. Storage is cheap. This would be an offline copy for one-way consumption, so I'm not looking for ways to do local edits, and push these back upstream. I'd imagine the Wikipedia dumps can be rsync'd once every x months over sneakernet. Dump data is here: https://meta.wikimedia.org/wiki/Data_dumps When I was in Nepal we cloned Wiktionary onto the school server, and I imagine the process is similar for wikipedia. The way we did it was: Install mediawiki and configure it the same way that the real version is configured: http://noc.wikimedia.org/conf/ Install the same plugins that are running on the real version: http://en.wikipedia.org/wiki/Special:Version Then import the db http://dumps.wikimedia.org/backup-index.html Then make a few local tweaks (e.g. disable registration/editing) Daniel Thx, Daniel. Will work on things and get back. cheers, Sameer -- Sameer Verma, Ph.D. Professor, Information Systems San Francisco State University http://verma.sfsu.edu/ http://commons.sfsu.edu/ http://olpcsf.org/ http://olpcjamaica.org.jm/ ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] Running complete Wikipedia offline
On Wed, Dec 12, 2012 at 9:28 PM, Sameer Verma sve...@sfsu.edu wrote: I've been debating the possibility of running a *complete* copy of Wikipedia (txt and images) offline on the XS. At this point, the targets are English (https://en.wikipedia.org) and Hindi (https://hi.wikipedia.org). The demand on the local server wouldn't be huge, given the relatively small footprint at the school. Storage is cheap. This would be an offline copy for one-way consumption, so I'm not looking for ways to do local edits, and push these back upstream. I'd imagine the Wikipedia dumps can be rsync'd once every x months over sneakernet. Dump data is here: https://meta.wikimedia.org/wiki/Data_dumps When I was in Nepal we cloned Wiktionary onto the school server, and I imagine the process is similar for wikipedia. The way we did it was: Install mediawiki and configure it the same way that the real version is configured: http://noc.wikimedia.org/conf/ Install the same plugins that are running on the real version: http://en.wikipedia.org/wiki/Special:Version Then import the db http://dumps.wikimedia.org/backup-index.html Then make a few local tweaks (e.g. disable registration/editing) Daniel ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] Running complete Wikipedia offline
On Wed, Dec 12, 2012 at 1:37 PM, Martin Langhoff martin.langh...@gmail.com wrote: On Wed, Dec 12, 2012 at 4:28 PM, Sameer Verma sve...@sfsu.edu wrote: I've been debating the possibility of running a *complete* copy of Wikipedia (txt and images) offline on the XS. At this point, the targets are English (https://en.wikipedia.org) and Hindi (https://hi.wikipedia.org). It would be trivial. Get the HTML-formatted dumps, serve them statically. Got the XML dump for en-wiki My only comment is... let us know about the on-disk space usage once it's unpacked (du -sh /path/to/wikipedia ) sverma@elverma-xps13:~$ du -sh /home/sverma/Downloads/enwiki-20121201-pages-articles.xml 40G /home/sverma/Downloads/enwiki-20121201-pages-articles.xml ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel
Re: [Server-devel] Running complete Wikipedia offline
On Wed, Dec 12, 2012 at 4:28 PM, Sameer Verma sve...@sfsu.edu wrote: I've been debating the possibility of running a *complete* copy of Wikipedia (txt and images) offline on the XS. At this point, the targets are English (https://en.wikipedia.org) and Hindi (https://hi.wikipedia.org). It would be trivial. Get the HTML-formatted dumps, serve them statically. My only comment is... let us know about the on-disk space usage once it's unpacked (du -sh /path/to/wikipedia ) m -- martin.langh...@gmail.com mar...@laptop.org -- Software Architect - OLPC - ask interesting questions - don't get distracted with shiny stuff - working code first - http://wiki.laptop.org/go/User:Martinlanghoff ___ Server-devel mailing list Server-devel@lists.laptop.org http://lists.laptop.org/listinfo/server-devel