Re: [Server-devel] Running complete Wikipedia offline

2013-01-23 Thread Andy Rabagliati
On Wed, 12 Dec 2012, Sameer Verma wrote:

 I've been debating the possibility of running a *complete* copy of
 Wikipedia (txt and images) offline on the XS. At this point, the
 targets are English (https://en.wikipedia.org) and Hindi
 (https://hi.wikipedia.org).

Old thread, I know.

Have you looked at kiwix - http://www.kiwix.org/index.php/Main_Page

It is 'screened' content, with effort to pick a good article version.

Cheers,Andy!
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel


Re: [Server-devel] Running complete Wikipedia offline

2012-12-18 Thread Sameer Verma
On Sun, Dec 16, 2012 at 4:36 AM, Daniel Drake d...@laptop.org wrote:
 On Wed, Dec 12, 2012 at 9:28 PM, Sameer Verma sve...@sfsu.edu wrote:
 I've been debating the possibility of running a *complete* copy of
 Wikipedia (txt and images) offline on the XS. At this point, the
 targets are English (https://en.wikipedia.org) and Hindi
 (https://hi.wikipedia.org).

 The demand on the local server wouldn't be huge, given the relatively
 small footprint at the school. Storage is cheap. This would be an
 offline copy for one-way consumption, so I'm not looking for ways to
 do local edits, and push these back upstream. I'd imagine the
 Wikipedia dumps can be rsync'd once every x months over sneakernet.
 Dump data is here: https://meta.wikimedia.org/wiki/Data_dumps

 When I was in Nepal we cloned Wiktionary onto the school server, and I
 imagine the process is similar for wikipedia. The way we did it was:

 Install mediawiki and configure it the same way that the real
 version is configured:
 http://noc.wikimedia.org/conf/

 Install the same plugins that are running on the real version:
 http://en.wikipedia.org/wiki/Special:Version

 Then import the db
 http://dumps.wikimedia.org/backup-index.html

 Then make a few local tweaks (e.g. disable registration/editing)

 Daniel



Thx, Daniel. Will work on things and get back.

cheers,
Sameer
-- 
Sameer Verma, Ph.D.
Professor, Information Systems
San Francisco State University
http://verma.sfsu.edu/
http://commons.sfsu.edu/
http://olpcsf.org/
http://olpcjamaica.org.jm/
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel


Re: [Server-devel] Running complete Wikipedia offline

2012-12-16 Thread Daniel Drake
On Wed, Dec 12, 2012 at 9:28 PM, Sameer Verma sve...@sfsu.edu wrote:
 I've been debating the possibility of running a *complete* copy of
 Wikipedia (txt and images) offline on the XS. At this point, the
 targets are English (https://en.wikipedia.org) and Hindi
 (https://hi.wikipedia.org).

 The demand on the local server wouldn't be huge, given the relatively
 small footprint at the school. Storage is cheap. This would be an
 offline copy for one-way consumption, so I'm not looking for ways to
 do local edits, and push these back upstream. I'd imagine the
 Wikipedia dumps can be rsync'd once every x months over sneakernet.
 Dump data is here: https://meta.wikimedia.org/wiki/Data_dumps

When I was in Nepal we cloned Wiktionary onto the school server, and I
imagine the process is similar for wikipedia. The way we did it was:

Install mediawiki and configure it the same way that the real
version is configured:
http://noc.wikimedia.org/conf/

Install the same plugins that are running on the real version:
http://en.wikipedia.org/wiki/Special:Version

Then import the db
http://dumps.wikimedia.org/backup-index.html

Then make a few local tweaks (e.g. disable registration/editing)

Daniel
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel


Re: [Server-devel] Running complete Wikipedia offline

2012-12-15 Thread Sameer Verma
On Wed, Dec 12, 2012 at 1:37 PM, Martin Langhoff
martin.langh...@gmail.com wrote:
 On Wed, Dec 12, 2012 at 4:28 PM, Sameer Verma sve...@sfsu.edu wrote:
 I've been debating the possibility of running a *complete* copy of
 Wikipedia (txt and images) offline on the XS. At this point, the
 targets are English (https://en.wikipedia.org) and Hindi
 (https://hi.wikipedia.org).

 It would be trivial. Get the HTML-formatted dumps, serve them statically.


Got the XML dump for en-wiki

 My only comment is... let us know about the on-disk space usage once
 it's unpacked (du -sh /path/to/wikipedia )

sverma@elverma-xps13:~$ du -sh
/home/sverma/Downloads/enwiki-20121201-pages-articles.xml
40G /home/sverma/Downloads/enwiki-20121201-pages-articles.xml
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel


Re: [Server-devel] Running complete Wikipedia offline

2012-12-12 Thread Martin Langhoff
On Wed, Dec 12, 2012 at 4:28 PM, Sameer Verma sve...@sfsu.edu wrote:
 I've been debating the possibility of running a *complete* copy of
 Wikipedia (txt and images) offline on the XS. At this point, the
 targets are English (https://en.wikipedia.org) and Hindi
 (https://hi.wikipedia.org).

It would be trivial. Get the HTML-formatted dumps, serve them statically.

My only comment is... let us know about the on-disk space usage once
it's unpacked (du -sh /path/to/wikipedia )


m
--
 martin.langh...@gmail.com
 mar...@laptop.org -- Software Architect - OLPC
 - ask interesting questions
 - don't get distracted with shiny stuff  - working code first
 - http://wiki.laptop.org/go/User:Martinlanghoff
___
Server-devel mailing list
Server-devel@lists.laptop.org
http://lists.laptop.org/listinfo/server-devel