On Tue, Dec 2, 2008 at 9:43 PM, Pratul Kalia <[EMAIL PROTECTED]> wrote:
> There are a couple of issues I noted with the Google Code page...
> These are language issues like improper capitalization and some
> grammar. These will be noticed ASAP by any critic. So I did :-)
>
>  - The description under the title of the page should say "... English
> Wikipedia..."
>
>  - I corrected the description for you. Go through it and change where
> you want to.
>
> This project is using: Python (for creating a converter which converts
> MediaWiki format text to corresponding HTML page), Django (as a web
> server to fix CSS and other stuff), PostgreSQL database to locate the
> article and its span.
>
> The basic working module was taken from this article
> http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html;
> difference being the Python parser for XML->HTML conversion. So in
> place of a PHP kind of setup which Wikipedia uses, this application
> uses Python. The idea which remains intact is of using the XML dump
> provided from MediaWiki site, breaking it to small files using
> bzip2recover, and then creating database for article and its location
> in list of files. For now, I am using the dump from 24th July 2008.
> For me this dump, the database and rest of configuration takes around
> 4.6G (3.9G XML files, ~700MB PostgreSQL dump and some CSS and JS files
> taken from MediaWiki site itself.)

Thanks for the corrections. :P Though "English Wikipedia would not be
required as we are aiming for more languages.

Kind Regards
Nandeep

_______________________________________________
ilugd mailinglist -- ilugd@lists.linux-delhi.org
http://frodo.hserus.net/mailman/listinfo/ilugd
Archives at: http://news.gmane.org/gmane.user-groups.linux.delhi 
http://www.mail-archive.com/ilugd@lists.linux-delhi.org/

Reply via email to