On Tue, Dec 2, 2008 at 9:43 PM, Pratul Kalia <[EMAIL PROTECTED]> wrote: > There are a couple of issues I noted with the Google Code page... > These are language issues like improper capitalization and some > grammar. These will be noticed ASAP by any critic. So I did :-) > > - The description under the title of the page should say "... English > Wikipedia..." > > - I corrected the description for you. Go through it and change where > you want to. > > This project is using: Python (for creating a converter which converts > MediaWiki format text to corresponding HTML page), Django (as a web > server to fix CSS and other stuff), PostgreSQL database to locate the > article and its span. > > The basic working module was taken from this article > http://users.softlab.ece.ntua.gr/~ttsiod/buildWikipediaOffline.html; > difference being the Python parser for XML->HTML conversion. So in > place of a PHP kind of setup which Wikipedia uses, this application > uses Python. The idea which remains intact is of using the XML dump > provided from MediaWiki site, breaking it to small files using > bzip2recover, and then creating database for article and its location > in list of files. For now, I am using the dump from 24th July 2008. > For me this dump, the database and rest of configuration takes around > 4.6G (3.9G XML files, ~700MB PostgreSQL dump and some CSS and JS files > taken from MediaWiki site itself.)
Thanks for the corrections. :P Though "English Wikipedia would not be required as we are aiming for more languages. Kind Regards Nandeep _______________________________________________ ilugd mailinglist -- ilugd@lists.linux-delhi.org http://frodo.hserus.net/mailman/listinfo/ilugd Archives at: http://news.gmane.org/gmane.user-groups.linux.delhi http://www.mail-archive.com/ilugd@lists.linux-delhi.org/