On Wed, Jul 22, 2009 at 2:37 PM, Tei <oscar.vi...@gmail.com> wrote:

> On Wed, Jul 22, 2009 at 5:48 PM, Chengbin Zheng<chengbinzh...@gmail.com>
> wrote:
> ...
> >
> > Yes, the "TombRaider" version is exactly the version I want for static
> > HTML.
> >
> > Just curious, is
> > pages-articles.xml.bz2<
> http://download.wikimedia.org/enwiki/20090713/enwiki-20090713-pages-articles.xml.bz2
> >
> > like
> > a "TombRaider" version? If not, what's the difference?
> >
> > And another curiosity, at
> > http://en.wikipedia.org/wiki/Wikipedia:TomeRaider_database, it says the
> > English Wikipedia database is only 3.3GB. Did they use compression? That
> > seems awfully small. Even if they did, that's an incredible compression
> > ratio, similar to 7-zip, I don't know how you can do that on a eBook
> format.
> > NTFS compression only brings size down 50%.
>
> At a point, Brion compressed it to 242 MB.
>
> http://www.mail-archive.com/wikitech-l@lists.wikimedia.org/msg00358.html
>
> You may also read this:
>  http://en.wikipedia.org/wiki/Solid_compression
>
>
> --
> --
> ℱin del ℳensaje.
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>


I have no doubt that you can compress it to 3.3GB. I'm just curious how
that's possible for an eBook format. 3.3GB, does it include skin, proper
format of Wikipedia, etc?

I'm assuming that the pages-articles.xml.bz2 XML dump includes something
else other than the raw articles? What else are in it?
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to