On Wed, Sep 2, 2009 at 8:45 AM, Manuel Schneider < manuel.schnei...@wikimedia.ch> wrote:
> Hi Chengbin, > > ZIM is an upcoming standard for using HTML contents offline. It is derived > from the Zeno file format used on the german Wikipedia DVDs since 2006 (ZIM > = > Zeno IMproved). > > There are currently several reader applications for it, for instance the > zimreader made by the openZIM project or Kiwix. > There are some ports around like Kiwix on Windows and zimreader on openmoko > / > ARM. > > The zimreader by openZIM works like a small webserver, it serves the > contents > of the ZIM file locally. > > Once the HTML dump on static.wikimedia.org is fixed and ZIM file creation > has > been integrated you will be able to download fresh ZIM files of all > Wikimedia > projects directly from download.wikimedia.org. > > Currently the Kiwix team has created some ZIM files and we try to build a > ZIM > file directory: > http://openzim.org/ZIM_File_Archive > > ZIM actually stores the article text portion of the HTML output of the Wiki > in > a compressed cluster. It can hold also all type of other MIME types such as > images, CSS files etc. > http://openzim.org/ZIM_File_Format > > It is an open standard and has currently been developed and implemented by > the > openZIM team (sponsored by Wikimedia CH) in C++. There is a library > (zimlib) > which can be integrated in other reader or dumping applications to make > them > ZIM-aware. > > Using the open documentation ZIM can be implemented in any other language > as > well. > The idea of ZIM is to make the data files freely interchangeable with any > reader application. It is also flexible enough to store other works than > only > data from Wikipedia/MediaWiki. Then it tries to keep the reader application > as simple and stupid as possible. There is only uncompression and HTML > rendering to be done while a HTML renderer should be available on nearly > all > devices. > > Greets, > > > Manuel > > > Am Mittwoch, 2. September 2009 schrieb Chengbin Zheng: > > On Wed, Sep 2, 2009 at 8:13 AM, Manuel Schneider < > > > > manuel.schnei...@wikimedia.ch> wrote: > > > Hi Chengbin, hi list, > > > > > > static.wikimedia.org is currently not being updated and while the > dumps > > > processing has been assigned to and completely rewritten by Tomasz Finc > > > (developer at WMF), there has not been made any assignment concerning > > > HTML dumps. > > > > > > We had a Wikipedia Offline meeting at Wikimania last week and discussed > > > several issues. One issue is the fact, that WMF wants to see the ZIM > file > > > format being used for offline dumps and has suggested to include it > into > > > the > > > regular dumping process. > > > So one question was: When will that happen, what is the status of WMF > ZIM > > > dumping? > > > As ZIM uses HTML extracts Tomasz clarified that once > > > static.wikimedia.orghas been rebuild to be stable and sutainable, > > > integrating ZIM would be trivial. But he also informed us that this > task > > > has not yet been assigned. > > > > > > As Brion Vibber and Erik Möller have been at the meeting as well we > hope > > > that > > > this assignment will be made soon and this task has got higher > priority. > > > > > > This said I may also advise you not to you use the pure HTML dumps but > > > the ZIM > > > files for your Archos, because that's what they are meant for. > > > A ZIM file containing all german Wikipedia articles (>900,000) is 1,4 > GB, > > > an > > > additional full text search index takes another 1 GB. > > > > > > Greets, > > > > > > > > > Manuel > > > > > > Am Mittwoch, 2. September 2009 schrieb Chengbin Zheng: > > > > I bring this old issue up because I want to know if (or if not) > > > > progress (or plans) are made to update the static HTML version of > > > > Wikipedia. B&H photos just leaked the next generation of Archos > > > > portable media players. Unbelievably, the rumors of a 500GB version > is > > > > true! This is already tempting (especially the price at $420). Just > > > > waiting for specs > > > > > > on > > > > > > > September 15, the Archos event. I really hope it will support NTFS so > I > > > > > > can > > > > > > > use the compression feature. > > > > > > > > It would be really cool and convenient to have an offline copy of > > > > > > Wikipedia > > > > > > > anywhere I go without the need of Wi-Fi. What am I gonna do with > 500GB? > > > > > > > > BTW, does anyone know what is the size of the current static HTML > > > > English Wikipedia version uncompressed? Thanks. > > > > _______________________________________________ > > > > Wikitech-l mailing list > > > > Wikitech-l@lists.wikimedia.org > > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > > > > -- > > > Regards > > > Manuel Schneider > > > > > > Wikimedia CH - Verein zur Förderung Freien Wissens > > > Wikimedia CH - Association for the advancement of free knowledge > > > www.wikimedia.ch > > > > > > _______________________________________________ > > > Wikitech-l mailing list > > > Wikitech-l@lists.wikimedia.org > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > > I'm not familiar with the file extension .zim. What is that? Some sort of > > compressed html format like .chm? Where can I get a .zim file? I need to > > get check if this format is compatible with my Archos's Opera browser. > > _______________________________________________ > > Wikitech-l mailing list > > Wikitech-l@lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > > -- > Regards > Manuel Schneider > > Wikimedia CH - Verein zur Förderung Freien Wissens > Wikimedia CH - Association for the advancement of free knowledge > www.wikimedia.ch > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > Well, as I said, Archos devices are not computers. They're merely portable video players with an internet browser. That's why I seek the static HTML version of Wikipedia. Will there be easy extraction of zim to HTML? Extracting a dump is too difficult. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l