On 15 December 2010 20:24, Manuel Schneider
<manuel.schnei...@wikimedia.ch> wrote:
> Hi Andrew,
>
> maybe you'd like to check out ZIM: This is an standardized file format
> for compressed HTML dumps, focused on Wikimedia content at the moment.
>
> There is some C++ code around to read and write ZIM files and there are
> several projects using that, eg. the WP1.0 project, the Israeli and
> Kenyan Wikipedia Offline initiatives and more. Also the Wikimedia
> Foundation is currently in progress to adopt the format to provide ZIM
> files from Wikimedia wikis in the future.

This is very interesting and I'll be watching it. Where do the HTML
dumps come from? I'm pretty sure I've only seen "static" for Wikipedia
and not for Wiktionary for example. I am also looking at adapting the
parser for offline use to generate HTML from the dump file wikitext.

Andrew Dunbar (hippietrail)

> http://openzim.org/
>
> /Manuel
>
> Am 15.12.2010 16:21, schrieb Andrew Dunbar:
>> I've long been interested in offline tools that make use of WikiMedia
>> information, particularly the English Wiktionary.
>>
>> I've recently come across a tool which can provide random access to a
>> bzip2 archive without decompressing it and I would like to make use of
>> it in my tools but I can't get it to compile and/or function with any
>> free Windows compiler I have access to. It works fine on the *nix
>> boxes I have tried but my personal machine is a Windows XP netbook.
>>
>> The tool is "seek-bzip2" by James Taylor and is available here:
>> http://bitbucket.org/james_taylor/seek-bzip2
>>
>> * The free Borland compiler won't compile it due to missing (Unix?) header 
>> files
>> * lcc compiles it but it always fails with error "unexpected EOF"
>> * mingw compiles it if the -m64 option is removed from the Makefile
>> but it then has the same behaviour as the lcc build.
>>
>> My C experience is now quite stale and my 64-bit programming
>> experience negligible.
>>
>> (I'm also interested in hearing from other people working on offline
>> tools for dump files, wikitext parsing, or Wiktionary)
>>
>> Andrew Dunbar (hippietrail)
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>
> --
> Regards
> Manuel Schneider
>
> Wikimedia CH - Verein zur Förderung Freien Wissens
> Wikimedia CH - Association for the advancement of free knowledge
> www.wikimedia.ch
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to