On Wed, Sep 16, 2015 at 12:51 AM, Federico Leva (Nemo)
wrote:
> Have you looked into what mwoffliner does?
> https://sourceforge.net/p/kiwix/other/ci/master/tree/mwoffliner/mwoffliner.js
+1 for mwoffliner. It should be *very* close to what you are looking for,
and avoids the need to parse wikit
As another suggestion, XOWA (http://gnosygnu.github.io/xowa/) can generate
a list of thumbs. It takes about 60 hours to parse the English Wikipedia
dump and generate a table of 4.78 million rows with the following columns:
* file name
* file extension
* repo (commons or local)
* file width
* file
Have you looked into what mwoffliner does?
https://sourceforge.net/p/kiwix/other/ci/master/tree/mwoffliner/mwoffliner.js
Maybe you can even just extract the images from the ZIM files.
Nemo
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
htt
On Mon, Sep 14, 2015 at 4:49 PM, Platonides wrote:
> You know it will fail for all kind of images included through templates
> (particularly infoboxes), right?
Indeed, it is not possible to find out what thumbnails are used by a page
without actually parsing it. Your best bet is to wait until P
On 15/09/15 01:34, wp mirror wrote:
Idea. I am thinking of piping the *pages-articles.xml.bz2 dump file
through an AWK script to write all unique [[File:*]] tags into a file. This
can be done quickly. The question then is: Given a file with all the media
tags, how can I generate all the thumbs.
Dear Brian,
On 9/13/15, Brian Wolff wrote:
> On 9/12/15, wp mirror wrote:
>> 0) Context
>>
>> I am currently developing new features for WP-MIRROR (see <
>> https://www.mediawiki.org/wiki/Wp-mirror>).
>>
>> 1) Objective
>>
>> I would like WP-MIRROR to generate all image thumbs during the mirror
On 9/12/15, wp mirror wrote:
> 0) Context
>
> I am currently developing new features for WP-MIRROR (see <
> https://www.mediawiki.org/wiki/Wp-mirror>).
>
> 1) Objective
>
> I would like WP-MIRROR to generate all image thumbs during the mirror build
> process. This is so that mediawiki can render p
0) Context
I am currently developing new features for WP-MIRROR (see <
https://www.mediawiki.org/wiki/Wp-mirror>).
1) Objective
I would like WP-MIRROR to generate all image thumbs during the mirror build
process. This is so that mediawiki can render pages quickly using
precomputed thumbs.
2) Du