I happen to work on a tool (initially for Liam Wyatt) that might do some of
what you want on Wikidata. Given a Wikidata Query (separate topic ;-) or a
simple list of Wikidata items, it can record changes made to these items
over time. It records the JSON for the Wikidata items, max of one
revision/day.

A front-end (to be written) can then extract things like number of
sitelinks (Wikipedia articles) for these items over time; Wikidata labels
in different languages; number/type of statements added; etc. Ideally, this
can be exported as a table, to make pretty stats in R (or the like).

As I said, it's work in progress, but if you have a (initial) list of
items, I can start "recording".

On Tue, Oct 6, 2015 at 4:54 PM Andrew Gray <andrew.g...@dunelm.org.uk>
wrote:

> On 6 October 2015 at 14:12, Amir E. Aharoni
> <amir.ahar...@mail.huji.ac.il> wrote:
> > Thanks for this email.
> >
> > This raises a wider question: What is the comfortable way to compare the
> > coverage of a topic in different languages?
> >
> > For example, I'd love to see a report that says:
> >
> > Number of articles about UNESCO cultural heritage:
> > English Wikipedia: 1000
> > French Wikipedia: 1200
> > Hebrew Wikipedia: 742
> > etc.
> >
> > And also to track this over time, so if somebody would work hard on
> creating
> > articles about UNESCO cultural heritage in Hebrew, I'd see a trend graph.
>
> There's two general approaches to this:
>
> a) On Wikidata
> b) On the individual wikis
>
> Approach (a) would rely on having a defined set of things in Wikidata
> that we can identify. For example, "is a World Heritage Site" would be
> easy enough, since we have a property explicitly dealing with WHS
> identifiers (and we have 100% coverage in Wikidata). "Is of interest
> to UNESCO" is a trickier one - but if you can construct a suitable
> Wikidata query...
>
> As Federico notes, for WHS records, we can generate a report like
> https://tools.wmflabs.org/mix-n-match/?mode=sitestats&catalog=93
> (57.4% coverage on hewiki!). No graphs but if you were interested then
> you could probably set one up without much work.
>
> b) is more useful for fuzzy groups like "of relevance to UNESCO",
> since this is more or less perfect for a category system. However, it
> would require examining the category tree for each WP you're
> interested in to figure out exactly which categories are relevant, and
> then running a script to count those daily.
>
> A.
> --
> - Andrew Gray
>   andrew.g...@dunelm.org.uk
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to