I happen to work on a tool (initially for Liam Wyatt) that might do some of what you want on Wikidata. Given a Wikidata Query (separate topic ;-) or a simple list of Wikidata items, it can record changes made to these items over time. It records the JSON for the Wikidata items, max of one revision/day.
A front-end (to be written) can then extract things like number of sitelinks (Wikipedia articles) for these items over time; Wikidata labels in different languages; number/type of statements added; etc. Ideally, this can be exported as a table, to make pretty stats in R (or the like). As I said, it's work in progress, but if you have a (initial) list of items, I can start "recording". On Tue, Oct 6, 2015 at 4:54 PM Andrew Gray <andrew.g...@dunelm.org.uk> wrote: > On 6 October 2015 at 14:12, Amir E. Aharoni > <amir.ahar...@mail.huji.ac.il> wrote: > > Thanks for this email. > > > > This raises a wider question: What is the comfortable way to compare the > > coverage of a topic in different languages? > > > > For example, I'd love to see a report that says: > > > > Number of articles about UNESCO cultural heritage: > > English Wikipedia: 1000 > > French Wikipedia: 1200 > > Hebrew Wikipedia: 742 > > etc. > > > > And also to track this over time, so if somebody would work hard on > creating > > articles about UNESCO cultural heritage in Hebrew, I'd see a trend graph. > > There's two general approaches to this: > > a) On Wikidata > b) On the individual wikis > > Approach (a) would rely on having a defined set of things in Wikidata > that we can identify. For example, "is a World Heritage Site" would be > easy enough, since we have a property explicitly dealing with WHS > identifiers (and we have 100% coverage in Wikidata). "Is of interest > to UNESCO" is a trickier one - but if you can construct a suitable > Wikidata query... > > As Federico notes, for WHS records, we can generate a report like > https://tools.wmflabs.org/mix-n-match/?mode=sitestats&catalog=93 > (57.4% coverage on hewiki!). No graphs but if you were interested then > you could probably set one up without much work. > > b) is more useful for fuzzy groups like "of relevance to UNESCO", > since this is more or less perfect for a category system. However, it > would require examining the category tree for each WP you're > interested in to figure out exactly which categories are relevant, and > then running a script to count those daily. > > A. > -- > - Andrew Gray > andrew.g...@dunelm.org.uk > > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l