Which is why we have Wikidata? On Tue, Jan 24, 2017 at 8:03 AM, Stuart A. Yeates <syea...@gmail.com> wrote:
> "closure of the [[Category:Australia]]" is not going to work. In en.wiki > subcategories are not subsets in any mathematical sense and the category > tree has many, many loops and no roots. > > cheers > stuart > > -- > ...let us be heard from red core to black sky > > On Tue, Jan 24, 2017 at 2:12 PM, Kerry Raymond <kerry.raym...@gmail.com> > wrote: > >> As previously came up in discussion about chapters, it would be very >> useful to have national data about Wikipedia activities, which can be >> determined (generally) from IP addresses. Now I understand the privacy >> argument in relation to logged-in users (not saying I agree with it though >> in relation to aggregate data). However, can we find a proxy that does not >> have the privacy considerations. >> >> >> >> My hypothesis is that national content is predominantly written by users >> resident in that nation. And that therefore activity on national content >> can be used as a proxy for national user editing activity. >> >> >> >> In the case of Australia, we could describe Australian national content >> in either of two ways: articles within the closure of the >> [[Category:Australia]] and/or those tagged as {{WikiProject Australia}}. >> There are arguments for/against either (neither is perfect, in my >> experience the category closure will tend to have false positives and the >> project will tend to have false negatives). >> >> >> >> I would like to know what correlation exists between national editor >> activity (as determined from IP addresses mapped to location) and national >> content edits and if/how it changes over time for various nations. This is >> research that only WMF can do because WMF has the IP addresses and the rest >> of us can’t have them for privacy reasons. >> >> >> >> If we could establish that a strong-enough correlation existed between >> them, we could use national content activity (for which there is no privacy >> consideration) as a proxy for national editing activity. And we might even >> be able to come up with a multiplier for each nation to provide comparable >> data for national editing activity. >> >> >> >> Now, it may be that we need to restrict the edits themselves in some way >> to maximise the correlations between national content and same-nation >> editor activity. >> >> >> >> My second hypothesis is “semantic” edits (e.g. edits that add large >> amounts of content or citation) to national content will be more highly >> correlated with same-nation editors than “syntactic” edits (e.g. fix >> spelling, punctuation or Manual of Style issues) will be. I suspect most >> bots and other automated/semi-automated edits are doing syntactic edits. >> >> >> >> Now, some of you will probably be aware of [https://en.wikipedia.org/wiki >> /Wikipedia:Wikipedia_Signpost/2017-01-17/Recent_research Female >> Wikipedians aren't more likely to edit women biographies]. So it may well >> be that my patriotic-editing hypothesis is also untrue. But it would be >> nice to know one way or the other. >> >> >> >> Kerry >> >> >> >> _______________________________________________ >> Wiki-research-l mailing list >> Wiki-research-l@lists.wikimedia.org >> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l >> >> > > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > >
_______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l