Which is why we have Wikidata?

On Tue, Jan 24, 2017 at 8:03 AM, Stuart A. Yeates <syea...@gmail.com> wrote:

> "closure of the [[Category:Australia]]" is not going to work. In en.wiki
> subcategories are not subsets in any mathematical sense and the category
> tree has many, many loops and no roots.
>
> cheers
> stuart
>
> --
> ...let us be heard from red core to black sky
>
> On Tue, Jan 24, 2017 at 2:12 PM, Kerry Raymond <kerry.raym...@gmail.com>
> wrote:
>
>> As previously came up in discussion about chapters, it would be very
>> useful to have national data about Wikipedia activities, which can be
>> determined (generally) from IP addresses. Now I understand the privacy
>> argument in relation to logged-in users (not saying I agree with it though
>> in relation to aggregate data). However, can we find a proxy that does not
>> have the privacy considerations.
>>
>>
>>
>> My hypothesis is that national content is predominantly written by users
>> resident in that nation. And that therefore activity on national content
>> can be used as a proxy for national user editing activity.
>>
>>
>>
>> In the case of Australia, we could describe Australian national content
>> in either of two ways: articles within the closure of the
>> [[Category:Australia]] and/or those tagged as  {{WikiProject Australia}}.
>> There are arguments for/against either (neither is perfect, in my
>> experience the category closure will tend to have false positives and the
>> project will tend to have false negatives).
>>
>>
>>
>> I would like to know what correlation exists between national editor
>> activity (as determined from IP addresses mapped to location) and national
>> content edits and if/how it changes over time for various nations. This is
>> research that only WMF can do because WMF has the IP addresses and the rest
>> of us can’t have them for privacy reasons.
>>
>>
>>
>> If we could establish that a strong-enough correlation existed between
>> them, we could use national content activity (for which there is no privacy
>> consideration) as a proxy for national editing activity. And we might even
>> be able to come up with a multiplier for each nation to provide comparable
>> data for national editing activity.
>>
>>
>>
>> Now, it may be that we need to restrict the edits themselves in some way
>> to maximise the correlations between national content and same-nation
>> editor activity.
>>
>>
>>
>> My second hypothesis is “semantic” edits (e.g. edits that add large
>> amounts of content or citation) to national content will be more highly
>> correlated with same-nation editors than “syntactic” edits (e.g. fix
>> spelling, punctuation or Manual of Style issues) will be. I suspect most
>> bots and other automated/semi-automated edits are doing syntactic edits.
>>
>>
>>
>> Now, some of you will probably be aware of [https://en.wikipedia.org/wiki
>> /Wikipedia:Wikipedia_Signpost/2017-01-17/Recent_research Female
>> Wikipedians aren't more likely to edit women biographies]. So it may well
>> be that my patriotic-editing hypothesis is also untrue. But it would be
>> nice to know one way or the other.
>>
>>
>>
>> Kerry
>>
>>
>>
>> _______________________________________________
>> Wiki-research-l mailing list
>> Wiki-research-l@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>>
>>
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to