Hi Brian,

2015-03-30 0:25 GMT+02:00 Brian <reflect...@gmail.com>:
> Although the initial goal of the Netflix Prize was to design a
> collaborative filtering algorithm, it became notorious when the data was
> used to de-anonymize Netflix users. Researchers proved that given just a
> user's movie ratings on one site, you can plug those ratings into another
> site, such as the IMDB. You can then take that information, and with some
> Google searches and optionally a bit of cash (for websites that sell user
> information, including, in some cases, their SSN) figure out who they are.
> You could even drive up to their house and take a selfie with them, or
> follow them to work and meet their boss and tell them about their views on
> the topics they were editing.

somewhat tangentially, and to bring back this to topic to a more
scientific setting I would like to point out that there has already
been reasearch in the past on this topic.

I highly recommend reading the following paper:

Lieberman, Michael D., and Jimmy Lin. "You Are Where You Edit:
Locating Wikipedia Contributors through Edit Histories." ICWSM. 2009.
(PDF 
<http://www.pensivepuffin.com/dwmcphd/syllabi/infx598_wi12/papers/wikipedia/lieberman-lin.YouAreWhereYouEdit.ICWSM09.pdf>)

For those of you that don't want to read the whole paper, you can find
a recap of the most relevant findings in this presentation by Maurizio
Napolitano:
<http://www.slideshare.net/napo/social-geography-wikipedia-a-quick-overwiew>

The main idea is associating spatial coordinates to a Wikipedia
articles when possible, this articles are called "geopages". Then you
extract from the history of articles the users which have edited a
geopage. If you plot the geopages edited by a given contributor you
can see that they tend to cluster, so you can define an "edit area".
The study finds that 30-35% of contributors concentrate their edits in
an edit area smaller than 1 deg^2 (~12,362 km^2, approximately the
area of Connecticut or Northern Ireland[1] (thanks, Wikipedia!)).

For another free/libre project with a geographic focus like
OpenStreetMap this is even more marked, check out for example this
tool «“Your OSM Heat Map” (aka Where did you contribute?)»[2] by
Pascal Neis.

This, of course, is not a straightforward de-anonimization but this
methods work in principle for every contributor even if you obfuscate
their IP or username (provided that you can still assign all the edits
from a given user to a unique and univocal identifier)

C
[1] https://en.wikipedia.org/wiki/Square_degree
[2a] http://yosmhm.neis-one.org/
[2b] http://neis-one.org/2011/08/yosmhm/

_______________________________________________
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines
Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

Reply via email to