Re: [Wiki-research-l] Polling the watcher's of a page. Possible?

2014-01-01 Thread WereSpielChequers
Max,

I wouldn't know if the Foundation was even aware of the incident, they
weren't the source of the data. But it was rather high profile in the
community.

I expect there have been other issues of data being extracted for
researchers, but watchlist data  is for some people a sensitive issue,
hence my alternative suggestion. If you want to go forward with this I'd
suggest either finding a better way to look at how groups of editors focus
on the same articles, or doing something with anonymised watchlist data -
you might get that from the WMF or indeed by posting your credentials as a
researcher and inviting contributors to email you their watchlists for some
research that you will anonymise.

I think it would be interesting to see some research on how closely an
editors watchlist reflects their editing, and how large a watchlist gets
before it becomes so big that an editor no longer stays on top of it. But
you'd also need to ask a few questions such as under what circumstances do
you take a page off your watchlist






On 1 January 2014 05:44, Klein,Max kle...@oclc.org wrote:


  Jonathan,

 So is that it then. Is foundation feeling too burned to ever give out the
 data again? Has there been other precedent since then of releasing data to
 academics?



  Kerry,

 Thanks for the link to the paper. I just saw this in the latest
 newsletter.


  Brian,
 The idea of sending a script to follow other editors and then survey them
 would be a good way to train a learning algorithm. I hadn't thought of
 that, mostly I expected to just pour over some old edits. Thanks for the
 idea.


  Maximilian Klein
 Wikipedian in Residence, OCLC
 +17074787023


  --
 *From:* wiki-research-l-boun...@lists.wikimedia.org 
 wiki-research-l-boun...@lists.wikimedia.org on behalf of
 WereSpielChequers werespielchequ...@gmail.com
 *Sent:* Tuesday, December 31, 2013 4:31 AM
 *To:* Research into Wikimedia content and communities
 *Subject:* Re: [Wiki-research-l] Polling the watcher's of a page.
 Possible?

  How many watchlisters a page has is a sensitive issue, we've already had
 one incident where a researcher acquired a list of unwatched pages for a
 vandalism experiment.

  However anyone who watches a page will also have that pages talkpage on
 their watchlist, so while you can't directly contact everyone who has that
 page on their watchlist you could conceivably attract the attention of some
 of them by a message on its talkpage. But if you were doing more than one
 or two of them you would need your note to be very relevant to the
 watchlisters of that page.

  Regards

  Jonathan


 On 31 December 2013 10:36, Brian Keegan b.kee...@neu.edu wrote:

  Check out Michael Kummer's paper that looks at a similar topic
 (contagion in pageviews among linked articles) from an econometrics
 perspective: Spillovers in Networks of User Generated Content – Evidence
 from 23 Natural Experiments on Wikipedia

  http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2356199



  On Mon, Dec 30, 2013 at 9:42 PM, Kerry Raymond 
 kerry.raym...@gmail.comwrote:

   No, you can’t for reasons on privacy. See:



 https://en.wikipedia.org/wiki/Help:Watching_pages#Privacy



 But, I concur with your theory that edits are contagious. I often find
 that when I get the notification that a watched page has changed, I go and
 look at the page. While I am there, I often spot a “little thing that needs
 doing”, which sometimes is just a simple single edit and other times
 initiates a marathon of editing activity for the next couple of days J



 If you want to test this theory, I think using at the set of editors of
 the page might be a pretty good approximation of the watchlist. A lot of
 people have the “add the pages and files I edit to my watchlist” set in
 their preferences (I know I do).



 For the purpose of declaring one edit as being contagious (that is,
 causes another edit), what criteria would you use? I would assume you need
 some time bounds here. I think there needs to be “kick-off” edits
 identified. These would be edits that occurred sufficiently long after the
 previous edit that contagion could not be factor. Then after the kick-off
 edit, you would be looking for one or more “reaction” edits that occurred
 fairly quickly after one another, suggesting a contagion based on
 watchlists. So it seems there are two time parameters: the kick-off
 threshold and the reaction threshold. I don’t think these are necessarily
 the same value (i.e. is there is some grey zone in-between where the edits
 can be categorised as neither kick-off nor reaction?).



 In terms of setting these threshold(s), you might need some real-life
 data to train on. So maybe you could start by asking if some editors would
 send you a copy of their watchlist and you could write a script that
 compared it with their edit history over the same time frame (plus a bit to
 cater for bursty-ness). From that you could come up with a set of edits

Re: [Wiki-research-l] Polling the watcher's of a page. Possible?

2013-12-31 Thread WereSpielChequers
How many watchlisters a page has is a sensitive issue, we've already had
one incident where a researcher acquired a list of unwatched pages for a
vandalism experiment.

However anyone who watches a page will also have that pages talkpage on
their watchlist, so while you can't directly contact everyone who has that
page on their watchlist you could conceivably attract the attention of some
of them by a message on its talkpage. But if you were doing more than one
or two of them you would need your note to be very relevant to the
watchlisters of that page.

Regards

Jonathan


On 31 December 2013 10:36, Brian Keegan b.kee...@neu.edu wrote:

 Check out Michael Kummer's paper that looks at a similar topic
 (contagion in pageviews among linked articles) from an econometrics
 perspective: Spillovers in Networks of User Generated Content – Evidence
 from 23 Natural Experiments on Wikipedia

 http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2356199



 On Mon, Dec 30, 2013 at 9:42 PM, Kerry Raymond kerry.raym...@gmail.comwrote:

  No, you can’t for reasons on privacy. See:



 https://en.wikipedia.org/wiki/Help:Watching_pages#Privacy



 But, I concur with your theory that edits are contagious. I often find
 that when I get the notification that a watched page has changed, I go and
 look at the page. While I am there, I often spot a “little thing that needs
 doing”, which sometimes is just a simple single edit and other times
 initiates a marathon of editing activity for the next couple of days J



 If you want to test this theory, I think using at the set of editors of
 the page might be a pretty good approximation of the watchlist. A lot of
 people have the “add the pages and files I edit to my watchlist” set in
 their preferences (I know I do).



 For the purpose of declaring one edit as being contagious (that is,
 causes another edit), what criteria would you use? I would assume you need
 some time bounds here. I think there needs to be “kick-off” edits
 identified. These would be edits that occurred sufficiently long after the
 previous edit that contagion could not be factor. Then after the kick-off
 edit, you would be looking for one or more “reaction” edits that occurred
 fairly quickly after one another, suggesting a contagion based on
 watchlists. So it seems there are two time parameters: the kick-off
 threshold and the reaction threshold. I don’t think these are necessarily
 the same value (i.e. is there is some grey zone in-between where the edits
 can be categorised as neither kick-off nor reaction?).



 In terms of setting these threshold(s), you might need some real-life
 data to train on. So maybe you could start by asking if some editors would
 send you a copy of their watchlist and you could write a script that
 compared it with their edit history over the same time frame (plus a bit to
 cater for bursty-ness). From that you could come up with a set of edits
 that look like contagious ones and you could ask the editors to say “yes /
 no / don’t remember” to try to see if 1) contagion appears to be happening
 2) what the time thresholds need to be. Then test it on a bigger set of
 data using edit history as a proxy for watchlists.



 Kerry








  --

 *From:* wiki-research-l-boun...@lists.wikimedia.org [mailto:
 wiki-research-l-boun...@lists.wikimedia.org] *On Behalf Of *Klein,Max
 *Sent:* Tuesday, 31 December 2013 2:26 PM
 *To:* wiki-research-l@lists.wikimedia.org
 *Subject:* [Wiki-research-l] Polling the watcher's of a page. Possible?



 Hello Research,

 It it possible to query for the watchers of a page? It does not seem to
 be in the API, nor is the watchers or wl_user table in the Data Base
 replicas (where I thought MediaWiki stores it. I imagine this is for
 privacy reasons, correct? If so, how would one gain access?

 I have been talking with an econophysicist who thinks that we could
 apply a contagion algorithm, to see which edits are contagious.  (I met
 this econopyhicist at the Berkeley Data Science Faire at which Wikimedia
 Analytics presented, so it was worth it in the end).

   Maximilian Klein
 Wikipedian in Residence, OCLC
 +17074787023

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




 --
 Brian C. Keegan, Ph.D.
 Post-Doctoral Research Fellow, Lazer Lab
 College of Social Sciences and Humanities, Northeastern University
 Fellow, Institute for Quantitative Social Sciences, Harvard University
 Affiliate, Berkman Center for Internet  Society, Harvard Law School

 b.kee...@neu.edu
 www.brianckeegan.com
 M: 617.803.6971
 O: 617.373.7200
 Skype: bckeegan

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


___
Wiki-research-l mailing list
Wiki

[Wiki-research-l] Polling the watcher's of a page. Possible?

2013-12-30 Thread Klein,Max
Hello Research,

It it possible to query for the watchers of a page? It does not seem to be in 
the API, nor is the watchers or wl_user table in the Data Base replicas 
(where I thought MediaWiki stores it. I imagine this is for privacy reasons, 
correct? If so, how would one gain access?

I have been talking with an econophysicist who thinks that we could apply a 
contagion algorithm, to see which edits are contagious.  (I met this 
econopyhicist at the Berkeley Data Science Faire at which Wikimedia Analytics 
presented, so it was worth it in the end).


Maximilian Klein
Wikipedian in Residence, OCLC
+17074787023
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Polling the watcher's of a page. Possible?

2013-12-30 Thread Kerry Raymond
No, you can't for reasons on privacy. See:

 

https://en.wikipedia.org/wiki/Help:Watching_pages#Privacy

 

But, I concur with your theory that edits are contagious. I often find that
when I get the notification that a watched page has changed, I go and look
at the page. While I am there, I often spot a little thing that needs
doing, which sometimes is just a simple single edit and other times
initiates a marathon of editing activity for the next couple of days :-)

 

If you want to test this theory, I think using at the set of editors of the
page might be a pretty good approximation of the watchlist. A lot of people
have the add the pages and files I edit to my watchlist set in their
preferences (I know I do).  

 

For the purpose of declaring one edit as being contagious (that is, causes
another edit), what criteria would you use? I would assume you need some
time bounds here. I think there needs to be kick-off edits identified.
These would be edits that occurred sufficiently long after the previous edit
that contagion could not be factor. Then after the kick-off edit, you would
be looking for one or more reaction edits that occurred fairly quickly
after one another, suggesting a contagion based on watchlists. So it seems
there are two time parameters: the kick-off threshold and the reaction
threshold. I don't think these are necessarily the same value (i.e. is there
is some grey zone in-between where the edits can be categorised as neither
kick-off nor reaction?). 

 

In terms of setting these threshold(s), you might need some real-life data
to train on. So maybe you could start by asking if some editors would send
you a copy of their watchlist and you could write a script that compared it
with their edit history over the same time frame (plus a bit to cater for
bursty-ness). From that you could come up with a set of edits that look like
contagious ones and you could ask the editors to say yes / no / don't
remember to try to see if 1) contagion appears to be happening 2) what the
time thresholds need to be. Then test it on a bigger set of data using edit
history as a proxy for watchlists.

 

Kerry

 

 

 

 

  _  

From: wiki-research-l-boun...@lists.wikimedia.org
[mailto:wiki-research-l-boun...@lists.wikimedia.org] On Behalf Of Klein,Max
Sent: Tuesday, 31 December 2013 2:26 PM
To: wiki-research-l@lists.wikimedia.org
Subject: [Wiki-research-l] Polling the watcher's of a page. Possible?

 

Hello Research,

It it possible to query for the watchers of a page? It does not seem to be
in the API, nor is the watchers or wl_user table in the Data Base
replicas (where I thought MediaWiki stores it. I imagine this is for privacy
reasons, correct? If so, how would one gain access?

I have been talking with an econophysicist who thinks that we could apply
a contagion algorithm, to see which edits are contagious.  (I met this
econopyhicist at the Berkeley Data Science Faire at which Wikimedia
Analytics presented, so it was worth it in the end).



Maximilian Klein
Wikipedian in Residence, OCLC
+17074787023

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Polling the watcher's of a page. Possible?

2013-12-30 Thread Brian Keegan
Check out Michael Kummer's paper that looks at a similar topic (contagion
in pageviews among linked articles) from an econometrics perspective:
Spillovers in Networks of User Generated Content – Evidence from 23
Natural Experiments on Wikipedia

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2356199



On Mon, Dec 30, 2013 at 9:42 PM, Kerry Raymond kerry.raym...@gmail.comwrote:

  No, you can’t for reasons on privacy. See:



 https://en.wikipedia.org/wiki/Help:Watching_pages#Privacy



 But, I concur with your theory that edits are contagious. I often find
 that when I get the notification that a watched page has changed, I go and
 look at the page. While I am there, I often spot a “little thing that needs
 doing”, which sometimes is just a simple single edit and other times
 initiates a marathon of editing activity for the next couple of days J



 If you want to test this theory, I think using at the set of editors of
 the page might be a pretty good approximation of the watchlist. A lot of
 people have the “add the pages and files I edit to my watchlist” set in
 their preferences (I know I do).



 For the purpose of declaring one edit as being contagious (that is, causes
 another edit), what criteria would you use? I would assume you need some
 time bounds here. I think there needs to be “kick-off” edits identified.
 These would be edits that occurred sufficiently long after the previous
 edit that contagion could not be factor. Then after the kick-off edit, you
 would be looking for one or more “reaction” edits that occurred fairly
 quickly after one another, suggesting a contagion based on watchlists. So
 it seems there are two time parameters: the kick-off threshold and the
 reaction threshold. I don’t think these are necessarily the same value
 (i.e. is there is some grey zone in-between where the edits can be
 categorised as neither kick-off nor reaction?).



 In terms of setting these threshold(s), you might need some real-life data
 to train on. So maybe you could start by asking if some editors would send
 you a copy of their watchlist and you could write a script that compared it
 with their edit history over the same time frame (plus a bit to cater for
 bursty-ness). From that you could come up with a set of edits that look
 like contagious ones and you could ask the editors to say “yes / no / don’t
 remember” to try to see if 1) contagion appears to be happening 2) what the
 time thresholds need to be. Then test it on a bigger set of data using edit
 history as a proxy for watchlists.



 Kerry








  --

 *From:* wiki-research-l-boun...@lists.wikimedia.org [mailto:
 wiki-research-l-boun...@lists.wikimedia.org] *On Behalf Of *Klein,Max
 *Sent:* Tuesday, 31 December 2013 2:26 PM
 *To:* wiki-research-l@lists.wikimedia.org
 *Subject:* [Wiki-research-l] Polling the watcher's of a page. Possible?



 Hello Research,

 It it possible to query for the watchers of a page? It does not seem to be
 in the API, nor is the watchers or wl_user table in the Data Base
 replicas (where I thought MediaWiki stores it. I imagine this is for
 privacy reasons, correct? If so, how would one gain access?

 I have been talking with an econophysicist who thinks that we could
 apply a contagion algorithm, to see which edits are contagious.  (I met
 this econopyhicist at the Berkeley Data Science Faire at which Wikimedia
 Analytics presented, so it was worth it in the end).

   Maximilian Klein
 Wikipedian in Residence, OCLC
 +17074787023

 ___
 Wiki-research-l mailing list
 Wiki-research-l@lists.wikimedia.org
 https://lists.wikimedia.org/mailman/listinfo/wiki-research-l




-- 
Brian C. Keegan, Ph.D.
Post-Doctoral Research Fellow, Lazer Lab
College of Social Sciences and Humanities, Northeastern University
Fellow, Institute for Quantitative Social Sciences, Harvard University
Affiliate, Berkman Center for Internet  Society, Harvard Law School

b.kee...@neu.edu
www.brianckeegan.com
M: 617.803.6971
O: 617.373.7200
Skype: bckeegan
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l