Re: [Wiki-research-l] Tracking authorship of wiki content
Sorry, I meant to say: if there is interest in the code for the Mediawiki extension, let me know, and _we_ will clean it up and put on github (you won't have to clean it up :-). Luca On Sat, Aug 22, 2015 at 7:25 AM, Luca de Alfaro l...@dealfaro.com wrote: Thank you Federico. Done. BTW, we also had code for a Mediawiki extension that computed this in real time. That code has not yet been cleaned up, but it is available from here: https://sites.google.com/a/ucsc.edu/luca/the-wikipedia-authorship-project If there is interest, I don't think it would be hard to clean up and post better to github. The extension uses the edit hook to attribute the content of every new revision of a wiki page, using the earliest plausible attribution idea algo we used in the paper. Luca On Sat, Aug 22, 2015 at 12:20 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Luca de Alfaro, 22/08/2015 01:51: So I got inspired, and I cleaned up some code that Michael Shavlovsky and I had written for this: https://github.com/lucadealfaro/authorship-tracking Great! It's always good when code behind a paper is published, it's never too late. If you can please add a link from wikipapers: http://wikipapers.referata.com/wiki/Form:Tool Nemo ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] Tracking authorship of wiki content
Thank you Federico. Done. BTW, we also had code for a Mediawiki extension that computed this in real time. That code has not yet been cleaned up, but it is available from here: https://sites.google.com/a/ucsc.edu/luca/the-wikipedia-authorship-project If there is interest, I don't think it would be hard to clean up and post better to github. The extension uses the edit hook to attribute the content of every new revision of a wiki page, using the earliest plausible attribution idea algo we used in the paper. Luca On Sat, Aug 22, 2015 at 12:20 AM, Federico Leva (Nemo) nemow...@gmail.com wrote: Luca de Alfaro, 22/08/2015 01:51: So I got inspired, and I cleaned up some code that Michael Shavlovsky and I had written for this: https://github.com/lucadealfaro/authorship-tracking Great! It's always good when code behind a paper is published, it's never too late. If you can please add a link from wikipapers: http://wikipapers.referata.com/wiki/Form:Tool Nemo ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
[Wiki-research-l] Tracking authorship of wiki content
Dear All, I was yesterday at OpenSym (many thanks to Dirk for organizing this!), and I was chatting with some people about attribution of content to its authors in a wiki. So I got inspired, and I cleaned up some code that Michael Shavlovsky and I had written for this: https://github.com/lucadealfaro/authorship-tracking The way to use it is super simple (see below). The attribution object can also be serialized and de-serialized to/from json (see documentation on github). The idea behind the code is to attribute the content to the *earliest revision *where the content was inserted, not the latest as diff tools usually do. So if some piece of text is inserted, then deleted, then re-inserted (in a revert or a normal edit), we still attribute it to the earliest revision. This is somewhat similar to what we tried to do in WikiTrust, but it's better done, and far more efficient. The algorithm details can be found in http://www2013.wwwconference.org/proceedings/p343.pdf I hope this might be of interest! Luca import authorship_attribution a = authorship_attribution.AuthorshipAttribution.new_attribution_processor(N=4) a.add_revision(I like to eat pasta.split(), revision_info=rev0) a.add_revision(I like to eat pasta with tomato sauce.split(), revision_info=rev1) a.add_revision(I like to eat rice with tomato sauce.split(), revision_info=rev3)print a.get_attribution() ['rev0', 'rev0', 'rev0', 'rev0', 'rev3', 'rev1', 'rev1', 'rev1'] ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] Tracking authorship of wiki content
Dear Aaron, sorry, sorry, thanks for helping clear up some mis-conceptions, and let me see if I can do more. The WikiWho API is very nice work, and was presented in WWW 2014. The work of Michael and myself dates from one year before, WWW 2013 (see http://www2013.wwwconference.org/proceedings/p343.pdf). This is why in our work we don't give credit to WikiWho. In fact, it is them who most politely cite us. Now, you say, why don't I give them more credit now? Because I haven't really done anything new. I am not claiming anything new, I have just taken code that was written in 2013, and made it better available on github, with a moderate clean-up of its API. We tried to make that code available in 2013 to the community, by putting it into gerritt (we were told it was the proper place), but it didn't really work out. Again, I am not pushing a new result out. I am simply making code available that dates from some time back, and that I realized yesterday might be useful to others. There are many many ways to attribute content. Even if you go for the theory of earliest possible attribution, which is what we do in the paper and code, it would certainly be better done using language models of average text, to better distinguish casual from intentional repetition. I put the code on github because I was inspired by our conversation yesterday. If you like, I'd be happy to give you access to the repo (write access I mean) so you can both do the code reviews we had been mentioning, and improve the README.md file with more considerations and references. Let me know. Again, what I wanted to do is make better available code written 2-3 years ago, not really make any new claims. Luca On Fri, Aug 21, 2015 at 5:49 PM, Aaron Halfaker aaron.halfa...@gmail.com wrote: Hey Luca! Welcome back to the content persistence tracking club! I feel like I should clear up some misconceptions. 1st, yours is not the first python library that is useful for determining the authorship of content in versioned text and I don't think you have given fair treatment to the work we have been doing since you last worked on WikiTrust. For example, its hard to tell from your description whether you are doing anything different than the wikiwho api[2] with tracking content historically. Further the work I have been doing with diff-based content persistence (e.g. [1]) is not so simple as to not notice removals and re-additions under most circumstances. In my opinion, this is much better for measuring the productivity of a contribution (adding content that looks like content that was removed long ago is still productive, isn't it?), but maybe less useful for attributing a first contributor status to a particular sub-statement. Regardless, it seems that a qualitative analysis is necessary to determine whether these differences matter and whether one strategy is better than the other. AFAICT, the only software that has received this kind of analysis is wikiwho (discussed in [3]). Regardless, it's great to have you working in this space again and I welcome you to help us develop overview of content persistence measurement strategies that is complete and allows others to critically decide which strategy matches their needs. See https://meta.wikimedia.org/wiki/Research:Content_persistence for such an overview. I encourage you to use this description of persistence measures to differentiate your strategy from the work we have been doing over the last 5 years. Edit boldly! 1. https://pythonhosted.org/mediawiki-utilities/lib/persistence.html#mw-lib-persistence 2. http://people.aifb.kit.edu/ffl/wikiwho/ 3. http://people.aifb.kit.edu/ffl/wikiwho/fp715-floeck.pdf -Aaron On Aug 21, 2015 4:52 PM, Luca de Alfaro l...@dealfaro.com wrote: Dear All, I was yesterday at OpenSym (many thanks to Dirk for organizing this!), and I was chatting with some people about attribution of content to its authors in a wiki. So I got inspired, and I cleaned up some code that Michael Shavlovsky and I had written for this: https://github.com/lucadealfaro/authorship-tracking The way to use it is super simple (see below). The attribution object can also be serialized and de-serialized to/from json (see documentation on github). The idea behind the code is to attribute the content to the *earliest revision *where the content was inserted, not the latest as diff tools usually do. So if some piece of text is inserted, then deleted, then re-inserted (in a revert or a normal edit), we still attribute it to the earliest revision. This is somewhat similar to what we tried to do in WikiTrust, but it's better done, and far more efficient. The algorithm details can be found in http://www2013.wwwconference.org/proceedings/p343.pdf I hope this might be of interest! Luca import authorship_attribution a = authorship_attribution.AuthorshipAttribution.new_attribution_processor(N=4) a.add_revision(I like
Re: [Wiki-research-l] Pageviews, mobile versus desktop
So Wikipedia gets 20M pageviews per day total? I somehow expected more. Or am I mis-reading the graph? Luca On Mon, Dec 15, 2014 at 9:57 AM, Oliver Keyes oke...@wikimedia.org wrote: Yep; same timeframe. On 15 December 2014 at 12:50, Federico Leva (Nemo) nemow...@gmail.com wrote: Oliver Keyes, 13/12/2014 21:15: http://ironholds.org/misc/pageviews_year_and_week.png - fascinating! It reveals a lot of seasonality in the desktop views - again, not replicated on mobile (at least, not so strongly) Does this graph also go from 2013-02-01 to 2014-12-01? Nemo ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l -- Oliver Keyes Research Analyst Wikimedia Foundation ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] Fwd: FW: What works for increasing editor engagement?
Dear James, very well argued, thanks for the insightful post. Saving drafts on the other hand could help avoid many conflicts on less-trafficked pages. Right now, on a page that is edited infrequently, this happens: - User A starts an edit - User A saves not to lose work, not quite done yet. Resumes the edit. - User B (typically an editor) sees the edit by A, and sets to work polishing it. Saves. - User A saves -- conflict The first edit by A woke up B, and led to the conflict. If we allowed saving drafts, the following would be more likely: - User A starts an edit - User A saves a draft, and continues the edit. - User A saves the edit. - User B (typically an editor) sees the edit by A, and sets to work polishing it. Saves. The conflict would occur only if A had second-thoughts about the edit and continues work after saving it, which might happen, but les frequently. Of course saving drafts is also cumbersome to implement at scale (how long would they persist? there would be clean up needed, etc; maybe they could persist for one week then be mailed back to the author and deleted?). Luca On Fri, Sep 26, 2014 at 11:22 AM, James Forrester jforres...@wikimedia.org wrote: [Re-sending as it bounced first time.] On 25 September 2014 22:45, Pine W wiki.p...@gmail.com wrote: FWIW there were sessions at Wikimania about concurrent editing. I think there is community support for the concept. If it helps us retain good faith new editors then that is another good reason to press foward on this subject. Perhaps James Forrester can provide an update on the outlook for concurrent editing capability. Hey. [This is a bit off-topic for wiki-research-l, but I've been asked to answer.] First things first: There aren't any plans right now to try to roll this out any time soon. Collaborative real-time editing is an interesting task in terms of engineering, but an exceptional challenge in terms of product. I think that it's reasonable to talk about it as a possible solution to issues, but the number of problems it raises is so great that people should be careful to not talk of it as some magic pixie dust. :-) For a couple of brief examples: If the objective is to prevent all edit conflicts by making parallel edits them impossible, this means either: * everyone has to use the collaborative editor; * people who can't use the collaborative editor (e.g. old computer, slow network, no JavaScript, etc.) can't edit at all; * people who don't like the collaborative editor are unable to edit ever again; and * bots can't edit at all (because they can't react to prompts from other users) … or: * you have to choose to use the collaborative editor for each edit (how do newbies know, or is it opt-out somehow?) * as soon as someone wants to edit an article collaboratively, everyone else's edits die and they're told so (or they all have to wait for the collaborative edit session to end and then manually resolve the edit conflict); * for people who can't or don't want to use the collaborative editor, and all bots, the article is essentially locked from their editing until the collaborative edit is finished. Neither of these are great options. If instead we're happy to keep having edit conflicts, we can allow parallel edits, but then the benefit for newbies (and, frankly, the rest of us) goes away the second your collaborative edit conflicts with a non-collaborative edit. Whoops. Say that we've decided on a course of action for the above, maybe by biting the bullet and denying people with older computers *etc.* the ability to edit (which I think would be sad and a dereliction of our values); what do you do when there are too many parallel editors of an article? When you're editing in a real-time collaborative editor, that means you see the edits of each of the participants, alongside their cursors/selections and comments in the chat system if there is one (which there normally is). When there's two or three of these, it's relatively easy to see what's happening. But what if there are 1,000 people trying to edit the article at once (e.g. the article of a very famous individual just after they've died unexpectedly; think Michael Jackson or Robin Williams). Showing 1,000 cursors at once isn't just unhelpful – the level of traffic would probably kill most users' browsers. Consequently, there needs to be a limit somehow on the number of participants; maybe call it 10. So, what happens when you click edit on an article where 10 people are already editing? * Do you just get told tough? * Does the least-recently active editor get kicked out so you can join? * Does this mean that all I need is 11 bots requesting to edit an article to DoS it? If you're a special user (e.g. a sysop), can you get into a collaborative edit even if it's at the limit? * If yes, doesn't this go against our values to place some editors above others? * If yes, do we just let
Re: [Wiki-research-l] FW: What works for increasing editor engagement?
Re. the edit conflicts happening when a new user is editing: Can't one add some AJAX to the editor that notifies that one still has the editing window open? Maybe editors could wait to modify work in progress, if they had that indication, and if the content does not seem vandalism? Luca On Thu, Sep 25, 2014 at 12:17 PM, James Salsman jsals...@gmail.com wrote: Aaron, would you please post the script you used to create https://commons.wikimedia.org/wiki/File:Desirable_newcomer_survival_over_time.png ? I would be happy to modify it to also collect the number of extant non-redirect articles each desirable user created. Aaron wrote: ... You'll find the hand-coded set of users here http://datasets.wikimedia.org/public-datasets/enwiki/rise-and-decline ... Categories: 1. Vandals - Purposefully malicious, out to cause harm 2. Bad-faith - Trying to be funny, not here to help or harm 3. Good-faith - Trying to be productive, but failing 4. Golden - Successfully contributing productively ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] FW: What works for increasing editor engagement?
Better merging would be welcome. But also less aggressive editing/policing. When I edit openstreetmap I have a better overall experience: the edits may or may not go live immediately, but I don't have the impression that there is someone aggressively vetting/refining my edits while I am still doing them. I feel welcome there. To make Wikipedia more welcoming, we could do a few things. We could allow users to save drafts. In this way, people could work for a while at their own pace, and then publish the changes. Currently, saving is the only way to avoid risking losing changes, but it has the very undesired effect of inviting editors/vetters to the page before one is really done. We could also allow a time window (even 30 minutes) before edits went live after one is done editing (using above Ajax mechanism to track when editor open), experienced editors would not need to swoop in quite so fast on the work of new users, and the whole editing atmosphere would be more relaxed and welcoming. The fact is that the Wikipedia editor, with its lack of ability to save drafts, poor merging, and swooping editors, feels incredibly outdated and unwelcoming - downright aggressive - to anyone used to WordPress / Google Docs / Blogger / ... Luca On Thu, Sep 25, 2014 at 12:35 PM, James Salsman jsals...@gmail.com wrote: Luca wrote: Re. the edit conflicts happening when a new user is editing: Can't one add some AJAX to the editor that notifies that one still has the editing window open? Maybe editors could wait to modify work in progress, if they had that indication, and if the content does not seem vandalism? Instead of asking editors to wait, we could improve the merge algorithm to avoid conflicts: https://en.wikipedia.org/wiki/Merge_(revision_control) ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] FW: What works for increasing editor engagement?
Flagged revisions is different though, as it requires trusted editors to flag things as approved. I am simply advocating the ability to save drafts visible only to oneself before publishing a change. WordPress, Blogger, etc have it. And so newcomers could edit to their heart content, without triggering the interest of editors and the consequent conflicts, then save their changes. Luca On Thu, Sep 25, 2014 at 5:15 PM, Scott Hale computermacgy...@gmail.com wrote: On Fri, Sep 26, 2014 at 5:14 AM, Luca de Alfaro l...@dealfaro.com wrote: Better merging would be welcome. But also less aggressive editing/policing. When I edit openstreetmap I have a better overall experience: the edits may or may not go live immediately, but I don't have the impression that there is someone aggressively vetting/refining my edits while I am still doing them. I feel welcome there. To make Wikipedia more welcoming, we could do a few things. We could allow users to save drafts. In this way, people could work for a while at their own pace, and then publish the changes. Currently, saving is the only way to avoid risking losing changes, but it has the very undesired effect of inviting editors/vetters to the page before one is really done. We could also allow a time window (even 30 minutes) before edits went live after one is done editing (using above Ajax mechanism to track when editor open), experienced editors would not need to swoop in quite so fast on the work of new users, and the whole editing atmosphere would be more relaxed and welcoming. The fact is that the Wikipedia editor, with its lack of ability to save drafts, poor merging, and swooping editors, feels incredibly outdated and unwelcoming - downright aggressive - to anyone used to WordPress / Google Docs / Blogger / ... Luca The technology exists to do this---[[:en:Wikipedia:Flagged_revisions]]. The challenge is that many existing users don't want flagged revisions on by default. And that is the fundamental flaw with this whole email thread. The question needing to be answered isn't what increases new user retention. The real question is what increases new user retention and is acceptable to the most active/helpful existing users. The second question is much harder than the first. ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] FW: What works for increasing editor engagement?
You are right about conflicts with fast-updated pages. Not sure it would be worse than the current situation though. For many low traffic articles, drafts only visible to the user would not have many conflicts -- basically, for all pages with fewer than a couple of edits per day this would be true, and there are many such pages. I think a more annoying issue would be how to clean up these drafts; a policy would be required (one week?), cron jobs, etc, otherwise these drafts could grow uncontrollably in size due to abandoned edits. But this should be solvable, if with some pain. I tend to think that with a bit of UI tweaking, Wikipedia could be made more friendly On Thu, Sep 25, 2014 at 5:58 PM, Scott Hale computermacgy...@gmail.com wrote: Yes, drafts visible only to the user are different. I was thinking of flagged revisions in reference to your idea that edits would first go live only after a set period of time. This is basically flagged revisions with a trivial extension that the flagged revision always be the latest revision that is at least X minutes old. We could also allow a time window (even 30 minutes) before edits went live after one is done editing (using above Ajax mechanism to track when editor open), experienced editors would not need to swoop in quite so fast on the work of new users, and the whole editing atmosphere would be more relaxed and welcoming. I think the challenge with drafts visible only to the user is that they are very likely to have a conflict and have to merge changes if they wait too long between starting the draft and later committing it. On Fri, Sep 26, 2014 at 9:20 AM, Luca de Alfaro l...@dealfaro.com wrote: Flagged revisions is different though, as it requires trusted editors to flag things as approved. I am simply advocating the ability to save drafts visible only to oneself before publishing a change. WordPress, Blogger, etc have it. And so newcomers could edit to their heart content, without triggering the interest of editors and the consequent conflicts, then save their changes. Luca On Thu, Sep 25, 2014 at 5:15 PM, Scott Hale computermacgy...@gmail.com wrote: On Fri, Sep 26, 2014 at 5:14 AM, Luca de Alfaro l...@dealfaro.com wrote: Better merging would be welcome. But also less aggressive editing/policing. When I edit openstreetmap I have a better overall experience: the edits may or may not go live immediately, but I don't have the impression that there is someone aggressively vetting/refining my edits while I am still doing them. I feel welcome there. To make Wikipedia more welcoming, we could do a few things. We could allow users to save drafts. In this way, people could work for a while at their own pace, and then publish the changes. Currently, saving is the only way to avoid risking losing changes, but it has the very undesired effect of inviting editors/vetters to the page before one is really done. We could also allow a time window (even 30 minutes) before edits went live after one is done editing (using above Ajax mechanism to track when editor open), experienced editors would not need to swoop in quite so fast on the work of new users, and the whole editing atmosphere would be more relaxed and welcoming. The fact is that the Wikipedia editor, with its lack of ability to save drafts, poor merging, and swooping editors, feels incredibly outdated and unwelcoming - downright aggressive - to anyone used to WordPress / Google Docs / Blogger / ... Luca The technology exists to do this---[[:en:Wikipedia:Flagged_revisions]]. The challenge is that many existing users don't want flagged revisions on by default. And that is the fundamental flaw with this whole email thread. The question needing to be answered isn't what increases new user retention. The real question is what increases new user retention and is acceptable to the most active/helpful existing users. The second question is much harder than the first. -- Scott Hale Oxford Internet Institute University of Oxford http://www.scotthale.net/ scott.h...@oii.ox.ac.uk ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] FW: What works for increasing editor engagement?
relaxed and welcoming. I think the challenge with drafts visible only to the user is that they are very likely to have a conflict and have to merge changes if they wait too long between starting the draft and later committing it. On Fri, Sep 26, 2014 at 9:20 AM, Luca de Alfaro l...@dealfaro.com wrote: Flagged revisions is different though, as it requires trusted editors to flag things as approved. I am simply advocating the ability to save drafts visible only to oneself before publishing a change. WordPress, Blogger, etc have it. And so newcomers could edit to their heart content, without triggering the interest of editors and the consequent conflicts, then save their changes. Luca On Thu, Sep 25, 2014 at 5:15 PM, Scott Hale computermacgy...@gmail.com wrote: On Fri, Sep 26, 2014 at 5:14 AM, Luca de Alfaro l...@dealfaro.com wrote: Better merging would be welcome. But also less aggressive editing/policing. When I edit openstreetmap I have a better overall experience: the edits may or may not go live immediately, but I don't have the impression that there is someone aggressively vetting/refining my edits while I am still doing them. I feel welcome there. To make Wikipedia more welcoming, we could do a few things. We could allow users to save drafts. In this way, people could work for a while at their own pace, and then publish the changes. Currently, saving is the only way to avoid risking losing changes, but it has the very undesired effect of inviting editors/vetters to the page before one is really done. We could also allow a time window (even 30 minutes) before edits went live after one is done editing (using above Ajax mechanism to track when editor open), experienced editors would not need to swoop in quite so fast on the work of new users, and the whole editing atmosphere would be more relaxed and welcoming. The fact is that the Wikipedia editor, with its lack of ability to save drafts, poor merging, and swooping editors, feels incredibly outdated and unwelcoming - downright aggressive - to anyone used to WordPress / Google Docs / Blogger / ... Luca The technology exists to do this---[[:en:Wikipedia:Flagged_revisions]]. The challenge is that many existing users don't want flagged revisions on by default. And that is the fundamental flaw with this whole email thread. The question needing to be answered isn't what increases new user retention. The real question is what increases new user retention and is acceptable to the most active/helpful existing users. The second question is much harder than the first. -- Scott Hale Oxford Internet Institute University of Oxford http://www.scotthale.net/ scott.h...@oii.ox.ac.uk ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] FW: What works for increasing editor engagement?
This last message of yours Jonathan is very insightful and true. I wonder how it would be possible to set up some kind of controlled study on how different edit capabilities lead to different engagements. One could always set up controlled mirrors of the Wikipedia for a small set of pages on a coherent topic, and perhaps measure the difference in engagement? Do you think that there is a way to do this? There are also pages that are very different. The rapidly evolving page on a current event requires rapid communication of edits. Instead, a novice that edits a page on a topic with little traffic is best left alone (no tweaking that causes edit conflicts) until she/he is done. Luca On Thu, Sep 25, 2014 at 10:13 PM, WereSpielChequers werespielchequ...@gmail.com wrote: We have had endless discussions about this in the new page patrol community. Basically there is a divide between those who think it important to communicate with people as quickly as possible so they have a chance to fix things before they log off and people such as myself who think that this drives people away. So before we try to make people more aware that they are dealing with a newbie it would help if we had some neutral independent research that indicated which position is more grounded in reality. Simply making it clearer to patrollers that they are dealing with newbies is solving a non problem, we know the difference between newbies and regulars, we just disagree as to the best way to handle newbies. Investing in software to tell patrollers when they are dealing with newbies is unlikely to help, in fact I would be willing to bet that one of the criticisms will be from patrollers saying that it isn't doing that job as well as they can because it doesn't spot which editors are obviously experienced even if their latest account is not yet auto confirmed. There is also the issue that some patrollers may not realise how many edit conflicts they cause by templating and categorising articles. Afterall it isn't going to be the templater or categoriser who loses the edit conflict, that is almost guaranteed to be the newbie. Of course this could be resolved by changing the software so that adding a category or template is not treated as conflicting with changing the text. Regards Jonathan Cardy On 25 Sep 2014, at 23:23, Luca de Alfaro l...@dealfaro.com wrote: Re. the edit conflicts happening when a new user is editing: Can't one add some AJAX to the editor that notifies that one still has the editing window open? Maybe editors could wait to modify work in progress, if they had that indication, and if the content does not seem vandalism? Luca On Thu, Sep 25, 2014 at 12:17 PM, James Salsman jsals...@gmail.com wrote: Aaron, would you please post the script you used to create https://commons.wikimedia.org/wiki/File:Desirable_newcomer_survival_over_time.png ? I would be happy to modify it to also collect the number of extant non-redirect articles each desirable user created. Aaron wrote: ... You'll find the hand-coded set of users here http://datasets.wikimedia.org/public-datasets/enwiki/rise-and-decline ... Categories: 1. Vandals - Purposefully malicious, out to cause harm 2. Bad-faith - Trying to be funny, not here to help or harm 3. Good-faith - Trying to be productive, but failing 4. Golden - Successfully contributing productively ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] long in tooth: what outdated looks like
This is an EXCELLENT email, Steven. +1 to it! Luca On Thu, May 3, 2012 at 11:17 AM, Steven Walling swall...@wikimedia.orgwrote: On Thu, May 3, 2012 at 2:41 AM, Richard Jensen rjen...@uic.edu wrote: JSTOR reports there were about 300 articles on Shakespeare a year in scholarly journals in 1997 to 2006; none of them are cited, nor any since then and only one before then. This is typical as well of political and military history. Wiki editors are not using scholarly journals. I assume that is because they are unaware of them. Not at all. Wikipedians are *very much* aware that these journals exist. They do not have access to them, because they are unaffiliated scholars. Dozens of editors want access to this content,[1] but can't have it because JSTOR locks it down. They just now started letting people access content that is in the public domain! If as an academic, you see a problem where peer reviewed content is not cited in Wikipedia, I would strongly encourage you to join the movement lobbying for openness in scholarly work. Otherwise, you're complaining about a problem that Wikipedians do not have the power to fix, because academics tacitly support a system in which knowledge is kept in the hands of the few who can pay for it. -- Steven Walling https://wikimediafoundation.org/ 1. https://en.wikipedia.org/wiki/Wikipedia:Requests_for_JSTOR_access ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] Request for feedback on new data dump formats
Not quite... if I am reading correctly the proposal by Brion, this would list all the pages that changed in a specific interval. If the interval is large, like a month, this could be a very large size, if all the history of a page is provided. What I was suggesting is to include only the changes (the revisions) that occur in a specific time span. Luca On Thu, Mar 31, 2011 at 5:33 PM, Yuvi Panda yuvipa...@gmail.com wrote: Would incremental dumps, as described by brion long time ago (http://leuksman.com/log/2007/10/14/incremental-dumps/) be what you're looking for? On Fri, Apr 1, 2011 at 5:01 AM, Aaron Halfaker aaron.halfa...@gmail.com wrote: If periodic update dumps are being considered, information that describes changes to old data (page deletes, user renames, etc) would be very useful to have along with new revisions. -Aaron On Mar 31, 2011 6:27 PM, Luca de Alfaro l...@dealfaro.org wrote: I think I would be very interested in 3, or even, in having every month a dump of that month's revisions. As I have built tools for the xml dumps, no change in format is good for me (and for WikiTrust). I would find incremental dumps (with occasional, yearly, full dumps) much easier to manage than full dumps. Luca On Thu, Mar 31, 2011 at 2:27 PM, Yuvi Panda yuvipa...@gmail.com wrote: Hi, I'm a student planning on doing GSoC this year on mediawiki. Specifically, I'd like to work on data dumps. I'm writing this to gauge what would be useful to the research community. Several ideas thrown about include: 1. JSON Dumps 2. Sqlite Dumps 3. Daily dumps of revisions in last 24 hours 4. Dumps optimized for very fast import into various external storage and smaller size (diffs) 5. JSON/CSV for Special:Import and Special:Export Would any of these be useful? Or is there anything else that I'm missing, that you would consider much more useful? Feedback would be invaluable :) Thanks :) -- Yuvi Panda T http://yuvi.in/blog ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l -- Yuvi Panda T http://yuvi.in/blog ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] Most reverted pages in the en-wikipedia (enwiki-20100130 dump)
Thanks, this is great fun! As an Italian, let me quote: (0.42525520906166969, (7151, 3041, 59, 514, 63, 2519, 955), 'Penis') (0.42516069788797062, (1089, 463, 29, 27, 16, 470, 84), 'Inner core') (0.42490272373540855, (1285, 546, 11, 64, 27, 515, 122), 'Stuff') (0.42477231329690346, (2745, 1166, 28, 110, 46, 1054, 341), 'Gun') (0.42474916387959866, (2990, 1270, 37, 149, 23, 1190, 321), 'Monkey') (0.42443438914027148, (1105, 469, 20, 21, 2, 427, 166), 'Incas') (0.42433090024330899, (2055, 872, 39, 45, 15, 825, 259), 'Italian Renaissance') (0.42375950742484608, (2761, 1170, 34, 94, 24, 978, 461), 'Watermelon') (0.42362613587191694, (2311, 979, 22, 121, 19, 937, 233), 'Puppy') (0.4235686492495831, (1799, 762, 20, 83, 34, 669, 231), 'Crap') It is absolutely great to see that Italian Renaissance (with Incas) is one of the few cultural topics that makes it as high in the list as the usual excrement-sex-infantile type of things!! Luca On Fri, Aug 13, 2010 at 1:12 PM, Dmitry Chichkov dchich...@gmail.comwrote: If anybody is interested, I've made a list of 'most reverted pages' in the english wikipedia based on the analysis of the enwiki-20100130 dump. Here is the list: http://wpcvn.com/enwiki-20100130.most.reverted.tar.bz http://wpcvn.com/enwiki-20100130.most.reverted.txt This list was calculated using the following sampling criteria: * All pages from the enwiki-20100130 dump; ** Filtered pages with more than 1000 revisions; ** Filtered pages with revert ratios 0.3; * Sorted in descending revert ratios. Page revision is considered to be a revert if there is a previous revision with a matching MD5 checksum; BTW, if anybody needs it, the python code that identifies reverts, revert wars, self-reverts, etc is available (LGPL). -- Regards, Dmitry ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] Help to solve three doubts on Wikipedia research data
I guess that Wiki(pedia|media) could very well gather statistics on (revision_id, clicked_link) pairs without compromising the anonimity of the visitors. It would be very useful to have indications on which hyperlinks are most useful. For example, I am always curious whether the large editorial effort to curate categories is worth it. And also, if one had data on: (revision_id, search terms used in next search), one could infer which links are actually missing. The problem is that many people use search engines rather than Wikipedia's own search to navigate the Wikipedia... but perhaps the information could still be reconstructed somehow from session information. But as far as I know, there is no plan nor current infrastructure to have such anonymously logged data. I don't work there, however, so other better-informed people might comment. Luca On Sun, Apr 11, 2010 at 3:19 PM, Ziko van Dijk zvand...@googlemail.comwrote: Hello, Gregory (? if I remember well) mentioned in August 2009 this: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1446862 All examined sites spy on their visitors, but Wikimedia and Wikipedia. Kind regards Ziko 2010/4/11 Gregory Maxwell gmaxw...@gmail.com: On Sun, Apr 11, 2010 at 12:06 PM, Fuster, Mayo mayo.fus...@eui.eu wrote: * Does the site learn from the navigation and searches? That is, if a Wikipedia visitor who reads a Network entry then goes to the Manuel Castells entry, Will the system understand there is a connexion between them? Will next time put them together when presenting search results? No. Although that is an interesting area of research. Unfortunately, due to privacy concerns the data that would be required to invent such a system (search strings and search click through traces) is not available to the public. (and in fact, the traces aren't really collected, currently, as far as I know) ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l -- Ziko van Dijk NL-Silvolde ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] Help to solve three doubts on Wikipedia research data
The first thing I proposed is innocuous (gathering stats on (revision_id, clicked_link)), and in fact can be done easily with a minimum of instrumentation. The second is very different from the AOL search data. The AOL search data was problematic because it associated data on a per-user basis, so you could use some queries to figure out who the user was, and then see the other queries of the user. I am suggesting here to instead gather anonymous statistics on: (was on page A, did a search, landed on page B), keeping track only of the (A, B) pairs, without user information. But the problem is that gathering such anonymous logs takes effort, is difficult to do securely, is difficult to avoid someone tamper with it and add back information that should not have been there, and it is difficult to then present the information to Wikipedia editors in a way that helps them meaningfully improve pages. So perhaps the first statistic is the only useful one. I would also be curious to know, once a user enters, what % of next visits are due to the visitor clicking on links, vs. doing a search. Luca On Sun, Apr 11, 2010 at 6:28 PM, Anthony wikim...@inbox.org wrote: On Sun, Apr 11, 2010 at 6:27 PM, Luca de Alfaro l...@dealfaro.org wrote: I guess that Wiki(pedia|media) could very well gather statistics on (revision_id, clicked_link) pairs without compromising the anonimity of the visitors. It would be very useful to have indications on which hyperlinks are most useful. For example, I am always curious whether the large editorial effort to curate categories is worth it. And also, if one had data on: (revision_id, search terms used in next search), one could infer which links are actually missing. Seems to me like that (especially the latter), would need to be done extremely carefully to avoid compromising the anonymity of the visitors. Although it's not quite as bad it seems reminiscent of the http://en.wikipedia.org/wiki/AOL_search_data_scandal;, especially with regard to search terms used in next search. ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] Study on Interfaces to Improving Wikipedia Quality
Dear All, if you go to http://wiki-trust.cse.ucsc.edu/index.php/Main_Page and click on *Random Page*, you can explore that Wikipedia demo more. Our trust is function both of the text age (how many times it has been revised), but also of the reputation of the revisors. Let me also point out that our code base has much evolved since then. In particular, the latest release of WikiTrust adds: - Author tracking. Hover over a word in the check text tab, and the author is displayed in a pop-up window. - Origin tracking. Click on a (non-link) word, and you are sent to the diff where the word was introduced. - Vote button. If you are logged in, and you agree with the information displayed in the check text tab, you can vote for its correctness, and the revision text will gain trust as a consequence. We do not have a whole-wikipedia demo of that so far, but you can find various examples linked from http://trust.cse.ucsc.edu/WikiTrust . Some of those examples are mirrors of existing wikis (from dumps), but visit our very own Cookiwiki, which we set up to experiment: - www.cookiwiki.org Register, and edit a page, browse around, etc. Yes, there is not a lot of content, but you can experiment with the interface. Come on, everybody knows at least one recipe, and we welcome even one for hard-boiled eggs! Finally, do you have your own wiki? Then just download our codehttp://trust.cse.ucsc.edu/WikiTrustand give it a try. If you follow the link for a tarballhttp://code.google.com/p/wikitrust/downloads/list, you can download a tarball which contains a statically-linked executable for linux (and the source code for any OS), so installation is easy. Luca On Wed, Nov 19, 2008 at 12:12 PM, [EMAIL PROTECTED] wrote: Dear All, My name is Avanidhar Chandrasekaran (http://en.wikipedia.org/wiki/User_talk:Avanidhar). I work with GroupLens Research at the University of Minnesota, Twin Cities. As part of my research, I am involved in analyzing the usefulness and Necessity of author reputation in Wikipedia. In lieu of this, I have simulated an Interface to color words in an article based on their Age. Being experienced contributors to Wikipedia, I invite you to participate in this study, which involves the following. 1. Please visit the following Instances of wikipedia and evaluate the interface components which have been incorporated into each of them. Each of these use their own algorithm to color text. a) The Wikitrust project http://wiki-trust.cse.ucsc.edu/index.php/Main_Page b) The Wiki-reputation project at Grouplens research http://wiki-reputation.cs.umn.edu/index.php/Main_Page 2) Once you have evaluated the two interfaces, kindly complete this survey on Wikipedia quality http://www.surveymonkey.com/s.aspx?sm=hagN5S1JZHxH6pF9SmXkkA_3d_3d We hope to get your valuable feedback on these interfaces and how Wikipedia article quality can be improved. Thanks for your time Avanidhar Chandrasekaran, GroupLens Research, University of Minnesota ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
[Wiki-research-l] WikiTrust v2 released: reputation and trust for your wiki in real-time!
As some of you might remember, we have been working on author reputation and text trust systems for wikis; some of you may have seen our demo at WikiMania 2007, or the on-line demo http://wiki-trust.cse.ucsc.edu/ Since then, we have been busy at work to build a system that can be deployed on any wiki, and display the text trust information. And we finally made it! We are pleased to announce the release of WikiTrust version 2! With it, you can compute author reputation and text trust of your wikis in real-time, as edits to the wiki are made, and you can display text trust via a new trust tab. The tool can be installed as a MediaWiki extension, and is released open-source, under the BSD license; the project page is http://trust.cse.ucsc.edu/WikiTrust WikiTrust can be deployed both on new, and on existing, wikis. WikiTrust stores author reputation and text trust in additional database tables. If deployed on an existing wiki, WikiTrust first computes the reputation and trust information for the current wiki content, and then processes new edits as they are made. The computation is scalable, parallel, and fault-tolerant, in the sense that WikiTrust adaptively fills in missing trust or reputation information. On my MacBook, running under Ubuntu in vmware, WikiTrust can analize some 10-20 revisions / second of a wiki; so with a little patience, unless your wiki is truly huge, you can just deploy it and wait a bit. Go to http://trust.cse.ucsc.edu/WikiTrust for more information and for the code! Feedback, comments, etc are much appreciated! Luca de Alfaro (with Ian Pye and Bo Adler) ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
[Wiki-research-l] Three techreps: assigning trust to Wikipedia content, and reputation, contributions of authors
Dear All, we have three new techreps available: - Robust Content-Driven Reputationhttp://www.soe.ucsc.edu/%7Eluca/papers/08/ucsc-soe-08-09.htmlshows that the content-driven reputation we proposed in a WWW 2007 paper can be made robust to Sybil (sock-puppet) and other coordinated attacks. In WWW 2007, we proposed content-driven reputation for Wikipedia authors, where authors gain reputation if their contributions are preserved, and lose reputation if their contributions are quickly undone. The original algorithms were very prone to attacks; we show here that they can be made resistant. - Assigning Trust to Wikipedia Contenthttp://www.soe.ucsc.edu/%7Eluca/papers/08/ucsc-soe-08-07.htmlproposes computing the trust of Wikipedia text on the basis of the reputation of the author, and the reputation of the people who revised the text. We display text trust by coloring text background. Many of you have seen the on-line demo for the English Wikipedia, at http://trust.cse.ucsc.edu/ . This is an improved version of a November 2007 techrep on the same topic. In this improved techrep, we show how the trust system can be made resistant to attacks. - Measuring Author Contributions to the Wikipediahttp://www.soe.ucsc.edu/%7Eluca/papers/08/ucsc-soe-08-08.htmldefines and compares various ways for measuring the contribution of individual authors to the Wikipedia. We have our own favorite; read more to find out :-) In these months, we have been busy working at WikiTrusthttp://trust.cse.ucsc.edu/, an open-source tool for assigning reputation to wiki authors and trust to wiki content. We already have a batch (or off-line) system, which can compute reputation and trust based on wiki dumps, such as the Wikipedia dumps made available by the Wikimedia Foundation. We are developing an on-line system, which can assign reputation and trust in real-time, as edits are made. One of our chief concerns in developing an on-line system was to ensure that it was robust to attack, and we believe we have made progress in this direction, as reported in the above techreps. We are now proceeding with the implementation; my guess is that we will have a prototype in a month or so. By the way, the batch part of WikiTrust http://trust.cse.ucsc.edu/ can be easily adapted to carry out various analysis tasks. Basically, it walks over all revisions of every page of a wiki, and it contains an efficient text analysis engine that tells you precisely how text was changed between versions. So, it is easy to use WikiTrust as a platform to write analysis algorithms for wikis: you don't have to worry about the boring tasks of reading and parsing markup language, and computing text diffs in a reasonable way; you can concentrate on the details of the specific analysis you want to do. It is all open source, and we welcome developers or people interested in it. All the best, Luca (with Ian, Bo, and the other wikitrusters). ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
[Wiki-research-l] The WikiTrust code has been released!
Dear All, we have just released in open-source format the code of WikiTrust, the tool we use for the Wikipedia trust coloring. The project homepage is http://trust.cse.ucsc.edu/ , and from there, you can follow a link to a live demo. The code itself is available from http://trust.cse.ucsc.edu/WikiTrust . The code is suitable to the trust-coloring of a static dump of a wiki; the code for the coloring of edits in real-time, as they happen, is under development. The code is extensible, and it provides a platform over which it is (relatively) easy to write wiki analysis tools... for instance, we wrote small analysis procedures that measure the inter-edit time distribution, and the amount of text contributed by authors of various reputation ranges. As the text analysis engine and the dump traversal engines are already built, it is relatively easy to add other analysis modules. We hope this will be of interest! All the best, Luca de Alfaro (message sent on behalf also of Bo Adler and Ian Pye) ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Re: [Wiki-research-l] Demo: coloring the text of the Wikipedia according to its trust
Dear Andre, let me say that the algorithms need tuning, so we are not sure we are doing the best, but here is the idea: When a user of reputation 10 (for example) edits the page, the text that is added only gets trust 6 or so. It is not immediately considered high trust, because others have not yet had a chance to vet it. When a user of reputation 10 edits the page, the trust of the text already on the page raises a bit (over several edits, it would approach 10). This models the fact that the user, by leaving the text there, gave an implicit vote of assent. The combination of the two effects explains what you are seeing. The goal is that even high-reputation authors can only lend part of their reputation to the text they create; community vetting is still needed to achieve high trust. Now as I say, we must still tune the various coefficients in the algorithms via a learning approach, and there is a bit more in the algorithm than i describe above, but that's the rough idea. Another thing I am pondering is how much a reputation change should spill over paragraph or bullet-point breaks. I could change easily what I do, but I will first set up the optimization/learning - I want to have some quantitative measure of how well the trust algo behaves. Thanks for your careful analysis of the results! Luca On 7/30/07, Andre Engels [EMAIL PROTECTED] wrote: 2007/7/29, Luca de Alfaro [EMAIL PROTECTED]: We first analyze the whole English Wikipedia, computing the reputation of each author at every point in time, so that we can answer questions like what was the reputation of author with id 453 at 5:32 pm of March 14, 2006. The reputation is computed according to the idea of content-driven reputation. For new portions of text, the trust is equal to (a scaling function of) the reputation of the text author. Portions of text that were already present in the previous revision can gain reputation when the page is revised by higher-reputation authors, especially if those authors perform an edit in proximity of the portion of text. Portions of text can also lose trust, if low-reputation authors edit in their proximity. All the algorithms are still very preliminary, and I must still apply a rigorous learning approach to optimize the computation. Please see the demo page for more details. One thing I find peculiar is that adding a text somewhere can lower the trust of the surrounding text while at the same thing heightening that of far away text. Why is that? See for example http://enwiki-trust.cse.ucsc.edu/index.php?title=Collationdiff=prevoldid=102784135 - trust:6 text is added between trust:8 text, causing the surrounding text to go down to trust:6 or even trust:5, but at the same time improving text elsewhere in the page from trust:8 to trust:9. Why would the author count as low-reputation for the direct environment, but high-reputation farther away? -- Andre Engels, [EMAIL PROTECTED] ICQ: 6260644 -- Skype: a_engels ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wiki-research-l
[Wiki-research-l] Demo: coloring the text of the Wikipedia according to its trust
Dear All: I would like to tell you about a demo we set up, where we color the text of Wikipedia articles according to a computed value of trust. The demo is available at http://trust.cse.ucsc.edu/ The trust value of each word of each revision is computed according to the reputation of the original author of the text, as well as the reputation of all authors that subsequently revised the text. We have uploaded a few hundred pages; for each page, we display the most recent 50 revisions (we analyzed them all, but we just uploaded the most recent 50 to the server). Of course, there are many other uses of text trust (for example, one could have the option of viewing a recent high-trust version of each page upon request), but I believe that this coloring gives an intuitive idea of how it could work. I will talk about this at Wikimania, for those of you who will be there. I am looking forward to Wikimania! Details: We first analyze the whole English Wikipedia, computing the reputation of each author at every point in time, so that we can answer questions like what was the reputation of author with id 453 at 5:32 pm of March 14, 2006. The reputation is computed according to the idea of content-driven reputation http://www.soe.ucsc.edu/%7Eluca/papers/07/wikiwww2007.html. For new portions of text, the trust is equal to (a scaling function of) the reputation of the text author. Portions of text that were already present in the previous revision can gain reputation when the page is revised by higher-reputation authors, especially if those authors perform an edit in proximity of the portion of text. Portions of text can also lose trust, if low-reputation authors edit in their proximity. All the algorithms are still very preliminary, and I must still apply a rigorous learning approach to optimize the computation. Please see the demo page for more details. All the best, Luca de Alfaro http://www.soe.ucsc.edu/~luca ___ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wiki-research-l