I'd be happy to offer my admin/oversighter experience and knowledge to help you develop the labeling and such, Aaron! I just commented on Andreas's proposal on the Community Wishlist, but to summarize here: I see a lot of potential pitfalls in trying to handle/generalize this with machine learning, but I also see a lot of potential value, and I think it's something we should be investigating.
-Fluffernutter On Sun, Nov 15, 2015 at 11:32 AM, Aaron Halfaker <ahalfa...@wikimedia.org> wrote: > > > > The League of Legends team collaborated with outside scientists to > > analyse their dataset. I would love to see the Wikimedia Foundation > engage > > in a similar research project. > > > Oh! We are! :) When we have time. :\ One of the projects that I'd like to > see done, but I've struggled to find the time for is a common talk page > parser[1] that could produce a dataset of talk page interactions. I'd like > this dataset to be easy to join to editor outcome measures. E.g. there > might be "aggressive" talk that we don't know is problematic until we see > the kind of effect that it has on other conversation participants. > > Anyway, I want some powerful utilities and datasets out there to help > academics look into this problem more easily. For revscoring, I'd like to > be able to take a set of talk page diffs, have them classified in Wiki > labels[2] as "aggressive" and the build a model for ORES[3] to be used > however people see fit. You could then use ORES to do offline analysis of > discussions for research. You could use ORES to interrupt the a user > before saving a change. I'm sure there are other clever ideas that people > have for what to do with such a model that I'm happy to enable it via the > service. The hard part is getting a good dataset labeled. > > If someone wants to invest some time and energy into this, I'm happy to > work with you. We'll need more than programming help. We'll need a lot of > help to figure out what dimensions we'll label talk page postings by and to > do the actual labeling. > > 1. https://github.com/Ironholds/talk-parser > 2. https://meta.wikimedia.org/wiki/Wiki_labels > 3. https://meta.wikimedia.org/wiki/ORES > > On Sun, Nov 15, 2015 at 6:56 AM, Andreas Kolbe <jayen...@gmail.com> wrote: > > > On Sat, Nov 14, 2015 at 9:13 PM, Benjamin Lees <emufarm...@gmail.com> > > wrote: > > > > > This article highlights the happier side of things, but it appears > > > that Lin's approach also involved completely removing bad actors: > > > "Some players have also asked why we've taken such an aggressive > > > stance when we've been focused on reform; well, the key here is that > > > for most players, reform approaches are quite effective. But, for a > > > number of players, reform attempts have been very unsuccessful which > > > forces us to remove some of these players from League entirely."[0] > > > > > > > > > Thanks for the added context, Benjamin. Of course, banning bad actors > that > > they consider unreformable is something Wikipedia admins have always done > > as well. > > > > The League of Legends team began by building a dataset of interactions > that > > the community considered unacceptable, and then applied machine-learning > to > > that dataset. > > > > It occurs to me that the English Wikipedia has ready access to such a > > dataset: it's the totality of revision-deleted and oversighted talk page > > posts. The League of Legends team collaborated with outside scientists to > > analyse their dataset. I would love to see the Wikimedia Foundation > engage > > in a similar research project. > > > > I've added this point to the community wishlist survey: > > > > > > > https://meta.wikimedia.org/wiki/2015_Community_Wishlist_Survey#Machine-learning_tool_to_reduce_toxic_talk_page_interactions > > > > > > > > > P.S. As Rupert noted, over 90% of LoL players are male (how much over > > > 90%?).[1] It would be interesting to know whether this percentage has > > > changed along with the improvements described in the article. > > > > > > > > > Indeed. > > _______________________________________________ > > Wikimedia-l mailing list, guidelines at: > > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines > > Wikimedia-l@lists.wikimedia.org > > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> > > > _______________________________________________ > Wikimedia-l mailing list, guidelines at: > https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines > Wikimedia-l@lists.wikimedia.org > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, > <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe> > -- Karen Brown user:Fluffernutter *Unless otherwise specified, any email sent from this address is in my volunteer capacity and does not represent the views or wishes of the Wikimedia Foundation* _______________________________________________ Wikimedia-l mailing list, guidelines at: https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>