[Wiki-research-l] Fwd: Reasons you use the XML dumps or want to, but can't?

2015-02-20 Thread Federico Leva (Nemo)

FYI


 Messaggio inoltrato 
Oggetto:[Xmldatadumps-l] Your comments needed (long term dumps rewrite?)
Data:   Thu, 19 Feb 2015 12:30:01 +0200
Mittente:   Ariel Glenn WMF 
A:  xmldatadump...@lists.wikimedia.org



The MediaWiki Core team has opened a discussion about getting more
involved in and maybe redoing the dumps infrastructure.  A good starting
point is to understand how folks use the dumps already or want to use
them but can't, and some questions about that are listed here:
https://www.mediawiki.org/wiki/Wikimedia_MediaWiki_Core_Team/Backlog/Improve_dumps 


I've added some notes but please go weigh in.  Don't be shy about what
you do/what you need, this is the time to get it all on the table.

Ariel


___
Xmldatadumps-l mailing list
xmldatadump...@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] a cautious note on gender stats Re:Fwd:[Gendergap] Wikipedia readers

2015-02-20 Thread Kerry Raymond
Claudia

Our behaviour on Wikipedia is public (for better or worse). But a tool that
analyses it can of course be limited to allow users to see only the analysis
of their own behaviour and show them where they sit on a graph relative to
unidentified others. However, based on past discussions of privacy and
analysis tools, I suspect others will argue that if the data is public, why
shouldn't the analysis also be public?

But, Claudia, I am not sure of the end point of this conversation which
seems to be wandering all over the place. Are we trying to come up with one
or more research questions in relation to the gender gap? If so, that needs
some constraining in terms of the time and resources available? What can be
done in 10 years with $10M and the full cooperation of WMF is very different
to what can be done over the weekend with no budget using existing public
data? Is the goal to put in a PEG grant?

Kerry




___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] a cautious note on gender stats Re: Fwd:[Gendergap] Wikipedia readers

2015-02-20 Thread koltzenburg
Hi Kerry, 

I think that such a tool, if ever, should be used only if everyone who agrees 
with implementing it has had their own behaviour analysed publicly... 
btw, 
one reason why the "thank you" function is not used widely on Wikipedia 
might be that their logs are made public, even if for the entries some 
information is scraped. I consider screened does not usually have the effect 
of trust enhancement, so this would be an interesting issue to look into for 
the measures you suggest. 
my position is that with any kind of surveillance, alleged benefits never 
balance the losses, for individual and social freedom, for a culture of mutual 
trust, for sharing freely what would otherwise risk to be self-censored, not 
least for civil society's antimilitarist activism, etc. ...

my cautious note on gender stats (that seem to talk about facts re the enWP 
community) is in part motivated by similar thoughts as yours, Kerry, 
pinpointing behaviour and drawing conclusions;
because:
talking about any numbers in a short line of no more that 10 words will never 
allow for any transparency about the assumptions underlying the measuring 
and counting exercise, but it is precisely these that *create* the data in the 
first place, 
and I guess that the concept-creating exercise that I read in your mail 
therefore would have to be made public, too, in as easy words as you do here, 
and not in any discourse that is inaccessible for too many of those (like 
myself) who would be affected by an implementation

I guess that while goodwill is nice (to read about), research in my 
understanding should start from reflections about one's own perspective and 
not from any claims about "what is out there" -- but rather: "what do I see to 
be the case out there" and also: why do I perceive this to be my perception -- 
yes, it is no less complicated that this, and I am not the first one to argue 
in 
this vein

anyway, here again, Lorde's insight that the master's tools will never 
dismantle the master's house might serve as a cautious note about any claim 
published and quoted in/from mainstream research

best,
Claudia

-- Original Message ---
From:"Kerry Raymond" 
To:"'Research into Wikimedia content and communities'" 
Sent:Fri, 20 Feb 2015 11:18:15 +1000
Subject:Re: [Wiki-research-l] a cautious note on gender stats Re: Fwd:
[Gendergap] Wikipedia readers

> I agree if a person enjoys bullying, they are 
> unlikely to self-correct. But an "interaction 
> sentiment tool" makes it easier for the community 
> to spot these people, and look more closely into 
> what they are doing. Then try to get them to 
> change, and  until such time as 
> they ban them.
> 
> My comment about self-correcting behaviour is 
> about people who don't intend to be a bully but 
> behave abrasively without realising it. We have a 
> lot of battle-weary editors out there who have 
> just seen one too many vandalism, one too many 
> blatant self-promotional article, etc and they 
> become inclined to just shoot down "yet another" 
> with increasing reluctance to check out the merits 
> of the specific case, or to be terse and unhelpful 
> in a Talk message etc. We've probably all had 
> those moments of finding some new user's 
> contribution that needs so much work to improve 
> and thought "I'm just too busy, I don't have time 
> to educate yet another one who probably won't 
> stick around anyway, I'll just delete it and move 
> on". I believe that most of our community does not 
> intend to be a "bully" but may not be aware that 
> is how they might seem to others at times. Letting 
> people be aware that their interaction style is 
> exhibiting higher than average "negative 
> sentiment" *is* likely to change the behaviour of 
> that group.
> 
> Obviously if we were to put such a tool out there, 
> I'd suggest adding some general advice about what 
> you might do if your score is "pretty negative",
> e.g.
> 
> * think about the choice of words you use, don't 
> use words like ..., instead use ...
> 
> * are you terse or just point to a policy without 
> being specific about your concerns
> 
> * could you have suggested a solution rather than 
> just pointing out a problem?
> 
> * is it time for a wiki-break to recharge your batteries?
> 
> The sentiment score is likely to be generated from 
> assessment of a number of elements of the observed 
> interactions, so, for an individual looking at 
> their score, it might be possible to make specific 
> suggestions based on specific component scores,
>  e.g. pointing out specific "abrasive" words being 
> used regularly and suggesting alternatives.
> 
> Here's a suggestion for something a lot simpler 
> than the "international sentiment tool". Just 
> produce some word clouds for:
> 
> * a user's edit summaries
> 
> * a user's edits on article Talk pages
> 
> * a user's edits on other people's User 
> Talk pages
> 
> * a user's edits on their own User Talk pag