Re: [Wiki-research-l] [Analytics] Wikipedia Detox: Scaling up our understanding of harassment on Wikipedia

2017-06-26 Thread Pine W
On Sat, Jun 24, 2017 at 2:49 AM, Kerry Raymond 
wrote:

> No right to be offended? To say to someone "you don't have the right to be
> offended" seems pretty offensive in itself. It seems to imply that their
> cultural norms are somehow inferior or unacceptable.
>

I'm not sure that I worded my comment clearly as I would like. I would like
to reduce the intensity and frequency of toxic behavior, but there's some
difficulty in defining what is toxic or unacceptable. If person X says
something that person Y finds offensive, that in and of itself doesn't mean
that person X was being intentionally malicious. Cultural norms and
personal sensitivities vary widely, and there is a danger that attempts to
reduce conflict will be done in such a way that freedom of expression is
suppressed. As an example, there are statements in British English that I
am told are highly offensive, but to me as an American seem mild when I
hear them through an American cultural lens. Having an AI, or humans,
attempt to police the degree to which a statement is offensive seems like a
minefield. Perhaps a better way to approach the situation is to try to a
look at intent, which I think is similar to your next point:


>
> With the global reach of Wikipedia, there are obviously many points of
> view on what is or isn't offensive in what circumstances. Offence may not
> be intended at first, but, if after a person is told their behaviour is
> offensive and they persist with that behaviour, I think it is reasonable to
> assume that they intend to offend. Which is why the data showing there is a
> group of experienced users involved in numerous personal attacks demands
> some human investigation of their behaviour.
>

I think that looking at intent, rather than solely at the content of what
was said, sounds like a good idea. However, I'm not sure that I'd always
agree that if person X is told that statement A is offensive to person Y
that person X should necessarily stop, because what person X is saying may
be seem reasonable to person X (for example "It's OK to eat meat") but
highly offensive to person Y. I think maybe a more nuanced approach would
be to look at what person X's intent is in saying "It's OK to eat meat": is
the person expressing or arguing for their views in good faith, or are they
acting in bad faith and intentionally trying to provoke person Y?
Fortunately, in my experience, the cases where people are being malicious
are usually clearer, such that admins and others are not usually called on
to evaluate whether a statement was OK. "Calling names" in any language
seems to not go over very well, and I think that most of us who have a tool
to create blocks would be willing to use that tool if a conversation
degenerated to that point. Unfortunately, like you, my perception in the
past was that there were some experienced users on English Wikipedia (and
perhaps other languages as well) where needlessly provocative behavior was
tolerated; I would like to think that the standards for civility are being
raised.

I'm aware of WMF's research into the frequency of personal attacks; I
wonder whether there are charts of how the frequency is changing over time.


> Similarly for a person offended, if there is a genuinely innocent
> interpretation to something they found offensive and that is explained to
> them (perhaps by third parties), I think they need to be accepting that no
> offence was intended on that occasion. Obviously we need a bit of give and
> take. But I think there have to be limits on the repeated behaviour (either
> in giving the offence or taking the offence).
>

In general, I agree.

There are some actions for which I could support "one strike and you're
out"; I once kicked someone out of an IRC channel for uncivil behavior with
little (perhaps no) warning because the situation seemed so clear to me,
and no one complained about my decision. I think that in many cases that
it's clear whether someone is making a personal attack, but some cases are
not so clear, and I want to be careful about the degree to which WMF
encourages administrators to rely on an AI to make decisions. Even if an AI
is trained extensively in with native language speakers, there can be
significant differences in how a statement is interpreted.

Pine


>
> Kerry
>
>
>
>
>
>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] link rot

2017-06-26 Thread Kerry Raymond
It's worth commenting that link rot occurs at a variety of ways.

The obvious way is that the URL is broken, error 404 is returned to the browser.
Or but rather than send a 404 to the browser, the site redirects you to a page 
that says "Page not found" without an error 404.
Or but you are redirected to a search page from which does not  find what you 
want. (A lot of sites seem to be increasingly to be hiding content by returning 
it as search results that you cannot archive).
Or but you are redirected to a general search page from which you may or may 
not find the page you were after at a new URL
Or the URL has been replaced by a specialised search which will give you what 
you want but not in a way that you can use for citing or archiving. A lot of 
sites seem to be increasingly to be hiding content by returning it as search 
results.
Or the URL works but the contents on the page is not what was expected 
(different topic) which occurs with sites that number (and then re-number their 
web pages) or when cybersquatters buy an expired domain name.
Or the URL works, continues to be about the topic expected, but does not say 
anything to back up the claim in the Wikipedia article because the content has 
changed since.
Or the URL works and the content NEVER said what the Wikipedia article claims 
(contributor error or deliberate misleading).

And there may be more variations on the them that I have forgotten about.

Obviously these variations have to be detected in different ways. And for 
archive sites, it is often impossible to recognise in an automated way that a 
lot of these have occurred. It can be really tedious to wade through dozens of 
archived snapshots of a webpage finding "Page not found" pages in your search 
for the "most recent really-what-I-wanted content". This is a problem for the 
Internet Archive Bot.

So you often need a human to say "hey, it's broken" at which point the Internet 
Archive Bot may try to fix it. Because the bot writers know that the bot can be 
fooled by finding an "archived page" that actually doesn't replace the deadlink 
with useful content, they put those very long messages on the Talk pages to try 
to ask people to check the rescued citation. I don't know about other people, 
but when the Internet Archive Bot was released, it deluged my watchlist and I 
simply stopped checking its work (I could never have kept up). Now its volume 
has reduced but I'm now trained to ignore it. I think it does a better job at 
archiving external links than rescuing (but given the variations above, this is 
not to be wondered at).

At the end of the day, most deadlinks need a human in the loop for recovery. 
And it's a huge task and a tedious one. But I do dabble in it from time to time 
for claims that seem particularly "bold" or on articles that I care a little 
bit more about. So let me talk about the process.

 One of the problems is that for URLs that I did not add myself, I can see the 
deadlink citation and I may have located what I think is a replacement page 
(whether on the original website or from an archive or whatever), say with a 
similar-ish title appearing to take about the topic of the Wikipedia article, 
but my problem is that I cannot tell from the article how much of the content 
preceeding the citation (or in the case of bullet lists, tables, etc, following 
the citation) is intended to be supported by the citation. So I don't really 
know if some particular claim is supposed to be supported by the nearest 
citation or whether it may be supported by another citation that has drifted a 
long way away. I've emailed at some length previously about this problem of 
being unable to relate chunks of texts in articles to citations and the 
citation rot that occurs as the article grows and the citations drift into the 
wrong text (or just get deleted because a subsequent editor can't see where 
they fit into the narrative or can't be bothered to see). So, not quite knowing 
what information was supposed to be supported by this citation, it is genuinely 
hard to say if the new URL I have found is or isn't an adequate replacement.  
Am I doing more harm to replace it when I may not totally confident, or should 
I leave it for someone else to decide (assuming someone else will even try)? I 
often try to fix a deadlink citation but back away because I just don't know if 
I am doing the right thing or not.

To try to get around the "citation rot" issues, if I am highly motivated that 
day, I use WikiBlame to try to locate the version of the article in the History 
where the citation was added. This gives me the best chance to know what 
information it was intended to support. So then I go and look in Internet 
Archive and find the URL has been archived, but the first archived version is 
AFTER the date of the version of the Wikipedia article that added the citation. 
Is this a problem? Generally I take the risk and go for it if the info seems to 
be consistent. At the end of 

Re: [Wiki-research-l] Looking for examples and suggestions for research project on my work at UNESCO

2017-06-26 Thread Ed Summers
Hi John,

I suspect you've seen this already, but you can use MediaWiki's external link 
search to search for links to UNESCO. For example here's how to find links to 
resources hosted at en.unesco.org/mediabank in the commons:


https://commons.wikimedia.org/w/index.php?target=http%3A%2F%2Fen.unesco.org%2Fmediabank&title=Special%3ALinkSearch&uselang=en

You can do searches like that on all the language Wikipedias. Maybe it could 
yield some interesting data for analysis? If you're interested I have some code 
for collecting the links automatically:

https://github.com/edsu/wikilinks

Also, a few years ago Liam Wyatt prepared a report for the National Museum of 
Australia about their relationship with Wikimedia. Perhaps there are some 
ideas/approaches in there that could be useful for your project?


https://upload.wikimedia.org/wikipedia/commons/e/e0/Wikipedia_-_National_Museum_of_Australia_Situation_Report%2C_February_2012.pdf

Good luck!

//Ed
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] link rot

2017-06-26 Thread fn



On 06/26/2017 04:43 PM, Mark J. Nelson wrote:

James Salsman  writes:


Is anyone studying the rate at which external links become unavailable
on Wikipedia projects?


There've been a few studies over the years, but none of the ones I know
of are recent. One from 2011 that may nonetheless be interesting is:

P. Tzekou, S. Stamou, N. Kirtsis, N. Zotos. Quality assessment of
Wikipedia external links. In Proceedings of Web Information Systems and
Technologies (WEBIST) 2011.
http://www.dblab.upatras.gr/download/nlp/NLP-Group-Pubs/11-WEBIST_Wikipedia_External_Links.pdf

-Mark



There is a Japanese study from the same year:

Characteristics of external links and dead links in Japanese Wikipedia
https://dx.doi.org/10.2964/JSIK.21_06


This was found on the Scholia page for the link rot topic:

https://tools.wmflabs.org/scholia/topic/Q1193907


/Finn

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Looking for examples and suggestions for research project on my work at UNESCO

2017-06-26 Thread Gerard Meijssen
Hoi,
To what extend does UNESCO hold open licenced texts in languages other than
English?
Thanks,
 GerardM

On 25 June 2017 at 15:27, john cummings  wrote:

> Dear all
>
> I've been working as Wikimedian in Residence at UNESCO for the past two
> years working on a number of activities including:
>
> * Sharing UNESCO media content on Wikimedia projects
> * Sharing UNESCO open license text on English language Wikipedia
> * Promoting Wiki Loves competitions through UNESCO social media
> * Encouraging other UN agencies to adopt open licenses.
>
> More information is here:
> https://en.wikipedia.org/wiki/Wikipedia:WikiProject_United_Nations
>
> I'm working with a researcher at UNESCO to understand the impact of what
> I've been doing and would like to some suggestions on where to start with a
> research project. The researcher has a background in statistics and is
> familiar with R but is not very knowledgeable about Wikimedia projects. I'm
> not familiar with much of the research done on Wikimedia projects other
> than metrics tools like BaGLAMa, GLAMorgan etc that I use for reporting. I
> guess what I'm looking for is a general overview and case studies on
> research projects done on Wikimedia projects and any specific examples done
> with the kind of work I'm doing.
>
> Many thanks
>
> John
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Looking for examples and suggestions for research project on my work at UNESCO

2017-06-26 Thread Jonathan Morgan
hi John,

When you say "research project", do you mean specifically "measure the
impact of a program or event", or do you mean something more general?

- J

On Sun, Jun 25, 2017 at 6:27 AM, john cummings 
wrote:

> Dear all
>
> I've been working as Wikimedian in Residence at UNESCO for the past two
> years working on a number of activities including:
>
> * Sharing UNESCO media content on Wikimedia projects
> * Sharing UNESCO open license text on English language Wikipedia
> * Promoting Wiki Loves competitions through UNESCO social media
> * Encouraging other UN agencies to adopt open licenses.
>
> More information is here:
> https://en.wikipedia.org/wiki/Wikipedia:WikiProject_United_Nations
>
> I'm working with a researcher at UNESCO to understand the impact of what
> I've been doing and would like to some suggestions on where to start with a
> research project. The researcher has a background in statistics and is
> familiar with R but is not very knowledgeable about Wikimedia projects. I'm
> not familiar with much of the research done on Wikimedia projects other
> than metrics tools like BaGLAMa, GLAMorgan etc that I use for reporting. I
> guess what I'm looking for is a general overview and case studies on
> research projects done on Wikimedia projects and any specific examples done
> with the kind of work I'm doing.
>
> Many thanks
>
> John
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>



-- 
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) 
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] link rot

2017-06-26 Thread Mark J. Nelson
James Salsman  writes:

> Is anyone studying the rate at which external links become unavailable
> on Wikipedia projects?

There've been a few studies over the years, but none of the ones I know
of are recent. One from 2011 that may nonetheless be interesting is:

P. Tzekou, S. Stamou, N. Kirtsis, N. Zotos. Quality assessment of
Wikipedia external links. In Proceedings of Web Information Systems and
Technologies (WEBIST) 2011.
http://www.dblab.upatras.gr/download/nlp/NLP-Group-Pubs/11-WEBIST_Wikipedia_External_Links.pdf

-Mark

-- 
Mark J. Nelson
The MetaMakers Institute
Falmouth University
http://www.kmjn.org

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] link rot

2017-06-26 Thread Leila Zia
Hi James,

On Mon, Jun 26, 2017 at 8:04 AM, James Salsman  wrote:
>
> Is anyone studying the rate at which external links become unavailable
> on Wikipedia projects?
>
> I just did a quick tally and less than 40% of the external links cited
> in the introductions of L1-vital enwiki health and social science
> articles I sampled were good, and that's only counting those which
> didn't already have a {{dead link}} tag.
>
> I thought that the bots were doing a better job of replacing dead
> links with archive copies than they apparently are.

Two items to share:

* In FY17-18 Annual Plan, Program 11 [1]: Objective 1, Outcome 1 is
closely related to your question/observation. I expect more research
in this space as a result.

* InternetArchiveBot [2] is one bot that I know operates in this
space. If you are interested in it, it would be good to have a
discussion with the team behind that bot to learn how the bot
currently operates and what it needs to be improved.

Best,
Leila

[1] 
https://meta.wikimedia.org/wiki/Wikimedia_Foundation_Annual_Plan/2017-2018/Draft/Programs/Technology#Program_11._Improving_citations_across_Wikimedia_projects
[2] https://en.wikipedia.org/wiki/User:InternetArchiveBot


> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l