Re: [Wiki-research-l] Help us understand ORES and make good tradeoffs

2018-12-13 Thread Ofer Arazy
Hi Bowen,

I've used ORES in my research on the factors driving article quality (where
ORES scores are used as a proxy for article quality).
If you are seeking input from the research community, I'm happy to
participate in your survey

Ofer



On Thu, Dec 13, 2018 at 5:08 PM Bowen Yu  wrote:

> Hello,
>
> ORES has been out and served for the Wikipedia community for a while, for
> the purpose such as counter-vandalism. Having seen the wide usage and
> effectiveness of ORES in the community, we'd like to continue working on
> ORES development. We plan to improve and redesign ORES algorithms by
> incorporating feedbacks of all the stakeholders involved in the entire ORES
> ecosystem, such as ORES application developers, ORES application operators,
> etc. We want to understand their concerns and values, and come up with
> effective algorithmic designs that can balance trade-offs and mitigate
> potential conflicts of interests (such as edit quality control v.s.
> newcomer protection) to further improve ORES performance.
>
> We will work with Aaron Halfaker and his team to make improvements on ORES
> quality control models, and identify its limitations. Here is the project
> proposal on Meta-Wiki
> <
> https://meta.wikimedia.org/wiki/Research:Applying_Value-Sensitive_Algorithm_Design_to_ORES
> >.
> If you are interested or have any thoughts, please feel free to reach out
> to me. Thanks!
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Readers of Wikipedia

2018-12-13 Thread Kerry Raymond
I think the decision on the scope probably depends on whether people who speak 
that language also speak other languages. For example, many people in the 
Netherlands and Norway speak English very well. There may be less need to 
provide some topics in their own language if that topic is well-covered in 
Wikipedia so perhaps the focus can be more on local content. But if the 
speakers of that language are less likely to speak a "larger" language, then 
the need to provide a wide variety of non-local topics may be more important 
than providing information on local topics.

I don't know if any Wikipedias consciously make a decision to focus (or not) on 
local content, but even if they do, I presume they are hostage to the interests 
of their contributors (unless they actively remove the material). That is, you 
get the topics that the contributors are willing and able to write, no matter 
what the intention might be.

Australians are often surprised to find content about the Australian Outback 
appears in German Wikipedia and not in English Wikipedia but if you travel in 
the Outback, the reason is obvious -- the outback is full of German tourists in 
campervans and this is reflected in their Wikipedia contributions.

Kerry

-Original Message-
From: Wiki-research-l [mailto:wiki-research-l-boun...@lists.wikimedia.org] On 
Behalf Of Ziko van Dijk
Sent: Thursday, 13 December 2018 8:02 PM
To: Research into Wikimedia content and communities 

Subject: [Wiki-research-l] Readers of Wikipedia

Hello,

I just watched the showcase of December 2018, thank you for the interesting 
contribution! It would be great it further research could have a look at 
questions such as language choice.
With regard to have more insight in what readers want, I struggled in the past 
with two questions:

Regionally important content: Should a Wikipedia language version concentrate 
on regional topics, or try to cover a large variety of topics?
Heinz Kloss in the 1970s introduced the idea of "eigenbezogene Inhalte", 
content, that is closely related to a language and its region, like local 
history, culture and typical crafts such as fishing on the Faroe islands or 
farming in the Alps. What do the readers in Hungary want? That hu.WP 
concentrates on Hungarian topics, while they consult English wikipedia for 
specialized technical topics or other countries?

Large or small articles: Some printed encyclopedias had relatively few, but 
large articles. Others segmented the content into many small articles.
(Think of Encyclopedia Britannica: Macropedia and Micropedia.) What do 
Wikipedia readers want? Do they prefer to read about a larger topic in one 
long, well structured article? Or several short ones, linking to each other?

I could imagine that a reader who is interested in information for work or 
school prefers long articles that provide an in-depth approach in order to 
became familiar with the overall topic (that is, what one would expect 
traditionally). And that "news" readers want to look up something quickly, in a 
short, simplyfing article.

Kind regards
Ziko
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Readers of Wikipedia

2018-12-13 Thread Leila Zia
Hi all.

On Thu, Dec 13, 2018 at 2:02 AM Ziko van Dijk  wrote:
>
> I just watched the showcase of December 2018, thank you for the interesting
> contribution!

For those interested who haven't watched it, Ziko is referring to:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#December_2018

Thanks, Ziko! More below.

> It would be great it further research could have a look at
> questions such as language choice.

Agreed. This has been a request by a few other community members as
well. One interesting question to address here is: can we characterize
language switching? More specifically: are there specific conditions
under which switching happens? This will allow us to answer questions
like: Are there specific topics that are covered in language x and not
y that trigger switching? Is switching a function of availability of
content or we can still see switching even when the content exists in
the 2+ languages the user is comfortable reading content in? ...

Diego started looking into this, and you can follow his future work at
https://meta.wikimedia.org/wiki/Research:Characterizing_Wikipedia_Reader_Behaviour/Demographics_and_Wikipedia_use_cases
We will do more work in this space in coming 6 months.

> With regard to have more insight in what readers want, I struggled in the
> past with two questions:

> Regionally important content: Should a Wikipedia language version
> concentrate on regional topics, or try to cover a large variety of topics?

This is a good question, and as you stated, it is related to
understanding reader needs and some of the research in understanding
language switching behavior can help us understand this better.
Another aspect to keep an eye on is Denny's recent proposal for
abstract Wikipedia [1]. If that direction is picked up, we may have
more reason to emphasize on regionally important content creation
first.

> Large or small articles: Some printed encyclopedias had relatively few, but
> large articles. Others segmented the content into many small articles.
> (Think of Encyclopedia Britannica: Macropedia and Micropedia.) What do
> Wikipedia readers want? Do they prefer to read about a larger topic in one
> long, well structured article? Or several short ones, linking to each other?

This is an interesting one, too. There are at least two ways to
approach this question: study how Wikipedia readers learn (what it
means to learn needs to be defined) and then do a series of user
studies across languages and regions to find patterns and provide
recommendations for how to organize content with readers in mind. The
other approach, which I would love to see in action, is to break down
the article into many pieces and allow the reader to pick and choose
to create a learning experience for learning topic x. Then, learn from
the way readers learn. This will be building on Collection [2], Gather
[3] or other similar initiatives. Search data can also be valuable
here. (just to be clear: this is not something we're looking into
right now, but it's a fascinating area that if someone has bandwidth
and resources to look into, it can help us learn a lot.)

> I could imagine that a reader who is interested in information for work or
> school prefers long articles that provide an in-depth approach in order to
> became familiar with the overall topic (that is, what one would expect
> traditionally).

We don't know if this assumption is correct: in fact, we have the
length of article as a feature in the study and it's not picked up as
a feature that defines this user group. What we know is that across
the 14 languages in the study, this group of readers have longer dwell
times on articles, they use the desktop platform, and they come to
Wikipedia in the afternoon [4].

The above being said, we can't say for sure based on the recent study
that this group of readers don't prefer longer articles because if the
longer article in the topic of their interest doesn't exist on
Wikipedia, they may have to work with the shorter article. It would be
great to have some user studies to understand this group and their
needs better.

> And that "news" readers want to look up something quickly,
> in a short, simplyfing article.

This one we don't know. :) What we know is that across languages, this
was not observed as a consistent pattern (check table 2 in the most
recent paper [5]. for enwiki specific audience, check table 2a in the
first paper [6]: while 38% of the users motivated by media are coming
to look up a fact another 62% are there for overview or in-depth
reading.).

On Thu, Dec 13, 2018 at 5:58 AM Bob Kosovsky  wrote:
>
> "Large or small articles."  I've noticed this point of contention at the
> outset of my Wikipedia editing.  There are some editors (and presumably
> readers) who want Wikipedia to look and function like a traditional
> encyclopedia, with thorough articles reflecting well-written and thoughtful
> essays that one used to find in encyclopedias.  Those who know anything
> about web

Re: [Wiki-research-l] Help us understand ORES and make good tradeoffs

2018-12-13 Thread Pine W
Hi Bowen, after reading your project proposal I have a few questions and
concerns.

You mention a perceived tension between protecting newcomers and protecting
the quality of content. I am wondering whether that is a false dichotomy.
In my experience, test edits and blatant vandalism usually look different
from mistakes from good faith editors.

There is a feature that allows users to adjust ORES-supported edit scoring
in our watchlists and Recdent Changes:
https://www.mediawiki.org/wiki/Edit_Review_Improvements/New_filters_for_edit_review.
Have you tested this feature? How would your research be useful for that
feature's future development?

I think that ORES is supposed to aid human judgment, not to substitute for
human judgment. How certain are you that "ORES applications will play a
role in drawing a line between acceptable freestyle edits and editing
policies in standard."? There may well be some human patrollers who adjust
their definitions for vandalism based on ORES recommendations, but I think
that you would want to know to what extent ORES has that effect.

I would also like to mention that Wikipedia policies and guidelines, like
offline human laws and customs, may change over time, may have varying
interpretations, and may have varying degrees of adherence among the
populace.

Thanks for your interest in studying ORES. I am glad that you are
collaborating with Aaron.



On Thu, Dec 13, 2018, 7:08 AM Bowen Yu  wrote:

> Hello,
>
> ORES has been out and served for the Wikipedia community for a while, for
> the purpose such as counter-vandalism. Having seen the wide usage and
> effectiveness of ORES in the community, we'd like to continue working on
> ORES development. We plan to improve and redesign ORES algorithms by
> incorporating feedbacks of all the stakeholders involved in the entire ORES
> ecosystem, such as ORES application developers, ORES application operators,
> etc. We want to understand their concerns and values, and come up with
> effective algorithmic designs that can balance trade-offs and mitigate
> potential conflicts of interests (such as edit quality control v.s.
> newcomer protection) to further improve ORES performance.
>
> We will work with Aaron Halfaker and his team to make improvements on ORES
> quality control models, and identify its limitations. Here is the project
> proposal on Meta-Wiki
> <
> https://meta.wikimedia.org/wiki/Research:Applying_Value-Sensitive_Algorithm_Design_to_ORES
> >.
> If you are interested or have any thoughts, please feel free to reach out
> to me. Thanks!
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
-- 

Pine
( https://meta.wikimedia.org/wiki/User:Pine )
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Help us understand ORES and make good tradeoffs

2018-12-13 Thread Bowen Yu
Hello,

ORES has been out and served for the Wikipedia community for a while, for
the purpose such as counter-vandalism. Having seen the wide usage and
effectiveness of ORES in the community, we'd like to continue working on
ORES development. We plan to improve and redesign ORES algorithms by
incorporating feedbacks of all the stakeholders involved in the entire ORES
ecosystem, such as ORES application developers, ORES application operators,
etc. We want to understand their concerns and values, and come up with
effective algorithmic designs that can balance trade-offs and mitigate
potential conflicts of interests (such as edit quality control v.s.
newcomer protection) to further improve ORES performance.

We will work with Aaron Halfaker and his team to make improvements on ORES
quality control models, and identify its limitations. Here is the project
proposal on Meta-Wiki
.
If you are interested or have any thoughts, please feel free to reach out
to me. Thanks!
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Readers of Wikipedia

2018-12-13 Thread Federico Leva (Nemo)

Ziko van Dijk, 13/12/18 12:02:

Regionally important content: Should a Wikipedia language version
concentrate on regional topics, or try to cover a large variety of topics?


This question is automatically solved if instead of focusing on 
Wikipedia you do Wikisource. Wikisource will only contain texts 
published in that language, such as local fiction and official acts of 
local entities. An example is Ladino/Ladin (as in lld, not 
lad/Judaeo-Spanish):

https://it.wikisource.org/wiki/Categoria:Testi_in_ladino

Federico

___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Readers of Wikipedia

2018-12-13 Thread Bob Kosovsky
I didn't see the showcase but I'm intrigued by Ziko's comments.  The
general response I would make to his question "What do Wikipedia readers
want?" is that different readers want different things, sometimes in
conflict with one another.  To elaborate on two of Ziko's points:

"Regionally important content."   This reminded me of a Signpost editorial
some years ago discussing a then-recent Arbcom debate concerning how the
city Jerusalem is described in the opening section of several different
language Wikipedias.  As you can imagine, not only was there strong
variance but it seemed that some of the versions were making unstated
points that, if not political, were trying to convey stability of
definition without alluding to any controversies.  Admittedly Jerusalem is
an extreme example, but I would think there would be any number of
geographic or even topical ideas which certain languages would want to
convey certain meanings and issues of which other languages might be
unaware.

"Large or small articles."  I've noticed this point of contention at the
outset of my Wikipedia editing.  There are some editors (and presumably
readers) who want Wikipedia to look and function like a traditional
encyclopedia, with thorough articles reflecting well-written and thoughtful
essays that one used to find in encyclopedias.  Those who know anything
about web design know that a long essay goes against the design ethos of
the web where some advise against webpages that require excessive scrolling.

The bottom line is that I don't think one can or should make a definitive
rule regarding these issues because different communities will want
different attributes and styles.  To be sure, editors/readers should be
aware that such options exist and that Wikipedia style varies considerably
from article to article (and community to community).

Bob
(user:kosboot)


On Thu, Dec 13, 2018 at 5:02 AM Ziko van Dijk  wrote:

> Hello,
>
> I just watched the showcase of December 2018, thank you for the interesting
> contribution! It would be great it further research could have a look at
> questions such as language choice.
> With regard to have more insight in what readers want, I struggled in the
> past with two questions:
>
> Regionally important content: Should a Wikipedia language version
> concentrate on regional topics, or try to cover a large variety of topics?
> Heinz Kloss in the 1970s introduced the idea of "eigenbezogene Inhalte",
> content, that is closely related to a language and its region, like local
> history, culture and typical crafts such as fishing on the Faroe islands or
> farming in the Alps. What do the readers in Hungary want? That hu.WP
> concentrates on Hungarian topics, while they consult English wikipedia for
> specialized technical topics or other countries?
>
> Large or small articles: Some printed encyclopedias had relatively few, but
> large articles. Others segmented the content into many small articles.
> (Think of Encyclopedia Britannica: Macropedia and Micropedia.) What do
> Wikipedia readers want? Do they prefer to read about a larger topic in one
> long, well structured article? Or several short ones, linking to each
> other?
>
> I could imagine that a reader who is interested in information for work or
> school prefers long articles that provide an in-depth approach in order to
> became familiar with the overall topic (that is, what one would expect
> traditionally). And that "news" readers want to look up something quickly,
> in a short, simplyfing article.
>
> Kind regards
> Ziko
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


[Wiki-research-l] Readers of Wikipedia

2018-12-13 Thread Ziko van Dijk
Hello,

I just watched the showcase of December 2018, thank you for the interesting
contribution! It would be great it further research could have a look at
questions such as language choice.
With regard to have more insight in what readers want, I struggled in the
past with two questions:

Regionally important content: Should a Wikipedia language version
concentrate on regional topics, or try to cover a large variety of topics?
Heinz Kloss in the 1970s introduced the idea of "eigenbezogene Inhalte",
content, that is closely related to a language and its region, like local
history, culture and typical crafts such as fishing on the Faroe islands or
farming in the Alps. What do the readers in Hungary want? That hu.WP
concentrates on Hungarian topics, while they consult English wikipedia for
specialized technical topics or other countries?

Large or small articles: Some printed encyclopedias had relatively few, but
large articles. Others segmented the content into many small articles.
(Think of Encyclopedia Britannica: Macropedia and Micropedia.) What do
Wikipedia readers want? Do they prefer to read about a larger topic in one
long, well structured article? Or several short ones, linking to each other?

I could imagine that a reader who is interested in information for work or
school prefers long articles that provide an in-depth approach in order to
became familiar with the overall topic (that is, what one would expect
traditionally). And that "news" readers want to look up something quickly,
in a short, simplyfing article.

Kind regards
Ziko
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l