[Wikimedia-l] Re: Wikimedia Research Showcase April 19 at 16:30 UTC

2023-04-19 Thread Emily Lescak
Hi all,

A friendly reminder that this event will be starting in about three hours.

Best,
Emily


On Thu, Apr 13, 2023 at 2:55 PM Emily Lescak  wrote:

> Hi all,
>
> The next Research Showcase, with the theme of Images on Wikipedia, will be
> live-streamed Wednesday, April 19, at 16:30 UTC. Find your local time here
> <https://zonestamp.toolforge.org/1681921857>.
>
> YouTube stream: https://www.youtube.com/watch?v=vW0waU-QArU
>
> You can join the conversation on IRC at #wikimedia-research or on the
> YouTube chat.
>
> This month's presentations:
> A large scale study of reader interactions with images on WikipediaBy *Daniele
> Rama, University of Turin*Wikipedia is the largest source of free
> encyclopedic knowledge and one of the most visited sites on the Web. To
> increase reader understanding of the article, Wikipedia editors add images
> within the text of the article’s body. However, despite their widespread
> usage on web platforms and the huge volume of visual content on Wikipedia,
> little is known about the importance of images in the context of free
> knowledge environments. To bridge this gap, we collect data about English
> Wikipedia reader interactions with images during one month and perform the
> first large-scale analysis of how interactions with images happen on
> Wikipedia. First, we quantify the overall engagement with images, finding
> that one in 29 pageviews results in a click on at least one image, one
> order of magnitude higher than interactions with other types of article
> content. Second, we study what factors associate with image engagement and
> observe that clicks on images occur more often in shorter articles and
> articles about visual arts or transports and biographies of less well-known
> people. Third, we look at interactions with Wikipedia article previews and
> find that images help support reader information need when navigating
> through the site, especially for more popular pages. The findings in this
> study deepen our understanding of the role of images for free knowledge and
> provide a guide for Wikipedia editors and web user communities to enrich
> the world’s largest source of encyclopedic knowledge.
>
>- Paperː
>
> https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-021-00312-8
>
>
> Visual gender biases in Wikipediaː A systematic evaluation across the ten
> most spoken languagesBy *Pablo Beytia, Catholic University of Chile*The
> existing research suggests a significant gender gap in Wikipedia
> biographical articles, with a minimal representation of women and gender
> asymmetries in the textual content. However, the visual aspects of this gap
> (e.g., image volume and quality) have received little attention. This study
> examined asymmetries between women's and men's biographies, exploring
> written and visual content across the ten most widely spoken languages. The
> cross-lingual analysis reveals that (1) the most salient male biases appear
> when editors select which personalities should have a Wikipedia page, (2)
> the trends in written and visual content are dissimilar, (3) male
> biographies tend to have more images across languages, and (4) female
> biographies have better visual quality on average. The open database of
> this study provides eight indicators of gender asymmetries in ten
> occupational domains and ten languages. That information allows for a
> granular view of gender biases, as well as exploring more macroscopic
> phenomena, such as the similarity between Wikipedia versions according to
> their gender bias structures.
>
>- Papersː
>
> Beytía, P., Agarwal, P., Redi, M., & Singh, V. K. (2022). Visual Gender
> Biases in Wikipedia: A Systematic Evaluation across the Ten Most Spoken
> Languages. Proceedings of the International AAAI Conference on Web and
> Social Media, 16(1), 43-54. https://doi.org/10.1609/icwsm.v16i1.19271
> https://ojs.aaai.org/index.php/ICWSM/article/view/19271Beytía, P. &
> Wagner, C. (2022). Visibility layers: a framework for systematizing the
> gender gap in Wikipedia content. Internet Policy Review, 11(1).
> https://doi.org/10.14763/2022.1.1621
> https://policyreview.info/articles/analysis/visibility-layers-framework-systematising-gender-gap-wikipedia-content
> You can watch our past Research Showcases here:
> https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
>
> Hope you can join us!
>
> Warm regards,
> Emily
>
> --
> Emily Lescak (she / her)
> Senior Research Community Officer
> The Wikimedia Foundation
>
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/T5FZNSUHLZD5YHH4GOBSIOBQYK7NHOZE/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Wikimedia Research Showcase April 19 at 16:30 UTC

2023-04-13 Thread Emily Lescak
Hi all,

The next Research Showcase, with the theme of Images on Wikipedia, will be
live-streamed Wednesday, April 19, at 16:30 UTC. Find your local time here
<https://zonestamp.toolforge.org/1681921857>.

YouTube stream: https://www.youtube.com/watch?v=vW0waU-QArU

You can join the conversation on IRC at #wikimedia-research or on the
YouTube chat.

This month's presentations:
A large scale study of reader interactions with images on WikipediaBy *Daniele
Rama, University of Turin*Wikipedia is the largest source of free
encyclopedic knowledge and one of the most visited sites on the Web. To
increase reader understanding of the article, Wikipedia editors add images
within the text of the article’s body. However, despite their widespread
usage on web platforms and the huge volume of visual content on Wikipedia,
little is known about the importance of images in the context of free
knowledge environments. To bridge this gap, we collect data about English
Wikipedia reader interactions with images during one month and perform the
first large-scale analysis of how interactions with images happen on
Wikipedia. First, we quantify the overall engagement with images, finding
that one in 29 pageviews results in a click on at least one image, one
order of magnitude higher than interactions with other types of article
content. Second, we study what factors associate with image engagement and
observe that clicks on images occur more often in shorter articles and
articles about visual arts or transports and biographies of less well-known
people. Third, we look at interactions with Wikipedia article previews and
find that images help support reader information need when navigating
through the site, especially for more popular pages. The findings in this
study deepen our understanding of the role of images for free knowledge and
provide a guide for Wikipedia editors and web user communities to enrich
the world’s largest source of encyclopedic knowledge.

   - Paperː
   
https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-021-00312-8


Visual gender biases in Wikipediaː A systematic evaluation across the ten
most spoken languagesBy *Pablo Beytia, Catholic University of Chile*The
existing research suggests a significant gender gap in Wikipedia
biographical articles, with a minimal representation of women and gender
asymmetries in the textual content. However, the visual aspects of this gap
(e.g., image volume and quality) have received little attention. This study
examined asymmetries between women's and men's biographies, exploring
written and visual content across the ten most widely spoken languages. The
cross-lingual analysis reveals that (1) the most salient male biases appear
when editors select which personalities should have a Wikipedia page, (2)
the trends in written and visual content are dissimilar, (3) male
biographies tend to have more images across languages, and (4) female
biographies have better visual quality on average. The open database of
this study provides eight indicators of gender asymmetries in ten
occupational domains and ten languages. That information allows for a
granular view of gender biases, as well as exploring more macroscopic
phenomena, such as the similarity between Wikipedia versions according to
their gender bias structures.

   - Papersː

Beytía, P., Agarwal, P., Redi, M., & Singh, V. K. (2022). Visual Gender
Biases in Wikipedia: A Systematic Evaluation across the Ten Most Spoken
Languages. Proceedings of the International AAAI Conference on Web and
Social Media, 16(1), 43-54. https://doi.org/10.1609/icwsm.v16i1.19271
https://ojs.aaai.org/index.php/ICWSM/article/view/19271Beytía, P. & Wagner,
C. (2022). Visibility layers: a framework for systematizing the gender gap
in Wikipedia content. Internet Policy Review, 11(1).
https://doi.org/10.14763/2022.1.1621
https://policyreview.info/articles/analysis/visibility-layers-framework-systematising-gender-gap-wikipedia-content
You can watch our past Research Showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

Hope you can join us!

Warm regards,
Emily

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/QYAFMVZU2XN5PSRKZN5CI6T4C5A66VP6/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] [Wikimedia Research Showcase] March 15

2023-03-10 Thread Emily Lescak
Hi all,

The next Research Showcase, focused on Gender and Equity on Wikipedia, will
be live-streamed Wednesday, March 15, at 9:30 AM PST / 16:30 UTC. Find your
local time here <https://zonestamp.toolforge.org/1678897840>.

YouTube stream: https://www.youtube.com/watch?v=lw4MzJgDIzo

You can join the conversation on IRC at #wikimedia-research. You can also
watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

This month's presentations:
Men Are elected, women are marriedː events gender bias on Wikipedia
By *Jiao Sun, University of Southern California*Human activities can be
seen as sequences of events, which are crucial to understanding societies.
Disproportional event distribution for different demographic groups can
manifest and amplify social stereotypes, and potentially jeopardize the
ability of members in some groups to pursue certain goals. In this paper,
we present the first event-centric study of gender biases in a Wikipedia
corpus. To facilitate the study, we curate a corpus of career and personal
life descriptions with demographic information consisting of 7,854
fragments from 10,412 celebrities. Then we detect events with a
state-of-the-art event detection model, calibrate the results using
strategically generated templates, and extract events that have asymmetric
associations with genders. Our study discovers that the Wikipedia pages
tend to intermingle personal life events with professional events for
females but not for males, which calls for the awareness of the Wikipedia
community to formalize guidelines and train the editors to mind the
implicit biases that contributors carry. Our work also lays the foundation
for future works on quantifying and discovering event biases at the corpus
level.

   - Paperː Sun, J. & Peng, N. (2021). Men Are Elected, Women Are Married:
   Events Gender Bias on Wikipedia. Proceedings of the 59th Annual Meeting of
   the Association for Computational Linguistics and the 11th International
   Conference on Natural Language Processing, 350-360.
   <https://aclanthology.org/2021.acl-short.45.pdf>


Twitter reacts to absence of women on Wikipediaː a mixed-methods analysis
of #VisibleWikiWomen campaignBy *Sneh Gupta, Guru Gobind Singh Indraprastha
University*Digital gender divide (DGD) is visible in access, participation,
representation, and biases against women embedded in Wikipedia, the largest
digital reservoir of co-created content. This article examined the content
of #VisibleWikiWomen, a global digital advocacy campaign aimed at
encouraging inclusion of women voices in the global technology conversation
and improving digital sustainability of feminist data on Wikipedia. In a
mixed-methods study, Sentiment Analysis followed by a Feminist Critical
Discourse Analysis of the campaign tweets reveals how digital gender divide
manifested in the public response. An overwhelming majority of tweets
expressed positive sentiment towards the objective of the campaign. An
inductive reading of the coded tweets (n = 1067) generated five themes:
Feminist Activism, Invisibility & Marginalization of Women, Technology for
Women Empowerment, Gendered Knowledge Inequity, and Power Dynamics in the
Digital Sphere. Twitter discourse presented many agitated digital users
calling out the epistemic injustice on Wikipedia that goes beyond the
invisibility of women. Their tweets reveal that they want an equal social
platform inclusive of women of color and varied identities currently absent
in the Wikipedia universe. Extracting ideas, values, and themes from new
media campaigns holds unparalleled potential in the diffusion of
interventions and messages on a larger scale.

   - Paperː Gupta, S., & Trehan, K. (2022). Twitter reacts to absence of
   women on Wikipedia: a mixed-methods analysis of #VisibleWikiWomen campaign.
   Media Asia, 49(2), 130-154.
   
<https://www.researchgate.net/publication/356909618_Twitter_reacts_to_absence_of_women_on_Wikipedia_a_mixed-methods_analysis_of_VisibleWikiWomen_campaign>

Warm regards,

Emily

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/TKS5P5PAYRA3U4P5T2CVDVI5X3WSUSY5/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Re: [Wikimedia Research Showcase] February 15 at 9:30AM PT, 17:30 UTC

2023-02-15 Thread Emily Lescak
A reminder that this is starting in about an hour! We hope you can join us!

Best,
Emily

On Wed, Feb 8, 2023 at 2:27 PM Emily Lescak  wrote:

> Hello everyone,
>
> The next Research Showcase will be livestreamed next Wednesday, February
> 15 at 9:30AM PT / 17:30 UTC. The theme is The Free Knowledge Ecosystem.
>
> YouTube stream: https://www.youtube.com/watch?v=8VJmR-3lTac
>
> We welcome you to join the conversation on IRC at #wikimedia-research. You
> can also watch our past research showcases:
> https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
>
> This month's presentations:
>
> The evolution of humanitarian mapping in OpenStreetMap (OSM) and how it
> affects map completeness and inequalities in OSMBy *Benjamin Herfort,
> Heidelberg Institute for Geoinformation Technology*Mapping efforts of
> communities in OpenStreetMap (OSM) over the previous decade have created a
> unique global geographic database, which is accessible to all with no
> licensing costs. The collaborative maps of OSM have been used to support
> humanitarian efforts around the world as well as to fill important data
> gaps for implementing major development frameworks such as the Sustainable
> Development Goals (SDGs). Besides the well-examined Global North - Global
> South bias in OSM, the OSM data as of 2023 shows a much more spatially
> diverse spread pattern than previously considered, which was shaped by
> regional, socio-economic and demographic factors across several scales.
> Humanitarian mapping efforts of the previous decade have already made OSM
> more inclusive, contributing to diversify and expand the spatial footprint
> of the areas mapped. However, methods to quantify and account for the
> remaining biases in OSM’s coverage are needed so that researchers and
> practitioners will be able to draw the right conclusions, e .g. about
> progress towards the SDGs in cities.
>
>
> Dataset reuseː Toward translating principles to practiceBy *Laura
> Koesten, University of Vienna*The web provides access to millions of
> datasets. These data can have additional impact when used beyond the
> context for which they were originally created. But using a dataset beyond
> the context in which it originated remains challenging. Simply making data
> available does not mean it will be or can be easily used by others. At the
> same time, we have little empirical insight into what makes a dataset
> reusable and which of the existing guidelines and frameworks have an
> impact.In this talk, I will discuss our research on what makes data
> reusable in practice. This is informed by a synthesis of literature on the
> topic, our studies on how people evaluate and make sense of data, and a
> case study on datasets on GitHub. In the case study, we describe a corpus
> of more than 1.4 million data files from over 65,000 repositories. Building
> on reuse features from the literature, we use GitHub’s engagement metrics
> as proxies for dataset reuse and devise an initial model, using deep neural
> networks, to predict a dataset’s reusability. This demonstrates the
> practical gap between principles and actionable insights that might allow
> data publishers and tool designers to implement functionalities that
> facilitate reuse.
> We hope you can join us!
>
> Warm regards,
> Emily
>
>
> --
> Emily Lescak (she / her)
> Senior Research Community Officer
> The Wikimedia Foundation
>
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/SP4FQLZCMFONGUT6FZSNIBPTGERSOM2Z/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] [Wikimedia Research Showcase] February 15 at 9:30AM PT, 17:30 UTC

2023-02-08 Thread Emily Lescak
Hello everyone,

The next Research Showcase will be livestreamed next Wednesday, February 15
at 9:30AM PT / 17:30 UTC. The theme is The Free Knowledge Ecosystem.

YouTube stream: https://www.youtube.com/watch?v=8VJmR-3lTac

We welcome you to join the conversation on IRC at #wikimedia-research. You
can also watch our past research showcases:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

This month's presentations:

The evolution of humanitarian mapping in OpenStreetMap (OSM) and how it
affects map completeness and inequalities in OSMBy *Benjamin Herfort,
Heidelberg Institute for Geoinformation Technology*Mapping efforts of
communities in OpenStreetMap (OSM) over the previous decade have created a
unique global geographic database, which is accessible to all with no
licensing costs. The collaborative maps of OSM have been used to support
humanitarian efforts around the world as well as to fill important data
gaps for implementing major development frameworks such as the Sustainable
Development Goals (SDGs). Besides the well-examined Global North - Global
South bias in OSM, the OSM data as of 2023 shows a much more spatially
diverse spread pattern than previously considered, which was shaped by
regional, socio-economic and demographic factors across several scales.
Humanitarian mapping efforts of the previous decade have already made OSM
more inclusive, contributing to diversify and expand the spatial footprint
of the areas mapped. However, methods to quantify and account for the
remaining biases in OSM’s coverage are needed so that researchers and
practitioners will be able to draw the right conclusions, e .g. about
progress towards the SDGs in cities.


Dataset reuseː Toward translating principles to practiceBy *Laura Koesten,
University of Vienna*The web provides access to millions of datasets. These
data can have additional impact when used beyond the context for which they
were originally created. But using a dataset beyond the context in which it
originated remains challenging. Simply making data available does not mean
it will be or can be easily used by others. At the same time, we have
little empirical insight into what makes a dataset reusable and which of
the existing guidelines and frameworks have an impact.In this talk, I will
discuss our research on what makes data reusable in practice. This is
informed by a synthesis of literature on the topic, our studies on how
people evaluate and make sense of data, and a case study on datasets on
GitHub. In the case study, we describe a corpus of more than 1.4 million
data files from over 65,000 repositories. Building on reuse features from
the literature, we use GitHub’s engagement metrics as proxies for dataset
reuse and devise an initial model, using deep neural networks, to predict a
dataset’s reusability. This demonstrates the practical gap between
principles and actionable insights that might allow data publishers and
tool designers to implement functionalities that facilitate reuse.
We hope you can join us!

Warm regards,
Emily


-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/LR4Y4E5XH5HP7AVSUOJC5CWLQ37YBRTV/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Help us review the second round of Research Fund proposals

2023-01-18 Thread Emily Lescak
Hello,

In September, we announced [1] our second call for proposals to the
Wikimedia Research Fund [2]. Our submission deadline was December 16. We
are now in the exciting phase of reviewing submissions and making
recommendations for which proposals to advance to Stage II [3] and we
welcome your input.

We are using a two-phase review process consisting of a technical review
conducted by researchers and an open community process on Meta-Wiki [4]
before advancing to the next stages. On Meta-Wiki, you can read the
proposals under consideration and leave comments using our feedback form
linked from every proposal page.

We will review feedback provided by January 27th (23:59 AoE). If you have
any questions, please contact us at research_f...@wikimedia.org.

Thank you for your time.

Emily, on behalf of the Research Fund Organizing Committee [5]

[1]
<https://diff.wikimedia.org/2021/11/03/launch-of-the-wikimedia-research-fund/>
https://lists.wikimedia.org/hyperkitty/list/wiki-researc...@lists.wikimedia.org/thread/AIEGBXGQVGMQSCCQ6KOICGBTFHVQ4HSC/

[2]
<https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund#Wikimedia_Research_Fund>
https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund/Wikimedia_Research_Fund


[3]
<https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund>
https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund/Wikimedia_Research_Fund#How_we_fund


[4]
<https://meta.wikimedia.org/wiki/Category:Proposed_Wikimedia_Research_Fund_applications_in_FY_2021-22>
https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund/Wikimedia_Research_Fund#Review_submissions


[5]
<https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund#Organizing_Committee>
https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund/Wikimedia_Research_Fund#Organizing_Committee



-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/53R3HQMQMMLSEBL5RFWEVPK6C322BLRG/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] [Wikimedia Research Showcase] January 18

2023-01-12 Thread Emily Lescak
Hello everyone,

The next Research Showcase, focused on Editor Retention, will be
live-streamed Wednesday, January 18. Find your local time here
<https://zonestamp.toolforge.org/1674063059>.

YouTube stream: https://www.youtube.com/watch?v=gS8ELcVZ8Q4

You can join the conversation on IRC at #wikimedia-research. You can also
watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

This month's presentations:
Vital Signsː Measuring Wikipedia Communities’ HealthBy *Cristian Consonni,
Eurecat - Centre Tecnològic de Catalunya, Barcelona*Community health in
Wikipedia is a complex topic that has been at the center of discussion for
Wikipedia and the scientific community for years. Researchers observed that
the number of active editors for the largest Wikipedias started declining
after an initial phase of exponential growth. Some media outlets picked
this fact as a death announcement for the project, but the news of
Wikipedia's death turned out to be greatly exaggerated. However, it remains
true that researchers and community activists need to understand how to
measure community health and describe it more accurately. In this
presentation, we would like to go beyond the traditional metrics used to
describe the status of the community. We propose the creation of 6 sets of
language-independent indicators that we call "Vital Signs." We borrow the
analogy from the medical field, as these indicators represent a first step
in defining the health status of a community; they can constitute a
valuable reference point to foresee and prevent future risks. We present
our analysis for several Wikipedia language editions, showing that
communities renew their productive force even with stagnating absolute
numbers; we observe a general need for renewal in positions related to
particular functions or administratorship. We created a dashboard to
visualize all the indicators we have computed and hope that the communities
will find it helpful for improving their health.

   - Paperː Community Vital Signs: Measuring Wikipedia Communities’
   Sustainable Growth and Renewal
   
<https://meta.wikimedia.org/wiki/File:Community_Vital_Signs_Research_Paper_-_Miquel_Laniado_Consonni.pdf>


Learning to Predict the Departure Dynamics of Wikidata EditorsBy *Guangyuan
Piao, Maynooth University*Wikidata as one of the largest open collaborative
knowledge bases has drawn much attention from researchers and practitioners
since its launch in 2012. As it is collaboratively developed and maintained
by a community of a great number of volunteer editors, understanding and
predicting the departure dynamics of those editors are crucial but have not
been studied extensively in previous works. In this paper, we investigate
the synergistic effect of two different types of features: statistical and
pattern-based ones with DeepFM as our classification model which has not
been explored in a similar context and problem for predicting whether a
Wikidata editor will stay or leave the platform. Our experimental results
show that using the two sets of features with DeepFM provides the best
performance regarding AUROC (0.9561) and F1 score (0.8843), and achieves
substantial improvement compared to using either of the sets of features
and over a wide range of baselines.

   - Paperː Learning to Predict the Departure Dynamics of Wikidata Editors
   <https://parklize.github.io/publications/ISWC2021.pdf>



-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/L2PJZ3PAXWQ2MPUU4KBFHVQBEG3ZKVYL/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] [Wikimedia Research Showcase] December 14

2022-12-09 Thread Emily Lescak
Hi all,

The next Research Showcase will be live-streamed next Wednesday, December
14. Find your local time here <https://zonestamp.toolforge.org/1671039024>.

The title of the Showcase is, 'A year in review from the WMF Research team:
Tying our work to the research community.'

The Wikimedia Research community is key to tackling the many strategic
challenges of the Wikimedia movement. As we are ending the year, the
Research team will reflect on why working with the community is important
to us. We will share the initiatives, tools, and resources developed
throughout 2022 to bring the community together, facilitate researchers’
contributions to the Wikimedia projects, and encourage a diversity of
research questions.

YouTube stream: https://www.youtube.com/watch?v=a0ss9ckUlvQ

You can join the conversation on IRC at #wikimedia-research. You can also
watch our past Showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

Warm regards,

Emily

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/G6URAIYRGZODDGQWHFFUSL27TUK2J3RG/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] [Wikimedia Research Showcase] November 16

2022-11-09 Thread Emily Lescak
Hello everyone,

The next Research Showcase will be live-streamed Wednesday, November 16, at
9:30 AM PST/16:30 UTC. Find your local time here
<https://zonestamp.toolforge.org/1668619830>.

YouTube stream: https://www.youtube.com/watch?v=sFanZoHjUnY

Members of the Research team will collect questions on IRC at
#wikimedia-research and YouTube.

This month's theme is 'Libraries and Wikipedia Knowledge.'

In the first talk, Laurie Bridges (Oregon State University) and Michael
David Miller (McGill University) will co-present on Wikipedia and Academic
Libraries.

Abstract: In 2021 an open-access edited book, Wikipedia and Academic
Libraries: A Global Project <https://doi.org/10.3998/mpub.11778416>, was
published, featuring 20 chapters from over 50 authors. In this
presentation, Laurie Bridges, one of the co-editors, will discuss the
process for creating and publishing an OA-edited book. Michael David
Miller, one of the chapter authors, will discuss his chapter about
contributions to local Québécois LGBTQ+ content in Francophone Wikipedia.


The second talk will be on Ethical Considerations of Including Gender
Information in Open Knowledge Platforms, presented by Nerissa Lindsey (San
Diego State University).

Abstract: In recent years, galleries, libraries, archives, and museums
(GLAMs) have sought to leverage open knowledge platforms such as Wikidata
to highlight or provide more visibility for traditionally marginalized
groups and their work, collections, or contributions. Efforts like Art +
Feminism, local edit-a-thons, and, more recently, GLAM institution-led
projects have promoted open knowledge initiatives to a broader audience of
participants. One such open knowledge project, the Program for Cooperative
Cataloging (PCC) Wikidata Pilot, has brought together over seventy GLAM
organizations to contribute linked open data for individuals associated
with their institutions, collections, or archives. However, these projects
have brought up ethical concerns around including potentially sensitive
personal demographic information, such as gender identity, sexual
orientation, race, and ethnicity, in entries in an open knowledge base
about living persons. GLAM institutions are thus in a position of balancing
open access with ethical cataloging, which should include adhering to the
personal preferences of the individuals whose data is being shared. People
working in libraries and archives have been increasingly focusing their
energies on issues of diversity, equity, and inclusion in their descriptive
practices, including remediating legacy data and addressing biased
language. Moving this work into a more public sphere and scaling up in
volume creates potential risks to the individuals being described. While
adding demographic information on living people to open knowledge bases has
the potential to enhance, highlight, and celebrate diversity, it could also
potentially be used to the detriment of the subjects through surveillance
and targeting activities. In our research we investigated the changing role
of metadata and open knowledge in addressing, or not addressing, issues of
under- and misrepresentation, especially as they pertain to gender identity
as described in the sex or gender property in Wikidata. We reported our
findings from a survey investigating how organizations participating in
open knowledge projects are addressing ethical concerns around including
personal demographic information as part of their projects, including what,
if any, policies they have implemented and what implications these
activities may have for the living people being described.

You can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

We hope you can join us!

Warm regards,

Emily, on behalf of the WMF Research team

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/BBEZBGNZFVMGVTA4C2KQT4D3URXYJFNL/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] [Wikimedia Research Showcase] October 19

2022-10-17 Thread Emily Lescak
Hello everyone,

The next Research Showcase will be live-streamed Wednesday, October 19, at
9:30 AM PST/16:30 UTC. Find your local time here
<https://zonestamp.toolforge.org/1666197004>.

YouTube stream: https://www.youtube.com/watch?v=ML-ULyARpU4

Members of the Research team will collect questions on IRC at
#wikimedia-research and YouTube.

This month's presentation is a panel discussion celebrating Wikidata's 10th
birthday!

October 2022 marks the tenth anniversary of the launch of Wikidata (
www.wikidata.org). In ten years, this project has become the largest
community-driven free knowledge graph in the world, enabling a common
knowledge base for Wikimedia projects. The language-independent nature of
Wikidata has greatly improved the maintenance and consistency of knowledge
across Wikipedia language editions, fostering knowledge equity in
Wikimedia. In addition, since Wikidata is a collaborative project that can
be read and edited by humans and machines alike, it is also widely used in
third-party applications delivering knowledge as a service for all. The
Wikimedia Research community has devoted significant effort and resources
in studying the foundations, capabilities and applications of Wikidata,
from the complex requirements of representing real-world knowledge in a
multilingual environment to the needs to assess the quality of data and
sources in Wikidata. To learn more about the state of the art of Wikidata
and research challenges in the era of AI/ML, we will celebrate this tenth
anniversary with a panel that will bring together established
researchers/practitioners in this field.

The panel will be moderated by Denny Vrandečić (WMF) with panelists Lydia
Pintscher (WMDE), Elena Simperl (King's College London), Katherine Thornton
(Yale), and Markus Krötzsch (Technical University of Dresden).

You can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

We hope you can join us!

Warm regards,

Emily, on behalf of the WMF Research team

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/YXVNLVDZ7CML424BHQBE42JWQ76QLZUV/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Wikimedia Research Showcase July 20

2022-07-14 Thread Emily Lescak
Hi all,

The next Research Showcase, featuring the recipients of this year's
Wikimedia Foundation Research Awards of the Year, will be live-streamed
Wednesday, July 20, at 9:30 AM PST/16:30 UTC. Find your local time here
<https://zonestamp.toolforge.org/1658334607>.

YouTube stream: https://www.youtube.com/watch?v=KMvXOQU5fX4
<https://www.youtube.com/watch?v=KMvXOQU5fX4>

You are welcome to ask questions via YouTube chat or on IRC at
#wikimedia-research.

This month's presentations:
Wikipedia-based Image Text Dataset for Multimodal Multilingual Machine
LearningBy *Krishna Srinivasan (Google)*The milestone improvements brought
about by deep representation learning and pre-training techniques have led
to large performance gains across downstream NLP, IR and Vision tasks.
Multimodal modeling techniques aim to leverage large high-quality
visio-linguistic datasets for learning complementary information across
image and text modalities. In this talk, I introduce the Wikipedia-based
Image Text (WIT) Dataset to better facilitate multimodal, multilingual
learning. WIT is composed of a curated set of 37.5 million entity rich
image-text examples with 11.5 million unique images across 108 Wikipedia
languages.

WIT’s unique advantages include: WIT is the largest multimodal dataset by
the number of image-text examples by 3x (at the time of writing). WIT is
massively multilingual (first of its kind) with coverage over 100+
languages. WIT represents a more diverse set of concepts and real world
entities relative to what previous datasets cover.

WIT Dataset is available for download and use via a Creative Commons
license here: https://github.com/google-research-datasets/wit

I conclude the talk with future directions to expand and extend the WIT
dataset. Link to paperː https://arxiv.org/pdf/2103.01913.pdf
Assessing the Quality of Sources in Wikidata Across LanguagesBy *Gabriel
Amaral (King's College London)*Wikidata is one of the most important
sources of structured data on the web, built by a worldwide community of
volunteers. As a secondary source, its contents must be backed by credible
references; this is particularly important as Wikidata explicitly
encourages editors to add claims for which there is no broad consensus, as
long as they are corroborated by references. Nevertheless, despite this
essential link between content and references, Wikidata’s ability to
systematically assess and assure the quality of its references remains
limited. To this end, we carry out a mixed-methods study to determine the
relevance, ease of access, and authoritativeness of Wikidata references, at
scale and in different languages, using online crowdsourcing, descriptive
statistics, and machine learning. The findings help us ascertain the
quality of references in Wikidata, and identify common challenges in
defining and capturing the quality of user-generated multilingual
structured data on the web. Link to paperː
https://dl.acm.org/doi/abs/10.1145/3484828

You can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase


Emily, on behalf of the Research team

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/2UVESG4FRYOP5QENHFPA556H2UC5E5VG/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research Office Hours July 5, 2022

2022-06-27 Thread Emily Lescak
Hi all,


Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours Tuesday, 2022-07-05. Find your local time here
<https://zonestamp.toolforge.org/1657036800>.

To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3]. You are welcome to add questions / items to the etherpad in advance,
or when you arrive at the session. Even if you are unable to attend the
session, you can leave a question that we can address asynchronously. If
you do not have a specific agenda item, you are welcome to hang out and
enjoy the conversation. More detailed information (e.g., about how to
attend) can be found here [4].

Through these office hours, we aim to make ourselves available to answer
research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
with:

   -

   You have a specific research related question that you suspect you
   should be able to answer with the publicly available data and you don’t
   know how to find an answer for it, or you just need some more help with it.
   For example, how can I compute the ratio of anonymous to registered editors
   in my wiki?
   -

   You run into repetitive or very manual work as part of your Wikimedia
   contributions and you wish to find out if there are ways to use machines to
   improve your workflows. These types of conversations can sometimes be
   harder to find an answer for during an office hour. However, discussing
   them can help us understand your challenges better and we may find ways to
   work with each other to support you in addressing it in the future.
   -

   You want to learn what the Research team at the Wikimedia Foundation
   does and how we can potentially support you. Specifically for affiliates:
   if you are interested in building relationships with the academic
   institutions in your country, we would love to talk with you and learn
   more. We have a series of programs that aim to expand the network of
   Wikimedia researchers globally and we would love to collaborate with those
   of you interested more closely in this space.
   -

   You want to talk with us about one of our existing programs [5].


Hope to see many of you,

Emily, on behalf of the WMF Research Team

[1] https://research.wikimedia.org

[2] https://meet.jit.si/WMF-Research-Office-Hours

[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours

[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours

[5] https://research.wikimedia.org/projects.html

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/HN45PNQICAMLUR3XDWOSKSPS7RIPR5G3/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Wikimedia Research Showcase June 15

2022-06-08 Thread Emily Lescak
Hi all,

The next Research Showcase, *Wikipedia's Languages*, will be live-streamed
Wednesday, June 15, at 4:00 AM PST/11:00 AM UTC. View your local time here
<https://zonestamp.toolforge.org/1655290800>.

YouTube stream: https://www.youtube.com/watch?v=AZQM1dtn3g0

You are welcome to ask questions via YouTube chat or on IRC at
#wikimedia-research.

This month's presentations:
Quantifying knowledge synchronisation in the 21st centuryBy *Jisung Yoon
(Pohang University of Science and Technology)*Humans acquire and accumulate
knowledge through language usage and eagerly exchange their knowledge for
advancement. Although geographical barriers had previously limited
communication, the emergence of information technology has opened new
avenues for knowledge exchange. However, it is unclear which communication
pathway is dominant in the 21st century. Here, we explore the dominant path
of knowledge diffusion in the 21st century using Wikipedia, the largest
communal dataset. We evaluate the similarity of shared knowledge between
population groups, distinguished based on their language usage. When
population groups are more engaged with each other, their knowledge
structure is more similar, where engagement is indicated by socio-economic
connections, such as cultural, linguistic, and historical features.
Moreover, geographical proximity is no longer a critical requirement for
knowledge dissemination. Furthermore, we integrate our data into a
mechanistic model to better understand the underlying mechanism and suggest
that the knowledge "Silk Road" of the 21st century is based online.


The Language Geography of WikipediaBy *Martin Dittus*Every language is a
system of being, doing, knowing, and imagining. With over 7,000 active
languages in the world, how many languages are fully represented online? To
answer this question, digital non-profit Whose Knowledge? initiated the
first ever report on the State of the Internet's Languages. As part of this
report, Martin Dittus and Mark Graham have investigated the languages of
Wikipedia. Wikipedia began with a single English-language edition more than
two decades ago, and now offers more than 300 language editions, which
places it at the forefront of digital language support. However, this does
not mean that speakers of these languages get access to the same content:
Wikipedia’s language editions vary widely in scale. We further find that
this inequality is also reflected in Wikipedia’s geographic coverage: not
all places are captured in every language. Wikipedia's coverage often
follows the global distribution of speakers of the respective language. Yet
even when we account for the distribution of language populations, certain
language communities are much more strongly represented on Wikipedia than
others. As a consequence, we find that for many countries in Africa,
Central and South America, and South Asia, most of the content about those
countries is in a foreign language, often a European-colonial language. In
other words, in many of these places, people may need to be able to speak a
second (possibly foreign) language in order to access Wikipedia information
about their own places. Why do we see these differences? And what can be
done to improve things?

You can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase


Emily, on behalf of the Research team

--
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/MX52TEGU5MFHKT2AYJ524V7FCRRI3JCJ/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research Office Hours June 7, 2022

2022-06-01 Thread Emily Lescak
Hi all,


Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours Tuesday, 2022-06-07. Find your local time here
<https://zonestamp.toolforge.org/1654642800>.

To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3]. You are welcome to add questions / items to the etherpad in advance,
or when you arrive at the session. Even if you are unable to attend the
session, you can leave a question that we can address asynchronously. If
you do not have a specific agenda item, you are welcome to hang out and
enjoy the conversation. More detailed information (e.g., about how to
attend) can be found here [4].

Through these office hours, we aim to make ourselves available to answer
research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
with:

   -

   You have a specific research related question that you suspect you
   should be able to answer with the publicly available data and you don’t
   know how to find an answer for it, or you just need some more help with it.
   For example, how can I compute the ratio of anonymous to registered editors
   in my wiki?
   -

   You run into repetitive or very manual work as part of your Wikimedia
   contributions and you wish to find out if there are ways to use machines to
   improve your workflows. These types of conversations can sometimes be
   harder to find an answer for during an office hour. However, discussing
   them can help us understand your challenges better and we may find ways to
   work with each other to support you in addressing it in the future.
   -

   You want to learn what the Research team at the Wikimedia Foundation
   does and how we can potentially support you. Specifically for affiliates:
   if you are interested in building relationships with the academic
   institutions in your country, we would love to talk with you and learn
   more. We have a series of programs that aim to expand the network of
   Wikimedia researchers globally and we would love to collaborate with those
   of you interested more closely in this space.
   -

   You want to talk with us about one of our existing programs [5].


Hope to see many of you,

Emily, on behalf of the WMF Research Team

[1] https://research.wikimedia.org

[2] https://meet.jit.si/WMF-Research-Office-Hours

[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours

[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours

[5] https://research.wikimedia.org/projects.html

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/74APMWMEM4B6VF6E6MSV5T5L4VSFAHHF/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Wikimedia Research Showcase May 18

2022-05-11 Thread Emily Lescak
Hello everyone,

The next Research Showcase, *Gaps and Biases in Wikipedia*, will be
live-streamed Wednesday, May 18, at 9:30 AM PST/16:30 UTC. View your local
time here <https://zonestamp.toolforge.org/1652891400>.

YouTube stream: https://www.youtube.com/watch?v=Q8FlunZ0mH4

You are welcome to ask questions via YouTube chat or on IRC at
#wikimedia-research.

This month's presentations:

Ms. Categorized: Gender, notability, and inequality on Wikipedia

By Francesca Tripodi (University of North Carolina at Chapel Hill)

For the last five decades, sociologists have argued that gender is one of
the most pervasive and insidious forms of inequality. Research demonstrates
how these inequalities persist on Wikipedia - arguably the largest
encyclopedic reference in existence. Roughly eighty percent of Wikipedia's
editors are men and pages about women and women's interests are
underrepresented. English language Wikipedia contains more than 1.5 million
biographies about notable writers, inventors, and academics, but less than
nineteen percent of these biographies are about women. To try and improve
these statistics, activists host “edit-a-thons” to increase the visibility
of notable women. While this strategy helps create several biographies
previously inexistent, it fails to address a more inconspicuous form of
gender exclusion. Drawing on ethnographic observations, interviews, and
quantitative analysis of web-scraped metadata this talk demonstrates that
women’s biographies are more frequently considered non-notable and
nominated for deletion compared to men’s biographies. This disproportionate
rate is another dimension of gender inequality on Wikipedia previously
unexplored by social scientists and provides broader insights into how
women’s achievements are (under)valued in society.

Controlled Analyses of Social Biases in Wikipedia Bios

By Yulia Tsvetkov (University of Washington)

Social biases on Wikipedia could greatly influence public opinion.
Wikipedia is also a popular source of training data for NLP models, and
subtle biases in Wikipedia narratives are liable to be amplified in
downstream NLP models. In this talk I'll present two approaches to
unveiling social biases in how people are described on Wikipedia, across
demographic attributes and across languages. First, I'll present a
methodology that isolates dimensions of interest (e.g., gender), from other
attributes (e.g., occupation). This methodology allows us to quantify
systemic differences in coverage of different genders and races, while
controlling for confounding factors. Next, I'll show an NLP case study that
uses this methodology in combination with people-centric sentiment analysis
to identify disparities in Wikipedia bios of members of the LGBTQIA+
community across three languages: English, Russian, and Spanish. Our
results surface cultural differences in narratives and signs of social
biases. Practically, these methods can be used to automatically identify
Wikipedia articles for further manual analysis—articles that might contain
content gaps or an imbalanced representation of particular social groups.


You can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase


Emily, on behalf of the Research team

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/5IRR5N34IEZKIR3TQPZEMREKZHN4ZXKR/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research Office Hours May 3, 2022

2022-04-26 Thread Emily Lescak
Hi all,


Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours Tuesday, 2022-05-03. Find your local time here
<https://zonestamp.toolforge.org/1651593600>.

To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3]. You are welcome to add questions / items to the etherpad in advance,
or when you arrive at the session. Even if you are unable to attend the
session, you can leave a question that we can address asynchronously. If
you do not have a specific agenda item, you are welcome to hang out and
enjoy the conversation. More detailed information (e.g., about how to
attend) can be found here [4].

Through these office hours, we aim to make ourselves available to answer
research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
with:

   -

   You have a specific research related question that you suspect you
   should be able to answer with the publicly available data and you don’t
   know how to find an answer for it, or you just need some more help with it.
   For example, how can I compute the ratio of anonymous to registered editors
   in my wiki?
   -

   You run into repetitive or very manual work as part of your Wikimedia
   contributions and you wish to find out if there are ways to use machines to
   improve your workflows. These types of conversations can sometimes be
   harder to find an answer for during an office hour. However, discussing
   them can help us understand your challenges better and we may find ways to
   work with each other to support you in addressing it in the future.
   -

   You want to learn what the Research team at the Wikimedia Foundation
   does and how we can potentially support you. Specifically for affiliates:
   if you are interested in building relationships with the academic
   institutions in your country, we would love to talk with you and learn
   more. We have a series of programs that aim to expand the network of
   Wikimedia researchers globally and we would love to collaborate with those
   of you interested more closely in this space.
   -

   You want to talk with us about one of our existing programs [5].


Hope to see many of you,

Emily, on behalf of the WMF Research Team

[1] https://research.wikimedia.org

[2] https://meet.jit.si/WMF-Research-Office-Hours

[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours

[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours

[5] https://research.wikimedia.org/projects.html



-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/4Z36WIDMBEAV7X4X3OO32BXY4RZX4DRW/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research Office Hours April 5, 2022

2022-03-30 Thread Emily Lescak
Hi all,


Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours this Tuesday, 2022-04-05. Find your local time here
<https://zonestamp.toolforge.org/1649199600>.

To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3]. You are welcome to add questions / items to the etherpad in advance,
or when you arrive at the session. Even if you are unable to attend the
session, you can leave a question that we can address asynchronously. If
you do not have a specific agenda item, you are welcome to hang out and
enjoy the conversation. More detailed information (e.g., about how to
attend) can be found here [4].

Through these office hours, we aim to make ourselves available to answer
research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
with:

   -

   You have a specific research related question that you suspect you
   should be able to answer with the publicly available data and you don’t
   know how to find an answer for it, or you just need some more help with it.
   For example, how can I compute the ratio of anonymous to registered editors
   in my wiki?
   -

   You run into repetitive or very manual work as part of your Wikimedia
   contributions and you wish to find out if there are ways to use machines to
   improve your workflows. These types of conversations can sometimes be
   harder to find an answer for during an office hour. However, discussing
   them can help us understand your challenges better and we may find ways to
   work with each other to support you in addressing it in the future.
   -

   You want to learn what the Research team at the Wikimedia Foundation
   does and how we can potentially support you. Specifically for affiliates:
   if you are interested in building relationships with the academic
   institutions in your country, we would love to talk with you and learn
   more. We have a series of programs that aim to expand the network of
   Wikimedia researchers globally and we would love to collaborate with those
   of you interested more closely in this space.
   -

   You want to talk with us about one of our existing programs [5].


To improve the impact and accessibility of our sessions, we invite you to
share your feedback in a brief optional survey [6]. We estimate that it
will take about 5-10 minutes to complete. We welcome your input even if you
have not attended Office Hours. If you prefer to not respond via Google
form, you can provide your feedback via email. We will accept responses
until April 15, 2022.


Hope to see many of you,
Emily on behalf of the WMF Research Team

[1] https://research.wikimedia.org

[2] https://meet.jit.si/WMF-Research-Office-Hours

[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours

[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours

[5] https://research.wikimedia.org/projects.html

[6] https://forms.gle/Y5zJ7gunk4RvqvJX8

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/AWS4SV3TQUS4CZMOB6YH3ML5AIZ6WOEZ/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] [Wikimedia Research Showcase] March 16

2022-03-10 Thread Emily Lescak
Hi all,

The next Research Showcase will be live-streamed Wednesday, March 16 at
6:30AM PT / 13:30 UTC. Find your local time here:
https://zonestamp.toolforge.org/1647437436.

The theme is: Patterns and dynamics of article quality.

YouTube stream: https://www.youtube.com/watch?v=o5e6S7ac4q4

You can join the conversation on IRC at #wikimedia-research. You can also
watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase.

The Showcase will feature the following talks:
Quality monitoring in Wikipedia - A computational perspectiveBy *Animesh
Mukherjee <https://cse.iitkgp.ac.in/~animeshm/> (Indian Institute of
Technology, Kharagpur)*In this talk, I shall summarize our five-year long
research highlights concerning Wikipedia. In particular, I shall deep dive
into two of our recent works; while the first one attempts to understand
the early indications of which editors would soon go "missing" (aka missing
editors) [1], the second one investigates how the quality of a Wikipedia
article transitions over time and whether computational models could be
built to understand the characteristics of future transitions [2]. In each
case, I will present a suite of key results and the main insights that we
obtained thereof.[1] When expertise gone missing: Uncovering the loss of
prolific contributors in Wikipedia
<https://link.springer.com/chapter/10.1007/978-3-030-91669-5_23>, ICADL
2021 (pdf <https://arxiv.org/pdf/2109.09979>)[2] Quality Change: norm or
exception? Measurement, Analysis and Detection of Quality Change in
Wikipedia <https://arxiv.org/abs/2111.01496>, CSCW 2022 (pdf
<https://arxiv.org/pdf/2111.01496>)


Automatically Labeling Low Quality Content on Wikipedia by Leveraging
Editing BehaviorsBy *Sumit Asthana <http://sumitasthana.xyz/> (University
of Michigan, Ann Arbor)*Wikipedia articles aim to be definitive sources of
encyclopedic content. Yet, only 0.6% of Wikipedia articles have high
quality according to its quality scale due to insufficient number of
Wikipedia editors and enormous number of articles. Supervised Machine
Learning (ML) quality improvement approaches that can automatically
identify and fix content issues rely on manual labels of individual
Wikipedia sentence quality. However, current labeling approaches are
tedious and produce noisy labels. In this talk, I will discuss an automated
labeling approach that identifies the semantic category (e.g., adding
citations, clarifications) of historic Wikipedia edits and uses the
modified sentences prior to the edit as examples that require that semantic
improvement. Highest-rated article sentences are examples that no longer
need semantic improvements. I will discuss the performance of models
training with this labeling approach over models trained with existing
labeling approaches, and also the implications of such a large scale semi
supervised labeling approach in capturing the editing practices of
Wikipedia editors and helping them improve articles faster.Related
paper: Automatically
Labeling Low Quality Content on Wikipedia By Leveraging Patterns in Editing
Behaviors <https://dl.acm.org/doi/10.1145/3479503>, CSCW 2021 (pdf
<https://arxiv.org/pdf/2108.02252>)

--
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/7CL45CQQEPUV7J6OV3Q665ABP3CX4YOO/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research Office Hours March 1, 2022

2022-02-24 Thread Emily Lescak
Hi all,


Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours this Tuesday, 2022-03-01, at 12:00-13:00 UTC (4:00 PT / 7:00
ET / 13:00 CET). Find your local time here
<https://zonestamp.toolforge.org/1646136000>.

To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3]. You are welcome to add questions / items to the etherpad in advance,
or when you arrive at the session. Even if you are unable to attend the
session, you can leave a question that we can address asynchronously. If
you do not have a specific agenda item, you are welcome to hang out and
enjoy the conversation. More detailed information (e.g. about how to
attend) can be found here [4].

Through these office hours, we aim to make ourselves more available to
answer research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
with:

   -

   You have a specific research related question that you suspect you
   should be able to answer with the publicly available data and you don’t
   know how to find an answer for it, or you just need some more help with it.
   For example, how can I compute the ratio of anonymous to registered editors
   in my wiki?
   -

   You run into repetitive or very manual work as part of your Wikimedia
   contributions and you wish to find out if there are ways to use machines to
   improve your workflows. These types of conversations can sometimes be
   harder to find an answer for during an office hour. However, discussing
   them can help us understand your challenges better and we may find ways to
   work with each other to support you in addressing it in the future.
   -

   You want to learn what the Research team at the Wikimedia Foundation
   does and how we can potentially support you. Specifically for affiliates:
   if you are interested in building relationships with the academic
   institutions in your country, we would love to talk with you and learn
   more. We have a series of programs that aim to expand the network of
   Wikimedia researchers globally and we would love to collaborate with those
   of you interested more closely in this space.
   -

   You want to talk with us about one of our existing programs [5].


Hope to see many of you,
Emily on behalf of the WMF Research Team

[1] https://research.wikimedia.org

[2] https://meet.jit.si/WMF-Research-Office-Hours

[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours

[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours

[5] https://research.wikimedia.org/projects.html

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/KYIITZB3BRQ45JR2Q7THJ63NLNI4JSIO/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] [Wikimedia Research Showcase] February 16 at 9:30 PT, 17:30 UTC

2022-02-10 Thread Emily Lescak
Hi all,

The next Research Showcase will be live-streamed next Wednesday, February
16, at 9:30 PT/17:30 UTC. The theme is: Collective Attention in Wikipedia.

YouTube stream: https://www.youtube.com/watch?v=bg2aE2m08Qo

As usual, you can join the conversation on IRC at #wikimedia-research. You
can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

The Showcase will feature the following talks:
Modeling Collective Anticipation and Response on WikipediaBy *Renaud
Lambiotte <https://www.maths.ox.ac.uk/people/renaud.lambiotte> (University
of Oxford)*The dynamics of popularity in online media are driven by a
combination of endogenous spreading mechanisms and response to exogenous
shocks including news and events. However, little is known about the
dependence of temporal patterns of popularity on event-related information,
e.g. which types of events trigger long-lasting activity. Here we propose a
simple model that describes the dynamics around peaks of popularity by
incorporating key features, i.e., the anticipatory growth and the decay of
collective attention together with circadian rhythms. The proposed model
allows us to develop a new method for predicting the future page view
activity and for clustering time series. To validate our methodology, we
collect a corpus of page view data from Wikipedia associated to a range of
planned events, that are events which we know in advance will have a fixed
date in the future, such as elections and sport events. Our methodology is
superior to existing models in both prediction and clustering tasks.
Furthermore, restricting to Wikipedia pages associated to association
football, we observe that the specific realization of the event, in our
case which team wins a match or the type of the match, has a significant
effect on the response dynamics after the event. Our work demonstrates the
importance of appropriately modeling all phases of collective attention, as
well as the connection between temporal patterns of attention and
characteristic underlying information of the events they represent.


Sudden Attention Shifts on Wikipedia During the COVID-19 CrisisBy *Kristina
Gligorić <https://kristinagligoric.github.io/> (EPFL)*We study how the
COVID-19 pandemic, alongside the severe mobility restrictions that ensued,
has impacted information access on Wikipedia, the world’s largest online
encyclopedia. A longitudinal analysis that combines pageview statistics for
12 Wikipedia language editions with mobility reports published by Apple and
Google reveals massive shifts in the volume and nature of information
seeking patterns during the pandemic. Interestingly, while we observe a
transient increase in Wikipedia’s pageview volume following mobility
restrictions, the nature of information sought was impacted more
permanently. These changes are most pronounced for language editions
associated with countries where the most severe mobility restrictions were
implemented. We also find that articles belonging to different topics
behaved differently; e.g., attention towards entertainment-related topics
is lingering and even increasing, while the interest in health- and
biology-related topics was either small or transient. Our results highlight
the utility of Wikipedia for studying how the pandemic is affecting
people’s needs, interests, and concerns.
-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/CJ2HG2VYMLNYXHNR74DOFAGL4FPUSVK6/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research February Office Hours

2022-01-28 Thread Emily Lescak
Hi all,


Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours this Wednesday, 2022-02-02 at 00:00-1:00 UTC (16:00 PT 02-01 /
19:00 ET 02-01 / 1:00 CET 02-02). Find your local date and time here
<https://zonestamp.toolforge.org/1643760056>.

To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3]. If you do not have a specific agenda item, you are welcome to hang out
and enjoy the conversation. More detailed information (e.g., about how to
attend) can be found here [4].

Through these office hours, we aim to make ourselves more available to
answer research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
with:

   -

   You have a specific research related question that you suspect you
   should be able to answer with the publicly available data and you don’t
   know how to find an answer for it, or you just need some more help with it.
   For example, how can I compute the ratio of anonymous to registered editors
   in my wiki?
   -

   You run into repetitive or very manual work as part of your Wikimedia
   contributions and you wish to find out if there are ways to use machines to
   improve your workflows. These types of conversations can sometimes be
   harder to find an answer for during an office hour. However, discussing
   them can help us understand your challenges better and we may find ways to
   work with each other to support you in addressing it in the future.
   -

   You want to learn what the Research team at the Wikimedia Foundation
   does and how we can potentially support you. Specifically for affiliates:
   if you are interested in building relationships with the academic
   institutions in your country, we would love to talk with you and learn
   more. We have a series of programs that aim to expand the network of
   Wikimedia researchers globally and we would love to collaborate with those
   of you interested more closely in this space.
   -

   You want to talk with us about one of our existing programs [5].


Hope to see many of you,
Emily on behalf of the WMF Research Team

[1] https://research.wikimedia.org

[2] https://meet.jit.si/WMF-Research-Office-Hours

[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours

[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours

[5] https://research.wikimedia.org/projects.html



-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/R3UTAWBHZ74SPCOPVR57U6MEQCXWP64R/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Help us review the first round of Research Fund proposals

2022-01-19 Thread Emily Lescak
Hello,

In November, we announced [1] our first call for proposals to the Wikimedia
Research Fund [2]. Our submission deadline was January 3rd. We are now in
the exciting phase of reviewing submissions and making recommendations for
which proposals to advance to Stage II [3] and we welcome your input.

We are using a two-phase review process consisting of a technical review
conducted by researchers and an open community process on Meta [4] before
advancing to the next stages. On Meta, you can read the 33 proposals under
consideration from more than 20 countries and leave comments using our
feedback form linked from every proposal page.

The deadline for feedback is February 7th (23:59 AoE). If you have any
questions, please contact us at research_f...@wikimedia.org.

Thank you for your time.

Emily, on behalf of the Research Fund Organizing Committee [5]

[1]
https://diff.wikimedia.org/2021/11/03/launch-of-the-wikimedia-research-fund/

[2]
https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund#Wikimedia_Research_Fund


[3]
https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund


[4]
https://meta.wikimedia.org/wiki/Category:Proposed_Wikimedia_Research_Fund_applications_in_FY_2021-22


[5]
https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund#Organizing_Committee



-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/CYD47DQUGSHDZ33HFF7VNXTVNCRNGS5N/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] [Wikimedia Research Showcase] January 19 at 9:30 AM PST, 17:30 UTC

2022-01-14 Thread Emily Lescak
Hi all,

The next Research Showcase will be live-streamed next Wednesday, January
19, at 9:30 AM PST/17:30 UTC. The theme is: Beyond English Wikipedia.

YouTube stream: https://www.youtube.com/watch?v=PRaCa-v8nfQ

As usual, you can join the conversation on IRC at #wikimedia-research. You
can also watch our past research showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

The Showcase will feature the following talks:
Comparing Language Communities - Characterizing Collaboration in the
English, French and Spanish Language Editions of WikipediaBy *Taryn Bipat
<https://tarynbipat.me/> (Microsoft, formerly University of Washington)*Is
Wikipedia a standardized platform with a common model of collaboration or
is it a set of 312 active language editions with distinct collaborative
models? In the last 20 years, researchers have extensively analyzed the
complexities of group work that enable the creation of quality articles in
the English Wikipedia, but most of our intellectual assumptions about
collaborative practices on Wikipedia remain solely based on an Anglocentric
perspective. This research extends the current Anglocentric body of
literature in human-computer interaction (HCI) and computer-supported
cooperative work (CSCW) through three studies that mutually help build an
understanding of collaboration models in the English (EN), French (FR), and
Spanish (ES) editions of Wikipedia. In the first study, I replicated a
model by Viégas et al. (2007) based on editors' behaviors in the English
Wikipedia. This model was used as a lens to examine collaborative activity
in EN, FR, and ES. In the second study, I leveraged a collaboration model
by Kriplean et al. (2007) that suggested editors used “power plays” – how
groups of editors claim control over article content through the discourse
of Wikipedia policy – in their talk page debates to justify their edits
made on articles. In the third study, I interviewed editors from each
language edition to build a typology of collaborative behavior and further
understand the editor's perceptions of power and authority on Wikipedia.
Understanding Wikipedia Practices Through Hindi, Urdu, and English Takes on
an Evolving Regional ConflictBy *Jacob Thebault-Spieker
<https://jacob.thebault-spieker.com/> (Information School, University of
Wisconsin – Madison)*Wikipedia is the product of thousands of editors
working collaboratively to provide free and up-to-date encyclopedic
information to the project’s users. This article asks to what degree
Wikipedia articles in three languages — Hindi, Urdu, and English — achieve
Wikipedia’s mission of making neutrally-presented, reliable information on
a polarizing, controversial topic available to people around the globe. We
chose the topic of the recent revocation of Article 370 of the Constitution
of India, which, along with other recent events in and concerning the
region of Jammu and Kashmir, has drawn attention to related articles on
Wikipedia. This work focuses on the English Wikipedia, being the preeminent
language edition of the project, as well as the Hindi and Urdu editions.
Hindi and Urdu are the two standardized varieties of Hindustani, a lingua
franca of Jammu and Kashmir. We analyzed page view and revision data for
three Wikipedia articles to gauge popularity of the pages in our corpus,
and responsiveness of editors to breaking news events and problematic
edits. Additionally, we interviewed editors from all three language
editions to learn about differences in editing processes and motivations,
and we compared the text of the articles across languages as they appeared
shortly after the revocation of Article 370. Across languages, we saw
discrepancies in article tone, organization, and the information presented,
as well as differences in how editors collaborate and communicate with one
another. Nevertheless, in Hindi and Urdu, as well as English, editors
predominantly try to adhere to the principle of neutral point of view
(NPOV), and for the most part, the editors quash attempts by other editors
to push political agendas.Best regards,
Emily

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/2IIFRHDTJSNCTRGYE26C5YTOT5L6HSX2/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research Office Hours January 11, 2022

2022-01-10 Thread Emily Lescak
Hi all,


Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours this Tuesday, 2022-01-11, at 12:00-13:00 UTC (4:00 PT / 7:00
ET / 13:00 CET). View your local time here
<https://zonestamp.toolforge.org/1641902452>. Please note the time change!
We are experimenting with our Office hours schedules to make our sessions
more globally welcoming.

To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3]. You are welcome to add questions / items to the etherpad in advance,
or when you arrive at the session. Even if you are unable to attend the
session, you can leave a question that we can address asynchronously. If
you do not have a specific agenda item, you are welcome to hang out and
enjoy the conversation. More detailed information (e.g. about how to
attend) can be found here [4].

Through these office hours, we aim to make ourselves more available to
answer research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
with:

   -

   You have a specific research related question that you suspect you
   should be able to answer with the publicly available data and you don’t
   know how to find an answer for it, or you just need some more help with it.
   For example, how can I compute the ratio of anonymous to registered editors
   in my wiki?
   -

   You run into repetitive or very manual work as part of your Wikimedia
   contributions and you wish to find out if there are ways to use machines to
   improve your workflows. These types of conversations can sometimes be
   harder to find an answer for during an office hour. However, discussing
   them can help us understand your challenges better and we may find ways to
   work with each other to support you in addressing it in the future.
   -

   You want to learn what the Research team at the Wikimedia Foundation
   does and how we can potentially support you. Specifically for affiliates:
   if you are interested in building relationships with the academic
   institutions in your country, we would love to talk with you and learn
   more. We have a series of programs that aim to expand the network of
   Wikimedia researchers globally and we would love to collaborate with those
   of you interested more closely in this space.
   -

   You want to talk with us about one of our existing programs [5].


Hope to see many of you,
Emily on behalf of the WMF Research Team

[1] https://research.wikimedia.org

[2] https://meet.jit.si/WMF-Research-Office-Hours

[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours

[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours

[5] https://research.wikimedia.org/projects.html



-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/Z3IWP4QD4BBMGUTZFYV6CVT74JKPV3LO/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research December Office Hours

2021-12-05 Thread Emily Lescak
Hi all,


Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours this Wednesday, 2021-12-08 at 00:00-1:00 UTC (16:00 PT 12-07 /
19:00 ET 12-07 / 1:00 CET 12-08). Find your local date and time here
<https://zonestamp.toolforge.org/1638921637>. Please note the time change!
We are experimenting with our Office hours schedules to make our sessions
more globally welcoming.

To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3]. You are welcome to add questions / items to the etherpad in advance,
or when you arrive at the session. Even if you are unable to attend, you
can leave a question that we can address asynchronously. If you do not have
a specific agenda item, you are welcome to hang out and enjoy the
conversation. More detailed information (e.g. about how to attend) can be
found here [4].

Through these office hours, we aim to make ourselves more available to
answer research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
with:

   -

   You have a specific research related question that you suspect you
   should be able to answer with the publicly available data and you don’t
   know how to find an answer for it, or you just need some more help with it.
   For example, how can I compute the ratio of anonymous to registered editors
   in my wiki?
   -

   You run into repetitive or very manual work as part of your Wikimedia
   contributions and you wish to find out if there are ways to use machines to
   improve your workflows. These types of conversations can sometimes be
   harder to find an answer for during an office hour. However, discussing
   them can help us understand your challenges better and we may find ways to
   work with each other to support you in addressing it in the future.
   -

   You want to learn what the Research team at the Wikimedia Foundation
   does and how we can potentially support you. Specifically for affiliates:
   if you are interested in building relationships with the academic
   institutions in your country, we would love to talk with you and learn
   more. We have a series of programs that aim to expand the network of
   Wikimedia researchers globally and we would love to collaborate with those
   of you interested more closely in this space.
   -

   You want to talk with us about one of our existing programs [5].


This is also a good opportunity to learn more about the Research Fund [6]!

Hope to see many of you,
Emily on behalf of the WMF Research Team

[1] https://research.wikimedia.org

[2] https://meet.jit.si/WMF-Research-Office-Hours

[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours

[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours

[5] https://research.wikimedia.org/projects.html

[6]
https://meta.wikimedia.org/wiki/Grants:Programs/Wikimedia_Research_%26_Technology_Fund

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/TJPEVGUNNE3GXBWOWKYKZDIM5D5UM5YF/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research Office hours November 2, 2021

2021-10-28 Thread Emily Lescak
Hi all,


Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours this Tuesday, 2021-11-02, at 12:00-13:00 UTC (5am PT/8am
ET/1pm CET). Please note the time change! We are experimenting with our
Office hours schedules to make our sessions more globally welcoming.

To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3]. You are welcome to add questions / items to the etherpad in advance,
or when you arrive at the session. Even if you are unable to attend the
session, you can leave a question that we can address asynchronously. If
you do not have a specific agenda item, you are welcome to hang out and
enjoy the conversation. More detailed information (e.g. about how to
attend) can be found here [4].

Through these office hours, we aim to make ourselves more available to
answer research related questions that you as Wikimedia volunteer editors,
organizers, affiliates, staff, and researchers face in your projects and
initiatives. Here are some example cases we hope to be able to support you
with:

   -

   You have a specific research related question that you suspect you
   should be able to answer with the publicly available data and you don’t
   know how to find an answer for it, or you just need some more help with it.
   For example, how can I compute the ratio of anonymous to registered editors
   in my wiki?
   -

   You run into repetitive or very manual work as part of your Wikimedia
   contributions and you wish to find out if there are ways to use machines to
   improve your workflows. These types of conversations can sometimes be
   harder to find an answer for during an office hour. However, discussing
   them can help us understand your challenges better and we may find ways to
   work with each other to support you in addressing it in the future.
   -

   You want to learn what the Research team at the Wikimedia Foundation
   does and how we can potentially support you. Specifically for affiliates:
   if you are interested in building relationships with the academic
   institutions in your country, we would love to talk with you and learn
   more. We have a series of programs that aim to expand the network of
   Wikimedia researchers globally and we would love to collaborate with those
   of you interested more closely in this space.
   -

   You want to talk with us about one of our existing programs [5].


Hope to see many of you,
Emily on behalf of the WMF Research Team

[1] https://research.wikimedia.org

[2] https://meet.jit.si/WMF-Research-Office-Hours

[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours

[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours

[5] https://research.wikimedia.org/projects.html


-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/2KM53BLGRMSHGPFI23A3EO2UBVYTU3H5/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Announcing the Wikipedia Image / Caption Matching Competition

2021-10-08 Thread Emily Lescak
Hi all,

To help bridge Wikipedia’s visual knowledge gaps, the Research team
<https://research.wikimedia.org/> at the Wikimedia Foundation has launched
the “Wikipedia Image/Caption Matching Competition
<https://www.kaggle.com/c/wikipedia-image-caption>”.

Read on for more information or check out our blog post
<https://diff.wikimedia.org/2021/09/13/the-wikipedia-image-caption-matching-challenge-and-a-huge-release-of-image-data-for-research/>
!

Images are essential for knowledge sharing, learning, and understanding.
However, the majority of images on Wikipedia articles lack written context
(e.g., captions, alt-text), often making them inaccessible. As part of our
initiatives <https://research.wikimedia.org/knowledge-gaps.html> to address
Wikipedia’s knowledge gaps, the Research <https://research.wikimedia.org/>
team at the Wikimedia Foundation is hosting the “Wikipedia Image/Caption
Matching Competition <https://www.kaggle.com/c/wikipedia-image-caption>.”
We invite the communities of volunteers, developers, data scientists, and
machine learning enthusiasts to develop systems that can automatically
associate images with their corresponding captions and article titles.

In this competition (hosted on Kaggle <https://www.kaggle.com/>),
participants are provided with content from Wikipedia articles in 100+
language editions and are asked to build systems that automatically
retrieve the text (an image caption, or an article title) closest to a
query image.The data is a combination of Google AI’s recently released WIT
dataset <https://github.com/google-research-datasets/wit> and a new dataset
of 6 Million images from Wikimedia Commons that we have released
<https://analytics.wikimedia.org/published/datasets/one-off/caption_competition/>
for this competition. Kaggle is hosting all data needed to get started with
the task, example notebooks, a forum for participants to share and
collaborate, and submitted models in open-sourced formats.

We encourage everyone to download our data and participate in the
competition. This challenge is an opportunity for people around the world
to grow their technical skills while increasing the accessibility of
Wikipedia.

This competition is possible thanks to collaborations with Google Research
<https://research.google/>,  EPFL <https://www.epfl.ch/en/>, Naver Labs
Europe <https://europe.naverlabs.com/> and Hugging Face
<https://huggingface.co/>, who assisted with data preparation and
competition design. Check out our blog post
<https://diff.wikimedia.org/2021/09/13/the-wikipedia-image-caption-matching-challenge-and-a-huge-release-of-image-data-for-research/>
for more information! The point of contact for this project is Miriam Redi.
You're welcome to reach out with questions or comments at
mir...@wikimedia.org.

Cheers,

Emily Lescak, on behalf of the Research team

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/HVW5ESJ5BT3MWY76ARWRQQX6NR3UA43Y/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org

[Wikimedia-l] Invitation to Wikimedia Research Office hours October 5, 2021

2021-10-01 Thread Emily Lescak
Hi all,
Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours next Tuesday, 2021-10-05, at 16:00-17:00 UTC (9am PT/6pm
CEST). To participate, join the video-call via this link [2]. There is no
set agenda - feel free to add your item to the list of topics in the
etherpad [3] (You can do this after you join the meeting, too.), otherwise
you are welcome to also just hang out. More detailed information (e.g.
about how to attend) can be found here [4]. Through these office hours, we
aim to make ourselves more available to answer some of the research related
questions that you as Wikimedia volunteer editors, organizers, affiliates,
staff, and researchers face in your projects and initiatives. Some example
cases we hope to be able to support you in: - You have a specific research
related question that you suspect you should be able to answer with the
publicly available data and you don’t know how to find an answer for it, or
you just need some more help with it. For example, how can I compute the
ratio of anonymous to registered editors in my wiki? - You run into
repetitive or very manual work as part of your Wikimedia contributions and
you wish to find out if there are ways to use machines to improve your
workflows. These types of conversations can sometimes be harder to find an
answer for during an office hour, however, discussing them can help us
understand your challenges better and we may find ways to work with each
other to support you in addressing it in the future. - You want to learn
what the Research team at the Wikimedia Foundation does and how we can
potentially support you. Specifically for affiliates: if you are interested
in building relationships with the academic institutions in your country,
we would love to talk with you and learn more. We have a series of programs
that aim to expand the network of Wikimedia researchers globally and we
would love to collaborate with those of you interested more closely in this
space. - You want to talk with us about one of our existing programs [5].
Hope to see many of you, Emily on behalf of the WMF Research Team [1]
https://research.wikimedia.org [2]
https://meet.jit.si/WMF-Research-Office-Hours [3]
https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours [4]
https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours
[5] https://research.wikimedia.org/projects.html

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
___
Wikimedia-l mailing list -- wikimedia-l@lists.wikimedia.org, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
Public archives at 
https://lists.wikimedia.org/hyperkitty/list/wikimedia-l@lists.wikimedia.org/message/O3B5Z4MSWW4XBY6PECN2QDLXE4FD44O5/
To unsubscribe send an email to wikimedia-l-le...@lists.wikimedia.org