[Wiki-research-l] Announcing the Wikipedia Image / Caption Matching Competition

Emily Lescak Fri, 08 Oct 2021 07:18:10 -0700

Hi all,

To help bridge Wikipedia’s visual knowledge gaps, the Research team
<https://research.wikimedia.org/> at the Wikimedia Foundation has launched
the “Wikipedia Image/Caption Matching Competition
<https://www.kaggle.com/c/wikipedia-image-caption>”.


Read on for more information or check out our blog post
<https://diff.wikimedia.org/2021/09/13/the-wikipedia-image-caption-matching-challenge-and-a-huge-release-of-image-data-for-research/>
!

Images are essential for knowledge sharing, learning, and understanding.
However, the majority of images on Wikipedia articles lack written context
(e.g., captions, alt-text), often making them inaccessible. As part of our
initiatives <https://research.wikimedia.org/knowledge-gaps.html> to address
Wikipedia’s knowledge gaps, the Research <https://research.wikimedia.org/>
team at the Wikimedia Foundation is hosting the “Wikipedia Image/Caption
Matching Competition <https://www.kaggle.com/c/wikipedia-image-caption>.”
We invite the communities of volunteers, developers, data scientists, and
machine learning enthusiasts to develop systems that can automatically
associate images with their corresponding captions and article titles.

In this competition (hosted on Kaggle <https://www.kaggle.com/>),
participants are provided with content from Wikipedia articles in 100+
language editions and are asked to build systems that automatically
retrieve the text (an image caption, or an article title) closest to a
query image.The data is a combination of Google AI’s recently released WIT
dataset <https://github.com/google-research-datasets/wit> and a new dataset
of 6 Million images from Wikimedia Commons that we have released
<https://analytics.wikimedia.org/published/datasets/one-off/caption_competition/>
for this competition. Kaggle is hosting all data needed to get started with
the task, example notebooks, a forum for participants to share and
collaborate, and submitted models in open-sourced formats.

We encourage everyone to download our data and participate in the
competition. This challenge is an opportunity for people around the world
to grow their technical skills while increasing the accessibility of
Wikipedia.

This competition is possible thanks to collaborations with Google Research
<https://research.google/>,  EPFL <https://www.epfl.ch/en/>, Naver Labs
Europe <https://europe.naverlabs.com/> and Hugging Face
<https://huggingface.co/>, who assisted with data preparation and
competition design. Check out our blog post
<https://diff.wikimedia.org/2021/09/13/the-wikipedia-image-caption-matching-challenge-and-a-huge-release-of-image-data-for-research/>
for more information! The point of contact for this project is Miriam Redi.
You're welcome to reach out with questions or comments at
mir...@wikimedia.org.

Cheers,

Emily Lescak, on behalf of the Research team

-- 
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
_______________________________________________
Wiki-research-l mailing list -- wiki-research-l@lists.wikimedia.org
To unsubscribe send an email to wiki-research-l-le...@lists.wikimedia.org

[Wiki-research-l] Announcing the Wikipedia Image / Caption Matching Competition

Reply via email to