Re: [Wikimedia-l] [Wikimedia Announcements] Announcing a new Wikimedia project: Abstract Wikipedia

Samuel Klein Wed, 05 Aug 2020 14:02:02 -0700

I applaud this idea. Preferably a language family with a large community of
practice, 'minority' in the sense of coverage and support by modern tools
and scaffolding, not in the sense of limited use.


We used to have a roughly weighted list of major world languages by
(spoken, written; primary, secondary) and how well covered they were by wp
(articles, contributors).  Is there something like that still?

//S

🌍🌏🌎

On Wed., Aug. 5, 2020, 3:19 p.m. C. Scott Ananian, <canan...@wikimedia.org>
wrote:

> Sorry I'm coming to this discussion a bit late, but I'd like to underline a
> slightly different aspect of the concern that Phoebe raised:
>
> > It concerns me that, at least in the high-level project proposals I've
> > seen (I haven't been tracking this closely, and haven't read the academic
> > papers) I have not yet seen discussions of ethical data, or how we might
> > think about identifying bias, or even how to recruit contributors and the
> > impact on existing contributors.
> >
>
> Using the terminology of Ibram X. Kendi (and others), I'd put this as:
> "it's not enough to not be racist, you must actively be *anti-racist*."
>
> Abstract Wikipedia is a "color blind" project.  Indeed it is often
> described as advancing WMF goals by improving the amount of content
> available for minority languages.
>
> However, it is built on a huge edifice of ML and AI technology which
> advantages majority languages and the already-powerful.
>
> As Phoebe mentioned, the subtle biases of ML translation toward majority
> views (selecting the "proper" gender pronoun for someone described as a
> "doctor" or "professor", say) are well known, and certainly deserve to be
> foregrounded from the start, as Danny has pledged to do in his response to
> Phoebe.
>
> But the infrastructure of this project is built this way from the ground
> up.  Language models for European languages are orders of magnitude better
> than language models for minority languages (if the latter exist at all).
> The same is true for ontologies and every other constructed abstraction,
> down to choices of what topics are significant enough to include in an
> abstract article---but that ground has been ably covered by Kaldari and
> others.  So let me concentrate solely on language models in the remainder
> (with some parenthetical asides, for which I hope you'll forgive me).
>
> I would like to challenge Abstract Wikipedia not only to be "not racist" or
> "color blind", but to be actively *antiracist*.  That is, instead of
> passively accepting the status quo wrt language models (& etc), to commit
> to actively supporting a language model in *at least one* minority
> language, treating it as a first-class citizen or (better) the *main*
> output of the project.  That means not just looking for "a good enough
> language model that happens not to be a European language" but *actively
> developing the language model* so that the Abstract Wikipedia project *from
> inception* has a positive effect on *at least one* community speaking a
> underrepresented language with a small Wikipedia.  (Again, WLOG this could
> apply to general AI/ML support for many many minority groups, but I'm
> sticking with "at least one" and "language model" in order to make this as
> concrete and actionable as possible.)  This of course also means committing
> to hire a speaker of that non-European language as part of the core team
> (not just an "and translations" afterthought), committing to foregrounding
> that language in demonstrations, and doing outreach and community building
> to the language group in question.  (All the mockups I've seen have been in
> German and English, and have been pitched to an English-speaking audience.)
>
> I don't think it is wise in 2020 to pretend that "colorblind" business as
> usual will advance the goals of our organization.  We need to actively work
> to ensure this project has effects that *work against* the significant
> pre-existing biases toward highly-educated speakers of European languages.
> It is not enough to say that "someday" this "may" have an effect on
> minority language groups if "somebody" ever gets around to doing it.  We
> must make those investments proactively and with clear intention in order
> to effect the change we wish to see in the world.
>   -- C. Scott Ananian
> _______________________________________________
> Wikimedia-l mailing list, guidelines at:
> https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and
> https://meta.wikimedia.org/wiki/Wikimedia-l
> New messages to: Wikimedia-l@lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l,
> <mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>
_______________________________________________
Wikimedia-l mailing list, guidelines at: 
https://meta.wikimedia.org/wiki/Mailing_lists/Guidelines and 
https://meta.wikimedia.org/wiki/Wikimedia-l
New messages to: Wikimedia-l@lists.wikimedia.org
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:wikimedia-l-requ...@lists.wikimedia.org?subject=unsubscribe>

Re: [Wikimedia-l] [Wikimedia Announcements] Announcing a new Wikimedia project: Abstract Wikipedia

Reply via email to