(I know this thread was from many months ago. I've been reading recently about 
related issues, and remembered this conversation)

Mackenzie,

That's a great question. Accusations of systemic bias in the deletion of 
biographies is one of the most common criticisms of Wikipedia, and it would be 
great to have data on this issue. 

I would suggest scraping the New Pages feed 
(https://en.wikipedia.org/wiki/Special:NewPages) to get the full text and 
metadata of all new articles before any of them get deleted. If you do this a 
week you'll get a sample of thousands of articles that you can track 
longitudinally to see what kinds of articles survive for, say, 30 days. If an 
article survives 30 days it's highly likely to stay for good.

To get the biographies that are created via the Articles for Creation process, 
you would need to scrape the New Pages feed for Draft space in addition to 
Article space. 
 
When analyzing this data set, it would be good to categorize why each bio was 
deleted, and maybe filter out the cases in which an article was deleted for 
being a copyright violation, attack page, undecipherable nonsense, etc. 

I'd also like to echo WereSpielChequers's second point below. Apparent biases 
in deletion may reflect biases in article creation. You might be best off 
working with a small sample set but analyzing the set really closely (i.e. have 
humans read every article) to avoid the "maybe group X had more deleted 
articles because it had more crap articles to start with" objection.  


Best regards,
Su-Laine
Wikipedia volunteer








On -July-102020, 8:05 AM, "Wiki-research-l on behalf of WereSpielChequers" 
<wiki-research-l-boun...@lists.wikimedia.org on behalf of 
werespielchequ...@gmail.com> wrote:

    Hi Mackenie,

    You may be correct in either or both of your hypotheses, but you might also
    want to check out two other related ones.

    1 Some academic institutions may have an element of misogyny in their HR
    policies, leading to such situations as an academic becoming notable for
    their work to the point where they merit a Wikipedia article, before they
    become a full professor.

    2 In Wikipedia's drive to address the gender skew in our content, we may
    have some editors creating articles on women who don't yet meet our
    notability criteria. Such articles are of course highly likely to be
    deleted.

    There is another way to approach this, check primary and secondary sources
    to see how Wikipedia compares against them. For example, we have articles
    on every female Fellow of the Royal Society, and we achieved that almost a
    decade ago. I don't know if we yet have articles on all the blokes..  I
    expect we have articles on every Nobel Prize Winner by now, but there will
    be less well known awards and lists of people in STEM.

    One problem in looking at deletion discussions is that they don't always
    say what the person is known for, and so you can have confusion between
    multiple people of the same name. I was once asked to restore a deleted
    article so that someone could look at what was there and see if they could
    make a clearer case re the notability of that eminent diplomat. After
    looking at the deleted article, I told them not to start from the deleted
    bit, and if it was the same person, to emphasise their subsequent career as
    a diplomat, rather than their adolescent career as a "pro skateboarder".
    So in order to find the articles on deleted female scientists, you either
    need a list of deleted female scientists, or to check a lot of other
    articles to find which are scientists.

    Hope that's useful

    WSC

    On Fri, 10 Jul 2020 at 00:17, Stuart A. Yeates <syea...@gmail.com> wrote:

    > I recently completed a project writing en.wiki articles for all female
    > and indigenous professors in my country, .nz.
    >
    > I now write pronounless biographies, because there were a significant
    > number whose gender wasn't apparent from their public persona. My
    > guess is that women and LGBTIA+ minorities are incentivised to remove
    > markers of their gender from their online presence to keep a lower
    > profile to avoid the trolls and bigots.
    >
    > There were also a number who clearly appeared to be a certain
    > ethnicity based on their staff photo, but where there were no reliable
    > sources as to that ethnicity.
    >
    > I also had a one person ask for their article to be deleted. [If this
    > is of interest I can send details to you directly, but I will not post
    > their details to a public forum and ask you refrain from this also.]
    >
    > I look forward to reading your experimental design taking these
    > factors into account.
    >
    > cheers
    > stuart
    > --
    > ...let us be heard from red core to black sky
    >
    > On Fri, 10 Jul 2020 at 06:43, Mackenzie Lemieux
    > <mackenzie.lemi...@gmail.com> wrote:
    > >
    > > Dear Wiki Community,
    > >
    > > My name is Mackenzie Lemieux and I am a neuroscience researcher at the
    > Salk
    > > Institute for Biological Studies and I am interested in exploring biases
    > on
    > > Wikipedia.
    > >
    > > My research hypothesis is that gender or ethnicity mediate the rate of
    > > flagging and deletion of pages for women in STEM.  I hope to
    > > retrospectively analyze Wikipedia's deletion history, harvest the
    > > biographical articles about scientists that have been created over the
    > past
    > > n years and then confirm the gender and ethnicity of a large sample.
    > >
    > > It appears that we can identify deleted pages with Wikipedia's deletion
    > log
    > > <https://en.wikipedia.org/wiki/Wikipedia:Deletion_log>, but to actually
    > see
    > > the page that was deleted we need to be members of one of these 
Wikipedia
    > > user groups:  Administrators
    > > <https://en.wikipedia.org/wiki/Wikipedia:Administrators>, Oversighters
    > > <https://en.wikipedia.org/wiki/Wikipedia:Oversight>, Researchers
    > > <https://en.wikipedia.org/wiki/Wikipedia:Researchers>, Checkusers
    > > <https://en.wikipedia.org/wiki/Wikipedia:CheckUser>.
    > >
    > > Does anyone have advice on how to obtain researcher status or is there
    > > anyone willing to collaborate who has access to the data we need?
    > >
    > > Warmly,
    > > Mackenzie Lemieux
    > >
    > >
    > > --
    > > Mackenzie Lemieux
    > > mackenzie.lemi...@gmail.com
    > > cell: 416-806-0041
    > > 220 Gilmour Avenue
    > > Toronto, Ontario
    > > M6P 3B4
    > > _______________________________________________
    > > Wiki-research-l mailing list
    > > Wiki-research-l@lists.wikimedia.org
    > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
    >
    > _______________________________________________
    > Wiki-research-l mailing list
    > Wiki-research-l@lists.wikimedia.org
    > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
    >
    _______________________________________________
    Wiki-research-l mailing list
    Wiki-research-l@lists.wikimedia.org
    https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to