[Wiki-research-l] Re: How to access deleted Wikipedia articles

2021-11-05 Thread L.Gelauff
Hi Doris,

as you can see here
,
the article was actually deleted multiple times. But it was also restored
and recreated.

It was definitely deleted, but:
* Deletion is not always permanent. Articles can be restored (e.g. after
revisiting a decision).
* Deletion does not necessarily prohibit recreation. The reasons for
deletion may not apply to future version (for example, notability for
living persons may change)
* Deletion does not necessarily mean it was a 'bad article': sometimes an
article was created twice and needs to be merged, or a redirect is being
deleted in order to make place for the 'real article'.
* As WSC mentioned, a removal of content is also possible through other
means than a 'deletion' (e.g. replace the article with a redirect).

It depends a bit on your desired definition of 'deleted' how to approach
this.

Best,
Lodewijk

On Fri, Nov 5, 2021 at 11:11 AM D Z  wrote:

> Hi Adam,
>
> It seems that the issue is the apostrophe after "L", in the wikidata
> > query it is "´" and the wikipedia link above uses "'".
> >
>
> I see, a lof of the issues were caused by my string mishandling.
>
> One approach you might consider is to download the entire log
> > history, then process it locally to filter by page ID.
> >
>
> I am still unclear on how to know definitely for sure that an article was
> deleted.  It seems like the only way is to tell through the comments. For
> example, this call:
>
> https://en.wikipedia.org/w/api.php?action=query&list=logevents&leaction=delete/delete&letitle=Zayn%20Malik
> shows the comment "[[Wikipedia:Articles for deletion/Louis Tomlinson]]"
> which I have noticed to exist for other articles that were successfully
> deleted, but the article "Zayn Malik" exists. The  most recent event has
> the comment
> "[[WP:CSD#G6|G6]]: Deleted to make way for move" which would imply the
> other deletions weren't successful but the article still exists.
>
> Thanks,
>
> Doris
>
> On Thu, Nov 4, 2021 at 3:20 AM Adam Wight  wrote:
>
> > On 11/4/21 8:09 AM, D Z wrote:
> >
> > > Hi Adam,
> > >
> > > Thanks for your reply. The qitem api returns missing for this article
> but
> > > the article exists:
> > >
> > >
> >
> https://www.wikidata.org/w/api.php?action=wbgetentities&format=json&sites=eswiki&titles=Playas%20de%20L%C2%B4Atalaya%20y%20Focar%C3%B3n&normalize=1
> > >
> > > The Wikipedia page link
> > > 
> is
> > > here.
> >
> > It seems that the issue is the apostrophe after "L", in the wikidata
> > query it is "´" and the wikipedia link above uses "'".  Maybe something
> > in your query script is normalizing the fancy apostrophe to a simple
> > one?  I would check for proper UTF-8 handling.
> >
> > > Would you know if there is a way to input article revision ID or pageid
> > > instead of source title for the logevents API? The strings seem to be
> > > problematic at times.
> >
> > This was prescient :-).  But I don't see any record of the article being
> > deleted, so perhaps the API is correct in this case?
> >
> >
> >
> https://pt.wikipedia.org/wiki/Special:Log?type=&user=&page=Rodrigo+Flores+Álvarez&wpdate=&tagfilter=
> 
> > <
> https://pt.wikipedia.org/wiki/Special:Log?type=&user=&page=Rodrigo+Flores+%C3%81lvarez&wpdate=&tagfilter=
> >
> >
> > Unfortunately, the API help page doesn't mention filtering the log by
> > page ID.  One approach you might consider is to download the entire log
> > history, then process it locally to filter by page ID.
> >
> > Help page:
> >
> https://www.mediawiki.org/w/api.php?action=help&modules=query%2Blogevents
> >
> > Regards,
> > Adam W.
> > [[mw:User:Adamw]
> >
> > > For example, the article 'Rodrigo Flores Álvarez' of
> > > 'pt' Wikipedia gives me trouble (I got this article from the
> > cxtranslation
> > > list). This page seems to be missing
> > >  and
> perhaps
> > I
> > > am not using the logevents API correctly, but it returns empty.
> > >
> > > {'batchcomplete': '', 'query': {'logevents': []}}
> > >
> > > --
> > > endpoint = str('pt') + '.wikipedia.org/w/api.php'
> > > query_url =  "https://{0}".format(endpoint)
> > > params = {}
> > > params['action'] = 'query'
> > > params['list'] = 'logevents'
> > > params['format'] = 'json'
> > > params['leaction'] = 'delete/delete'
> > > params['letitle'] = 'Rodrigo Flores Álvarez'
> > > json_response = requests.get(url=query_url, params=params).json()
> > >
> > > Thanks again and cheers,
> > >
> > > Doris Zhou
> > >
> > > On Wed, Oct 27, 2021 at 9:51 AM Adam Wight 
> > wrote:
> > >
> > >> The "logevents" API should return the same data as Special:Log. For
> > >> example,
> > >>
> > >>
> > >>
> >
> https://en.wikipedia.org/w/api.php?action=query&list=logevents&letitle=Catego

Re: [Wiki-research-l] Feedback about Wikipedia-related project.

2020-09-21 Thread L.Gelauff
Hi dlab,

Are you looking at articles from scratch, or articles that already exist?
If they already exist, you could perhaps derive the 'type' from the
category the article is placed in, or predict it from the lead paragraph.
If you're aiming at articles that get created in non-English (I don't know
if you're limiting yourself by language), you could also consider using the
Wikidata item associated with the article (which will almost always have
some 'instance of' defined).

As a use case, I could see this tool be helpful in the context of
translating articles, and in that case you already have an article in
another language to your avail, as well as a category tree in that
language, and an introduction paragraph. In that case, you probably don't
have to ask them what type of topic it is.

That would seem also the most likely place to implement it (but also with
the least added value maybe?). I think this kind of tool would be most
likely to be used if you can somehow fit it into a toolbox or existing
workflow.

I could also suggest to you to consider the opposite use case: detect links
that are very 'out of place': links that are likely referring to a homonym.
An example would be a biology article linking to a mathematician, where you
would have expected a link to a biologist. Or in a historical article about
someone/something in the 14th century, a link to a person in the 17th
century. This could still be a valid link, but it may be helpful to detect
these rare events - it might trigger a disambiguation. Just thinking out
loud.

Best,

Lodewijk

On Sat, Sep 19, 2020 at 1:44 PM Su-Laine Brodsky  wrote:

> Hi,
>
> A good place to get feedback from the English Wikipedia community would be
> the Village Pump Idea Lab:
> https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(idea_lab) .
>
> It’s not clear to me whether the tool would be suggesting inline links for
> the text that’s already in the article, or “See also” links. A description
> of what problem the tool would solve would be really helpful.  It would
> also be helpful to see “before and after” mockups showing a specific stub
> article as it exists today and what the article would look like after the
> tool’s suggestions have been applied.
>
> Cheers,
> Su-Laine
> Wikipedia contributor
>
>
> > On Sep 18, 2020, at 3:30 AM, Garcia Duran Alberto 
> wrote:
> >
> > Hi all
> >
> > We are researchers from the dlab at EPFL working with Bob West.
> >
> > We have plans to build a graph-based ML algorithm, which will further
> facilitate development of a tool to assist Wikipedia editors by providing
> recommendations on two novel use-cases. One consists of suggesting
> hyperlinks (Wikipedia articles) to be inserted within a section of an
> article. Note that this is different from "classical link prediction".
> >
> > We feel the tool could be of great value, as it can work with newly
> created sections that do not have any content yet. What's more, the editor
> can type *any* section name (either non-existent in that article or even in
> the whole Wiki project) and the tool would have the power to suggest
> hyperlinks that are likely to be of interest for that section in the
> article. We think that (specially) stub articles can benefit from this tool.
> >
> > However, we have one assumption. In addition to the section name, the
> editor must provide the "entity type" (Place, People, Date,
> Organization...) of the Wikipedia articles she would like to insert in the
> section. The reason is that within a section you can find links to articles
> of diverse types.
> >
> > The reason we are reaching out to you is two fold:
> > (1) To check whether such a tool would be of interest and likely to be
> used by the editors.
> > (2) How limiting is the assumption that the editor needs to specify the
> entity type of the Wikipedia articles for which she needs recommendations
> from the tool?
> >
> > One one hand, some of us think this is not a problem as the number of
> entity types is relatively small (between 10 and 20) and they can be easily
> and visually presented to the editor with a dropdown list. On the other
> side, others think this requirement is limiting.
> >
> > We would like to know your opinion to decide whether we should move
> forward with this project.
> >
> > Thanks!
> > dlab
> >
> > ___
> > Wiki-research-l mailing list
> > Wiki-research-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
>
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Editor surveys on race/ethnicity/religion

2020-09-21 Thread L.Gelauff
Ah, you're assuming some automated country-detection, rather than
self-identify. I see.

Lodewijk

On Mon, Sep 21, 2020 at 12:59 PM Stuart A. Yeates  wrote:

> Everyone from China and Saudi Arabia (two countries which
> systematically block wikipedia) are likely to be taking technical
> measures to disguise their country.
>
> That's a lot of people, but I'm not sure how many editors that is.
>
> cheers
> stuart
>
> --
> ...let us be heard from red core to black sky
>
> On Tue, 22 Sep 2020 at 07:01, L.Gelauff  wrote:
> >
> > Just thinking out loud.. are we looking for actual race/ethnicity/etc
> data,
> > or is it rather that we're looking for whether someone belongs to an
> under
> > represented group in their specific situation? If it is the latter, there
> > may be ways to phrase the question without asking for actual
> demographics.
> >
> > Stuart; do you have any indication for how large a portion that group
> is? I
> > am aware of public pages being potentially disguised as such, but wasn't
> > familiar with stories about this happening in a survey context (although
> it
> > does not sound implausible).
> >
> > Best,
> >
> > Lodewijk
> >
> > On Mon, Sep 21, 2020 at 11:39 AM Stuart A. Yeates 
> wrote:
> >
> > > Another point not touched on by other commenters is that even if ideal
> > > race / ethnicity question(s were developed for every country in the
> > > world, users from some countries commonly disguise their country due
> > > to censorship in that country, so we there would be a whole class of
> > > systematic errors where we asked users the wrong country's
> > > question(s).
> > >
> > > cheers
> > > stuart
> > > --
> > > ...let us be heard from red core to black sky
> > >
> > > On Tue, 22 Sep 2020 at 05:00, Isaac Johnson 
> wrote:
> > > >
> > > > Adding another point from Rebecca Maung who helps run the annual
> > > Community
> > > > Insights surveys <https://meta.wikimedia.org/wiki/Community_Insights
> >
> > > but
> > > > isn't currently on this listserv so couldn't respond directly:
> > > >
> > > > This year's Community Insights survey (reporting scheduled for early
> > > 2021)
> > > > is the first that will ask Wikimedia contributors about race and
> > > > ethnicity-- but only in certain geographies. Due to all the excellent
> > > > points made in this thread, we have never asked a race or ethnicity
> > > > question, but this year we decided to start asking locally relevant
> > > > questions where we could. This year only editors in the US and
> Britain
> > > will
> > > > see a question about race or ethnicity, tailored to their local
> contexts.
> > > > In the coming years, we will expand the countries and geographies
> that
> > > see
> > > > a question like this, prioritizing places where there is a larger
> editor
> > > > presence and local laws and norms allow such questions. We have not
> yet
> > > > discussed asking about religion in the Community Insights survey.
> > > >
> > > > On Mon, Sep 21, 2020 at 9:20 AM Isaac Johnson 
> > > wrote:
> > > >
> > > > > As pointed out by others, the highly contextualized nature of
> religion,
> > > > > race, and ethnicity between countries makes it very difficult to
> > > impossible
> > > > > to craft questions that are not overly reductive but still somewhat
> > > > > universal. Despite this challenge, understanding diversity in a way
> > > that
> > > > > captures these aspects is obviously quite important as they often
> > > figure
> > > > > very strongly into power and representation within history, media,
> etc.
> > > > >
> > > > > In general, if you're looking for large-scale surveys of editors,
> the
> > > Meta
> > > > > category (Category:Editor surveys
> > > > > <https://meta.wikimedia.org/wiki/Category:Editor_surveys>) is
> actually
> > > > > quite complete (same for readers
> > > > > <https://meta.wikimedia.org/wiki/Category:Reader_surveys>). In
> > > > > particular, I wrote what little I could find about these topics
> into
> > > this
> > > > > section of our recently published knowledge gaps taxonomy:
> > > > > h

Re: [Wiki-research-l] Editor surveys on race/ethnicity/religion

2020-09-21 Thread L.Gelauff
Just thinking out loud.. are we looking for actual race/ethnicity/etc data,
or is it rather that we're looking for whether someone belongs to an under
represented group in their specific situation? If it is the latter, there
may be ways to phrase the question without asking for actual demographics.

Stuart; do you have any indication for how large a portion that group is? I
am aware of public pages being potentially disguised as such, but wasn't
familiar with stories about this happening in a survey context (although it
does not sound implausible).

Best,

Lodewijk

On Mon, Sep 21, 2020 at 11:39 AM Stuart A. Yeates  wrote:

> Another point not touched on by other commenters is that even if ideal
> race / ethnicity question(s were developed for every country in the
> world, users from some countries commonly disguise their country due
> to censorship in that country, so we there would be a whole class of
> systematic errors where we asked users the wrong country's
> question(s).
>
> cheers
> stuart
> --
> ...let us be heard from red core to black sky
>
> On Tue, 22 Sep 2020 at 05:00, Isaac Johnson  wrote:
> >
> > Adding another point from Rebecca Maung who helps run the annual
> Community
> > Insights surveys 
> but
> > isn't currently on this listserv so couldn't respond directly:
> >
> > This year's Community Insights survey (reporting scheduled for early
> 2021)
> > is the first that will ask Wikimedia contributors about race and
> > ethnicity-- but only in certain geographies. Due to all the excellent
> > points made in this thread, we have never asked a race or ethnicity
> > question, but this year we decided to start asking locally relevant
> > questions where we could. This year only editors in the US and Britain
> will
> > see a question about race or ethnicity, tailored to their local contexts.
> > In the coming years, we will expand the countries and geographies that
> see
> > a question like this, prioritizing places where there is a larger editor
> > presence and local laws and norms allow such questions. We have not yet
> > discussed asking about religion in the Community Insights survey.
> >
> > On Mon, Sep 21, 2020 at 9:20 AM Isaac Johnson 
> wrote:
> >
> > > As pointed out by others, the highly contextualized nature of religion,
> > > race, and ethnicity between countries makes it very difficult to
> impossible
> > > to craft questions that are not overly reductive but still somewhat
> > > universal. Despite this challenge, understanding diversity in a way
> that
> > > captures these aspects is obviously quite important as they often
> figure
> > > very strongly into power and representation within history, media, etc.
> > >
> > > In general, if you're looking for large-scale surveys of editors, the
> Meta
> > > category (Category:Editor surveys
> > > ) is actually
> > > quite complete (same for readers
> > > ). In
> > > particular, I wrote what little I could find about these topics into
> this
> > > section of our recently published knowledge gaps taxonomy:
> > > https://arxiv.org/pdf/2008.12314.pdf#subsubsection.3.1.7
> > >
> > > The April 2011 editor survey took the approach of just asking people
> how
> > > they felt they were different from others in the community -- this
> specific
> > > question is not one that I would advocate today (asking people to
> identify
> > > all the ways in which they may be "outsiders" is not particularly
> > > welcoming) but this is also probably the style of approach (asking
> people
> > > how well they feel represented within Wikipedia content or editor
> > > community) that you'd have to take to get information on ethnicity /
> race /
> > > religion without writing country-specific questions:
> > >
> https://upload.wikimedia.org/wikipedia/commons/7/76/Editor_Survey_Report_-_April_2011.pdf#page=65
> > >
> > > On Mon, Sep 21, 2020 at 6:12 AM Stuart A. Yeates 
> > > wrote:
> > >
> > >> The ethnicity / race question is an incredibly hard question to
> > >> compose in an internationalised way.
> > >>
> > >> Pretty much every country in the world uses different terms and there
> > >> are some very confusing cases where the same term is used in different
> > >> countries to mean very different things (e,g, "Asian" in UK English vs
> > >> New Zealand English). This is derived from varying legal definitions
> > >> (for example blood quantum vs one-drop laws); the history of
> > >> colonisation and waves of immigration to the country; along with
> > >> cultural differences.
> > >>
> > >> cheers
> > >> stuart
> > >> --
> > >> ...let us be heard from red core to black sky
> > >>
> > >> On Mon, 21 Sep 2020 at 21:55, Federico Leva (Nemo) <
> nemow...@gmail.com>
> > >> wrote:
> > >> >
> > >> > Su-Laine Brodsky, 21/09/20 08:19:
> > >> > > I’m wondering if any large-scale surveys have been done that ask
> 

Re: [Wiki-research-l] Data on arbitration, mediation, voting

2018-12-06 Thread L.Gelauff
Hi Ofer,

Could you explain a bit more of the background what kind of questions
you're trying to answer? I have been looking into voting on Wikipedia
myself, and getting clean data is a challenge indeed.

Are you only interested in English or also in other communities? Do you
refer with 'article' to the lemma around which a dispute was settled (in
arbitration, it's often not a particular lemma) or rather the section of
the rules that the ruling would refer to (quite common in Dutch, not sure
if it is in other languages).

As for polls, outside the 2010 dataset on admin elections in English
Wikipedia, I have been unable to find any readily available data myself.
Most likely, you'd have to collect it from various pages and interpret the
data. It depends on the type of polls you're interested in, how straight
forward that is. (If I overlooked something, I would be happy to be
corrected!)

Best,

Lodewijk Gelauff

On Wed, Dec 5, 2018 at 11:45 PM Ofer Arazy  wrote:

> Hi everyone,
>
> As part of my research on governance mechanisms in Wikipedia, I'm looking
> for data regarding mediation, arbitration, and polls.
> Are records of mediation and arbitration committees (dates, the article,
> decisions) and on voting readily available?
> How could I gain access to this data?
> I'm particularly interested on data regarding the Gdansk article (
> https://en.wikipedia.org/wiki/Gda%C5%84sk), but would be happy to retrieve
> data for other articles as well.
>
> Thanks in advance,
> Ofer Arazy
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l


Re: [Wiki-research-l] Definition of the death of a wiki

2018-11-05 Thread L.Gelauff
Please also note the exclusion criteria that Zhu et al used (not counting
people that edit in 10 communities or more at the same time). I'm not
exactly sure how that works out in practice, but this should take care of
some obvious edits you want to avoid in Wikimedia at least: (interwiki)
bots, stewards, global admins etc. You may also want to consider checking
for vandalism (are the edits later reverted) and not include those in your
measure.

Best,
Lodewijk


On Mon, Nov 5, 2018 at 11:53 AM ABEL SERRANO JUSTE  wrote:

> Thank you for your answers.
> It's worth noting that we are using the Wikia dataset too and that we are
> trying to identify when we consider a wiki "death" or "inactive".
> We are looking for a rule that should be as much as independant as possible
> of the topic, stage or size of the wiki. Indeed, we could say as well that
> we are interested in when the wiki community dies and isolate from any
> external relation (like, for instance, if it's a wiki around a TV serie
> that nowadays it's been nowadays, we don't care about that external fact,
> but whether the wiki is still active or not)
>
> On Mon, Nov 5, 2018, 8:37 PM Jeremy Foote  wrote:
>
> > Hi Abel,
> >
> > We have been working on a paper that looks at wiki survival on Wikia. We
> > ended up using a similar measure to Zhu et al. We are more interested in
> > when the community around a wiki "dies", and so we measure death as a
> > 30-day period in which fewer than two people edit the wiki. I think that
> > the generalization of this idea is to look for the first period of X days
> > in which N or fewer people make an edit, where X and N might change
> > depending on the question your are looking at (or might be informed by
> the
> > data).
> >
> > Best,
> > Jeremy
> >
> > On Mon, Nov 5, 2018 at 11:35 AM Ziko van Dijk 
> wrote:
> >
> > > Hello,
> > >
> > > Interesting mail - at the moment I am busy with thinking about
> > > chronological aspects of a wiki. One idea is that a wiki can have
> > > a finality, that means, that the founders had only a limited goal in
> > mind.
> > > If accomplished, the wiki is no longer needed.
> > > There is a paper about open source software that you might already
> know?
> > > (Schweik 2014)
> > >
> > > Kind regards
> > > Ziko
> > >
> > >
> > > Am Mo., 5. Nov. 2018 um 14:44 Uhr schrieb ABEL SERRANO JUSTE <
> > > abese...@ucm.es>:
> > >
> > > > Hello fellow researchers!
> > > >
> > > > We are conducting a research about "mortality in wikis" and we are
> > > looking
> > > > for a good definition to determine when a wiki is considered "death",
> > > > "inactive" or "abandoned".
> > > >
> > > > So far, I've only found this definition from Haiyi Zhu, Robert E.
> Kraut
> > > and
> > > > Aniket Kittur in their paper: "The impact of membership overlap on
> the
> > > > survival of online communities"
> > > > .
> > > > We define a community to be dormant (the inverse of active) in a
> given
> > > > month if the community did not have any activity (including
> discussion
> > > > pages and community pages) in the given month and the preceding two
> > > months.
> > > >
> > > > Any other references you could point me out? any better ideas?
> > > >
> > > > Thank you in advance!
> > > > --
> > > > Saludos,
> > > > Abel.
> > > > ___
> > > > Wiki-research-l mailing list
> > > > Wiki-research-l@lists.wikimedia.org
> > > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > > >
> > > ___
> > > Wiki-research-l mailing list
> > > Wiki-research-l@lists.wikimedia.org
> > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> > >
> > ___
> > Wiki-research-l mailing list
> > Wiki-research-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> >
> --
> Saludos,
> Abel.
> ___
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
___
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l