Hi Andrew, all,

In my eyes, a large incentive for the maintainers of external databases - as I 
am one for the ZBW German National Library for Economics - is the data they can 
earn: not only in terms of property values and attached Wikipedia pages, but 
also in terms identifiers and links to other vocabularies. 

This can reach up to the point where Wikidata replaces a custom database for 
identifier mappings. We approached that by moving a mapping of GND and RePEc 
(P2428) identifiers to Wikidata. Still, that were only 3100 of 460,000 GND IDs 
and 50,000 RePEc IDs in our EconBiz portal alone, so it's still very sparse - 
but an improvement. (details see https://hackmd.io/p/S1YmXWC0e). Adding, e.g., 
the rest of the "most important economists" from RePEc as well as GND is very 
tempting, as it will extend the mapping with relatively low efforts.

For vocabularies limited in size, such as STW Thesaurus for Economics, a 
complete mapping can be achieved (if relations beyond equivalence are available 
- see 
https://www.wikidata.org/wiki/Wikidata:Property_proposal/mapping_relation_type).
 The incentive for that is even higher, because it saves the owner of the 
vocabulary all cost of maintaining possibly multiple mappings to 
third-party-vocabularies.

So I think that embracing and extending the Wikidata's role as an *universal 
linking hub* benefits everybody, and will improve total coverage largely, 
because it offers incentives to communities not involved in Wikidata before.

Cheers, Joachim

PS. Thanks for the hint to P2429 - looks very useful!

> -----Ursprüngliche Nachricht-----
> Von: Wikidata [mailto:wikidata-boun...@lists.wikimedia.org] Im Auftrag von
> Andrew Gray
> Gesendet: Donnerstag, 7. September 2017 21:26
> An: Discussion list for the Wikidata project.
> Betreff: Re: [Wikidata] Which external identifiers are worth covering?
> 
> Hi Marco,
> 
> I guess this depends what you mean by "exhaustive". Exhaustive in that every
> Wikidata item has ID X, or exhaustive in that we have every instance of ID X 
> in
> Wikidata?
> 
> The first is probably not going to happen, as the vast majority of external
> identifiers have a defined scope for what they identify. Some are pretty 
> broad -
> VIAF is essentially "everyone who exists in a library catalogue as an author 
> or
> subject" - but still have a limit.
> We're never really going to reach a situation where there is a single 
> identifier
> type that covers everyone, unless we're linking across to another 
> Wikidata-type
> comprehensive knowledgebase, and even then we'd need to ensure we're in a
> position where they already cover everything in Wikidata.
> 
> The second can (and has) been done - the largest one I know of offhand for
> people is the Oxford DNB (60k items) but for non-people we have complete
> coverage of eg Swedish district codes, P1841 (160k items).
> It's a bit of a slog to get these completed and then maintained, since the 
> last 5-
> 10% tend to be more challenging complicated cases, but one or two
> determined people can make it happen. And of course it's not appropriate for
> many identifiers, as they may issue IDs for things that we don't intend to 
> have
> in Wikidata, so we will never completely cover them.
> 
> I should quickly plug the "expected completeness" property which is really
> useful for identifiers - P2429 - as this can quickly show whether something 
> is a)
> completely on Wikidata; b) not complete yet but eventually might be; or c)
> probably never will be. Not very widely rolled out yet, though...
> 
> Andrew.
> 
> 
> On 7 September 2017 at 19:51, Marco Fossati <foss...@spaziodati.eu> wrote:
> > Hi everyone,
> >
> > As a data quality addict, I've been investigating the coverage of
> > external identifiers linked to Wikidata items about people.
> >
> > Given the numbers on SQID [1] and some SPARQL queries [2, 3], it seems
> > that even the second most used ID (VIAF) only covers *25%* of people items
> circa.
> > Then, there is a long tail of IDs that are barely used at all.
> >
> > So here is my question:
> > *which external identifiers deserve an effort to achieve exhaustive
> > coverage?*
> >
> > Looking forward to your valuable feedback.
> > Cheers,
> >
> > Marco
> >
> > [1] https://tools.wmflabs.org/sqid/#/browse?type=properties "Select
> > datatype" set to "ExternalId", "Used for class" set to "human Q5"
> > [2] total people: http://tinyurl.com/ybvcm5uw [3] people with a VIAF
> > link: http://tinyurl.com/ya6dnpr7
> >
> > _______________________________________________
> > Wikidata mailing list
> > Wikidata@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikidata
> 
> 
> 
> --
> - Andrew Gray
>   and...@generalist.org.uk
> 
> _______________________________________________
> Wikidata mailing list
> Wikidata@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to