Hi all,

I don't know whether this is the right mailing list, but here is my report on 
the talks I had with two Dutch Wikidata people. Would you agree with the 
proposal we worked out?




A common project for Wikidata and DBpedia

Gerard Kuys, February 12, 2014

At the Dutch Chapter meeting at the 1st DBpedia Community Meeting in Amsterdam, 
we decided to embark on a trajectory of closer collaboration with the Wikidata 
project. The idea was to compare datasets and think of ways to mutually improve 
data quality by way of comparing one dataset to another. The monuments dataset 
was suggested to be one of the candidate datasets.

In order to determine further steps to be taken in this collaboration effort, 
Gerard Meijssen of Wikidata contacted the GLAM ‘liaison officer’ of the Dutch 
Wikimedia Foundation, Sebastiaan ter Burg. With Sebastiaan and with Hay Kranen, 
Wikipedian in residence at the Royal Library of the Netherlands, I had a 
fruitful conversation on Tuesday, March 11.

On the path towards better quality of data, both Wikidata and DBpedia encounter 
obstacles to be cleared away. At Wikidata, Hay has pleaded for including the 
Dutch (and German?) PPN identifier for books, authors and keywords into the 
list of external identifiers that is kept within Wikidata. However, this 
proposal was rejected so far, on the grounds that it was insufficiently clear 
to the Wikidata project members what exactly this NTA field (as it is known in 
Wikidata circles) would contribute. At DBpedia, on the other hand, we meet with 
the problem of registering data about people’s gender, which cannot be 
extracted from Wikipedia articles due to editors’ policies and has to be 
obtained by way of linking to external datasets. The major issue to be solved, 
however, is how to overcome the boundaries between content compartments that 
spring from institutions’ collections being separately donated or otherwise 
brought into Wikipedia (either the encyclopedia or Commons). What we need is 
finding a way of constructing relations between content across domains. By 
doing so, we probably also would facilitate the feedback loop donating 
institutions are eagerly waiting for.

When trying to settle for the domains that are most fit for comparison between 
Wikidata and DBpedia, we identified two domains: Writers’ and Monuments’ data. 
As a first step, we would want to make comparable dumps of data from either 
source, and work out an approach for finding all kinds of omissions and errors, 
and mending them. To be overly ambitious, however, as soon as this work has 
been done, we would want to take the bolder step and link both domains one to 
another: how could we find the relations (and translate them into RDF(S) 
properties) expressing the semantic relation between a person (mostly writers) 
and any building he or she has had a (documented) connection with.

Being the GLAM liaison within the Wikimedia Foundation, Sebastiaan is keen to 
foster this endeavour wherever possible. He will be offering all kinds of 
support Wikimedia can provide: meeting rooms and the paying of travel expenses. 
Wikimedia could also provide due publicity, which might be helpful if we would 
want to attract volunteers who could help monitor data quality wherever there 
is no automated way to do so.

The main work to be done yet remains with the Wikidata and DBpedia communities. 
We agreed that we would better limit the number of meetings between (working 
groups within) both communities. Nonetheless, a kick-off meeting would be nice, 
and useful to have. We think of two types of meetings:

1.     *  An initial meeting to set up a working approach, identify, divide and 
attribute work to be done

2.    *  One or several follow-up meetings to discuss progress and tackle 
problems that have arisen and cannot possibly be solved by way of skype 
conferences. This kind of meeting could be held within the framework of 
ordinary Wikimedia meetings, like the Wiki Saturdays.

As soon as both Dutch Wikidata and Dutch DBpedia communities will have approved 
of this approach (or rather – this is what we hope for), a date for the initial 
meeting could be fixed.

Dit bericht met eventuele bijlagen is vertrouwelijk en uitsluitend bestemd voor 
de geadresseerde. Indien u niet de bedoelde ontvanger bent, wordt u verzocht de 
afzender te waarschuwen en dit bericht met eventuele bijlagen direct te 
verwijderen en/of te vernietigen. Het is niet toegestaan dit bericht en 
eventuele bijlagen te vermenigvuldigen, door te sturen, openbaar te maken, op 
te slaan of op andere wijze te gebruiken. Ordina N.V. en/of haar 
groepsmaatschappijen accepteren geen verantwoordelijkheid of aansprakelijkheid 
voor schade die voortvloeit uit de inhoud en/of de verzending van dit bericht.

This e-mail and any attachments are confidential and are solely intended for 
the addressee. If you are not the intended recipient, please notify the sender 
and delete and/or destroy this message and any attachments immediately. It is 
prohibited to copy, to distribute, to disclose or to use this e-mail and any 
attachments in any other way. Ordina N.V. and/or its group companies do not 
accept any responsibility nor liability for any damage resulting from the 
content of and/or the transmission of this message.
Android apps run on BlackBerry 10
Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
Now with support for Jelly Bean, Bluetooth, Mapview and more.
Get your Android app in front of a whole new audience.  Start now.
Dbpedia-discussion mailing list

Reply via email to