Hi DBpedians!

As you surely have noticed, Google has abandoned Freebase and it will
merge with Wikidata [1]. I searched the list, but did not find a
discussion about it. So here goes my point of view:

When Wikidata was started, I hoped it would quickly become a major
contributor of quality data to the LOD cloud. But although the project
has a potentially massive crowd and is backed by Wikimedia, it does not
really care about the Linked Data paradigm as established in the
Semantic Web. RDF is more of an afterthought than a central concept. It
was a bit disappointing to see that Wikidata's impact on the LOD
community is lacking because of this.

Now Freebase will be integrated into Wikidata as a curated, Google
engineering hardened knowledge base not foreign to RDF and Linked Data.
How the integration will be realized is not yet clear it seems. One
consequence is hopefully, that the LOD cloud grows by a significant
amount of quality data. But I wonder what the consequences for the
DBpedia project will be? If Wikimedia gets their own knowledge graph,
possible curated by their crowd, where is the place for the DBpedia? Can
DBpedia stay relevant with all the problems of an open source project,
all the difficulties with mapping heterogeneous data in many different
languages, the resulting struggle with data quality and consistency and
so on?

So I propose being proactive about it:

I see a large problem of the DBpedia with restrictions of the RDF data
model. Triples limit our ability to make statements about statements. I
cannot easily address a fact in the DBpedia and annotate it. This means:

    -I cannot denote the provenance of a statement. I especially cannot
denote the source data it comes from. Resource level provenance is not
sufficient if further datasets are to be integrated into DBpedia in the
future.
    -I cannot denote a timespan that limits the validity of a statement.
Consider the fact that Barack Obama is the president of the USA. This
fact was not valid at a point in the past and won't be valid at some
point in the future. Now I might link the DBpedia page of Barack Obama
for this fact. Now if a DBpedia version is published after the next
president of the USA was elected, this fact might be missing from the
DBpedia and my link becomes moot.     -This is a problem with
persistency. Being able to download old dumps of DBpedia is not a
sufficient model of persistency. The community struggles to increase
data quality, but as soon as a new version is published, it drops some
of the progress made in favour of whatever facts are found in the
Wikipedia dumps at the time of extraction. The old facts should persist,
not only in some dump files, but as linkable data.

Being able to address these problems would also mean being able to fully
import Wikidata, including provenance statements and validity timespans,
and combine it with the DBpedia ontology (which already is an important
focus of development and rightfully so). It also means a persistent
DBpedia that does not start over in the next version.

So how can it be realized? With reification of course! But most of us
resent the problems reification brings with it, the complications in
querying etc. The reification model itself is also unclear. There are
different proposals, blank nodes, reification vocabulary, graph names,
creating unique subproperties for each triple etc. Now I won't propose
using one of these models, this will surely be subject to discussion.
But the DBpedia can propose a model and the LOD community will adapt,
due to DBpedia's state and impact. I think it is time to up the standard
of handling provenance and persistence in the LOD cloud and DBpedia
should make the start. Especially in the face of Freebase and Wikidata
merging, I believe it is imperative for the DBpedia to move forward.

regards,
Martin

[1] https://plus.google.com/109936836907132434202/posts/bu3z2wVqcQc

------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to