When I created a KB out of what I thought should be the high-quality
information in the canonicalized 2016-10 dataset of DBpedia I noticed that
there are some systematic errors in the types of nodes.  For example, Tree,
http://wikidata.dbpedia.org/resource/Q10884, is an instance of both
http://dbpedia.org/ontology/Agent and
http://dbpedia.org/ontology/WrittenWork, as well as a lot of other incorrect
types.  Vegetable, http://wikidata.dbpedia.org/resource/Q11004, has similar
problems

I traced these errors back to the following files:
instance_types_wkd_uris_eo.ttl:<http://wikidata.dbpedia.org/resource/Q10884>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/ontology/Bird> .
instance_types_wkd_uris_it.ttl:<http://wikidata.dbpedia.org/resource/Q10884>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/ontology/Person> .
instance_types_wkd_uris_ru.ttl:<http://wikidata.dbpedia.org/resource/Q10884>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/ontology/Book> .
instance_types_wkd_uris_eo.ttl:<http://wikidata.dbpedia.org/resource/Q11004>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/ontology/Bird> .
instance_types_wkd_uris_it.ttl:<http://wikidata.dbpedia.org/resource/Q11004>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/ontology/Person> .
instance_types_wkd_uris_ru.ttl:<http://wikidata.dbpedia.org/resource/Q11004>
<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
<http://dbpedia.org/ontology/Book> .

I then looked at instance_types_it.ttl and noticed that there were a lot of
incorrect instances of http://dbpedia.org/ontology/Person.  From looking at
the first few lines of the file with this type it appears that a large
majority of them are incorrect.  It thus appears that something has gone
very wrong in the extraction of information for Italian DBpedia.  Similarly
it appears that something has gone very wrong in the extraction of
information for Esperanto DBpedia.  I can't make sense of the analogous
file in Russian DBpedia, but it appears to have far too many instances of
Book indicating that there is something very wrong there as well.

The large number of errors that I have uncovered means that I can't count on
information from these parts of DBpedia.   That's regrettable, as I would
like to include as much information as possible.  But what is really
problematic is that now I don't see how I can count on any DBpedia
information.

What is the way forward here?   Are there some parts of DBpedia that are
known not to have these sorts of systematic problems.


Peter F. Patel-Schneider

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to