When I created a KB out of what I thought should be the high-quality information in the canonicalized 2016-10 dataset of DBpedia I noticed that there are some systematic errors in the types of nodes. For example, Tree, http://wikidata.dbpedia.org/resource/Q10884, is an instance of both http://dbpedia.org/ontology/Agent and http://dbpedia.org/ontology/WrittenWork, as well as a lot of other incorrect types. Vegetable, http://wikidata.dbpedia.org/resource/Q11004, has similar problems
I traced these errors back to the following files: instance_types_wkd_uris_eo.ttl:<http://wikidata.dbpedia.org/resource/Q10884> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Bird> . instance_types_wkd_uris_it.ttl:<http://wikidata.dbpedia.org/resource/Q10884> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Person> . instance_types_wkd_uris_ru.ttl:<http://wikidata.dbpedia.org/resource/Q10884> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Book> . instance_types_wkd_uris_eo.ttl:<http://wikidata.dbpedia.org/resource/Q11004> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Bird> . instance_types_wkd_uris_it.ttl:<http://wikidata.dbpedia.org/resource/Q11004> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Person> . instance_types_wkd_uris_ru.ttl:<http://wikidata.dbpedia.org/resource/Q11004> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Book> . I then looked at instance_types_it.ttl and noticed that there were a lot of incorrect instances of http://dbpedia.org/ontology/Person. From looking at the first few lines of the file with this type it appears that a large majority of them are incorrect. It thus appears that something has gone very wrong in the extraction of information for Italian DBpedia. Similarly it appears that something has gone very wrong in the extraction of information for Esperanto DBpedia. I can't make sense of the analogous file in Russian DBpedia, but it appears to have far too many instances of Book indicating that there is something very wrong there as well. The large number of errors that I have uncovered means that I can't count on information from these parts of DBpedia. That's regrettable, as I would like to include as much information as possible. But what is really problematic is that now I don't see how I can count on any DBpedia information. What is the way forward here? Are there some parts of DBpedia that are known not to have these sorts of systematic problems. Peter F. Patel-Schneider ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ DBpedia-discussion mailing list DBpedia-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion