Hi Johann,

If you look into the source code of the pages, for instance, the  
Russian one for Obama, you find out that the list of Obama's book is  
defined with "книга" template (the translation is "book") - search for  
"{{книга" in the source code. And - guess what - there is a mapping  
for the infobox with the same name, to the Book class of the DBpedia  
ontology
http://mappings.dbpedia.org/index.php/Mapping_ru:%D0%9A%D0%BD%D0%B8%D0%B3%D0%B0

The extraction framework just doesn't know which one of the infoboxes  
on the page is "the main one" - there actually seems to be no standard  
way to specify this in the Wiki page code. Some heuristics can be used  
- roughly, finding anything starting with "{{" and trying to treat is  
as a source of the DBpedia type - and this is exactly how you get the  
strange types.

Cheers,
Volha




> Quoting Johann Petrak <johann.pet...@gmail.com>:
>
>> The file
>> http://downloads.dbpedia.org/2015-04/core-i18n/de/instance-types-en-uris_de.ttl.bz2
>> contains
>> triples which I cannot seem to find in any online version and which appear
>> to be wrong, for example:
>>
>> <http://de.dbpedia.org/resource/Barack_Obama> <
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
>> http://dbpedia.org/ontology/Quote>
>>
>> Which says that Barack_Obama is a quote, which he isn't :)
>>
>> There is also the triple:
>> <http://de.dbpedia.org/resource/Angela_Merkel> <
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
>> http://dbpedia.org/ontology/Quote>
>>
>> There are similar odd entries in the file
>> instance-types-en-uris_de.ttl.bz2, for example
>>
>> <http://dbpedia.org/resource/George_W._Bush> <
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
>> http://dbpedia.org/ontology/WrittenWork>
>> <http://dbpedia.org/resource/Angela_Merkel> <
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
>> http://dbpedia.org/ontology/Quote>
>>
>> In the russian file instance-types-en-uris_ru.ttl.bz2 I find the following
>> triple:
>>  <http://dbpedia.org/resource/Barack_Obama> <
>> http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
>> http://dbpedia.org/ontology/Book>
>> which is also wrong.
>>
>> Where do all these incorrect entries come from, Wikipedia does not seem to
>> contain any of these?
>> Since Barack_Obama is a rather prominent entry (so one would assume that
>> errors in Wikipedia or dbpedia should
>> get caught quickly), I am worried about the information that may be in
>> there for less prominent entries.
>>
>> Many thanks,
>>
>>  Johann




------------------------------------------------------------------------------
_______________________________________________
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to