we are the team of Airpedia project [1], which aims to enhance the classes/properties coverage of DBpedia over Wikipedia using machine learning techniques.

We read about the "mapping sprint", therefore we want to bring to your attention the resource we are producing concerning DBpedia. We think that it can help the community to speed up the mapping process.

Wrong mappings

The basic idea of our approach is the use of DBpedia resource as training data. For this reason, we have to be sure that the mappings are correct. We then implement a cross-language validation to discover wrong mappings. We found out some obvious errors, that we think may be correct before the release of DBpedia 3.9. See attachment for the list of these mappings.

Automatic class mappings

In a paper accepted to I-KNOW conference [2], we present a resource obtained by automatically mapping Wikipedia templates in 25 languages. Our approach can replicate the human mappings with high reliability, and producing an additional set of mappings not included in the original DBpedia. The resource can be downloaded from the resource section [3] of the Airpedia website and consists of CSV files with two columns: Wikipedia infobox name and DBpedia class.

Automatic properties mappings

In a second paper submitted to ISWC conference [4], we focus on the problem of automatically mapping infobox attributes to properties into the DBpedia ontology for extending the coverage of the existing localised versions or building from scratch versions for languages not covered in the current version. We report results comparable to the ones obtained by a human annotator in term of precision, but our approach leads to a significant improvement in recall and speed. Specifically, we mapped 45,978 Wikipedia infobox attributes to DBpedia properties in 14 different languagesfor which mappings were not available yet. Again, it can be downloaded from the resource section [3] of the Airpedia website and consists of CSV files with two columns: Wikipedia infobox attribute name and DBpedia property.

Enhanced coverage of DBpedia over classes in 31 languages

Following the work already presented at ESWC conference [5], we enhance the coverage of DBpedia over pages devoid of infobox. The resource contains 10M computed entity types. It is available in RDF format and can be downloaded in the resource section [3] of our website.

Integration in Italian DBpedia

The Italian DBpedia team has been the firts adopter of our dataset. Next week a new version of the SPARQL endpoint containing our statements will be released. Stay tuned!



[1] http://www.airpedia.org

[2] http://i-know.tugraz.at/

[3] http://www.airpedia.org/download/

[4] http://iswc2013.semanticweb.org/

[5] http://2013.eswc-conferences.org/

cs:Kosmonaut > Airline
(should be Astronaut)

cs:Infobox_Kosmická_loď > Spacecraft
(should be SpaceMission)

cs:Infobox_Cyklistický_závod > CyclingLeague
(should be CyclingCompetition)

es:Ficha_de_taxón > Tax
(should be Species)

tr:Türkiye_il_bilgi_kutusu > Unknown
(should be AdministrativeRegion)

ru:Книга > Book
(should be deleted, it refers to citations)

sl:Infopolje_Glasbeni_ustvarjalec > Musical
(should be MusicalArtist)

hu:Tenisztorna_infobox > TennisLeague
(should be TennisTournament)

hu:Iskola_infobox > School
(should be EducationalInstitution, as it also contains universities)
