Hi again,

On 05.07.2017 17:58, Sebastian Hellmann wrote:
On 05.07.2017 17:43, John Flynn wrote:

I have long been curious about the DBpedia ontology structure so I just took a look at the ontology represented in (https://dl.dropboxusercontent.com/u/375401/dbo_no_mappings.nt) as referenced in the email below.

I normally start the evaluation of an ontology by looking at the top-down class relationships. So, I did a search for the classes that were listed as a direct subclass of owl#Thing to get a general idea of the organization of the DBpedia class structure.

To say the least, I was sorely disappointed. Here are a few of the DBpedia classes that are direct subclasses of owl#Thing: Food, Media, Work, Blazon, Altitude, Language, Currency, Statistic, Diploma, Award, Agent, PublicService, Disease, GrossDomesticProdutPerCapita, ElectionDiagram, Demographics, Relationship, Medicine, List, BioMolecule. I gave up after this small sample. It is obvious that the DBpedia community needs to worry a lot more about the structure of the ontology itself rather than focusing on selecting a new editor. A working group needs to be established to go back to the drawing board and look at the DBpedia ontology form the top down. It certainly doesn't make much sense as it is currently structured.


Yes, and we are doing exactly that in a parallel process. Ideally, both will be finished at the same time, i.e. clean up and new editor and automatic validation of edits via SHACL/RDFUnit in a pre-commit hook plus editorial guidelines and proces.

To give you a concrete idea how this will work:

The editorial guideline would state: "Thou shalt not add more top level classes" and below I added an example of an RDFUnit test case [1] [2] that finds all these classes mentioned by you. Each commit will run this test on the continuous integration server and since it is an error the commit will not pass automatically. This needs to be run on Jena with Pellet or another reasoning enabled in-memory SPARQL engine.
We are discussing these test cases at the moment in parallel.


rutt:no-new-top-level-classes
    a       rut:ManualTestCase ;
dcterms:description " retrieves all toplevel classes, i.e. direct subclasses of owl:Thing and throws an error, if these are not in the selected list of top classes ";
    rut:appliesTo rut:Schema ;
    rut:generated rut:ManuallyGenerated ;
    rut:references <http://dbpedia.org/ontology/> ;
    rut:source <http://dbpedia.org/ontology/> ;
    rut:testCaseLogLevel rlog:ERROR ;
    rut:sparqlWhere """ {
        ?class rdfs:subClassOf owl:Thing  .
        FILTER NOT EXISTS {    ?class rdfs:subClassOf ?otherClass .
FILTER (?otherClass != owl:Thing) .
                                        }
FILTER (?class != <top-level-class1> && ?this != <top-level-class2> && .....)

    } """ ;

   # snip ....


[1] http://rdfunit.aksw.org/
[2] Test-driven evaluation of linked data quality <http://svn.aksw.org/papers/2014/WWW_Databugger/public.pdf>. Dimitris Kontokostas, Patrick Westphal, Sören Auer, Sebastian Hellmann, Jens Lehmann, Roland Cornelissen, and Amrapali J. Zaveri in Proceedings of the 23rd International Conference on World Wide Web.


--
All the best,
Sebastian Hellmann

Director of Knowledge Integration and Linked Data Technologies (KILT) Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org, http://linguistics.okfn.org, https://www.w3.org/community/ld4lt <http://www.w3.org/community/ld4lt>
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion

Reply via email to