Hi again,
On 05.07.2017 17:58, Sebastian Hellmann wrote:
On 05.07.2017 17:43, John Flynn wrote:
I have long been curious about the DBpedia ontology structure so I
just took a look at the ontology represented in
(https://dl.dropboxusercontent.com/u/375401/dbo_no_mappings.nt) as
referenced in the email below.
I normally start the evaluation of an ontology by looking at the
top-down class relationships. So, I did a search for the classes that
were listed as a direct subclass of owl#Thing to get a general idea
of the organization of the DBpedia class structure.
To say the least, I was sorely disappointed. Here are a few of the
DBpedia classes that are direct subclasses of owl#Thing: Food, Media,
Work, Blazon, Altitude, Language, Currency, Statistic, Diploma,
Award, Agent, PublicService, Disease, GrossDomesticProdutPerCapita,
ElectionDiagram, Demographics, Relationship, Medicine, List,
BioMolecule. I gave up after this small sample. It is obvious that
the DBpedia community needs to worry a lot more about the structure
of the ontology itself rather than focusing on selecting a new
editor. A working group needs to be established to go back to the
drawing board and look at the DBpedia ontology form the top down. It
certainly doesn't make much sense as it is currently structured.
Yes, and we are doing exactly that in a parallel process. Ideally,
both will be finished at the same time, i.e. clean up and new editor
and automatic validation of edits via SHACL/RDFUnit in a pre-commit
hook plus editorial guidelines and proces.
To give you a concrete idea how this will work:
The editorial guideline would state: "Thou shalt not add more top level
classes" and below I added an example of an RDFUnit test case [1] [2]
that finds all these classes mentioned by you.
Each commit will run this test on the continuous integration server and
since it is an error the commit will not pass automatically. This needs
to be run on Jena with Pellet or another reasoning enabled in-memory
SPARQL engine.
We are discussing these test cases at the moment in parallel.
rutt:no-new-top-level-classes
a rut:ManualTestCase ;
dcterms:description " retrieves all toplevel classes, i.e. direct
subclasses of owl:Thing and throws an error, if these are not in the
selected list of top classes ";
rut:appliesTo rut:Schema ;
rut:generated rut:ManuallyGenerated ;
rut:references <http://dbpedia.org/ontology/> ;
rut:source <http://dbpedia.org/ontology/> ;
rut:testCaseLogLevel rlog:ERROR ;
rut:sparqlWhere """ {
?class rdfs:subClassOf owl:Thing .
FILTER NOT EXISTS { ?class rdfs:subClassOf ?otherClass .
FILTER (?otherClass !=
owl:Thing) .
}
FILTER (?class != <top-level-class1> && ?this !=
<top-level-class2> && .....)
} """ ;
# snip ....
[1] http://rdfunit.aksw.org/
[2] Test-driven evaluation of linked data quality
<http://svn.aksw.org/papers/2014/WWW_Databugger/public.pdf>. Dimitris
Kontokostas, Patrick Westphal, Sören Auer, Sebastian Hellmann, Jens
Lehmann, Roland Cornelissen, and Amrapali J. Zaveri in Proceedings of
the 23rd International Conference on World Wide Web.
--
All the best,
Sebastian Hellmann
Director of Knowledge Integration and Linked Data Technologies (KILT)
Competence Center
at the Institute for Applied Informatics (InfAI) at Leipzig University
Executive Director of the DBpedia Association
Projects: http://dbpedia.org, http://nlp2rdf.org,
http://linguistics.okfn.org, https://www.w3.org/community/ld4lt
<http://www.w3.org/community/ld4lt>
Homepage: http://aksw.org/SebastianHellmann
Research Group: http://aksw.org
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
DBpedia-discussion mailing list
DBpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion