On 21/02/13 16:46, [email protected] wrote:
Hi,
I imported the NCI Thesaurus into a TDB store. I am using the Jena API to get
references to existing classes. This is my basic setup:
Dataset ds =
TDBFactory.createDataset("C:\\Playground\\Ontology\\TDBStore_Instances");
OntModel modelOnt = ModelFactory.createOntologyModel(OntModelSpec.RDFS_MEM,
ds.getDefaultModel());
For some classes the getOntClass() method works fine, e.g.:
modelOnt.getOntClass(NS_NCI_HASH + "Neoplasm");
modelOnt.getOntClass(NS_NCI_HASH + "Volume");
But for other classes, getOntClass() returns null, e.g.:
modelOnt.getOntClass(NS_NCI_HASH + "Carcinoma");
modelOnt.getOntClass(NS_NCI_HASH + "Malignant_Prostate_Neoplasm");
I tried to get an OntClass reference for these in different ways, too, e.g.:
I used modelOnt.listStatements(...) and stmt.getSubject().as(OntClass.class).
But this throws the exception:
Cannot convert node
http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#Malignant_Prostate_Neoplasm
to OntClass: it does not have rdf:type owl:Class or equivalent
The problem is, that
http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#Malignant_Prostate_Neoplasm
is an owl:Class. I verified this by just printing
stmt.getSubject().listProperties() to the console:
[http://ncicb.nci.nih.gov/xml/owl/EVS/Thesaurus.owl#Malignant_Prostate_Neoplasm,
http://www.w3.org/1999/02/22-rdf-syntax-ns#type,
http://www.w3.org/2002/07/owl#Class]
I also used a Sparql query on the same data set:
String query = PREFIXES
+ " SELECT * "
+ " WHERE { "
+ " nci:Malignant_Prostate_Neoplasm ?p ?o . "
+ " }";
Which prints this:
p | o
rdf:type | owl:Class
nci:Preferred_Name | "Malignant Prostate Neoplasm"^^xsd:string
...
So based on what I can see (and know), Carcinoma and
Malignant_Prostate_Neoplasm are both owl:Class, but getOntClass() does not seem
to agree.
Does anybody know why?
No :)
If your subject resource really does have an rdf:type owl:Class
assertion then that's enough to allow the as(OntClass.class) to go through.
If I download that ontology I don't see URIs like that, they are all of
the form #Cxxxxxx so it's hard to check. Presumably you have do some
sort of transformation on the data.
If you really are using identical URIs in the getOntClass and the
listStatments call then I can't see how that could happen. Short of
something drastic like a corrupt TDB database but you would know about that.
You do have the workaround to setStrictMode(false). Perhaps if you do
that and then examine the OntClass you get back some explanation might
be revealed.
Dave