Re: Specifying An EntityDefinition when Building a Jena TDB index

Carlos S. Zamudio Tue, 26 Nov 2013 12:27:02 -0800

Below is a bit more complete example of my problem. What happens when Irun this is that the TDB model is created in the specified directory butindex directory contains only a couple of segements files, but not acomplete index. (If I run the example supplied with the jena-textrelease I can get an index created). So I'm guessing that the way I amcreating an index from an existing TDB source is not persisting theindex for some reason.


   String modelUrl ="file:///E:/skos/AAA.xml";
   String modelDirectory ="E:/tdb/AAA";
   File indexPath =*new*File("E:/tdbindex/AAA");
   Directory directory = FSDirectory./open/(indexPath);
   //
   Dataset modelDataset =*null*;
   Dataset indexedDataset =*null*;
   *try*{
   modelDataset = TDBFactory./createDataset/(modelDirectory);
   //
   Model modelBase = modelDataset.getDefaultModel();
   modelBase.read(modelUrl);
   //
   Model defaultModel = modelDataset.getDefaultModel();
   StmtIterator si = defaultModel.listStatements();
   System./out/.println("Number of model statements: "+si.toList().size());
   //
   EntityDefinition entDef
   =*new*EntityDefinition(/PREF_LABEL_PROPERTY/,"prefLabel",
   RDFS./label/.asNode()) ;
   //
   indexedDataset = TextDatasetFactory./createLucene/(modelDataset,
   directory, entDef);


   defaultModel = indexedDataset.getDefaultModel();
          si = defaultModel.listStatements();
   System./out/.println("Number of model statements: "+si.toList().size());
     }
   *catch*(Exception e) {
   *throw*e;
     }
   *finally*{
   *if*(modelDataset !=*null*)  { modelDataset.close(); }
   *try*{*if*(indexedDataset !=*null*) { indexedDataset.close(); }
   }*catch*(Exception e) {}
     }

Thanks for any suggestions.



On 11/26/2013 3:23 AM, Andy Seaborne wrote:

Carlos,

Do you have a complete, minimal example? Your description looks OKbut the details matter. What is the code to setup the index?


    Andy


On 26/11/13 00:47, Carlos S. Zamudio wrote:

Hi,

I'm having a bit of trouble deciphering the specification of the
EntityDefinition when constructing a Jena TDB index using the jena-text
module in 2.11.0. (I've been successfully using the previous LARQ module
for indexing RDF data sets).

I am attempting to index a data set that represents a SKOS vocabulary.
Below is an example entry in the model:

|<http://purl.obolibrary.org/obo/ID_62354>|||

|         skos:broader <http://purl.obolibrary.org/id/ID_35317> ;|

|         skos:prefLabel    "The preferred label for the entity" ;|

|         skos:hiddenLabel  "The hidden label for the entity" ;|

|         skos:altLabel     "An alternative label for the entity" ;|

|         rdf:type          skos:Concept|

The skos:prefLabel, skos:hiddenLabel and skos:altLabel are subclasses of
rdfs:label.

I would like to index the prefLabel, hiddenLabel and altLabels for all
of the entries.

The EntityDefintion is defined in the documentation as follows:

|public EntityDefinition(String entityField,|||

|                 String primaryField,|

|                 com.hp.hpl.jena.rdf.model.Resource primaryPredicate)|

From what I can gather the entityField is the field name in the index.

The primary field should be the skos:prefLabel property for example. And
the primaryPredicate should be specified as the RDFS.label.asNode()
resource.

It seems I can also add additional fields by calling the .set() method.

I can't seem to generate an index file when I use:

TextDatasetFactory.createLucene(dataset, directory, entityDefition)

I've verified that my dataset is valid, and that the directory is also
valid.

Do I have the right idea for specifying an EntityDefinition?

Any hints would be appreciated.

Re: Specifying An EntityDefinition when Building a Jena TDB index

Reply via email to