Hi Luigi, Regarding extracting the text from XML files you might want to check if the TikaEngine can be used for that.
Note also that you can parse pre-existing annotations to the Stanbol Enhancer. You might want to have a look at this example [1]. Sorry I have not understood the part with "<entity_1> <owl:sameAs> <entity_2>" best Rupert [1] http://stanbol.apache.org/docs/trunk/components/enhancer/enhancerrest#example-4-parse-existing-free-text-annotations On Tue, Jun 11, 2013 at 4:27 PM, Luigi Selmi <luigise...@gmail.com> wrote: > Hi Rupert, > > thanks for your answers. I am working with XML files and want to transform > them into RDF then use a NER engine to extract entities from those > properties that have text like titles and abstracts (that we map to related > DC properties). I was also searching a way to use another engine to > interlink entites extracted from the transformation and from the NER engine > above using a sparql endpoint that will add triples like <entity_1> > <owl:sameAs> <entity_2> to the content item metadata. I was thinking to > create enhancements every time a new resource is created by the > transformation and a owl:sameAs relationship is found by an interlinking > process. That will provide information about the document from which those > triples have been extracted and the engines that made it like what happens > when a plain text file is sent to a NLP engine. > > Best > > Luigi > > > 2013/6/11 Rupert Westenthaler <rupert.westentha...@gmail.com> > >> Hi Luigi, >> >> I am not sure if I understand your question, but let me try to answer. >> >> Please note [1] when reading through this mail. >> >> fise:EntityAnnotation use the fise:entity-reference property to link >> to the URI of the suggested Entity. If your question was about what >> happens if there are several fise:EntityAnnotation's referring to the >> same Entity (same value for the fise:entity-reference) than the answer >> is - it depends on the EnhancementEngine (and the situation) >> >> fise:EntityAnnotation may have multiple dc:relation properties >> pointing to several fise:TextAnnotaions - in this case this means that >> an Entity is suggested for several mentions within the analyzed >> content. >> >> However EnhancementEngines may also decide to create multiple >> fise:EnityAnnotation instances all pointing to the same entity. This >> is typically the case for disambiguation Engines (e.g. the >> disambiguation-mlt engine) as those will want to note different >> fise:confidence values for the different mentions linked with the same >> Entity. >> >> The Stanbol Enhancer does not add any relations between entities. If >> you see relations "<entity1> <owl:sameAs> <entity2>" than it means >> that (1) dereferencing of linked Entities is enabled and (2) those >> triples where already present in the knowledge base where the Entity >> do come from. >> >> Hope this answers your question >> best >> Rupert >> >> [1] >> http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure.png >> >> >> On Mon, Jun 10, 2013 at 7:36 PM, Luigi Selmi <luigise...@gmail.com> wrote: >> > Hi all, >> > >> > in the documentation is not clear which properties are attached to an >> > enhancement created by a linking engine after two URIs have been found to >> > represent the same entity. Which are the FISE properties used to state >> that >> > >> > <entity1> <owl:sameAs> <entity2> >> > >> > where <entity1> and <entity2> are referenced by <enhancement1> and >> > <enhancement2> ? >> > >> > Following the way in which enhancements are created in Stanbol a linking >> > engine should create a new enhancement, say <enhancement3> with its >> > confidence value that should state in some value the fact above but I >> > couldn't find any clear statement about this in the documentation. Anyone >> > knows how a linking engine works for this ? Thanks in advance. >> > >> > >> > Luigi >> >> >> >> -- >> | Rupert Westenthaler rupert.westentha...@gmail.com >> | Bodenlehenstraße 11 ++43-699-11108907 >> | A-5500 Bischofshofen >> -- | Rupert Westenthaler rupert.westentha...@gmail.com | Bodenlehenstraße 11 ++43-699-11108907 | A-5500 Bischofshofen