Hi Rupert, we are using XSLT transformation that we include within an enhancement engine. After that we want to look for links between entities extracted from the content-item by the XSLT transformation and a NER engine and entities stored in the entityhub. We want to do that by wrapping SILK in a bundle connected to the Stanbol SPARQL endpoint. When the bundle (an enhancement engine) receives the content-item it checks for duplicates following the comparisons defined in the SILK configuration file. The result can be some triples like <entity_1> <owl:sameAs> <entity_2> where <entity_1> comes from the content item and <entity_2> from the entityhub. These triples will be added to the content-item metadata but there is not a clear way to create an enhancement to connect these triples to the content-item. Maybe one way could be by reifying them. Do you have any advice/suggestion ?
Best Luigi 2013/6/12 Rupert Westenthaler <rupert.westentha...@gmail.com> > Hi Luigi, > > Regarding extracting the text from XML files you might want to check > if the TikaEngine can be used for that. > > Note also that you can parse pre-existing annotations to the Stanbol > Enhancer. You might want to have a look at this example [1]. > > Sorry I have not understood the part with "<entity_1> <owl:sameAs> > <entity_2>" > > best > Rupert > > > > [1] > http://stanbol.apache.org/docs/trunk/components/enhancer/enhancerrest#example-4-parse-existing-free-text-annotations > > On Tue, Jun 11, 2013 at 4:27 PM, Luigi Selmi <luigise...@gmail.com> wrote: > > Hi Rupert, > > > > thanks for your answers. I am working with XML files and want to > transform > > them into RDF then use a NER engine to extract entities from those > > properties that have text like titles and abstracts (that we map to > related > > DC properties). I was also searching a way to use another engine to > > interlink entites extracted from the transformation and from the NER > engine > > above using a sparql endpoint that will add triples like <entity_1> > > <owl:sameAs> <entity_2> to the content item metadata. I was thinking to > > create enhancements every time a new resource is created by the > > transformation and a owl:sameAs relationship is found by an interlinking > > process. That will provide information about the document from which > those > > triples have been extracted and the engines that made it like what > happens > > when a plain text file is sent to a NLP engine. > > > > Best > > > > Luigi > > > > > > 2013/6/11 Rupert Westenthaler <rupert.westentha...@gmail.com> > > > >> Hi Luigi, > >> > >> I am not sure if I understand your question, but let me try to answer. > >> > >> Please note [1] when reading through this mail. > >> > >> fise:EntityAnnotation use the fise:entity-reference property to link > >> to the URI of the suggested Entity. If your question was about what > >> happens if there are several fise:EntityAnnotation's referring to the > >> same Entity (same value for the fise:entity-reference) than the answer > >> is - it depends on the EnhancementEngine (and the situation) > >> > >> fise:EntityAnnotation may have multiple dc:relation properties > >> pointing to several fise:TextAnnotaions - in this case this means that > >> an Entity is suggested for several mentions within the analyzed > >> content. > >> > >> However EnhancementEngines may also decide to create multiple > >> fise:EnityAnnotation instances all pointing to the same entity. This > >> is typically the case for disambiguation Engines (e.g. the > >> disambiguation-mlt engine) as those will want to note different > >> fise:confidence values for the different mentions linked with the same > >> Entity. > >> > >> The Stanbol Enhancer does not add any relations between entities. If > >> you see relations "<entity1> <owl:sameAs> <entity2>" than it means > >> that (1) dereferencing of linked Entities is enabled and (2) those > >> triples where already present in the knowledge base where the Entity > >> do come from. > >> > >> Hope this answers your question > >> best > >> Rupert > >> > >> [1] > >> > http://stanbol.apache.org/docs/trunk/components/enhancer/enhancementstructure.png > >> > >> > >> On Mon, Jun 10, 2013 at 7:36 PM, Luigi Selmi <luigise...@gmail.com> > wrote: > >> > Hi all, > >> > > >> > in the documentation is not clear which properties are attached to an > >> > enhancement created by a linking engine after two URIs have been > found to > >> > represent the same entity. Which are the FISE properties used to state > >> that > >> > > >> > <entity1> <owl:sameAs> <entity2> > >> > > >> > where <entity1> and <entity2> are referenced by <enhancement1> and > >> > <enhancement2> ? > >> > > >> > Following the way in which enhancements are created in Stanbol a > linking > >> > engine should create a new enhancement, say <enhancement3> with its > >> > confidence value that should state in some value the fact above but I > >> > couldn't find any clear statement about this in the documentation. > Anyone > >> > knows how a linking engine works for this ? Thanks in advance. > >> > > >> > > >> > Luigi > >> > >> > >> > >> -- > >> | Rupert Westenthaler rupert.westentha...@gmail.com > >> | Bodenlehenstraße 11 ++43-699-11108907 > >> | A-5500 Bischofshofen > >> > > > > -- > | Rupert Westenthaler rupert.westentha...@gmail.com > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen >