Hi all,
As lengthily explained in the attached mail the Fusepool project[1]
has defined an Annotation Model based on Open Annotation [3][4] and
NIF 2.0 [5][6]. This model is called FAM (Fusepool Annotation Model
[16]).
This mail present the FISE to FAM transformation engine [17] provided
by Fusepool. This engine support the transformation of FISE
enhancements to Open Annotation based annotations as defined by the
Fusepool Annotation Model [16].
### Supported FISE enhancement types
The engine supports the following FISE enhancement types:
1. fise:TextAnnotation used to annotate the language of the Content.
Those are transformed to fam:ContentLanguage [8] annotations
2. fise:TextAnnotation used to annotate named entities. Those are
transformed to fam:EntityMention [9]
3. fise:EntityAnnotation used to suggest entities for mentions in the
text. Those are transformed to fam:EntitySuggestion [12]
4. fise:EntityAnnotation without a relation to an fise:TextAnnotation
are transformed to fam:EntityAnnotation [10]
5. fise:TopicAnnotation are transformed to fam:TopicAnnotation [13]:
Only Sentiment annotations are not (yet) supported by the engine as
the FAM currently does not define a fitting annotation body.
NOTES:
* transformed fise:Enhancements are deleted from the enhancement metadata
* unsupported fise:Enhancements are kept unchanged
* information of dereferenced entities are also not toughed
### Installation
To use the engine one needs to install
<dependency>
<groupId>eu.fusepool.p3.stanbol-engines-fise2fam</groupId>
<artifactId>stanbol-engines-fise2fam</artifactId>
<version>1.0.0</version>
</dependency>
to the OSGI environment running Stanbol (e.g. by adding it to a
bundlelist of a custom launcher or copying the jar to the fileinstall
folder or by manually adding it via the bundle tap of the Felix
Webconsole).
The engine is available on Maven Central [18].
The engine woks with the 0.12.0 release, the 0.12.1-SNAPSHOT as well
as the 1.0.0-SNAPSHOT versions of Stanbol.
After installing the bundle a default configuration of the Engine with
the name `fise2fam` will be available.
### License
The engine is provided under Apache Software License 2.0
### Configuration
The Fise2FamEngine provides two configuration parameters
* Selector type (enhancer.engine.fise2fam.selectortype.name): This
allows to configure the type of selectors. Supported are NIF 2.0 and
Open Annotation. Also a compatibility mode where properties of both
are written in supported (default: NIF).
* NIF: will activate NIF 2.0 type selectors
* OA : will activate Open Annotation Text Position Selector and
Text Quote Selector.
* BOTH: compatibility mode that will write both NIF and OA
selector information
* Write Metadata (enhancer.engine.fise2fam.metadata.name): This switch
allows to enable/disable the serialization of the metadata. If
disabled no oa:Annotation and oa:SpecificResource will get serialized.
Deactivating this option will make the resulting RDF to no longer
confirm to the Open Annotation standard. However it will also reduce
the triple count by > 50% (default: enabled)
The engine allows for multiple service instances with different
configurations. Just make sure that those do use different names. The
default instance does use the name `fise2fam`.
### Usage
To use the Fise2FamEngine just add the name of the engine (by default
`fise2fam` to your chain. As the Fise2FamEngine is a post-processing
engine it expects to be executed last.
When used in a WeightedChain this will happen automatically based on
metadata provided by the engine. Users that configure a ListChain will
need to ensure to add the `fise2fam` engine at the end of the list.
Happy testing
Rupert Westenthaler
[1] http://p3.fusepool.eu/
[3] http://www.openannotation.org/spec/core/
[4]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/openannotation.md
[5] http://persistence.uni-leipzig.org/nlp2rdf/
[6]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/nif.md
[7]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#annotation-core
[8]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#language-annotation
[9]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#entity-mention-annotation
[10]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#entity-annotation
[11]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#linked-entity-annotation
[12]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#entity-linking-choice-annotation
[13]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#topic-classification
[14]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#transformation-of-fise-to-the-fusepool-annotation-model
[15] http://svn.apache.org/repos/asf/stanbol/trunk/enhancement-engines/nlp2rdf/
[16]
https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md
[17] https://github.com/fusepoolP3/p3-stanbol-engine-fam
[18]
http://search.maven.org/#artifactdetails|eu.fusepool.p3.stanbol-engines-fise2fam|stanbol-engines-fise2fam|1.0.0|bundle
On Fri, Jul 25, 2014 at 10:11 AM, Rupert Westenthaler
<[email protected]> wrote:
> Hi all,
>
> The task to define a new Enhancement Structure for apache Stanbol is
> long outstanding (see STANBOL-351 [2]). In the past years several
> discussion started but none of them where coming even to the state of
> providing a good model.
>
> In recent times thanks to the support of the Research Project Fusepool
> P3 [1] I was able to spent time on this task and this this mail I
> would like to present the current state of this effort to the
> community.
>
> - - -
>
> The Fusepool Annotation Model
> -------------------------------------------
>
> The Fusepool Annotation Model (FAM) is based on Open Annotation [3][4]
> and uses NIF 2.0 [5][6] for Selectors and lower level NLP annotations.
> Summaries about Open Annotation and NIF are available at [4] and [6].
>
> The FAM is build up by two main parts:
>
> 1. The "Annotation Core" [7]: This defines the core annotation model
> and is based on Open Annotation and NIF.
> 2. Several "Annotation Bodies" for different annotation types. Those
> bodies include annotations for
> * Content Language [8]: Annotation used to annotate the language
> of the content
> * Entity Mentions [9]: Annotation for describing Named Entities
> detected in the parsed text
> * Entity Annotation [10]: Used to link Entities with the analyses Content
> * Linked Entity [11]: Combines an Entity Mention and an Entity
> Annotation. Used to link a mention of the Entity with a single Entity
> of a Vocabulary (e.g. after disambiguation)
> * Entity Linking Choice and Entity Suggestion [12]: Used to
> suggest several possible Entities for a Entity Mention.
> * Topic Classification and Topic Annotation [13]: Used to classify
> a content with several Topics of a classification scheme.
>
> With those predefined Annotation Bodies one can describe everything
> that is currently support by FISE. So the new Model has 100% coverage
> of the enhancement structure currently use by Apache Stanbol.
>
>
> Migration options from FISE to FAM
> ------------------------------------------------
>
> An easy migration from FISE to the FAM model was in important
> requirement. To avoid the need of adapting all existing Stanbol
> Engines to use the new model the decision was to define the FAM in a
> way that one can define transformation rules from FISE to FAM [14].
>
> Having such rules makes it possible to implement a
> "Fise2FamTransformationEngine" that if added to the end of an
> Enhancement Chain will allow users to receive Enhancement Results
> based on the FAM model.
>
> I will implement such an Engine in the 2nd half of August. This engine
> will be compatible both with the 0.12.* and 1.0.0 versions of Apache
> Stanbol.
>
> As part of this Effort I will also update the Nlp2RdfEngine [15] to
> support NIF 2.0. As FAM use NIF selectors having such an engine is
> much more relevant as now NLP annotations serialized using NIF 2.0
> will be automatically merged with Selectors used by high level FAM
> annotations.
>
> Next Steps:
> ----------------
>
> As part of my work on Fusepool I will implement the
> Fise2FamTransformationEngine and update the Nlp2RdfEngine before the
> end of September. Both engines will be Open Source and Apache
> Licensed. Meaning that by end of September all current Stanbol users
> will be able to play around with the new Annotation Model.
>
> IMHO it would really make sense to deprecate the current FISE Model
> and migrate to an Model based on Open Annotation and NIF. I am
> confident that FAM is a good starting point in that direction.
>
> From FISE to a Stanbol Annotation Model
> --------------------------------------------------------
>
> A possible path to migrate to a new Model could look like follows:
>
> * The Stanbol Community has a look at the FAM and tests it against
> current use cases as soon as the Fise2FamTransformationEngine is
> available.
> * Based on results of that process we can refine the FAM model and
> make it to the preferred Enhancement Model. By that we should also
> change its namespace to use "http://stanbol.apache.org/ongoloty/"
> * For Stanbol 0.12.* and 1.0.0 we will support the new model by
> providing a transformation engine For Stanbol 2.0.0 we would change
> all engines to natively support the new model.
>
> WDYT
> Rupert Westenthaler
>
> [1] http://p3.fusepool.eu/
> [2] https://issues.apache.org/jira/browse/STANBOL-351
> [3] http://www.openannotation.org/spec/core/
> [4]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/openannotation.md
> [5] http://persistence.uni-leipzig.org/nlp2rdf/
> [6]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/nif.md
> [7]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#annotation-core
> [8]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#language-annotation
> [9]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#entity-mention-annotation
> [10]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#entity-annotation
> [11]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#linked-entity-annotation
> [12]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#entity-linking-choice-annotation
> [13]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#topic-classification
> [14]
> https://github.com/fusepoolP3/overall-architecture/blob/master/wp3/fp-anno-model/fp-anno-model.md#transformation-of-fise-to-the-fusepool-annotation-model
> [15]
> http://svn.apache.org/repos/asf/stanbol/trunk/enhancement-engines/nlp2rdf/
>
> --
> | Rupert Westenthaler [email protected]
> | Bodenlehenstraße 11 ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO
> ..........................................................................
> | http://redlink.co/
--
| Rupert Westenthaler [email protected]
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO
..........................................................................
| http://redlink.co/