Author: rwesten
Date: Tue Oct 28 11:18:49 2014
New Revision: 1634847
URL: http://svn.apache.org/r1634847
Log:
some update to the documentation of STANBOL-1397
Modified:
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/nif20.mdtext
Modified:
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/nif20.mdtext
URL:
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/nif20.mdtext?rev=1634847&r1=1634846&r2=1634847&view=diff
==============================================================================
---
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/nif20.mdtext
(original)
+++
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/nif20.mdtext
Tue Oct 28 11:18:49 2014
@@ -4,20 +4,20 @@ Typically low level NLP results are not
## Processed Information (Input)
-Apache Stanbol manages NLP results by the [Analysed Text](../nlp/analyzedtext)
content part. This ContentPart provides a Java API for accessing those results.
This engine reads such information and transformes it according to the [NIF
2.0](http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html)
core ontology.
+Apache Stanbol manages NLP results by the [Analysed Text](../nlp/analyzedtext)
content part. This ContentPart provides a Java API for accessing those results.
This engine reads such information and transformes it according to the [NIF
2.0](http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html)
core ontology. Transformed information will be added as RDF to the Enhancement
Metadata and be included in the RDF response of the enhancement request.
If a ContentItem does not contain this content part it will not be processed
by this engine.
## Created RDF
-The engine serializes the following information:
+The engine serializes NLP annotations as defined by the [NIF 2.0 core
ontology](http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core/nif-core.html).
More specifically the engine is capable of it the following information:
-* Segment URIs by using the [RFC 5147](http://tools.ietf.org/html/rfc5147) URI
scheme
-* Selector information like `nif:beginIndex`, `nif:endIndex` as well as
`nif:before`, `nif:anchorOf` and `nif:after`. For spans longer as 100 chars the
`nif:head` property is used instead of `nif:anchorOf`.
-* Context information: This includes `nif:referenceContext` links for all
Strings as well as additional metadata for the context.
-* String hierarchies: `nif:sub-/nif:superWord`, `nif:sentence`
-* String navigation: `nif:next-/nif:previousSentnece`,
`nif:next-/nif:previousWord`
-* String annotations: `nif:oliaCategory`, `nif:oliaConfidence` and `nif:posTag`
+* Segment URIs do use [RFC 5147](http://tools.ietf.org/html/rfc5147). It can
be configured if the `nif:RFC5147String` type is only added to the
`nif:Context` instance or to all serialized `nif:String`instances.
+* Selector information like `nif:beginIndex`, `nif:endIndex` as well as
`nif:before`, `nif:anchorOf` and `nif:after`. For spans longer as 100 chars the
`nif:head` property is used instead of `nif:anchorOf`. Their is an option to
prevent those features to be serialized. This will greatly decrease the triple
count however clients will need to parse the start/end positions from the
segment URI.
+* All serialized `nif:String` instances do refer the `nif:Context` with the
`nif:referenceContext`. The context will refer to the URI of the ContentItem by
using the `nif:sourceUrl` property. The inclusion of the content as String
literal is NOT supported by this engine.
+* String hierarchies: This includes `nif:subWord` `nif:superWord` and
`nif:sentence` properties. If not required serializing of those can be
deactivated.
+* String navigation: This includes `nif:nextSentence`, `nif:previousSentnece`,
`nif:nextWord` and `nif:previousWord` properties. The transitive versions of
those properties are NOT supported. Users that want to have transitive
reasoning will anyway get those from the reasoner. String navigation properties
can be deactivated. This will greatly decrease the triple count.
+* String annotations: This currently includes `nif:oliaCategory`,
`nif:oliaConfidence` and `nif:posTag`. `nif:oliaLink` is not supported as the
Stanbol NLP API does not provide the required information. Also support for
word level sentiment annotations is not yet implemented.
### Configuration