Author: rwesten
Date: Fri May 23 07:57:18 2014
New Revision: 1597023

URL: http://svn.apache.org/r1597023
Log:
STANBOL-488, STANBOL-336, STANBOL-1223, STANBOL-1165: improvements, corrections 
and clarifications

Modified:
    
stanbol/site/trunk/content/docs/trunk/components/enhancer/chains/weightedchain.mdtext
    
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhubdereference.mdtext
    
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext
    
stanbol/site/trunk/content/docs/trunk/components/enhancer/enhancementproperties.mdtext
    
stanbol/site/trunk/content/docs/trunk/components/entityhub/managedsite.mdtext
    
stanbol/site/trunk/content/docs/trunk/utils/marmotta-kiwi-repository-service.mdtext

Modified: 
stanbol/site/trunk/content/docs/trunk/components/enhancer/chains/weightedchain.mdtext
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/chains/weightedchain.mdtext?rev=1597023&r1=1597022&r2=1597023&view=diff
==============================================================================
--- 
stanbol/site/trunk/content/docs/trunk/components/enhancer/chains/weightedchain.mdtext
 (original)
+++ 
stanbol/site/trunk/content/docs/trunk/components/enhancer/chains/weightedchain.mdtext
 Fri May 23 07:57:18 2014
@@ -16,9 +16,36 @@ The syntax to define an Engine as option
     <name>;optional
     <name>;optional=true
 
+The following figure shows the configuration dialog of a WeightedCahin 
configured with two required and an optional engine.
 
 ![Configuration dialog for the 
WeightedCahin](enhancer-weightedchain-config.png "Screenshot of the 
configuration dialog for a WeightedChain with two required and one optional 
engine")
 
+## Enhancement Properties Support
+
+__since `0.12.1`__
+
+Starting from `0.12.1` the Weighted Chain allows to configure 
[EnhancementProperties](../enhancementproperties)
+
+* __chain and engine__ scoped properties are defined as parameters to the 
engines with the syntax `{engine-name}; {property-name-1}={value-1},{value-2}; 
{property-name-2}={value-1};` 
+
+* __chain__ scoped properties can be configured by using the osgi property key 
`stanbol.enhancer.chain.chainproperties` by the syntax 
`{property-name-1}={value-1},{value-2}`. NOTE that `;` is NOT supported as 
separator for parsing multiple properties as OSGI configurations already define 
a way for parsing multiple values
+
+All EnhancementProperties configured with a [Chain](chains) are written as RDF 
to the [ExecutionPlan](chains/executionplan). _Chain_ scoped properties are 
directly added to the `ep:ExecutionPlan` instance while _chain and engine_ 
scoped properties are added to the `ep:ExecutionNode` of the according engine.
+
+The following figure and listing provide an example
+
+![WeightedChain including some Enhancement 
Properties](enhancer-weightedchain-enhprop-config.png)
+
+The figure shows that for the `dbpedia-fst` engine the maximum number of 
suggestions are set to `10`. Also the minimum confidence value is set to `0.8`. 
For the `dbpedia-dereference` engine the dereferenced languages are set to 
English, German and Spanish. Finally a _chain_ scoped property is used to set 
the maximum number of suggestions for the whole chain to `5`. However this has 
no effect for the `dbpedia-fst` engine as its custom configuration will 
override this chain wide property.
+
+The following listing shows the exact same configuration in the `.cfg` format.
+
+    stanbol.enhancer.chain.name="dbpedia-linking"
+    
stanbol.enhancer.chain.weighted.chain=["tika;optional","opennlp-sentence","opennlp-token","opennlp-pos","opennlp-chunker",
+        "dbpedia-fst;\ enhancer.max-suggestions\=10;\ 
enhancer.min-confidence\=0.8",
+        "dbpedia-dereference;\ 
enhancer.engines.dereference.languages\=en,de,es"]
+    stanbol.enhancer.chain.chainproperties=["enhancer.max-suggestions\=5"]
+
 ## Calculation of the ExecutionPlan
 
 It is important to note that the ordering of the list has no influence on the 
ExecutionPlan because the order of execution of the configured 
[EnhancementEngines](../engines) is calculated only by using the value of the 
"org.apache.stanbol.enhancer.engine.order" property provided by the 
EnhancementEngine:

Modified: 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhubdereference.mdtext
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhubdereference.mdtext?rev=1597023&r1=1597022&r2=1597023&view=diff
==============================================================================
--- 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhubdereference.mdtext
 (original)
+++ 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/entityhubdereference.mdtext
 Fri May 23 07:57:18 2014
@@ -38,7 +38,9 @@ The Shared Thread Pool is a singelton Co
 
 ![Shared Thread Pool 
Configuration](entityhub-dereference-engine-shared-threadpool-config.png)
 
-### Field Mapping Support
+### Advanced Dereference Configurations
+
+#### Entityhub Field Mapping Support
 
 The _enhancer.engines.dereference.fields_ configuration does support the 
Entityhub Field Mapping language.
 
@@ -47,12 +49,31 @@ FieldMappings do use the following synta
     :::text
     [!]FieldPattern [| Filter] [> Mapping]
 
-* an optional Exclusion indicated by '!' as the first character of the mapping 
used to exclude fields that are matched by the pattern.
+* an optional Exclusion indicated by '!' as the first character of the mapping 
used to exclude fields that are matched by the `FieldPattern` part (e.g. 
`!foaf:*` will exclude all properties of the FOAF namespace). Exclusions are 
only useful if a wildcard is used (e.g. `foaf:*` together with `!foaf:mbox`).
 * the required _FieldPattern_ supports the definition of prefixes such as 
`http://xmlns.com/foaf/0.1/*` or `foaf:*`
 * the optional _Filter_ part allows to filter specific languages (e.g. 
`@=null;en;de;` will only dereference English and German literals as well as 
literals with no language tag), typed literals (e.g. `d=xsd:dateTime;xsd:date`) 
or URI values (e.g. `d=entityhub:ref`). Filters will also try to convert values 
to the parsed data type (e.g. `d=xsd:double` would convert `xsd:float` values 
to `xsd:doule`. Also string literals that can be parsed as double would be 
converted).
 * an optional _Mapping_ can be used to copy values to an other field (e.g. 
`foaf:name > schema:name` would copy all FOAF names to the schema.org name 
field)
 
-__NOTE__: Field Mappings configured for the EntityhubDerefereceEngine are 
overridden by Field Mappings parsed as [Enhancement 
Properties](../enhancementproperties).
+__NOTE__ that Field Mappings configured for the EntityhubDerefereceEngine are 
overridden by Field Mappings parsed as [Enhancement 
Properties](../enhancementproperties).
+
+### LDPath support
+
+The use of[LD Path Language](http://marmotta.apache.org/ldpath/language.html) 
is an alternative to most of the features supported by the Entityhub Field 
Mapping language. Especially _Filters_ and _Mapping_ SHOULD BE expressed using 
LD Path. 
+
+The only advantage of the Field Mapping language is that is supports the use 
of wildcards and exclusions. So in cases where one once to dereference all 
properties of a specific namespace it is only possible to specify this by using 
the Field Mapping language.
+
+The following Example shows a configuration that dereferences all schema.org 
properties and also uses LD Path to align soem none schema.org properties
+
+    :::text
+    enhancer.engines.dereference.fields="schema:*"
+    enhancer.engines.dereference.ldpath=["@prefix schema 
<http://schema.org/>;",
+        "@prefix dct <http://purl.org/dc/terms/>;",
+        "schema:name = (rdfs:label | dct:title | dc:title | foaf:name | 
skos:prefLabel);",
+        "schema:alternateName = skos:altLabel;"
+        "schema:image = foaf:depiction;",
+        "schema:homepage = foaf:homepage;"]
+        
+_NOTE_ when used in a OSGI `*.cfg` file one would need to escape spaces and 
`=` with `\` and remove all line breaks.
 
 ## Supported Enhancement Properties 
 
@@ -61,3 +82,12 @@ The following Enhancement Properties are
 * __Dereference Languages__ _(enhancer.engines.dereference.languages)_: A set 
of languages that are dereferenced. Even if _'Dereference only Content Language 
Literals'_ is active explicitly configured languages will still get 
dereferenced. * __Dereferenced Fields__ 
_(enhancer.engines.dereference.fields)_: The dereferenced fields - in RDF 
terminology 'properties' - to be dereferenced. QNames (e.g. `rdf:label`) can be 
used for the configuration. This Engine supports the use of FieldMappings for 
the configuration. Dereferenced Fields parsed as EnhancementProperty will 
override values configured for the Engine.
 * __Dereference LD Path__ _(enhancer.engines.dereference.ldpath)_: The [LD 
Path Language](http://marmotta.apache.org/ldpath/language.html) allows to 
define powerful selectors for dereferenced Entities. An LD Path program parsed 
as EnhancementProperty will be executed in addition to those configured for the 
engine.
 
+As an example the following query parameter would instruct all Entityhub 
Dereference engines used in an enhancement engine to just dereference English 
and German literals.
+
+    :::bash
+    curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
+        --data "The Eifeltower is located in Paris." 
+        
http://localhost:8080/enhancer?enhancer.engines.dereference.languages=en&\
+        enhancer.engines.dereference.languages=de
+
+

Modified: 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext?rev=1597023&r1=1597022&r2=1597023&view=diff
==============================================================================
--- 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext 
(original)
+++ 
stanbol/site/trunk/content/docs/trunk/components/enhancer/engines/list.mdtext 
Fri May 23 07:57:18 2014
@@ -183,11 +183,8 @@ Enhancement Engines in this category can
        * create Entity suggestions (fise:EntityAnnotations) for the processed 
fise:TextAnnotations
        * accesses a remote service
 
-* _Solr More-like-This Disambiguation Engine:_ __under development_ (see 
[STANBOL-723](https://issues.apache.org/jira/browse/STANBOL-723))
+* __Solr More-like-This Disambiguation Engine:__ (see 
[STANBOL-723](https://issues.apache.org/jira/browse/STANBOL-723))
        * disambiguates Entities managed by the Stanbol Entityhub by using Solr 
MLT queries
-       * only available via the 
[disambiguation-engine](http://svn.apache.org/repos/asf/stanbol/branches/disambiguation-engine/)
 branch
-       * adjusts the fise:confidence of existing fise:EntityAnnotations
-
 
 
 ## Postprocessing / Other

Modified: 
stanbol/site/trunk/content/docs/trunk/components/enhancer/enhancementproperties.mdtext
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/enhancer/enhancementproperties.mdtext?rev=1597023&r1=1597022&r2=1597023&view=diff
==============================================================================
--- 
stanbol/site/trunk/content/docs/trunk/components/enhancer/enhancementproperties.mdtext
 (original)
+++ 
stanbol/site/trunk/content/docs/trunk/components/enhancer/enhancementproperties.mdtext
 Fri May 23 07:57:18 2014
@@ -136,6 +136,7 @@ Starting with `0.12.1` Enhancement Prope
 
 The following shows the curl request generating the equivalent of the example 
used in the above section:
 
+    :::bash
     curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
         --data "The Eifeltower is located in Paris." 
         http://localhost:8080/enhancer?enhancer.max-suggestions=5&\

Modified: 
stanbol/site/trunk/content/docs/trunk/components/entityhub/managedsite.mdtext
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/components/entityhub/managedsite.mdtext?rev=1597023&r1=1597022&r2=1597023&view=diff
==============================================================================
--- 
stanbol/site/trunk/content/docs/trunk/components/entityhub/managedsite.mdtext 
(original)
+++ 
stanbol/site/trunk/content/docs/trunk/components/entityhub/managedsite.mdtext 
Fri May 23 07:57:18 2014
@@ -91,7 +91,7 @@ where 'sparqlQuery.txt' refers to a file
 
 With [STANBOL-1169](https://issues.apache.org/jira/browse/STANBOL-1169) (since 
version `0.12.1`) a Sesame Repository registered as OSGI service can be used as 
Entityhub Yard.
 
-The following figure shows a Apache Marmotta Kiwi Repository registered as 
OSGI service. 
+The following figure shows a [Apache Marmotta Kiwi 
Repository](/docs/trunk/utils/marmotta-kiwi-repository-service) registered as 
OSGI service. 
 
 ![Marmotta Kiwi Repository Service](marmotta-kiwi-repository-service.png)
 

Modified: 
stanbol/site/trunk/content/docs/trunk/utils/marmotta-kiwi-repository-service.mdtext
URL: 
http://svn.apache.org/viewvc/stanbol/site/trunk/content/docs/trunk/utils/marmotta-kiwi-repository-service.mdtext?rev=1597023&r1=1597022&r2=1597023&view=diff
==============================================================================
--- 
stanbol/site/trunk/content/docs/trunk/utils/marmotta-kiwi-repository-service.mdtext
 (original)
+++ 
stanbol/site/trunk/content/docs/trunk/utils/marmotta-kiwi-repository-service.mdtext
 Fri May 23 07:57:18 2014
@@ -16,14 +16,18 @@ configuration. The following figure show
 * `org.openrdf.repository.Repository.id`: The id of the Repository. Intended 
to be used by
 other components to track a specific repository instance.
 * `marmotta.kiwi.dialect`: The KiWi Database dialect. Currently Marmotta 
supports the
-H2Dialect, PostgreSQLDialect and MySQLDialect. Note that the selected dialect 
will select
+`H2Dialect`, `PostgreSQLDialect` and `MySQLDialect`. Note that the selected 
dialect will select
 different database driver. If those are not available the activation will 
throw an
 exception. PostgreSQL driver are embedded. H2 drivers are included in the 
default
-Bundlelist used by Stanbol.
+[Marmotta Kiwi 
Bundlelist](http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/launchers/bundlelists/marmotta/kiwi/src/main/bundles/list.xml)
 used by Stanbol. For MySQL the according dependency needs to be uncommented in
+the [Marmotta Kiwi 
Bundlelist](http://svn.apache.org/repos/asf/stanbol/branches/release-0.12/launchers/bundlelists/marmotta/kiwi/src/main/bundles/list.xml).
 * `marmotta.kiwi.dburl`: This property can be used to directly configure the 
DB URL. If
-present this is preferred over the configuration of the `marmotta.kiwi.host`, 
-`marmotta.kiwi.port`, `marmotta.kiwi.database` and `marmotta.kiwi.options` 
parameters.
-* `marmotta.kiwi.user` and `marmotta.kiwi.password` for the database
+present this is preferred over the configuration of the `host`,`port`, 
`database` and `options` parameters.
+* `marmotta.kiwi.host`: The host of the database (a file path in case of H2)
+* `marmotta.kiwi.port`: The port of the database (ignored in case of H2)
+* `marmotta.kiwi.user`: The database user
+* `marmotta.kiwi.password`: The password for the configured user
+* `marmotta.kiwi.options`: Additional database options
 * `marmotta.kiwi.cluster`: defines the name of the cluster. Different KiWi 
Repositories
 might use clusters with different names. If not present or empty clustering 
will be
 deactivated.
@@ -53,6 +57,8 @@ registered as OSGI service with the para
 The marked `org.openrdf.repository.Repository.id` property is of special 
interest as it
 can be used to track for a Sesame Repository with a specific name. As an 
Example the
 Repository with the name `dummy` can be tracked with the Filter
-`(&(objectClass=org.openrdf.repository.Repository)(org.openrdf.repository.Repository.id=dummy))`
+
+    :::text
+    
(&(objectClass=org.openrdf.repository.Repository)(org.openrdf.repository.Repository.id=dummy))
 
 


Reply via email to