Author: rwesten
Date: Wed Jan 25 07:24:22 2012
New Revision: 1235653
URL: http://svn.apache.org/viewvc?rev=1235653&view=rev
Log:
Added GraphChain documentation,
Added:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/chains/enhancer-graphchain-config.png
(with props)
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/chains/graphchain.mdtext
Added:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/chains/enhancer-graphchain-config.png
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/chains/enhancer-graphchain-config.png?rev=1235653&view=auto
==============================================================================
Binary file - no diff available.
Propchange:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/chains/enhancer-graphchain-config.png
------------------------------------------------------------------------------
svn:mime-type = application/octet-stream
Added:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/chains/graphchain.mdtext
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/chains/graphchain.mdtext?rev=1235653&view=auto
==============================================================================
---
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/chains/graphchain.mdtext
(added)
+++
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/chains/graphchain.mdtext
Wed Jan 25 07:24:22 2012
@@ -0,0 +1,66 @@
+Title: GraphChain
+
+
+
+### Configuration
+
+The GraphChain supports two variants to configure the ExecutionPlan
+
+#### GraphResource
+
+A GraphResource is a RDF file available via the DataFileProvider. The easiest
way is to copy the RDF file defining the ExecutionPlan to the "/sling/datafile"
directory within the Stanbol home directory. The configuration of the
GraphChain needs than only to refer to that file such as:
+
+ stanbol.enhancer.chain.graph.graphresource=myExecutionPlan.rdf
+
+The used RDF encoding is guessed by the file extension. If the extension is
not recognized the format can be also parsed as additional parameter
+
+
stanbol.enhancer.chain.graph.graphresource=myExecutionPlan.something;format=application/rdf+xml
+
+The GraphCain will track for that file and activate itself as soon as the file
gets available. Removing the file, waiting some seconds and providing the new
version afterwards should also work. Just replacing the file will not work,
because the DataFileProvider does not have supports for updates. In such cases
it might be needed to deactivate/activate the GraphChain.
+
+#### ChainList
+
+This allows to directly configure the ExecutionPlan as value of the
"stanbol.enhancer.chain.graph.chainlist" property. Both arrays and Collections
are supported.
+
+The Syntax is defined as follows:
+
+ {engine-name};[optional];[dependsOn={engine-name1},{engine-name2}]
+
+The following Example shows how this Syntax can be used to define an
ExecutionPlan.
+
+ metaxa;optional
+ langId;dependsOn=metaxa
+ ner;dependsOn=langId
+ zemanta;optional
+ dbpedia-linking;dependsOn=ner
+ geonames;optional;dependsOn=ner
+ refactor;dependsOn=geonames,dbpedia-linking,zemanta
+
+Not that the internal oder of the list does not influence the resulting
ExecutionPlan. Only the "dependsOn" properties are used to determine the
execution order of the Engines and if Engines can be executed in parallel.
+
+Within an osgi configuration file
(org.apache.stanbol.enhancer.chain.graph.impl.GraphChain-myGraphChain.config)
this would look like
+
+
stanbol.enhancer.chain.graph.chainlist=["metaxa;optional","langId;dependsOn\=metaxa","ner;dependsOn\=langId","zemanta;optional","dbpedia-linking;dependsOn\=ner","geonames;optional;dependsOn\=ner","refactor;dependsOn\=geonames,dbpedia-linking,zemanta"]
+
+A better visual expression provides this screenshot of the Apache Feilx
Webconsole showing the dialog for the same configuration
+
+
+
+### Execution
+
+In contrast to other Chain implementation the ExecutionPlan must not be
calculated but is directly parsed by the user. This provides the most possible
freedom in defining how the execution should take place.
+
+#### Optional Engines
+
+The execution of optional engines is not mandatory. If they are not active or
the execution fails the enhancement process continues. For users it is
important to not that even Engines that depend on an optional Engine that was
not executed will be called.
+
+Given the above example this means that even if the 'metaxa' engine can not be
executed the 'langId' will be called by the EnhancementJobManager.
+
+#### Parallel Execution
+
+Engines are executed as soon as all Engines they dependOn have completed. This
also includes if optional engines where skipped (because they are not active)
or failed. This means that in most cases several EnhancementEngines can be
executed in parallel.
+
+Given the above Example both the 'zemanta' and the 'metaxa' engine are
executed as soon as the enhancement process starts.
+When 'metaxa' finished the 'langid' engine is called. After the 'langid'
finishes its work the EnhancementJobManager calls the 'ner' engine. After that
both the 'dbpedia-linking' and the 'geonames' engine are called. At this time
three engines might run simultaneously assuming that 'zemanta' has not finished
yet. Before the 'refactor' engine can be executed it need to wait for all this
engines to complete.
+
+Note that for parallel execution to be activated both the used
EnhancementJobManager and the different engines must support asynchronous
enhancement.