Author: rwesten
Date: Wed Feb 15 10:25:24 2012
New Revision: 1244426
URL: http://svn.apache.org/viewvc?rev=1244426&view=rev
Log:
Added enhanceroverview figure to the enhacer documentation. Added a section -
Using the Stanbol Enhancer - to the begin of this document
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/index.mdtext
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/index.mdtext
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/index.mdtext?rev=1244426&r1=1244425&r2=1244426&view=diff
==============================================================================
---
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/index.mdtext
(original)
+++
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/index.mdtext
Wed Feb 15 10:25:24 2012
@@ -1,13 +1,62 @@
-Title: Enhancer
+Title: Stanbol Enhancer
-This stateless interface allows the caller to submit content to the Apache
Stanbol [enhancer engines](engines/) and get the resulting enhancements
formatted as RDF at once without storing anything on the server-side.
+The Apache Stanbol Enhancer provides both a RESTful and a Java API that allows
caller to Extract features from parsed Content. In more detail the parsed
Content is processed by [enhancement engines](engines) as defined by the called
[Enhancement Chain](chains/enhancementchain.html).
+
+## Using the Stanbol Enhancer
+
+The following figure provides an overview about the RESTful as well as the
Java API provided by the Stanbol Enhancer
+
+
+
+### RESTful service:
The content to analyze should be sent in a POST request with the mimetype
specified in the Content-type header. The response will hold the RDF
enhancement serialized in the format specified in the Accept header:
curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
- --data "John Smith was born in London." http://localhost:8080/engines
+ --data "John Smith was born in London." http://localhost:8080/enhancer
+
+See the documentation provided by the Stanbol Web UI (e.g.
"http://localhost:8080/enhancer" assuming that Apache Stanbol runs on
localhost:8080)
+
+### Java API:
+
+The usage of the Java API requires the following OSGI Services
+
+ @Reference
+ EnhancementJobManager jobManager
+ @Reference
+ EnhancementChainManager
+
+Provided this service are available the following code snippet shows how to
enhance a Content
+
+ InputStream content; //the content (assuming an HTML document)
+ String chainName; //the name of the chain or null to use the default
+ ContentItem contentItem = new InMemoryContentItem(
+ IOUtils.toByteArray(content), "text/html; charset=UTF-8");
+ //get the EnhancementChain
+ Chain enhancementChain;
+ if(chainName == null){
+ enhancementChain = chainManager.getDefault();
+ } else {
+ enhancementChain = chainManager.getChain(chainName);
+ }
+ try { //enhance the content
+ jobManager.enhanceContent(contentItem, enhancementChain);
+ } catch (EnhancementException e) {}
+
+ //Get the enhancement Results
+ MGraph enhancements = contentItem.getMetadata();
+
+However the ContentIem may - depending on the executed [Enhancement
Engines](engines) also provide additional information. This shows how to
retrieve the text version of the parsed HTML content.
+
+ Entry<UriRef,Blob> textContentPart =
+ ContentItemHelper.getBlob(contentItem,
+ Collections.singleton("text/plain"));
+ Blob testBlob = textContentPart.getValue();
+ String charset = testBlob.getParameter().get("charset");
+ String plainText = IOUtils.toString(
+ textContentPart.getValue().getStream(),
+ charset == null ? "UTF-8" : charset);
-The list of mimetypes accepted as inputs depends on the deployed engines. By
default only text/plain content will be analyzed.
## List of Available Enhancement Engines