Author: rwesten
Date: Fri Feb 17 11:09:18 2012
New Revision: 1245391
URL: http://svn.apache.org/viewvc?rev=1245391&view=rev
Log:
minor corrections
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.mdtext
Modified:
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.mdtext
URL:
http://svn.apache.org/viewvc/incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.mdtext?rev=1245391&r1=1245390&r2=1245391&view=diff
==============================================================================
---
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.mdtext
(original)
+++
incubator/stanbol/site/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.mdtext
Fri Feb 17 11:09:18 2012
@@ -18,7 +18,7 @@ the <code>Content-type</code> header. Th
:::bash
curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
- --data "John Smith was born in London." ${it.serviceUrl}
+ --data "John Smith was born in London." http://localhost:8080/enhancer
The list of mimetypes accepted as inputs depends on the deployed engines. By
default most Enhancement Engines can only process plain text content. However
EnhancementEngines like [Metaxa](engines/metaxaengine.html) can be used to
create 'text/plain' versions of parsed content. This allows also to enhance
contents with mime types such as html, pdf and MS office documents (see the
Metaxa documentation for details)
@@ -36,19 +36,18 @@ Stanbol enhancer is able to serialize th
* __uri={content-item-uri}:__ By default the URI of the content item being
enhanced is a local, non de-referencable URI automatically built out of a hash
digest of the binary content. Sometimes it might be helpful to provide the URI
of the [ContentItem](contentitem.html) to be used in the enhancements RDF graph.
* __executionmetadata=true/false:__ Allows the include of [execution
metadata](executionmetadata.html) in the enhancement metadata of the response.
Such data include also the [execution plan](chains/executionplan.html) used to
enhance the parsed content. This information is typically only useful to
clients that want to know how the parsed content was processed by the enhancer.
NOTE that the execution metadata can also be requested by using the multi-part
content item API described below.
-The following example shows how to send an enhancement request with a
-custom content item URI that will include the execution metadata in the
-response.
-
+The following example shows how to send an enhancement request with a custom
content item URI that will include the execution metadata in the response.
+In addition this request is directed to the [Enhancement Chain](chains) with
the name "dbpedia-keyword"
+
:::bash
curl -X POST -H "Accept: text/turtle" -H "Content-type: text/plain" \
--data "John Smith was born in London." \
-
"${it.serviceUrl}?uri=urn:fise-example-content-item&executionmetadata=true"
+
"http://localhost:8080/enhancer/chain/dbpedia-keyword?uri=urn:fise-example-content-item&executionmetadata=true"
## Multi-part ContentItem support
-The multi-part ContentItem extensions to the RESTful API (introduced by
[STANBOL-481](https://issues.apache.org/jira/browse/STANBOL-481)) are
considered an advanced usage of the Stanbol Enhancer.
+The multi-part <code>ContentItem</code> extensions to the RESTful API
(introduced by
[STANBOL-481](https://issues.apache.org/jira/browse/STANBOL-481)) are
considered an advanced usage of the Stanbol Enhancer.
Users will want to use this extensions if they need to:
@@ -81,7 +80,7 @@ Requests that use an <code>Accept: {mime
### Parsing multiple ContentParts
-Requests to the Stanbol Enahcer with the <code>Content-Type:
multipart/from-data</code> are considered to contain a ContentItem serialized
as MultiPart MIME. The exact specification of the [MultiPart MIME format for
ContentItems](contentitem.html#multipart_mime_serialization) is provided by the
documentation of the ContentItem.
+Requests to the Stanbol Enahcer with the <code>Content-Type:
multipart/from-data</code> are considered to contain a <code>ContentItem</code>
serialized as MultiPart MIME. The exact specification of the [MultiPart MIME
format for ContentItems](contentitem.html#multipart_mime_serialization) is
provided by the documentation of the <code>ContentItem</code>.
The combination of <code>multipart/from-data</code> encoded requests with
QueryParameters as described above allow for the usage of [MultiPart MIME
format for ContentItems](contentitem.html#multipart_mime_serialization) for
both request and resonse.
@@ -130,9 +129,9 @@ This will result in an Response with the
--contentItem--
-Se also the formal specification of the [MultiPart MIME format for
ContentItems](contentitem.html#multipart_mime_serialization) for ContentItems.
+Se also the formal specification of the [MultiPart MIME format for
ContentItems](contentitem.html#multipart_mime_serialization) for
<code>ContentItem</code>s.
-#### Example 2: Directly return the plain text version of parsed content
+### Example 2: Directly return the plain text version of parsed content
The using the '<code> omitMetadata=true</code>' together with the "Accept:
{requested-content-type}" the multi-part content API allows to directly request
the transcoded version of the content with the format {requested-content-type}.
@@ -150,7 +149,7 @@ To make this work the requested [Enhance
Note also that because the metadata are omitted by responses to such requests
it is also recommended to configure/use a chain that does no further processing
on the transcoded content.
-#### Example 3: Parse multiple content versions
+### Example 3: Parse multiple content versions
This example will use the "httpmime" part of the Apache commons httpcomponents
to create the Multipart MIME sent to the Stanbol enhancer.
@@ -206,11 +205,11 @@ The created Multipart MIME content MUST
Note that for such requests [Metaxa](engines/metaxaengine.html) will still try
to extract metadata of the parsed MS Word document, but all other engines will
use the plain text version as parsed by the request for processing.
-#### Example 4: Parse existing free text annotations
+### Example 4: Parse existing free text annotations
This example shows how the multi-part content item API can be used to parse
already existing tags for an parsed content to the Stanbol Enhancer. For this
example it is important to understand that parsed metadata need to confirm to
the Stanbol Enhancement Structure. Because of that this example consist of two
main steps:
-1. Convert user tags to TextAnnotations
+1. Convert user tags to <code>TextAnnotation</code>s
2. Send existing Metadata along with the Content to the Stanbol Enhancer
Also note that the code snippets will uses utilities provided by the
"org.apache.stannbol.enhancer.servicesapi" module. As RDF framework Clerezza is
used. Both dependencies are easily replaceable.
@@ -238,7 +237,7 @@ The processing of parsed tags that use o
//in case users have assigned URIs
user = new UriRef("http://my.cms.org/users/rudof.huber");
-Now we can convert the information to TextAnnoations
+Now we can convert the User Tags to <code>TextAnnotation</code>s
:::java
//first create a URI for the text annotation. Here we use a random URN
@@ -261,7 +260,7 @@ Now we can convert the information to Te
graph.add(new TripleImpl(ta, Properties.DC_CREATOR,user));
}
-Now the 'graph' contains a valid TextAnnotation for the given user tag. This
should be done for all tags of the current content.
+Now the 'graph' contains a valid <code>TextAnnotation</code> for the given
user tag. This should be done for all tags of the current content.
In the next step we need to serialize the RDF data. Again I will use here
Clerezza as API, but any RDF framework will provide similar functionality
@@ -289,7 +288,7 @@ Now we need to create the MultiPart MIME
}
});
-Note that because the <code>StringBody</code> class provided my the "httpmime"
framework does not set a Filename we need to override this method and return
the URI of the content item. This is essential, because we need ensure that the
URI of the ContentItem is the same as the URI (variable '<code>ciUri</code>')
as used when creating the TextAnnotations for the user tags.
+Note that because the <code>StringBody</code> class provided my the "httpmime"
framework does not set a Filename we need to override this method and return
the URI of the content item. This is essential, because we need ensure that the
URI of the <code>ContentItem</code> is the same as the URI (variable
'<code>ciUri</code>') as used when creating the <code>TextAnnotation</code>s
for the user tags.
For the following code snippet note that we can directly add the content to
the content item container. Only if we would need to sent multiple alternate
content versions (as shown in 'Example 3') the usage of an
<code>'multipart/alternate'</code> container is required.