Author: buildbot
Date: Fri Feb 17 10:57:16 2012
New Revision: 805168
Log:
Staging update by buildbot for stanbol
Modified:
websites/staging/stanbol/trunk/ (props changed)
websites/staging/stanbol/trunk/content/stanbol/css/stanbol.css
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.html
Propchange: websites/staging/stanbol/trunk/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Fri Feb 17 10:57:16 2012
@@ -1 +1 @@
-1245375
+1245386
Modified: websites/staging/stanbol/trunk/content/stanbol/css/stanbol.css
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/css/stanbol.css (original)
+++ websites/staging/stanbol/trunk/content/stanbol/css/stanbol.css Fri Feb 17
10:57:16 2012
@@ -137,16 +137,19 @@ div.codehilite {
border: 1px solid #bebab0;
line-height: 133%;
}
-span.c1, span.cm {
+span.c1, span.cm { /* comments */
color: #667f5b;
}
-span.k {
+span.nt { /* XML Nodes */
+ color: #667f5b;
+}
+span.k { /* keyword */
color: #8e2b75;
font-weight:bold;
}
-span.nd {
+span.nd { /* Java Annotations */
color: #646464;
}
-span.s , span.s2{
+span.s , span.s2 { /* Strings */
color: #1500ff;
}
\ No newline at end of file
Modified:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.html
(original)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/enhancerrest.html
Fri Feb 17 10:57:16 2012
@@ -133,28 +133,28 @@ Requests that use an <code>Accept: {mime
<h3 id="parsing_multiple_contentparts">Parsing multiple ContentParts</h3>
<p>Requests to the Stanbol Enahcer with the <code>Content-Type:
multipart/from-data</code> are considered to contain a ContentItem serialized
as MultiPart MIME. The exact specification of the <a
href="contentitem.html#multipart_mime_serialization">MultiPart MIME format for
ContentItems</a> is provided by the documentation of the ContentItem.</p>
<p>The combination of <code>multipart/from-data</code> encoded requests with
QueryParameters as described above allow for the usage of <a
href="contentitem.html#multipart_mime_serialization">MultiPart MIME format for
ContentItems</a> for both request and resonse.</p>
-<h3
id="example_usages_of_the_multi-part_content_item_restful_api_extensions">Example
usages of the multi-part content item RESTful API extensions</h3>
-<p>The following examples show some typical usages of the multi-part content
item RESTful API. Note that for better readability the values of the query
parameters are
-not URLEncoded.</p>
-<p>Return Metadata and transformed Content versions</p>
+<h2 id="using_the_multi-part_content_item_restful_api_extensions">Using the
multi-part content item RESTful API extensions</h2>
+<p>The following examples show typical usage scenarios of the multi-part
content item RESTful API. Note that for better readability the values of the
query parameters are not URLEncoded.</p>
+<h3 id="example_1_return_metadata_and_content">Example 1: Return metadata and
content</h3>
+<p>The first example shows how users can request both the metadata and
transcoded versions of the parsed content.
+This can be achieved relatively easy by using the
"<code>outputContent=<em>/</em></code>" in combination with
"<code>omitParsed=true</code>".</p>
<div class="codehilite"><pre>curl -v -X POST -H <span class="s2">"Accept:
multipart/from-data"</span> <span class="se">\</span>
-H <span class="s2">"Content-type: text/html;
charset=UTF-8"</span> <span class="se">\</span>
- --data <span
class="s2">"&lt;html&gt;&lt;body&gt;&lt;p&gt;John
Smith was born in
London.&lt;/p&gt;&lt;/body&gt;&lt;/html&gt;"</span>
<span class="se">\</span>
+ --data <span class="s2">"<html><body><p>John Smith
was born in London.</p></body></html>"</span> <span
class="se">\</span>
<span
class="s2">"${it.serviceUrl}?outputContent=*/*&omitParsed=true&rdfFormat=application/rdf+xml"</span>
</pre></div>
-<p><strong>Example 1: Return metadata and content</strong></p>
<p>This will result in an Response with the mime type <code>"Content-Type:
multipart/from-data; charset=UTF-8; boundary=contentItem"</code> and the
Metadata as well as the plain text version of the parsed HTML document as
content.</p>
<div class="codehilite"><pre>--contentItem
Content-Disposition: form-data; name="metadata";
filename="urn:content-item-sha1-76e44d4b51c626bbed38ce88370be88702de9341"
Content-Type: application/rdf+xml; charset=UTF-8;
Content-Transfer-Encoding: 8bit
-&lt;rdf:RDF
+<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
[..the metadata formatted as RDF+XML..]
-&lt;/rdf:RDF&gt;
+</rdf:RDF>
--contentItem
Content-Disposition: form-data; name="content"
@@ -173,11 +173,12 @@ John Smith was born in London.
</pre></div>
-<p><strong>Example 2: Directly return the plain text version of parsed
content</strong></p>
-<p>This shows how the Apache Stanbol Enhancer can be used to transcode parsed
content.</p>
+<p>Se also the formal specification of the <a
href="contentitem.html#multipart_mime_serialization">MultiPart MIME format for
ContentItems</a> for ContentItems.</p>
+<h4
id="example_2_directly_return_the_plain_text_version_of_parsed_content">Example
2: Directly return the plain text version of parsed content</h4>
+<p>The using the '<code> omitMetadata=true</code>' together with the "Accept:
{requested-content-type}" the multi-part content API allows to directly request
the transcoded version of the content with the format {requested-content-type}.
</p>
<div class="codehilite"><pre>curl -v -X POST -H "Accept: text/plain"
\
-H "Content-type: text/html; charset=UTF-8" \
- --data "<span class="ni">&lt;</span>html<span
class="ni">&gt;&lt;</span>body<span
class="ni">&gt;&lt;</span>p<span class="ni">&gt;</span>John Smith
was born in London.<span class="ni">&lt;</span>/p<span
class="ni">&gt;&lt;</span>/body<span
class="ni">&gt;&lt;</span>/html<span class="ni">&gt;</span>" \
+ --data "<span class="nt"><html><body><p></span>John
Smith was born in London.<span
class="nt"></p></body></html></span>" \
"<span class="cp">${</span><span class="n">it</span><span
class="o">.</span><span class="n">serviceUrl</span><span
class="cp">}</span>?omitMetadata=true"
</pre></div>
@@ -187,8 +188,9 @@ John Smith was born in London.
</pre></div>
-<p>To make this work the requested <a href="chains">Enhancement Chain</a> will
need to include an engine (e.g. <a href="engines/metaxaengine.html">Metaxa</a>)
that supports transcoding the parsed content. Not that because the metadata are
omitted by responses to such requests it is also recommended to configure/use a
chain that does no further processing on the transcoded content. </p>
-<p><strong>Example 3: Parse multiple content versions</strong></p>
+<p>To make this work the requested <a href="chains">Enhancement Chain</a> will
need to include an engine (e.g. <a href="engines/metaxaengine.html">Metaxa</a>)
that supports transcoding the parsed content. If not content with the request
type is available the request will answer with a "<code>404 NOT FOUND</code>".
</p>
+<p>Note also that because the metadata are omitted by responses to such
requests it is also recommended to configure/use a chain that does no further
processing on the transcoded content. </p>
+<h4 id="example_3_parse_multiple_content_versions">Example 3: Parse multiple
content versions</h4>
<p>This example will use the "httpmime" part of the Apache commons
httpcomponents to create the Multipart MIME sent to the Stanbol enhancer.</p>
<div class="codehilite"><pre><span class="nt"><dependency></span>
<span class="nt"><groupId></span>org.apache.httpcomponents<span
class="nt"></groupId></span>
@@ -223,8 +225,8 @@ John Smith was born in London.
<span
class="s">"application/vnd.openxmlformats-officedocument.wordprocessingml.document"</span><span
class="o">,</span>
<span class="s">"example.docx"</span><span
class="o">)));</span>
- <span class="c1">//now add the alternate plain text version</span>
- <span class="n">content</span><span class="o">.</span><span
class="na">addBodyPart</span><span class="o">(</span><span class="k">new</span>
<span class="n">FormBodyPart</span><span class="o">(</span>
+<span class="c1">//now add the alternate plain text version</span>
+<span class="n">content</span><span class="o">.</span><span
class="na">addBodyPart</span><span class="o">(</span><span class="k">new</span>
<span class="n">FormBodyPart</span><span class="o">(</span>
<span
class="s">"http://www.example.com/example.docx"</span><span
class="o">,</span> <span class="c1">//the id of the content part</span>
<span class="k">new</span> <span class="nf">StringBody</span><span
class="o">(</span> <span class="c1">//use a StringBody to avoid binary encoding
for text</span>
<span class="n">IOUtils</span><span class="o">.</span><span
class="na">toString</span><span class="o">(</span><span
class="n">plainIn</span><span class="o">),</span> <span class="c1">//apache
commons IO utility</span>
@@ -235,13 +237,13 @@ John Smith was born in London.
<span class="c1">//Stanbol Enhancer</span>
<span class="n">HttpPost</span> <span class="n">request</span> <span
class="o">=</span> <span class="k">new</span> <span
class="n">HttpPost</span><span class="o">(</span><span
class="s">"http://localhost:8080/enhancer"</span><span
class="o">);</span>
<span class="n">request</span><span class="o">.</span><span
class="na">setEntity</span><span class="o">(</span><span
class="n">contentItem</span><span class="o">);</span>
-<span class="n">request</span><span class="o">.</span><span
class="na">setHeader</span><span class="o">(</span><span
class="s">"Accept"</span><span class="o">,</span><span
class="err">"</span><span class="n">application</span><span
class="o">/</span><span class="n">rdf</span><span class="o">+</span><span
class="n">xml</span><span class="o">);</span>
+<span class="n">request</span><span class="o">.</span><span
class="na">setHeader</span><span class="o">(</span><span
class="s">"Accept"</span><span class="o">,</span><span
class="s">"application/rdf+xml"</span><span class="o">);</span>
<span class="n">Response</span> <span class="n">response</span> <span
class="o">=</span> <span class="n">httpClient</span><span
class="o">.</span><span class="na">execute</span><span class="o">(</span><span
class="n">request</span><span class="o">);</span>
</pre></div>
<p>Note that for such requests <a href="engines/metaxaengine.html">Metaxa</a>
will still try to extract metadata of the parsed MS Word document, but all
other engines will use the plain text version as parsed by the request for
processing.</p>
-<p><strong>Example 4: Parse existing free text annotations</strong></p>
+<h4 id="example_4_parse_existing_free_text_annotations">Example 4: Parse
existing free text annotations</h4>
<p>This example shows how the multi-part content item API can be used to parse
already existing tags for an parsed content to the Stanbol Enhancer. For this
example it is important to understand that parsed metadata need to confirm to
the Stanbol Enhancement Structure. Because of that this example consist of two
main steps:</p>
<ol>
<li>Convert user tags to TextAnnotations</li>
@@ -296,31 +298,33 @@ John Smith was born in London.
<p>Now the 'graph' contains a valid TextAnnotation for the given user tag.
This should be done for all tags of the current content.</p>
<p>In the next step we need to serialize the RDF data. Again I will use here
Clerezza as API, but any RDF framework will provide similar functionality</p>
-<p>:::java
- ByteArrayOutputStream out = new ByteArrayOutputStream();
- //this tells the Serializer to create "application/rdf+xml"
- serializer.serialize(out, metadata, SupportedFormat.RDF_XML);
- String rdfContent = new String(out.toByteArray(),UTF8);</p>
+<div class="codehilite"><pre><span class="n">ByteArrayOutputStream</span>
<span class="n">out</span> <span class="o">=</span> <span class="k">new</span>
<span class="n">ByteArrayOutputStream</span><span class="o">();</span>
+<span class="c1">//this tells the Serializer to create
"application/rdf+xml"</span>
+<span class="n">serializer</span><span class="o">.</span><span
class="na">serialize</span><span class="o">(</span><span
class="n">out</span><span class="o">,</span> <span
class="n">metadata</span><span class="o">,</span> <span
class="n">SupportedFormat</span><span class="o">.</span><span
class="na">RDF_XML</span><span class="o">);</span>
+<span class="n">String</span> <span class="n">rdfContent</span> <span
class="o">=</span> <span class="k">new</span> <span
class="n">String</span><span class="o">(</span><span class="n">out</span><span
class="o">.</span><span class="na">toByteArray</span><span
class="o">(),</span><span class="n">UTF8</span><span class="o">);</span>
+</pre></div>
+
+
<p>Now we need to create the MultiPart MIME content item containing the
metadata and the content</p>
-<p>:::java
- String content; //the content we want to send to the Stanbol Enhancer</p>
-<div class="codehilite"><pre><span class="sr">//</span><span
class="n">the</span> <span class="n">container</span> <span
class="k">for</span> <span class="n">the</span> <span
class="n">ContentITem</span>
-<span class="n">MultipartEntity</span> <span class="n">contentItem</span>
<span class="o">=</span> <span class="k">new</span> <span
class="n">MultipartEntity</span><span class="p">(</span><span
class="n">null</span><span class="p">,</span> <span class="n">null</span> <span
class="p">,</span><span class="n">UTF8</span><span class="p">);</span>
-
-<span class="sr">//</span><span class="n">The</span> <span
class="n">Metadata</span> <span class="n">MUST</span> <span class="n">BE</span>
<span class="n">the</span> <span class="n">first</span> <span
class="n">element</span>
-<span class="n">contentItem</span><span class="o">.</span><span
class="n">addPart</span><span class="p">(</span>
- <span class="s">"metadata"</span><span class="p">,</span> <span
class="sr">//</span><span class="n">the</span> <span class="n">name</span>
<span class="n">MUST</span> <span class="n">BE</span> <span
class="s">"metadata"</span>
- <span class="k">new</span> <span class="n">StringBody</span><span
class="p">(</span><span class="n">rdfContent</span><span
class="p">,</span><span class="n">SupportedFormat</span><span
class="o">.</span><span class="n">RDF_XML</span><span class="p">,</span><span
class="n">UTF8</span><span class="p">){</span>
- <span class="nv">@Override</span>
- <span class="n">public</span> <span class="n">String</span> <span
class="n">getFilename</span><span class="p">()</span> <span class="p">{</span>
<span class="sr">//</span><span class="n">The</span> <span
class="n">filename</span> <span class="n">MUST</span> <span class="n">BE</span>
<span class="n">the</span>
- <span class="k">return</span> <span class="n">ciUri</span><span
class="o">.</span><span class="n">getUnicodeString</span><span
class="p">();</span> <span class="sr">//</span><span class="n">uri</span> <span
class="n">of</span> <span class="n">the</span> <span
class="n">ContentItem</span>
- <span class="p">}</span>
- <span class="p">});</span>
+<div class="codehilite"><pre><span class="n">String</span> <span
class="n">content</span><span class="o">;</span> <span class="c1">//the content
we want to send to the Stanbol Enhancer</span>
+
+<span class="c1">//the container for the ContentITem</span>
+<span class="n">MultipartEntity</span> <span class="n">contentItem</span>
<span class="o">=</span> <span class="k">new</span> <span
class="n">MultipartEntity</span><span class="o">(</span><span
class="kc">null</span><span class="o">,</span> <span class="kc">null</span>
<span class="o">,</span><span class="n">UTF8</span><span class="o">);</span>
+
+<span class="c1">//The Metadata MUST BE the first element</span>
+<span class="n">contentItem</span><span class="o">.</span><span
class="na">addPart</span><span class="o">(</span>
+ <span class="s">"metadata"</span><span class="o">,</span> <span
class="c1">//the name MUST BE "metadata" </span>
+ <span class="k">new</span> <span class="nf">StringBody</span><span
class="o">(</span><span class="n">rdfContent</span><span
class="o">,</span><span class="n">SupportedFormat</span><span
class="o">.</span><span class="na">RDF_XML</span><span class="o">,</span><span
class="n">UTF8</span><span class="o">){</span>
+ <span class="nd">@Override</span>
+ <span class="kd">public</span> <span class="n">String</span> <span
class="nf">getFilename</span><span class="o">()</span> <span class="o">{</span>
<span class="c1">//The filename MUST BE the</span>
+ <span class="k">return</span> <span class="n">ciUri</span><span
class="o">.</span><span class="na">getUnicodeString</span><span
class="o">();</span> <span class="c1">//uri of the ContentItem</span>
+ <span class="o">}</span>
+ <span class="o">});</span>
</pre></div>
-<p>Note that because the StringBody class provided my the "httpmime" framework
does not set a Filename we need to override this method and return the URI of
the content item. This is essential, because we need ensure that the URI of the
ContentItem is the same as the URI (variable 'ciUri') as used when creating the
TextAnnotations for the user tags.</p>
-<p>For the following code snippet note that we can directly add the content to
the content item container. Only if we would need to sent multiple alternate
content versions (as shown in 'Example 3') the usage of an
'multipart/alternate' container is required.</p>
+<p>Note that because the <code>StringBody</code> class provided my the
"httpmime" framework does not set a Filename we need to override this method
and return the URI of the content item. This is essential, because we need
ensure that the URI of the ContentItem is the same as the URI (variable
'<code>ciUri</code>') as used when creating the TextAnnotations for the user
tags.</p>
+<p>For the following code snippet note that we can directly add the content to
the content item container. Only if we would need to sent multiple alternate
content versions (as shown in 'Example 3') the usage of an
<code>'multipart/alternate'</code> container is required.</p>
<div class="codehilite"><pre><span class="c1">//Add the content as second mime
part</span>
<span class="n">contentItem</span><span class="o">.</span><span
class="na">addPart</span><span class="o">(</span>
<span class="s">"content"</span><span class="o">,</span> <span
class="c1">//the name MUST BE "content"</span>
@@ -330,7 +334,7 @@ John Smith was born in London.
<span class="c1">//Stanbol Enhancer</span>
<span class="n">HttpPost</span> <span class="n">request</span> <span
class="o">=</span> <span class="k">new</span> <span
class="n">HttpPost</span><span class="o">(</span><span
class="s">"http://localhost:8080/enhancer"</span><span
class="o">);</span>
<span class="n">request</span><span class="o">.</span><span
class="na">setEntity</span><span class="o">(</span><span
class="n">contentItem</span><span class="o">);</span>
-<span class="n">request</span><span class="o">.</span><span
class="na">setHeader</span><span class="o">(</span><span
class="s">"Accept"</span><span class="o">,</span><span
class="err">"</span><span class="n">application</span><span
class="o">/</span><span class="n">rdf</span><span class="o">+</span><span
class="n">xml</span><span class="o">);</span>
+<span class="n">request</span><span class="o">.</span><span
class="na">setHeader</span><span class="o">(</span><span
class="s">"Accept"</span><span class="o">,</span> <span
class="n">SupportedFormat</span><span class="o">.</span><span
class="na">RDF_XML</span><span class="o">);</span>
<span class="n">Response</span> <span class="n">response</span> <span
class="o">=</span> <span class="n">httpClient</span><span
class="o">.</span><span class="na">execute</span><span class="o">(</span><span
class="n">request</span><span class="o">);</span>
</pre></div>