contentitem.html

buildbot Tue, 14 Feb 2012 03:55:32 -0800

Author: buildbot
Date: Tue Feb 14 09:13:07 2012
New Revision: 804757

Log:
Staging update by buildbot for stanbol


Modified:
    websites/staging/stanbol/trunk/   (props changed)
    
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/contentitem.html

Propchange: websites/staging/stanbol/trunk/
------------------------------------------------------------------------------
    cms:source-revision = 1243838

Modified: 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/contentitem.html
==============================================================================
--- 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/contentitem.html
 (original)
+++ 
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/contentitem.html
 Tue Feb 14 09:13:07 2012
@@ -20,7 +20,7 @@
 -->
 
   <link href="/stanbol/css/stanbol.css" rel="stylesheet" type="text/css">
-  <title>Apache Stanbol - ContentItem</title>
+  <title>Apache Stanbol - Content Item</title>
   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
   <link rel="icon" type="image/png" 
href="/stanbol/images/stanbol-logo/stanbol-favicon.png"/>
 </head>
@@ -56,30 +56,30 @@
   </div>
   
   <div id="content">
-    <h1 class="title">ContentItem</h1>
-    <p>The ContentItem is the Object that represents the Content that is 
enhanced by the Stanbol Enhancer. The ContentItem is create based on the data 
provided by the enhancement request and used throughout the enhancement process 
to store results. After the enhancement process finishes the ContentItem 
represents therefore the result of the Stanbol enhancement process.</p>
-<p>The following section describe the interface of the ContentItem in more 
details.</p>
-<h3 id="contentparts">ContentParts</h3>
-<p>ContentParts are used to represent the original content as well as 
transformations of the original content (typically created by pre-processing <a 
href="engines/enhancementengine.html">EnhancementEngine</a> such as <a 
href="engines/metaxaengine.html">Metaxa</a>)</p>
+    <h1 class="title">Content Item</h1>
+    <p>The content item is the object that represents the content that is 
enhanced by the Apache Stanbol enhancer. The content item is created based on 
the data provided by the enhancement request and used throughout the 
enhancement process to store results. Therefore, after the enhancement process 
has finished, the content item represents the result of the Apache Stanbol 
enhancement process.</p>
+<p>The following section describes the interface of the content item in more 
detail.</p>
+<h3 id="content_parts">Content Parts</h3>
+<p>Content parts are used to represent the original content as well as 
transformations of the original content (typically created by pre-processing <a 
href="engines/list.html">enhancement engines</a> such as the <a 
href="engines/metaxaengine.html">Metaxa engine</a>)</p>
 <div class="codehilite"><pre><span class="sr">/** Getter for the ContentPart 
based on the index */</span>
 <span class="n">getPart</span><span class="p">(</span><span 
class="nb">int</span> <span class="nb">index</span><span class="p">,</span> 
<span class="n">Class</span><span class="sr">&lt;T&gt;</span> <span 
class="n">type</span><span class="p">)</span> <span class="p">:</span> <span 
class="n">T</span>
 <span class="sr">/** Getter for the ContentPart based on its ID */</span>
 <span class="n">getPart</span><span class="p">(</span><span 
class="n">UriRef</span> <span class="n">uri</span><span class="p">,</span> 
<span class="n">Class</span><span class="sr">&lt;T&gt;</span> <span 
class="n">type</span><span class="p">)</span> <span class="p">:</span> <span 
class="n">T</span>
 <span class="sr">/** Getter for the ID based on the index */</span>
 <span class="n">getPartUri</span><span class="p">(</span><span 
class="nb">index</span> <span class="nb">index</span><span class="p">)</span> 
<span class="p">:</span> <span class="n">UriRef</span>
-<span class="sr">/** Adds a new ContentPart to the ContentItem */</span>
+<span class="sr">/** Adds a new ContentPart to the content item */</span>
 <span class="n">addPart</span><span class="p">(</span><span 
class="n">UriRef</span> <span class="n">uri</span><span class="p">,</span> 
<span class="n">Object</span> <span class="n">part</span><span 
class="p">)</span> <span class="p">:</span> <span class="n">Object</span>
 </pre></div>
 
 
-<p>ContentParts are accessible by the index AND by there URI formatted id. 
Re-adding an ContentPart will replace the old one. The index will not be 
changed by this operation.</p>
-<p>There are two types of ContentParts:</p>
+<p>Content parts are accessible by the index <em>and</em> by their URI 
formatted ID. Re-adding a content part will replace the old one. The index will 
not be changed by this operation.</p>
+<p>There are two types of content parts:</p>
 <ol>
-<li>ContentParts for that additional metadata are provided within the metadata 
of the ContentItem. Such ContentParts are typically used to store transformed 
versions of the original content. This allows e.g. engines that can only 
process plain text version to query for the content part containing this 
version of the parsed document.</li>
-<li>ContentParts that are registered under a predefined URI. Such ContentParts 
are typically not mentioned within the metadata of the ContentItem. Typically 
this is used to share intermediate enhancement results in-between enhancement 
engines. An example would be Tokens, Sentens, POS tags and Chunks as extracted 
by some NLP engine. Engines that want to consume such data need to know the 
predefined URI. They will typically check within the "canEnhance(..)" method if 
a ContentPart with this URI is present and if it has the correct type. </li>
+<li>Content parts that have additional metadata provided within the metadata 
of the content item. Such content parts are typically used to store transformed 
versions of the original content. This allows e.g. engines that can only 
process plain text versions to query for the content part containing this 
version of the parsed document.</li>
+<li>Content parts that are registered under a predefined URI. Such content 
parts are typically not mentioned within the metadata of the content item. This 
is used to share intermediate enhancement results between enhancement engines. 
An example would be tokens, sentences, POS tags and chunks that are extracted 
by some NLP engine. Engines that want to consume such data need to know the 
predefined URI of the content part holding this data. They will check within 
the <code>canEnhance(..)</code> method if a content part with an expected URI 
is present and if it has the correct type. </li>
 </ol>
-<h3 id="accessing_the_main_content_of_the_contentitem">Accessing the Main 
Content of the ContentItem</h3>
-<p>The main content of the ContentItem refers to the content parsed by the 
enhancement request (or downloaded from the URL provided by an request). For 
accessing this content the following methods are available</p>
+<h3 id="accessing_the_main_content_of_the_content_item">Accessing the Main 
Content of the Content Item</h3>
+<p>The main content of the content item refers to the content parsed by the 
enhancement request (or downloaded from the URL provided by an request). For 
accessing this content the following methods are available</p>
 <div class="codehilite"><pre><span class="o">/**</span> <span 
class="n">Getter</span> <span class="k">for</span> <span class="n">the</span> 
<span class="n">InputStream</span> <span class="n">of</span> <span 
class="n">the</span> <span class="n">content</span> <span class="n">as</span> 
<span class="n">parsed</span>
     <span class="k">for</span> <span class="n">the</span> <span 
class="n">ContentItem</span> <span class="o">*/</span>
 <span class="o">+</span> <span class="n">getStream</span><span 
class="p">()</span> <span class="p">:</span> <span class="n">InputStream</span>
@@ -90,17 +90,17 @@
 </pre></div>
 
 
-<p>The "getStream()" and "getMimeType()" methods are shortcuts for the 
according methods of Blob. Calling "contentItem.getBlob.getStream()" will 
return an InputStream over the exact same content as directly calling 
"getStream()" on the ContentItem. Note that the Blob interface also provides a 
"getParameter()" method that allows to retrieve mime type parameters such as 
the charset of textual content.</p>
-<p>The content parsed by the user is stored as ContentPart at the index '0' 
with the URI of the ContentItem in the form of a Blob. Therefore calling</p>
+<p>The <code>getStream()</code> and <code>getMimeType()</code> methods are 
shortcuts for the according methods of the content item's blob object. Calling 
<code>contentItem.getBlob.getStream()</code> will return an InputStream over 
the exact same content as directly calling <code>getStream()</code> on the 
content item. Note that the blob interface also provides a 
<code>getParameter()</code> method that allows to retrieve mime type parameters 
such as the charset of textual content.</p>
+<p>The content parsed by the user is stored as content part at the index '0' 
with the URI of the content item in the form of a blob. Therefore calling</p>
 <p>contentItem.getPart(0,Blob.class)
    contentItem.getPart(contentItem.getUri(),Blob.class)
    contentItem.getBlob()</p>
-<p>MUST return all the exact same Blob instance.</p>
-<h3 id="metadata_of_the_contentitem">Metadata of the ContentItem</h3>
-<p>The metadata of the ContentItem are managed by an LockableMGraph. This is 
basically a normal java.util.Collections for Triples. The only RDF specific 
method is support for filtered iterators that support wildcards for subjects, 
predicates and objects.</p>
-<p>This graph is used to store all enhancement results as well as metadata 
about the content item (such as content parts) and the enhancement process (see 
<a href="executionmetadata.html">Executionmetadata</a>.</p>
+<p>returns the same blob instance.</p>
+<h3 id="metadata_of_the_content_item">Metadata of the Content Item</h3>
+<p>The metadata of the content item are managed by an lockable MGraph. This is 
basically a normal <code>java.util.Collections</code> for triples. The only RDF 
specific method is support for filtered iterators that support wildcards for 
subjects, predicates and objects.</p>
+<p>This graph is used to store all enhancement results as well as metadata 
about the content item (such as content parts) and the enhancement process (see 
<a href="executionmetadata.html">execution metadata</a>.</p>
 <h3 id="readwrite_locks">Read/Write locks</h3>
-<p>During the Stanbol enhancement process as executed by the <a 
href="enhancementjobmanager.html">EnhancementJobManager</a> components running 
in multiple threads need to access the state of the ContentItem. Because of 
that the ContentItem provides the possibility to acquire locks.</p>
+<p>During the Apache Stanbol enhancement process as executed by the <a 
href="enhancementjobmanager.html">enhancement job manager</a> components 
running in multiple threads need to access the state of the content item. 
Because of that the content item provides the possibility to acquire locks.</p>
 <div class="codehilite"><pre><span class="sr">/** Getter for the ReadWirteLock 
of a ContentItem +/</span>
 <span class="o">+</span> <span class="n">getLock</span><span 
class="p">()</span> <span class="p">:</span> <span class="n">java</span><span 
class="o">.</span><span class="n">util</span><span class="o">.</span><span 
class="n">concurrent</span><span class="o">.</span><span 
class="n">ReadWriteLock</span>
 </pre></div>
@@ -112,9 +112,9 @@
 </pre></div>
 
 
-<p>will return the same ReadWriteLock instance.</p>
-<p>This lock can be used request read/write locks on the ContentItem. All 
methods of the ContentItem and also the MGrpah holding the metadata need to be 
protected by using the lock. That means that users that do not need to product 
whole sections of code do not need to brother with the usage of locks. Typical 
examples are working with ContentParts, final Classes like Blob or 
adding/removing a triple from the metadata.</p>
-<p>However whenever components need to ensure that the data are not changed by 
other threads while performing some calculations read/write locks MUST BE used. 
A typical example are iterations over data returned by the MGraph. In this case 
code iterating over the results should be protected against concurrent changes  
by</p>
+<p>will return the same <code>ReadWriteLock</code> instance.</p>
+<p>This lock can be used to request read/write locks on the content item. All 
methods of the content item and also the <code>MGrpah</code> holding the 
metadata need to be protected by using the lock. That means that users that do 
not need to product whole sections of code do not need to brother with the 
usage of locks. Typical examples are working with content parts, final classes 
like <code>Blob</code> or adding/removing a triple from the metadata.</p>
+<p>However, whenever components need to ensure that the data are not changed 
by other threads while performing some calculations read/write locks <em>must 
be</em> used. A typical example are iterations over data returned by the 
MGraph. In this case code iterating over the results should be protected 
against concurrent changes by</p>
 <div class="codehilite"><pre><span class="n">contentItem</span><span 
class="o">.</span><span class="n">getLock</span><span class="p">()</span><span 
class="o">.</span><span class="n">readLock</span><span class="p">()</span><span 
class="o">.</span><span class="n">lock</span><span class="p">();</span>
 <span class="n">try</span> <span class="p">{</span>
     <span class="n">Iterator</span><span class="sr">&lt;Triple&gt;</span> 
<span class="n">it</span> <span class="o">=</span> <span 
class="n">contentItem</span><span class="o">.</span><span 
class="n">getMetadata</span><span class="p">()</span><span class="o">.</span>
@@ -129,7 +129,7 @@
 </pre></div>
 
 
-<p>While accessing ContentItems within an <a 
href="engines/enhancementengine.html">EnhancementEngine</a> there is an 
exception to this rule. If an engine declares that is only supports the 
SYNCHRONOUS enhancement mode the <a 
href="enhancementjobmanager.html">EnhancementJobManager</a> needs to take care 
the an engine has exclusive access to the ContentItem. In that case 
implementors of EnhancementEngines need not to care about using read/write 
locks.</p>
+<p>While accessing content items within an <a 
href="engines/enhancementengine.html">enhancement engine</a> there is an 
exception to this rule. If an engine declares that is only supports the 
<code>SYNCHRONOUS</code> enhancement mode the <a 
href="enhancementjobmanager.html">enhancement job manager</a> needs to take 
care the an engine has exclusive access to the content item. In that case 
implementors of enhancement engines need not to care about using read/write 
locks.</p>
   </div>
   
   <div id="footer">

svn commit: r804757 - in /websites/staging/stanbol/trunk: ./ content/stanbol/docs/trunk/enhancer/contentitem.html

Reply via email to