Author: buildbot
Date: Mon Oct 10 14:02:38 2011
New Revision: 796847
Log:
Staging update by buildbot
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contenthub.html
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/geonamesengine.html
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/opencalaisengine.html
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/refactorengine.html
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/zemantaengine.html
Modified:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/index.html
Added: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contenthub.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contenthub.html
(added)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/contenthub.html
Mon Oct 10 14:02:38 2011
@@ -0,0 +1,106 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE- 2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+ <link href="/stanbol/css/stanbol.css" rel="stylesheet" type="text/css">
+ <title>Apache Stanbol - ContentHub</title>
+ <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+ <link rel="icon" type="image/png"
href="/stanbol/images/stanbol-logo/stanbol-favicon.png"/>
+</head>
+
+<body>
+ <div id="navigation">
+ <img alt="Apache Stanbol" width="220" height="101"
src="/stanbol/images/stanbol-logo/stanbol-2010-12-14.png"/>
+ <h1 id="stanbol_links">Stanbol links</h1>
+<ul>
+<li><a href="/stanbol/index.html">Home</a></li>
+<li><a href="/stanbol/team.html">Project Team</a></li>
+<li><a href="/stanbol/docs/trunk/">Documentation</a></li>
+</ul>
+<h1 id="asf_links">ASF links</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a
Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+ </div>
+
+ <div id="content">
+ <h1 class="title">ContentHub</h1>
+ <p>The Apache Stanbol Contenthub is a persistent document store which
enables text based document submission
+and semantic search together with faceted search capability on submitted
documents.</p>
+<h2 id="technical_description_of_its_components">Technical Description of its
components</h2>
+<h3 id="contenthub_store">ContentHub Store</h3>
+<p>It is the subcomponent that actually stores the document and its metadata
persistently. In current implementation only text/plain documents are
allowed.</p>
+<p>The storage part of the Contenthub provide basic methods such as create,
put, get and delete. When a document is submitted, it delegates the textual
content to Stanbol Enhancer to get its enhancements. While submitting the
document, it is also possible to specify external metadata as field value pairs
to the document. </p>
+<p>The document itself and all specified external metadata are indexed through
an embedded Apache Solr core which is created specifically for Contenthub.
+Since documents are given unique IDs while indexing, using its unique ID, a
document can be retrieved or deleted from Contenthub.
+ContentHub provides an HTML interface for its functionalities under the
following endpoint, which is available after running the full launcher of
Apache Stanbol:</p>
+<div class="codehilite"><pre><span class="n">http:</span><span
class="sr">//</span><span class="n">localhost:8080</span><span
class="o">/</span><span class="n">contenthub</span>
+</pre></div>
+
+
+<h3 id="contenthub_search">ContentHub Search</h3>
+<p>ContentHub has a semantic search subcomponent that allows search over the
submitted documents. A HTML interface for search functionality can be reached
under:</p>
+<div class="codehilite"><pre><span class="n">http:</span><span
class="sr">//</span><span class="n">localhost:8080</span><span
class="sr">/contenthub/s</span><span class="n">earch</span>
+</pre></div>
+
+
+<p>To start a search, one enters a keyword and choose the search engines that
will execute the query. After having the first search results, all facets and
values of these facets will also arrive. Later on, when a facet constraint is
chosen, documents and facets will be dynamically updated according to chosen
constraint(s).</p>
+<p>The Contenthub Search API also provides a means of specifying an ontology
which carries semantic information to make the search more semantic. How this
external ontology is exploited is explained within the search engine
documentation below. Furthermore, Search API enables specifying constraints for
the search operation. The aim is to provide faceted search functionality
through Java interface based on the specified constraints.<br />
+</p>
+<p>The search part of this component is formed by several search engines that
work sequentially and contribute to the search results. Each search engine
works with a given search context. The initialization of the search context is
performed before the execution of any search engine. Each search engine makes
use of the information embedded in the search context and populates the context
with new results, such as resulting documents, related ontological resources,
new keywords etc ...</p>
+<p>Currently, three search engines are active in search subcomponent:</p>
+<h4 id="ontology_resource_search_engine">Ontology Resource Search Engine</h4>
+<p>This engine works when an additional ontology is specified at the beginning
of the search. A SPARQL query based on a LARQ index is executed on the
specified ontology to find individuals and classes related with the keyword.
When a class is found, it is added to search the context as a related class
resource and then, subclasses, superclasses and instances of all these classes
are found and added to the search context.</p>
+<p>When an individual about keyword is found it is added as a related
individual resource to search context and it's classes are found. These classes
are added to the search context using the same methodology explained in the
previous paragraph.</p>
+<h4 id="enhancement_search_engine">Enhancement Search Engine</h4>
+<p>This engine designed to work on enhancement graph which contains all
enhancements of content items submitted to the Contenthub. </p>
+<p>When a document is submitted to ContentHub, its content is enhanced
automatically by Enhancer component.
+In a single Clerezza graph, all the enhancements are kept together and this
graph is indexed with LARQ. The LARQ index is automatically updated when a new
enhancement is added.</p>
+<p>Enhancement Search Engine, executes a SPARQL query on enhancement graph to
find enhancements about the given keyword.
+When an enhancement is found, the document from which the enhancement was
obtained is added to search context as a related document resource.</p>
+<h4 id="solr_search_engine">SOLr Search Engine</h4>
+<p>The <a href="">SOLr</a> Search Engine is the engine that gives full-text
and faceted search capabilities to the Contenthub.</p>
+<p>Since every document is indexed to SOLr (to the core created for
Contenthub), it is possible to do full-text
+search over documents' content and metadata. After the first search, all the
facet constraints of resulting documents will be available for faceted search.
When a facet constraint is chosen, resulting documents and facet constraints
are updated dynamically. </p>
+<p>Later on, related class and individual resources about the keyword, which
are found by Ontology Resource Search Engine, are searched over SOLr using
their resource name. </p>
+<p>After all, document resources founded by Enhancement Search Engine is
examined. If there is a document whose field values does not match with facet
constraints, these document resources are removed from the search results.</p>
+<h2 id="building_and_launching_contenthub">Building and Launching
ContentHub</h2>
+<p>Since ContentHub is built with Apache Stanbol it can be launched under
"Full Launcher". For detailed instructions to build and launch Apache Stanbol
see the README file through the following link:</p>
+<div class="codehilite"><pre><span class="n">http:</span><span
class="sr">//s</span><span class="n">vn</span><span class="o">.</span><span
class="n">apache</span><span class="o">.</span><span class="n">org</span><span
class="sr">/repos/</span><span class="n">asf</span><span
class="sr">/incubator/s</span><span class="n">tanbol</span><span
class="sr">/trunk/</span><span class="n">README</span><span
class="o">.</span><span class="n">md</span>
+</pre></div>
+ </div>
+
+ <div id="footer">
+ <div class="copyright">
+ <p>
+ Copyright © 2010 The Apache Software Foundation, Licensed under
+ the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache
License, Version 2.0</a>.
+ <br />
+ Apache, Stanbol and the Apache feather and Stanbol logos are
trademarks of The Apache Software Foundation.
+ </p>
+ </div>
+ </div>
+
+</body>
+</html>
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/geonamesengine.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/geonamesengine.html
(added)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/geonamesengine.html
Mon Oct 10 14:02:38 2011
@@ -0,0 +1,286 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE- 2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+ <link href="/stanbol/css/stanbol.css" rel="stylesheet" type="text/css">
+ <title>Apache Stanbol - The Geonames Enhancement Engine</title>
+ <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+ <link rel="icon" type="image/png"
href="/stanbol/images/stanbol-logo/stanbol-favicon.png"/>
+</head>
+
+<body>
+ <div id="navigation">
+ <img alt="Apache Stanbol" width="220" height="101"
src="/stanbol/images/stanbol-logo/stanbol-2010-12-14.png"/>
+ <h1 id="stanbol_links">Stanbol links</h1>
+<ul>
+<li><a href="/stanbol/index.html">Home</a></li>
+<li><a href="/stanbol/team.html">Project Team</a></li>
+<li><a href="/stanbol/docs/trunk/">Documentation</a></li>
+</ul>
+<h1 id="asf_links">ASF links</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a
Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+ </div>
+
+ <div id="content">
+ <h1 class="title">The Geonames Enhancement Engine</h1>
+ <p>This engine creates fise:EntityAnnotations based on the
http://geonames.org dataset. It does not directly work on the parsed content,
but processes named entities extracted by some NLP (natural language
processing) engine. This engine creates EnityAnnotations for Features found
for named entities in the geonames.org data set. In addition it adds
EntityAnnotations for the continent, country and administrative regions for
entities with an high confidence level.</p>
+<h2 id="processed_annotations_input">Processed Annotations (Input)</h2>
+<p>This engine consumes fise:TextAnnotations of type dbpedia:Place. More
concrete it filters for enhancements that confirm to the following two
requirements and consumes the text selected by the TextAnnotations:</p>
+<p>?textAnnotation rdf:type fise:TextAnnotation .
+ ?textAnnotation dc:type dbpedia:Place
+ ?textAnnotation fise:selected-text ?text</p>
+<p>Here an example for such an TextAnnotations selecting the text "Vienna"
form the content "The community Workshop will take place in Vienna".</p>
+<div class="codehilite"><pre><span
class="err">urn:enhancement:text-enhancement:id1</span>
+ <span class="err">a</span> <span class="err">fise:TextAnnotation</span>
<span class="err">,</span> <span class="err">fise:Enhancement</span> <span
class="err">;</span>
+ <span class="err">dc:type</span>
+ <span class="err">dbpedia:Place</span> <span class="err">;</span>
+ <span class="err">fise:selected-text</span>
+ <span class="err">"Vienna"^^xsd:string</span> <span
class="err">;</span>
+ <span class="err">fise:selection-context</span>
+ <span class="err">"The</span> <span class="err">community</span>
<span class="err">Workshop</span> <span class="err">will</span> <span
class="err">take</span> <span class="err">place</span> <span
class="err">in</span> <span class="err">Vienna"^^xsd:string</span> <span
class="err">;</span>
+ <span class="err">fise:start</span>
+ <span class="err">"46"^^xsd:int</span> <span
class="err">;</span>
+ <span class="err">fise:end</span>
+ <span class="err">"52"^^xsd:int</span> <span
class="err">;</span>
+ <span class="err">fise:confidence</span>
+ <span class="err">"0.9773640902587215"^^xsd:double</span>
<span class="err">;</span>
+ <span class="err">fise:extracted-from</span>
+ <span class="err">urn:content-item:id1</span> <span
class="err">.</span>
+</pre></div>
+
+
+<p>Typically such enhancements are created by engines that provide named
entity extraction based on some natural language processing framework.</p>
+<h2 id="created_enhancements_output">Created Enhancements (Output)</h2>
+<p>The LocationEnhancementEngine creates two types of EntityAnnotations. First
it suggests Entities for processed TextAnnotations and second it creates
EntityAnnotations for the hierarchy of regions the suggested Entities are
located in. Suggested Entities are connected with the "dc:relation" attribute
to the TextAnnotation they enhance. EntityAnnotations representing the
hierarchydefine a dc:requires attribute to the EntityAnnotation.</p>
+<h3 id="entity_suggestions">Entity Suggestions</h3>
+<p>Entity suggestions are EntityEnhancements that suggest Features of the
geonames.org dataset for an processed TextAnnotation. This suggestions are
currently only calculated based on the fise:selected-text of the
TextAnnotation. </p>
+<p>The following example shows three EntityAnnotations for the TextAnnotation
used in the above example. See the fise:relation statements at the end of each
of the two EntityAnnotations.</p>
+<p>The first Entity found in the geonames.orf dataset is the capital city in
Austria with an confidence level of 1.0:</p>
+<div class="codehilite"><pre><span
class="err">urn:enhancement:entity-enhancement:id1</span>
+ <span class="err">a</span> <span
class="err">fise:EntityAnnotation</span> <span class="err">,</span> <span
class="err">fise:Enhancement</span> <span class="err">;</span>
+ <span class="err">fise:confidence</span>
+ <span class="err">"1.0"^^xsd:double</span> <span
class="err">;</span>
+ <span class="err">fise:entity-label</span>
+ <span class="err">"Vienna"^^xsd:string</span> <span
class="err">;</span>
+ <span class="err">fise:entity-reference</span>
+ <span class="err">http:</span><span
class="c1">//sws.geonames.org/2761369/ ;</span>
+ <span class="err">fise:entity-type</span>
+ <span class="err">geonames:Feature</span> <span class="err">,</span>
<span class="err">dbpedia:Place</span> <span class="err">,</span> <span
class="err">dbpedia:Settlement</span> <span class="err">,</span> <span
class="err">dbpedia:PopulatedPlace</span> <span class="err">,</span> <span
class="err">geonames:P.PPLC</span> <span class="err">;</span>
+ <span class="err">fise:extracted-from</span>
+ <span class="err">urn:content-item:id1</span> <span
class="err">;</span>
+<span class="err">dc:relation</span>
+ <span class="err">urn:enhancement:text-enhancement:id1</span> <span
class="err">.</span>
+</pre></div>
+
+
+<p>With lower confidence levels there are a lot of other populated places with
the name "Vienna" found in the geonames.org dataset.</p>
+<div class="codehilite"><pre><span
class="err">urn:enhancement:entity-enhancement:id2</span>
+ <span class="err">a</span> <span
class="err">fise:EntityAnnotation</span> <span class="err">,</span> <span
class="err">fise:Enhancement</span> <span class="err">;</span>
+ <span class="err">fise:confidence</span>
+ <span class="err">"0.42163702845573425"^^xsd:double</span>
<span class="err">;</span>
+ <span class="err">fise:entity-label</span>
+ <span class="err">"Vienna"^^xsd:string</span> <span
class="err">;</span>
+ <span class="err">fise:entity-reference</span>
+ <span class="err">http:</span><span
class="c1">//sws.geonames.org/4496671/ ;</span>
+ <span class="err">fise:entity-type</span>
+ <span class="err">geonames:Feature</span> <span class="err">,</span>
<span class="err">dbpedia:Place</span> <span class="err">,</span> <span
class="err">dbpedia:Settlement</span> <span class="err">,</span> <span
class="err">dbpedia:PopulatedPlace</span> <span class="err">,</span> <span
class="err">geonames:P.PPL</span> <span class="err">;</span>
+ <span class="err">fise:extracted-from</span>
+ <span class="err">urn:content-item:id1</span> <span
class="err">;</span>
+<span class="err">dc:relation</span>
+ <span class="err">urn:enhancement:text-enhancement:id1</span> <span
class="err">.</span>
+
+<span class="err">urn:enhancement:entity-enhancement:id3</span>
+ <span class="err">a</span> <span
class="err">fise:EntityAnnotation</span> <span class="err">,</span> <span
class="err">fise:Enhancement</span> <span class="err">;</span>
+ <span class="err">fise:confidence</span>
+ <span class="err">"0.42163702845573425"^^xsd:double</span>
<span class="err">;</span>
+ <span class="err">fise:entity-label</span>
+ <span class="err">"Vienna"^^xsd:string</span> <span
class="err">;</span>
+ <span class="err">fise:entity-reference</span>
+ <span class="err">http:</span><span
class="c1">//sws.geonames.org/4825976/ ;</span>
+ <span class="err">fise:entity-type</span>
+ <span class="err">geonames:Feature</span> <span class="err">,</span>
<span class="err">dbpedia:Place</span> <span class="err">,</span> <span
class="err">dbpedia:Settlement</span> <span class="err">,</span> <span
class="err">dbpedia:PopulatedPlace</span> <span class="err">,</span> <span
class="err">geonames:P.PPL</span> <span class="err">;</span>
+ <span class="err">fise:extracted-from</span>
+ <span class="err">urn:content-item:id1</span> <span
class="err">;</span>
+<span class="err">fdc:relation</span>
+ <span class="err">urn:enhancement:text-enhancement:id1</span> <span
class="err">.</span>
+</pre></div>
+
+
+<h2 id="entity_hierarchy_enhancements">Entity Hierarchy Enhancements</h2>
+<p>Entity Hierarchy Enhancements describe the regions that contain suggested
Features based on the geonames.org dataset. Enhancements describing this
hierarchy are added for all suggested entities with a confidence level above
the value of
"eu.iksproject.fise.engines.geonames.locationEnhancementEngine.min-hierarchy-score".
</p>
+<p>The default value for this property is 0.7. The hierarchy web service
provided by geonames.org is used to calculate the regions.
+The following example shows the entity hierarchy enhancements for the
suggested entity for Vienna (Autria). <em>Please note the dc:requires relation
to this EntityAnnotation at the end of each of the following
enhancement.</em></p>
+<h3 id="continent_europe">Continent: Europe</h3>
+<p>First the enhancement for the continent Europe:</p>
+<div class="codehilite"><pre><span
class="err">urn:enhancement:entity-hierarchy-enhancement:id1</span>
+ <span class="err">a</span> <span
class="err">fise:EntityAnnotation</span> <span class="err">,</span> <span
class="err">fise:Enhancement</span> <span class="err">;</span>
+ <span class="err">fise:confidence</span>
+ <span class="err">"0.42163702845573425"^^xsd:double</span>
<span class="err">;</span>
+ <span class="err">fise:entity-label</span>
+ <span class="err">"Europe"^^xsd:string</span> <span
class="err">;</span>
+ <span class="err">fise:entity-reference</span>
+ <span class="err">http:</span><span
class="c1">//sws.geonames.org/6255148/ ;</span>
+ <span class="err">fise:entity-type</span>
+ <span class="err">geonames:Feature</span> <span class="err">,</span>
<span class="err">dbpedia:Place,</span> <span
class="err">geonames:L.CONT</span> <span class="err">;</span>
+ <span class="err">fise:extracted-from</span>
+ <span class="err">urn:content-item:id1</span> <span
class="err">;</span>
+<span class="err">dc:requires</span>
+ <span class="err">urn:enhancement:entity-enhancement:id1</span> <span
class="err">.</span>
+</pre></div>
+
+
+<h3 id="country_austria">Country: Austria</h3>
+<p>Next the enhancement for the country "Austria", classified as an
independent political entry within geonames.org</p>
+<div class="codehilite"><pre><span
class="err">urn:enhancement:entity-hierarchy-enhancement:id2</span>
+ <span class="err">a</span> <span
class="err">fise:EntityAnnotation</span> <span class="err">,</span> <span
class="err">fise:Enhancement</span> <span class="err">;</span>
+ <span class="err">fise:confidence</span>
+ <span class="err">"0.42163702845573425"^^xsd:double</span>
<span class="err">;</span>
+ <span class="err">fise:entity-label</span>
+ <span class="err">"Austria"^^xsd:string</span> <span
class="err">;</span>
+ <span class="err">fise:entity-reference</span>
+ <span class="err">http:</span><span
class="c1">//sws.geonames.org/2782113/ ;</span>
+ <span class="err">fise:entity-type</span>
+ <span class="err">geonames:Feature</span> <span class="err">,</span>
<span class="err">dbpedia:Place,</span> <span class="err">dbpedia:</span> <span
class="err">AdministrativeRegion,</span> <span
class="err">geonames:A.PCLI</span> <span class="err">;</span>
+ <span class="err">fise:extracted-from</span>
+ <span class="err">urn:content-item:id1</span> <span
class="err">;</span>
+<span class="err">dc:requires</span>
+ <span class="err">urn:enhancement:entity-enhancement:id1</span> <span
class="err">.</span>
+</pre></div>
+
+
+<h3 id="aadm1_-_a_county">A.ADM1 - A county</h3>
+<p>Now three enhancement describing the different hierarchies of
administrative regions within Austria. First the "Bundesland", next the
"Stadtteil" and last the "Gemeindebezirk".</p>
+<div class="codehilite"><pre><span
class="err">urn:enhancement:entity-hierarchy-enhancement:id3</span>
+ <span class="err">a</span> <span
class="err">fise:EntityAnnotation</span> <span class="err">,</span> <span
class="err">fise:Enhancement</span> <span class="err">;</span>
+ <span class="err">fise:confidence</span>
+ <span class="err">"0.42163702845573425"^^xsd:double</span>
<span class="err">;</span>
+ <span class="err">fise:entity-label</span>
+ <span class="err">"Vienna"^^xsd:string</span> <span
class="err">;</span>
+ <span class="err">fise:entity-reference</span>
+ <span class="err">http:</span><span
class="c1">//sws.geonames.org/2761367/ ;</span>
+ <span class="err">fise:entity-type</span>
+ <span class="err">geonames:Feature</span> <span class="err">,</span>
<span class="err">dbpedia:Place,</span> <span class="err">dbpedia:</span> <span
class="err">AdministrativeRegion,</span> <span
class="err">geonames:A.ADM1</span> <span class="err">;</span>
+ <span class="err">fise:extracted-from</span>
+ <span class="err">urn:content-item:id1</span> <span
class="err">;</span>
+<span class="err">dc:requires</span>
+ <span class="err">urn:enhancement:entity-enhancement:id1</span> <span
class="err">.</span>
+</pre></div>
+
+
+<h3 id="aadm2_-_a_city">A.ADM2 - A city</h3>
+<div class="codehilite"><pre><span
class="err">urn:enhancement:entity-hierarchy-enhancement:id4</span>
+ <span class="err">a</span> <span
class="err">fise:EntityAnnotation</span> <span class="err">,</span> <span
class="err">fise:Enhancement</span> <span class="err">;</span>
+ <span class="err">fise:confidence</span>
+ <span class="err">"0.42163702845573425"^^xsd:double</span>
<span class="err">;</span>
+ <span class="err">fise:entity-label</span>
+ <span class="err">"Politischer</span> <span
class="err">Bezirk</span> <span class="err">Wien</span> <span
class="err">(Stadt)"^^xsd:string</span> <span class="err">;</span>
+ <span class="err">fise:entity-reference</span>
+ <span class="err">http:</span><span
class="c1">//sws.geonames.org/2761333/ ;</span>
+ <span class="err">fise:entity-type</span>
+ <span class="err">geonames:Feature</span> <span class="err">,</span>
<span class="err">dbpedia:Place,</span> <span class="err">dbpedia:</span> <span
class="err">AdministrativeRegion,</span> <span
class="err">geonames:A.ADM2</span> <span class="err">;</span>
+ <span class="err">fise:extracted-from</span>
+ <span class="err">urn:content-item:id1</span> <span
class="err">;</span>
+<span class="err">dc:requires</span>
+ <span class="err">urn:enhancement:entity-enhancement:id1</span> <span
class="err">.</span>
+</pre></div>
+
+
+<h3 id="aadm3_-_a_village">A.ADM3 - A village</h3>
+<div class="codehilite"><pre><span
class="err">urn:enhancement:entity-hierarchy-enhancement:id5</span>
+ <span class="err">a</span> <span
class="err">fise:EntityAnnotation</span> <span class="err">,</span> <span
class="err">fise:Enhancement</span> <span class="err">;</span>
+ <span class="err">fise:confidence</span>
+ <span class="err">"0.42163702845573425"^^xsd:double</span>
<span class="err">;</span>
+ <span class="err">fise:entity-label</span>
+ <span class="err">"Gemeindebezirk</span> <span
class="err">Innere</span> <span class="err">Stadt"^^xsd:string</span>
<span class="err">;</span>
+ <span class="err">fise:entity-reference</span>
+ <span class="err">http:</span><span
class="c1">//sws.geonames.org/2775259/ ;</span>
+ <span class="err">fise:entity-type</span>
+ <span class="err">geonames:Feature</span> <span class="err">,</span>
<span class="err">dbpedia:Place,</span> <span class="err">dbpedia:</span> <span
class="err">AdministrativeRegion,</span> <span
class="err">geonames:A.ADM3</span> <span class="err">;</span>
+ <span class="err">fise:extracted-from</span>
+ <span class="err">urn:content-item:id1</span> <span
class="err">;</span>
+<span class="err">dc:requires</span>
+ <span class="err">urn:enhancement:entity-enhancement:id1</span> <span
class="err">.</span>
+</pre></div>
+
+
+<p>The last two hierarchy levels are no longer valid for the meaning of
"Vienna" as selected by the TextAnnotation, but added, because the geonames.org
dataset locations the Feature of cities exactly in the center. However if the
TextAnnotation would describe a precise address such hierarchy levels would
completely make sense.</p>
+<h2 id="configuration">Configuration</h2>
+<p>The LocationEnhancementEngine provides six configurations</p>
+<p>The first three can be used to optimise the behaviour of the Engine
+ - Minimum score (default = 0.33): The minimum score (confidence) that is
required
+ for entity suggestions
+ - Maximum Locations (default = 3): The maximum numbers of entity
+ suggestions added (regardless if there would be more results with a
+ score > min-score.
+ - Maximum Locations (default = 0.7): The minimum score (confidence) that is
+ required that hierarchy enhancements are added for an suggested entity.
+ To add hierarchy enhancements for all suggested entities
+ min-hierarchy-score needs to be set to a value smaller equals
+ than min-score.</p>
+<p>The other three are used to configure the configured geonames.org server
+ - geonames.org Server: The URL of the geonames.org service. The default is the
+ free geonames.org webserver that works without user authentication.
There
+ is a second free server at http://api.geonames.org/ that requires to
setup
+ a free user account. Users with a premium account will require to add
here
+ there own URL
+ - User Name: Thats the name of the account (can be empty if the configured
+ server does not require user authentication
+ - Token: The token is usually the password of the user account.</p>
+<h3 id="howto_setup_a_free_user_account">HOWTO setup a free user account:</h3>
+<p>Such an account is required to be able to use the http://api.geonames.org/
server
+ that should support better performance and higher uptime than the default
+ free server available at http://ws.geonames.org/.</p>
+<p>To setup the free account:
+(1) go to www.geonames.org. In the right top corner you will find a "login"
link
+ that is also used to create new accounts
+(2) choose a username and pwd. You will get an confirmation mail at the
provided
+ email address. When choosing the password consider, that it will be sent
+ unencrypted (as token) with every webservice Request. Therefore it is
+ strongly suggested to do not use an password that is used for any other
+ account!<br />
+(3) confirm the account
+(4) IMPORTANT: You need to activate the free web service for the account via
+ http://www.geonames.org/manageaccount. Log in first, go back to this site.
+ At the botton you should find the text "the account is not yet enabled to
+ use the free web services. Click here to enable"</p>
+<p>If you do not complete step (4) requests with your account will result in
+IOExceptions with the message
+ "user account not enabled to use the free webservice. Please enable it on
your account page: http://www.geonames.org/manageaccount"</p>
+ </div>
+
+ <div id="footer">
+ <div class="copyright">
+ <p>
+ Copyright © 2010 The Apache Software Foundation, Licensed under
+ the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache
License, Version 2.0</a>.
+ <br />
+ Apache, Stanbol and the Apache feather and Stanbol logos are
trademarks of The Apache Software Foundation.
+ </p>
+ </div>
+ </div>
+
+</body>
+</html>
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/opencalaisengine.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/opencalaisengine.html
(added)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/opencalaisengine.html
Mon Oct 10 14:02:38 2011
@@ -0,0 +1,150 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE- 2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+ <link href="/stanbol/css/stanbol.css" rel="stylesheet" type="text/css">
+ <title>Apache Stanbol - The OpenCalais Enhancement Engine</title>
+ <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+ <link rel="icon" type="image/png"
href="/stanbol/images/stanbol-logo/stanbol-favicon.png"/>
+</head>
+
+<body>
+ <div id="navigation">
+ <img alt="Apache Stanbol" width="220" height="101"
src="/stanbol/images/stanbol-logo/stanbol-2010-12-14.png"/>
+ <h1 id="stanbol_links">Stanbol links</h1>
+<ul>
+<li><a href="/stanbol/index.html">Home</a></li>
+<li><a href="/stanbol/team.html">Project Team</a></li>
+<li><a href="/stanbol/docs/trunk/">Documentation</a></li>
+</ul>
+<h1 id="asf_links">ASF links</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a
Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+ </div>
+
+ <div id="content">
+ <h1 class="title">The OpenCalais Enhancement Engine</h1>
+ <p>The <strong>OpenCalais Enhancement Engine</strong> provides an
interface to the <a href="http://www.opencalais.com/">OpenCalais
+Webservice</a> for Named Entity Recognition (NER).</p>
+<h2 id="technical_description">Technical description</h2>
+<p>The engine will send the text of content item to the OpenCalais service and
+retrieve the NER annotations in RDF format. The OpenCalais annotations are
+added to the content item's metadata as Stanbol text enhancement
structures.</p>
+<p>The engine natively supports the mime types <em>text/plain</em> and
+<em>text/html</em>. Additionally, text can be processed that is provided in
the content
+item's metadata as value of the property</p>
+<div class="codehilite"><pre><span class="n">http:</span><span
class="sr">//</span><span class="n">www</span><span class="o">.</span><span
class="n">semanticdesktop</span><span class="o">.</span><span
class="n">org</span><span class="sr">/ontologies/</span><span
class="mi">2007</span><span class="sr">/01/</span><span
class="mi">19</span><span class="o">/</span><span class="n">nie</span><span
class="c1">#plainTextContent</span>
+</pre></div>
+
+
+<p>Supported languages are</p>
+<ul>
+<li>English (en)</li>
+<li>French (fr)</li>
+<li>Spanish (es)</li>
+</ul>
+<h2 id="requirements_for_use_and_configuration_options">Requirements for use
and configuration options</h2>
+<p>The use of this component requires an API key from OpenCalais. Without
+providing an API key, the engine will not do anything. Such a key can be
+obtained from <a
href="http://www.opencalais.com/APIkey">http://www.opencalais.com/APIkey</a>.</p>
+<p>In the OSGi configuration the key is set as value of the property</p>
+<div class="codehilite"><pre><span class="n">org</span><span
class="o">.</span><span class="n">apache</span><span class="o">.</span><span
class="n">stanbol</span><span class="o">.</span><span
class="n">enhancer</span><span class="o">.</span><span
class="n">engines</span><span class="o">.</span><span
class="n">opencalais</span><span class="o">.</span><span
class="n">license</span>
+</pre></div>
+
+
+<p>Also, the unit tests require the API key. Without the key some tests will be
+skipped. For Maven the key can be set as a system property on the command
line:</p>
+<div class="codehilite"><pre><span class="n">mvn</span> <span
class="o">-</span><span class="n">Dorg</span><span class="o">.</span><span
class="n">apache</span><span class="o">.</span><span
class="n">stanbol</span><span class="o">.</span><span
class="n">enhancer</span><span class="o">.</span><span
class="n">engines</span><span class="o">.</span><span
class="n">opencalais</span><span class="o">.</span><span
class="n">license</span><span class="o">=</span><span
class="n">YOUR_API_KEY</span> <span class="p">[</span><span
class="n">install</span><span class="o">|</span><span
class="n">test</span><span class="p">]</span>
+</pre></div>
+
+
+<p>The following configuration properties are defined:</p>
+<ul>
+<li>
+<p><tt>org.apache.stanbol.enhancer.engines.opencalais.license</tt></p>
+<p>The OpenCalais license key that <strong>must</strong> be defined.</p>
+</li>
+<li>
+<p><tt>org.apache.stanbol.enhancer.engines.opencalais.url</tt></p>
+<p>The URL of the OpenCalais RESTful service. That needs only be changed
+when OpenCalais should change its web service address.</p>
+</li>
+<li>
+<p><tt>org.apache.stanbol.enhancer.engines.opencalais.typeMap</tt></p>
+<p>The value is the name
+of a file for mapping the NER types from OpenCalais to other types. By
+default, a mapping to the DBPedia types is provided in order to achieve
+compatibility with the Stanbol OpenLNLP-NER engine. If no mapping is
+desired one might pass an empty mapping file. Types for which no
+mapping is defined are passed as is to the metadata. The syntax of the
+mapping table is similar to that of Java property files. Each entry
+takes the form</p>
+<p>CalaisTypeURI=TargetTypeURI</p>
+</li>
+<li>
+<p><tt>org.apache.stanbol.enhancer.engines.opencalais.NERonly</tt></p>
+<p>A Boolean property to
+specify whether in addition to the NER enhancements also the OpenCalais
+Linked Data references are included as entity references. By default,
+these are omitted.</p>
+</li>
+</ul>
+<h2 id="usage">Usage</h2>
+<p>Assuming that the Stanbol endpoint with the full launcher is running at</p>
+<div class="codehilite"><pre><span class="n">http:</span><span
class="sr">//</span><span class="n">localhost:8080</span>
+</pre></div>
+
+
+<p>the license key has been defined and the engine is activated, from the
+command line commands like this can be used for submitting some text file as
content item:</p>
+<ul>
+<li>
+<p>stateless interface</p>
+<p>curl -i -X POST -H "Content-Type:text/plain" -T testfile.txt
http://localhost:8080/engines</p>
+</li>
+<li>
+<p>stateful interface</p>
+<p>curl -i -X PUT -H "Content-Type:text/plain" -T testfile.txt
http://localhost:8080/contenthub/content/someFileId</p>
+</li>
+</ul>
+<p>Alternatively, the Stanbol web interface can be used for submitting
documents
+and viewing the metadata at</p>
+<div class="codehilite"><pre><span class="n">http:</span><span
class="sr">//</span><span class="n">localhost:8080</span><span
class="o">/</span><span class="n">contenthub</span>
+</pre></div>
+ </div>
+
+ <div id="footer">
+ <div class="copyright">
+ <p>
+ Copyright © 2010 The Apache Software Foundation, Licensed under
+ the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache
License, Version 2.0</a>.
+ <br />
+ Apache, Stanbol and the Apache feather and Stanbol logos are
trademarks of The Apache Software Foundation.
+ </p>
+ </div>
+ </div>
+
+</body>
+</html>
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/refactorengine.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/refactorengine.html
(added)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/refactorengine.html
Mon Oct 10 14:02:38 2011
@@ -0,0 +1,73 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE- 2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+ <link href="/stanbol/css/stanbol.css" rel="stylesheet" type="text/css">
+ <title>Apache Stanbol - The Refactor Engine</title>
+ <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+ <link rel="icon" type="image/png"
href="/stanbol/images/stanbol-logo/stanbol-favicon.png"/>
+</head>
+
+<body>
+ <div id="navigation">
+ <img alt="Apache Stanbol" width="220" height="101"
src="/stanbol/images/stanbol-logo/stanbol-2010-12-14.png"/>
+ <h1 id="stanbol_links">Stanbol links</h1>
+<ul>
+<li><a href="/stanbol/index.html">Home</a></li>
+<li><a href="/stanbol/team.html">Project Team</a></li>
+<li><a href="/stanbol/docs/trunk/">Documentation</a></li>
+</ul>
+<h1 id="asf_links">ASF links</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a
Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+ </div>
+
+ <div id="content">
+ <h1 class="title">The Refactor Engine</h1>
+ <p>This enhancement engine requires the following components running:</p>
+<ul>
+<li>Stanbol Entityhub</li>
+<li>Stanbol Refactor</li>
+<li>Stanbol OntoNet</li>
+</ul>
+<p>It refactor the RDF graphs of recognized entities to a target vocabulary.
+The engines is provided with a default set of rules (a recipe) for the
refactoring which allows
+to produce an RDF graph according to the google vocabulary. That default
recipe allows to produce google rich
+snippets.</p>
+ </div>
+
+ <div id="footer">
+ <div class="copyright">
+ <p>
+ Copyright © 2010 The Apache Software Foundation, Licensed under
+ the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache
License, Version 2.0</a>.
+ <br />
+ Apache, Stanbol and the Apache feather and Stanbol logos are
trademarks of The Apache Software Foundation.
+ </p>
+ </div>
+ </div>
+
+</body>
+</html>
Added:
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/zemantaengine.html
==============================================================================
---
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/zemantaengine.html
(added)
+++
websites/staging/stanbol/trunk/content/stanbol/docs/trunk/enhancer/engines/zemantaengine.html
Mon Oct 10 14:02:38 2011
@@ -0,0 +1,84 @@
+<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
+<html>
+<head>
+<!--
+
+ Licensed to the Apache Software Foundation (ASF) under one or more
+ contributor license agreements. See the NOTICE file distributed with
+ this work for additional information regarding copyright ownership.
+ The ASF licenses this file to You under the Apache License, Version 2.0
+ (the "License"); you may not use this file except in compliance with
+ the License. You may obtain a copy of the License at
+
+ http://www.apache.org/licenses/LICENSE- 2.0
+
+ Unless required by applicable law or agreed to in writing, software
+ distributed under the License is distributed on an "AS IS" BASIS,
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ See the License for the specific language governing permissions and
+ limitations under the License.
+-->
+
+ <link href="/stanbol/css/stanbol.css" rel="stylesheet" type="text/css">
+ <title>Apache Stanbol - The Zemanta enhancement engine</title>
+ <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+ <link rel="icon" type="image/png"
href="/stanbol/images/stanbol-logo/stanbol-favicon.png"/>
+</head>
+
+<body>
+ <div id="navigation">
+ <img alt="Apache Stanbol" width="220" height="101"
src="/stanbol/images/stanbol-logo/stanbol-2010-12-14.png"/>
+ <h1 id="stanbol_links">Stanbol links</h1>
+<ul>
+<li><a href="/stanbol/index.html">Home</a></li>
+<li><a href="/stanbol/team.html">Project Team</a></li>
+<li><a href="/stanbol/docs/trunk/">Documentation</a></li>
+</ul>
+<h1 id="asf_links">ASF links</h1>
+<ul>
+<li><a href="http://www.apache.org">Apache Software Foundation</a></li>
+<li><a href="http://www.apache.org/licenses/LICENSE-2.0">License</a></li>
+<li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+<li><a href="http://www.apache.org/foundation/sponsorship.html">Become a
Sponsor</a></li>
+<li><a href="http://www.apache.org/security/">Security</a></li>
+</ul>
+ </div>
+
+ <div id="content">
+ <h1 class="title">The Zemanta enhancement engine</h1>
+ <p>Enhancement engine that uses the Zemanta API. You need a Zemanta API
key to run this engine.</p>
+<h2 id="usage">Usage</h2>
+<ul>
+<li>
+<p>build ("mvn install") and deploy the Clerezza bundle
org.apache.clerezza.rdf.jena.parser</p>
+</li>
+<li>
+<p>build the jar ("mvn install")</p>
+</li>
+<li>
+<p>import the jar into the OSGi runtime</p>
+</li>
+<li>
+<p>In the OSGi web console, set the property
"org.apache.stanbol.enhancer.engines.zemanta.key" with your API key
+ (restart the bundle in the OSGi console)</p>
+</li>
+<li>
+<p>Watch the console when you add text using commands such as:</p>
+</li>
+</ul>
+<p>curl -T myText.txt -H Content-Type:text/plain
http://localhost:8080/fise/someId</p>
+ </div>
+
+ <div id="footer">
+ <div class="copyright">
+ <p>
+ Copyright © 2010 The Apache Software Foundation, Licensed under
+ the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache
License, Version 2.0</a>.
+ <br />
+ Apache, Stanbol and the Apache feather and Stanbol logos are
trademarks of The Apache Software Foundation.
+ </p>
+ </div>
+ </div>
+
+</body>
+</html>
Modified: websites/staging/stanbol/trunk/content/stanbol/docs/trunk/index.html
==============================================================================
--- websites/staging/stanbol/trunk/content/stanbol/docs/trunk/index.html
(original)
+++ websites/staging/stanbol/trunk/content/stanbol/docs/trunk/index.html Mon
Oct 10 14:02:38 2011
@@ -112,7 +112,7 @@ contains Stanbol's persistent data, depl
<li><a href="enhancer.html">Enhancer</a></li>
<li><a href="engines.html">Enhancement Engines</a></li>
<li><a href="entityhub.html">Entityhub</a></li>
-<li>Contenthub</li>
+<li><a href="contenthub.html">Contenthub</a></li>
<li>CMS Adapter </li>
<li>Ontology Manager</li>
<li>Reasoners</li>