Author: rwesten
Date: Mon May 16 10:34:52 2011
New Revision: 1103676

URL: http://svn.apache.org/viewvc?rev=1103676&view=rev
Log:
STANBOL-92: Adds support to use DBpedia, geonames.org and DBLP as Referenced 
Sites for the Entityhub.

the /data/sites folder will include bundles of popular Sites that can be used - 
out of the box - with the Stanbol Entityhub.

See the README.md for more information

Added:
    incubator/stanbol/trunk/data/
    incubator/stanbol/trunk/data/README.md
    incubator/stanbol/trunk/data/pom.xml   (with props)
    incubator/stanbol/trunk/data/sites/
    incubator/stanbol/trunk/data/sites/dblp/
    incubator/stanbol/trunk/data/sites/dblp/README.md
    incubator/stanbol/trunk/data/sites/dblp/pom.xml   (with props)
    incubator/stanbol/trunk/data/sites/dblp/src/
    incubator/stanbol/trunk/data/sites/dblp/src/main/
    incubator/stanbol/trunk/data/sites/dblp/src/main/resources/
    incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/
    incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/
    
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/
    
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/
    
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/
    
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/
    
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref
    
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config
    
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config
    
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config
    incubator/stanbol/trunk/data/sites/dbpedia/
    incubator/stanbol/trunk/data/sites/dbpedia/README.md
    incubator/stanbol/trunk/data/sites/dbpedia/pom.xml   (with props)
    incubator/stanbol/trunk/data/sites/dbpedia/src/
    incubator/stanbol/trunk/data/sites/dbpedia/src/main/
    incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/
    incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/
    incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/
    
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/
    
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/
    
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/
    
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/
    
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref
    
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config
    
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config
    
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config
    incubator/stanbol/trunk/data/sites/geonames/
    incubator/stanbol/trunk/data/sites/geonames/README.md
    incubator/stanbol/trunk/data/sites/geonames/pom.xml   (with props)
    incubator/stanbol/trunk/data/sites/geonames/src/
    incubator/stanbol/trunk/data/sites/geonames/src/main/
    incubator/stanbol/trunk/data/sites/geonames/src/main/resources/
    incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/
    incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/
    
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/
    
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/
    
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/
    
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/
    
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref
    
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config
    
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config
    
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config

Added: incubator/stanbol/trunk/data/README.md
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/README.md?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/README.md (added)
+++ incubator/stanbol/trunk/data/README.md Mon May 16 10:34:52 2011
@@ -0,0 +1,28 @@
+# Data files for optional extensions of the Stanbol distributions
+
+This source repository holds the pom.xml file and folder structure to build
+optional packages for Apache Stanbol.
+
+To avoid loading subversion repository with large binary files this artifacts
+are typically not included but need to be build/precomputed or downloaded
+form other sites.
+The the documentations of the according module for details.
+
+## DataFileProvider Service
+
+The DataFileProvoder Service is typically used by components that need to load
+big binary files to Apache Stanbol.
+See {stanbol-root}/commons/stanboltools/datafileprovider for details
+
+## Bundleprovider
+
+The Bundleprovider is an extension to the Apache Sling installer framework
+and supports to load multiple configuration files form a single bundle.
+
+It is intended to be used in cases where a single Stanbol module needs to
+package several configuration files (e.g. the configuration of several OSGI
+Services).
+
+See {stanbol-root}/commons/installer/bundleprovider for details.
+
+

Added: incubator/stanbol/trunk/data/pom.xml
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/pom.xml?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/pom.xml (added)
+++ incubator/stanbol/trunk/data/pom.xml Mon May 16 10:34:52 2011
@@ -0,0 +1,24 @@
+<?xml version="1.0"?>
+<project>
+    <modelVersion>4.0.0</modelVersion>
+
+    <groupId>org.apache.stanbol</groupId>
+    <artifactId>org.apache.stanbol.data.reactor</artifactId>
+    <version>0.9-SNAPSHOT</version>
+    <packaging>pom</packaging>
+    
+    <name>Apache Stanbol Data Reactor</name>
+    <scm>
+        <connection>
+            
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data
+        </connection>
+        <developerConnection>
+            
scm:svn:https://svn.apache.org/repos/asf/incubator/stanbol/trunk/data
+        </developerConnection>
+        <url>http://incubator.apache.org/stanbol/</url>
+    </scm>
+
+    <modules>
+        <module>sites</module>
+    </modules>
+</project>

Propchange: incubator/stanbol/trunk/data/pom.xml
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: incubator/stanbol/trunk/data/sites/dblp/README.md
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/README.md?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/dblp/README.md (added)
+++ incubator/stanbol/trunk/data/sites/dblp/README.md Mon May 16 10:34:52 2011
@@ -0,0 +1,54 @@
+# DBLP with local index for the Apache Stanbol Entityhub
+
+This build a bundle that can be installed to add the 
[DBLP](http://dblp.uni-trier.de/) 
+data set as a ReferencedSite to the Apache Entityhub.
+
+The binary data for the local cache are not included but need to be
+downloaded (TODO: add download location as soon as available) or built locally
+by using the DBLP indexing utility.
+
+PLEASE NOTE that the DBLP dataset does not provide any License information. 
+
+
+## Installation
+
+First build the bundle by calling
+
+    mvn install
+
+It the command succeeds the bundle is available in the target folder
+    
+    target/org.apache.stanbol.data.sites.dblp-.*.jar
+
+This bundle can now be installed to a running Stanbol instance e.g. by using
+the Apache Felix Webconsole.
+
+NOTE: This steps requires the Sling Installer Framework as well as the 
+Stanbol BundleInstaller extension to be active. Both are typically included
+within the Stanbol Launcher.
+
+After installing and starting this Bundle the Stanbol Data File Provider (a
+tab within the Apache Felix Webconsole) will show a request for the binary
+file for the local index.
+
+To finalise the installation you need to copy the requested file to the
+directory used by the Stanbol Data File Provider
+
+    sling/datafiles/
+    
+and that restart the SolrYard instance with the name
+    
+    dblpIndex
+    
+ 
+## Building the DBLP index
+
+To build a local Index for DBLP the Apache Entityhub provides an own utility
+The module is located at
+
+    {stanbol}/entityhub/indexing/dblp
+
+A detailed documentation on how to use this utility is provided by the
+README file.
+
+

Added: incubator/stanbol/trunk/data/sites/dblp/pom.xml
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/pom.xml?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/dblp/pom.xml (added)
+++ incubator/stanbol/trunk/data/sites/dblp/pom.xml Mon May 16 10:34:52 2011
@@ -0,0 +1,71 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <groupId>org.apache.stanbol</groupId>
+  <artifactId>org.apache.stanbol.data.sites.dblp</artifactId>
+  <!-- Fixed version number to avoid reuploading / redownloading large data 
jars
+    when not not necessary
+    We need to find a way to better distribute this and clarify the potential
+    legal distribution issues and the impact on the ASF release process.
+
+    See also:
+    https://issues.apache.org/jira/browse/OPENNLP-68
+  -->
+  <version>0.0.1</version>
+  <packaging>bundle</packaging>
+
+  <name>Apache Stanbol Data: DBLP</name>
+  <description>
+    This bundle installs DBLP as Referenced Site with a full local cache to
+    the Apache Stanbol Entityhub.
+    The data of the local cache are not included but MUST be either downloaded
+    or precomputed by using the DBLP indexing utility (see 
+    "{stanbol}/entityhub/indexing/dblp")
+  </description>
+
+  <inceptionYear>2011</inceptionYear>
+
+  <scm>
+    <connection>
+      
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/sites/dblp
+    </connection>
+    <developerConnection>
+      
scm:svn:https://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/sites/dblp
+    </developerConnection>
+    <url>http://incubator.apache.org/stanbol/</url>
+  </scm>
+  <properties>
+    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+  </properties>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.felix</groupId>
+        <artifactId>maven-bundle-plugin</artifactId>
+        <version>2.3.4</version>
+        <inherited>true</inherited>
+        <extensions>true</extensions>
+        <configuration>
+          <instructions>
+            <Bundle-Category>Stanbol Data</Bundle-Category>
+            <Bundle-DocURL>http://incubator.apache.org/stanbol</Bundle-DocURL>
+            <Bundle-Vendor>Apache Stanbol (Incubating)</Bundle-Vendor>
+            <Bundle-SymbolicName>${project.artifactId}</Bundle-SymbolicName>
+            <_versionpolicy>$${version;===;${@}}</_versionpolicy>
+            <Export-Package>
+              org.apache.stanbol.data.site.dblp.*;version="${pom.version}"
+            </Export-Package>
+            <Install-Path>
+              org/apache/stanbol/data
+            </Install-Path>
+          </instructions>
+        </configuration>
+      </plugin>
+    </plugins>
+  </build>
+
+</project>

Propchange: incubator/stanbol/trunk/data/sites/dblp/pom.xml
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref
 (added)
+++ 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref
 Mon May 16 10:34:52 2011
@@ -0,0 +1,5 @@
+Name=DBLP Computer Science Bibliography
+Description=DBLP provides bibliographic information on major computer science 
journals and proceedings. DBLP indexes more than one million articles and 
contains more than 10000 links to home pages of computer scientists.
+License=No License Information available
+Homepage=http://dblp.uni-trier.de/
+Index-Archive=dblp.solrindex.zip
\ No newline at end of file

Added: 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config
 (added)
+++ 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config
 Mon May 16 10:34:52 2011
@@ -0,0 +1,2 @@
+org.apache.stanbol.entityhub.yard.cacheYardId="dblpIndex"
+org.apache.stanbol.entityhub.yard.cache.additionalMappings=[""]

Added: 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config
 (added)
+++ 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config
 Mon May 16 10:34:52 2011
@@ -0,0 +1,14 @@
+org.apache.stanbol.entityhub.site.searcherType="org.apache.stanbol.entityhub.site.SparqlSearcher"
+org.apache.stanbol.entityhub.site.cacheId="dblpIndex"
+org.apache.stanbol.entityhub.site.entityPrefix=["http://dblp.l3s.de/d2r/resource/";]
+org.apache.stanbol.entityhub.site.dereferencerType="org.apache.stanbol.entityhub.site.CoolUriDereferencer"
+org.apache.stanbol.entityhub.site.name="DBLP"
+org.apache.stanbol.entityhub.site.fieldMappings=["swrc:*","rdfs:seeAlso\ |\ 
d\=entityhub:ref"]
+org.apache.stanbol.entityhub.site.accessUri="http://dblp.l3s.de/d2r/resource/";
+org.apache.stanbol.entityhub.site.cacheStrategy="all"
+org.apache.stanbol.entityhub.site.defaultExpireDuration="0"
+org.apache.stanbol.entityhub.site.queryUri="http://dblp.l3s.de/d2r/sparql";
+org.apache.stanbol.entityhub.site.id="dblp"
+org.apache.stanbol.entityhub.site.defaultMappedEntityState="proposed"
+org.apache.stanbol.entityhub.site.description="The\ DBLP\ Computer\ Science\ 
Bibliography"
+org.apache.stanbol.entityhub.site.defaultSymbolState="proposed"

Added: 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config
 (added)
+++ 
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config
 Mon May 16 10:34:52 2011
@@ -0,0 +1,9 @@
+org.apache.stanbol.entityhub.yard.maxQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.defaultQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.description="Yard\ holding\ the\ local\ 
copy\ of\ the\ DBLP\ dataset\ (see\ http://dblp.uni-trier.de/)"
+org.apache.stanbol.entityhub.yard.solr.maxBooleanClauses=I"1024"
+org.apache.stanbol.entityhub.yard.solr.solrUri="dblp"
+org.apache.stanbol.entityhub.yard.solr.multiYardIndexLayout="false"
+org.apache.stanbol.entityhub.yard.solr.allowDefaultConfig="false"
+org.apache.stanbol.entityhub.yard.id="dblpIndex"
+org.apache.stanbol.entityhub.yard.name="DPLP\ Yard"

Added: incubator/stanbol/trunk/data/sites/dbpedia/README.md
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/README.md?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/dbpedia/README.md (added)
+++ incubator/stanbol/trunk/data/sites/dbpedia/README.md Mon May 16 10:34:52 
2011
@@ -0,0 +1,55 @@
+# DBpedia.org with local index for the Apache Stanbol Entityhub
+
+This build a bundle that can be installed to add DBpedia.org as a
+ReferencedSite to the Apache Entityhub.
+
+It will override the "dbpedia" referenced site included in the default
+configuration of the "full" launcher of Apache Stanbol.
+
+The binary data for the local cache are not included but need to be
+downloaded (TODO: add download location as soon as available) or built locally
+by using the DBpedia.org indexing utility.
+
+
+## Installation
+
+First build the bundle by calling
+
+    mvn install
+
+It the command succeeds the bundle is available in the target folder
+    
+    target/org.apache.stanbol.data.sites.dbpedia-.*.jar
+
+This bundle can now be installed to a running Stanbol instance e.g. by using
+the Apache Felix Webconsole.
+
+NOTE: This steps requires the Sling Installer Framework as well as the 
+Stanbol BundleInstaller extension to be active. Both are typically included
+within the Stanbol Launcher.
+
+After installing and starting this Bundle the Stanbol Data File Provider (a
+tab within the Apache Felix Webconsole) will show a request for the binary
+file for the local index.
+
+To finalise the installation you need to copy the requested file to the
+directory used by the Stanbol Data File Provider
+
+    sling/datafiles/
+    
+and that restart the SolrYard instance with the name
+    
+    DBpediaIndex
+    
+ 
+## Building the DBpedia.org index
+
+To build a local Index for DBPedia the Apache Entityhub provides an own utility
+The module is located at
+
+    {stanbol}/entityhub/indexing/dbpedia
+
+A detailed documentation on how to use this utility is provided by the
+README file.
+
+

Added: incubator/stanbol/trunk/data/sites/dbpedia/pom.xml
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/pom.xml?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/dbpedia/pom.xml (added)
+++ incubator/stanbol/trunk/data/sites/dbpedia/pom.xml Mon May 16 10:34:52 2011
@@ -0,0 +1,71 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <groupId>org.apache.stanbol</groupId>
+  <artifactId>org.apache.stanbol.data.sites.dbpedia</artifactId>
+  <!-- Fixed version number to avoid reuploading / redownloading large data 
jars
+    when not not necessary
+    We need to find a way to better distribute this and clarify the potential
+    legal distribution issues and the impact on the ASF release process.
+
+    See also:
+    https://issues.apache.org/jira/browse/OPENNLP-68
+  -->
+  <version>0.0.1</version>
+  <packaging>bundle</packaging>
+
+  <name>Apache Stanbol Data: DBpedia.org </name>
+  <description>
+    This bundle installs DBpedia as Referenced Site with a full local cache to
+    the Apache Stanbol Entityhub.
+    The data of the local cache are not included but MUST be either downloaded
+    or precomputed by using the DBpedia indexing utility (see 
+    "{stanbol}/entityhub/indexing/dbpedia")
+  </description>
+
+  <inceptionYear>2011</inceptionYear>
+
+  <scm>
+    <connection>
+      
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/sites/dbpedia
+    </connection>
+    <developerConnection>
+      
scm:svn:https://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/sites/dbpedia
+    </developerConnection>
+    <url>http://incubator.apache.org/stanbol/</url>
+  </scm>
+  <properties>
+    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+  </properties>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.felix</groupId>
+        <artifactId>maven-bundle-plugin</artifactId>
+        <version>2.0.1</version>
+        <inherited>true</inherited>
+        <extensions>true</extensions>
+        <configuration>
+          <instructions>
+            <Bundle-Category>Stanbol Data</Bundle-Category>
+            <Bundle-DocURL>http://incubator.apache.org/stanbol</Bundle-DocURL>
+            <Bundle-Vendor>Apache Stanbol (Incubating)</Bundle-Vendor>
+            <Bundle-SymbolicName>${project.artifactId}</Bundle-SymbolicName>
+            <Install-Path>
+              org/apache/stanbol/data
+            </Install-Path>
+            <_versionpolicy>$${version;===;${@}}</_versionpolicy>
+            <Export-Package>
+              org.apache.stanbol.data.site.dbpedia.*;version="${pom.version}"
+            </Export-Package>
+          </instructions>
+        </configuration>
+      </plugin>
+    </plugins>
+  </build>
+
+</project>

Propchange: incubator/stanbol/trunk/data/sites/dbpedia/pom.xml
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref
 (added)
+++ 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref
 Mon May 16 10:34:52 2011
@@ -0,0 +1,6 @@
+Name=DBpedia
+Description=DBpedia is a community effort to extract structured information 
from Wikipedia and to make this information available on the Web. DBpedia 
allows you to ask sophisticated queries against Wikipedia, and to link other 
data sets on the Web to Wikipedia data. We hope this will make it easier for 
the amazing amount of information in Wikipedia to be used in new and 
interesting ways, and that it might inspire new mechanisms for navigating, 
linking and improving the encyclopaedia itself.
+License=Creative Commons Attribution-ShareAlike 3.0
+License-Url=http://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License
+Homepage=http://dbpedia.org/
+Index-Archive=dbPedia.solrindex.zip
\ No newline at end of file

Added: 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config
 (added)
+++ 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config
 Mon May 16 10:34:52 2011
@@ -0,0 +1,2 @@
+org.apache.stanbol.entityhub.yard.cacheYardId="dbPediaIndex"
+org.apache.stanbol.entityhub.yard.cache.additionalMappings=["|\ 
@\=null;en;de;fr;it;es","*"]

Added: 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config
 (added)
+++ 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config
 Mon May 16 10:34:52 2011
@@ -0,0 +1,18 @@
+org.apache.stanbol.entityhub.site.accessUri="http://dbpedia.org/sparql/";
+org.apache.stanbol.entityhub.site.queryUri="http://dbpedia.org/sparql";
+org.apache.stanbol.entityhub.site.defaultExpireDuration="0"
+org.apache.stanbol.entityhub.site.id="dbpedia"
+org.apache.stanbol.entityhub.site.defaultMappedEntityState="proposed"
+org.apache.stanbol.entityhub.site.name="DB\ Pedia"
+org.apache.stanbol.entityhub.site.cacheId="dbPediaIndex"
+org.apache.stanbol.entityhub.site.dereferencerType="org.apache.stanbol.entityhub.site.SparqlDereferencer"
+org.apache.stanbol.entityhub.site.description="The\ OLD\ Endpoint\ for\ 
Wikipedia"
+org.apache.stanbol.entityhub.site.cacheStrategy="all"
+org.apache.stanbol.entityhub.site.fieldMappings=["dbp-ont:*","dbp-ont:thumbnail\
 |\ d\=xsd:anyURI\ >\ foaf:depiction","dbp-prop:latitude\ |\ d\=xsd:decimal\ >\ 
geo:lat","dbp-prop:longitude\ |\ d\=xsd:decimal\ >\ 
geo:long","dbp-prop:population\ |\ d\=xsd:integer","dbp-prop:website\ |\ 
d\=xsd:anyURI\ >\ foaf:homepage"]
+org.apache.stanbol.entityhub.site.entityPrefix=["http://dbpedia.org/resource/","http://dbpedia.org/ontology/";]
+org.apache.stanbol.entityhub.site.searcherType="org.apache.stanbol.entityhub.site.VirtuosoSearcher"
+org.apache.stanbol.entityhub.site.defaultSymbolState="proposed"
+org.apache.stanbol.entityhub.site.licenseName=["Creative\ Commons\ 
Attribution-ShareAlike\ 3.0","GNU Free Documentation License"]
+org.apache.stanbol.entityhub.site.licenseUrl=["http://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License","http://en.wikipedia.org/wiki/Wikipedia:Text_of_the_GNU_Free_Documentation_License";]
+org.apache.stanbol.entityhub.site.attributionUrl="http://wiki.dbpedia.org/About";
+org.apache.stanbol.entityhub.site.attribution="DBpedia"

Added: 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config
 (added)
+++ 
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config
 Mon May 16 10:34:52 2011
@@ -0,0 +1,9 @@
+org.apache.stanbol.entityhub.yard.maxQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.defaultQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.description="The\ cache\ for\ DBpedia\ 
(using\ the\ default\ cache\ index\ and\ multi\ yard\ layout)"
+org.apache.stanbol.entityhub.yard.solr.maxBooleanClauses=I"1024"
+org.apache.stanbol.entityhub.yard.solr.solrUri="dbPedia"
+org.apache.stanbol.entityhub.yard.solr.multiYardIndexLayout="false"
+org.apache.stanbol.entityhub.yard.solr.allowDefaultConfig="false"
+org.apache.stanbol.entityhub.yard.id="dbPediaIndex"
+org.apache.stanbol.entityhub.yard.name="dbPedia\ Cache"

Added: incubator/stanbol/trunk/data/sites/geonames/README.md
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/README.md?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/geonames/README.md (added)
+++ incubator/stanbol/trunk/data/sites/geonames/README.md Mon May 16 10:34:52 
2011
@@ -0,0 +1,52 @@
+# geonames.org with local index for the Apache Stanbol Entityhub
+
+This build a bundle that can be installed to add the 
[geonames.org](http://geonames.org/) 
+data set as a ReferencedSite to the Apache Entityhub.
+
+The binary data for the local cache are not included but need to be
+downloaded (TODO: add download location as soon as available) or built locally
+by using the geonames.org indexing utility.
+
+
+## Installation
+
+First build the bundle by calling
+
+    mvn install
+
+It the command succeeds the bundle is available in the target folder
+    
+    target/org.apache.stanbol.data.sites.geonames-.*.jar
+
+This bundle can now be installed to a running Stanbol instance e.g. by using
+the Apache Felix Webconsole.
+
+NOTE: This steps requires the Sling Installer Framework as well as the 
+Stanbol BundleInstaller extension to be active. Both are typically included
+within the Stanbol Launcher.
+
+After installing and starting this Bundle the Stanbol Data File Provider (a
+tab within the Apache Felix Webconsole) will show a request for the binary
+file for the local index.
+
+To finalise the installation you need to copy the requested file to the
+directory used by the Stanbol Data File Provider
+
+    sling/datafiles/
+    
+and that restart the SolrYard instance with the name
+    
+    geonamesIndex
+    
+ 
+## Building the geonames.org index
+
+To build a local Index for geonames.org the Apache Entityhub provides an own 
+utility. The module is located at
+
+    {stanbol}/entityhub/indexing/geonames
+
+A detailed documentation on how to use this utility is provided by the
+README file.
+
+

Added: incubator/stanbol/trunk/data/sites/geonames/pom.xml
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/pom.xml?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/geonames/pom.xml (added)
+++ incubator/stanbol/trunk/data/sites/geonames/pom.xml Mon May 16 10:34:52 2011
@@ -0,0 +1,71 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance";
+  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd";>
+
+  <modelVersion>4.0.0</modelVersion>
+
+  <groupId>org.apache.stanbol</groupId>
+  <artifactId>org.apache.stanbol.data.sites.geonames</artifactId>
+  <!-- Fixed version number to avoid reuploading / redownloading large data 
jars
+    when not not necessary
+    We need to find a way to better distribute this and clarify the potential
+    legal distribution issues and the impact on the ASF release process.
+
+    See also:
+    https://issues.apache.org/jira/browse/OPENNLP-68
+  -->
+  <version>0.0.1</version>
+  <packaging>bundle</packaging>
+
+  <name>Apache Stanbol Data: geonames.org</name>
+  <description>
+    This bundle installs geonames.org as Referenced Site with a full local 
+    cache to the Apache Stanbol Entityhub.
+    The data of the local cache are not included but MUST be either downloaded
+    or precomputed by using the geonames.org indexing utility (see 
+    "{stanbol}/entityhub/indexing/geonames")
+  </description>
+
+  <inceptionYear>2011</inceptionYear>
+
+  <scm>
+    <connection>
+      
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/site/geonames
+    </connection>
+    <developerConnection>
+      
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/site/geonames
+    </developerConnection>
+    <url>http://incubator.apache.org/stanbol/</url>
+  </scm>
+  <properties>
+    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+  </properties>
+
+  <build>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.felix</groupId>
+        <artifactId>maven-bundle-plugin</artifactId>
+        <version>2.0.1</version>
+        <inherited>true</inherited>
+        <extensions>true</extensions>
+        <configuration>
+          <instructions>
+            <Bundle-Category>Stanbol Data</Bundle-Category>
+            <Bundle-DocURL>http://incubator.apache.org/stanbol</Bundle-DocURL>
+            <Bundle-Vendor>Apache Stanbol (Incubating)</Bundle-Vendor>
+            <Bundle-SymbolicName>${project.artifactId}</Bundle-SymbolicName>
+            <Install-Path>
+              org/apache/stanbol/data
+            </Install-Path>
+            <_versionpolicy>$${version;===;${@}}</_versionpolicy>
+            <Export-Package>
+              org.apache.stanbol.data.site.geonames.*;version="${pom.version}"
+            </Export-Package>
+          </instructions>
+        </configuration>
+      </plugin>
+    </plugins>
+  </build>
+
+</project>

Propchange: incubator/stanbol/trunk/data/sites/geonames/pom.xml
------------------------------------------------------------------------------
    svn:mime-type = text/plain

Added: 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref
 (added)
+++ 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref
 Mon May 16 10:34:52 2011
@@ -0,0 +1,5 @@
+Name=geonames.org
+Description=The GeoNames geographical database is available for download free 
of charge under a creative commons attribution license. It contains over 10 
million geographical names and consists of 7.5 million unique features whereof 
2.8 million populated places and 5.5 million alternate names. All features are 
categorized into one out of nine feature classes and further subcategorized 
into one out of 645 feature codes.License=Creative Commons Attribution 3.0 
License
+License-Url=http://creativecommons.org/licenses/by/3.0/
+Homepage=http://www.geonames.org/
+Index-Archive=geonames.solrindex.zip
\ No newline at end of file

Added: 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config
 (added)
+++ 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config
 Mon May 16 10:34:52 2011
@@ -0,0 +1,2 @@
+org.apache.stanbol.entityhub.yard.cacheYardId="geonamesIndex"
+org.apache.stanbol.entityhub.yard.cache.additionalMappings=[""]

Added: 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config
 (added)
+++ 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config
 Mon May 16 10:34:52 2011
@@ -0,0 +1,14 @@
+org.apache.stanbol.entityhub.site.searcherType=""
+org.apache.stanbol.entityhub.site.name="geonames.org"
+org.apache.stanbol.entityhub.site.dereferencerType="org.apache.stanbol.entityhub.site.CoolUriDereferencer"
+org.apache.stanbol.entityhub.site.description="The\ GeoNames\ geographical\ 
database\ covers\ all\ countries\ and\ contains\ over\ eight\ million\ 
placenames\ that\ are\ available\ for\ download\ free\ of\ charge."
+org.apache.stanbol.entityhub.site.entityPrefix=["http://sws.geonames.org/";]
+org.apache.stanbol.entityhub.site.cacheId="geonamesIndex"
+org.apache.stanbol.entityhub.site.defaultMappedEntityState="proposed"
+org.apache.stanbol.entityhub.site.accessUri="http://sws.geonames.org/";
+org.apache.stanbol.entityhub.site.fieldMappings=["geonames:*","geonames:name\ 
>\ rick:label"]
+org.apache.stanbol.entityhub.site.cacheStrategy="all"
+org.apache.stanbol.entityhub.site.id="geonames"
+org.apache.stanbol.entityhub.site.queryUri=""
+org.apache.stanbol.entityhub.site.defaultSymbolState="proposed"
+org.apache.stanbol.entityhub.site.defaultExpireDuration="0"

Added: 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config
URL: 
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config?rev=1103676&view=auto
==============================================================================
--- 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config
 (added)
+++ 
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config
 Mon May 16 10:34:52 2011
@@ -0,0 +1,9 @@
+org.apache.stanbol.entityhub.yard.description="The\ full\ cache\ for\ 
geonames.org"
+org.apache.stanbol.entityhub.yard.solr.solrUri="geonames"
+org.apache.stanbol.entityhub.yard.defaultQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.solr.multiYardIndexLayout="false"
+org.apache.stanbol.entityhub.yard.solr.allowDefaultConfig="false"
+org.apache.stanbol.entityhub.yard.id="geonamesIndex"
+org.apache.stanbol.entityhub.yard.name="geonames.org\ Cache"
+org.apache.stanbol.entityhub.yard.maxQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.solr.maxBooleanClauses=I"1024"


Reply via email to