Author: rwesten
Date: Mon May 16 10:34:52 2011
New Revision: 1103676
URL: http://svn.apache.org/viewvc?rev=1103676&view=rev
Log:
STANBOL-92: Adds support to use DBpedia, geonames.org and DBLP as Referenced
Sites for the Entityhub.
the /data/sites folder will include bundles of popular Sites that can be used -
out of the box - with the Stanbol Entityhub.
See the README.md for more information
Added:
incubator/stanbol/trunk/data/
incubator/stanbol/trunk/data/README.md
incubator/stanbol/trunk/data/pom.xml (with props)
incubator/stanbol/trunk/data/sites/
incubator/stanbol/trunk/data/sites/dblp/
incubator/stanbol/trunk/data/sites/dblp/README.md
incubator/stanbol/trunk/data/sites/dblp/pom.xml (with props)
incubator/stanbol/trunk/data/sites/dblp/src/
incubator/stanbol/trunk/data/sites/dblp/src/main/
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config
incubator/stanbol/trunk/data/sites/dbpedia/
incubator/stanbol/trunk/data/sites/dbpedia/README.md
incubator/stanbol/trunk/data/sites/dbpedia/pom.xml (with props)
incubator/stanbol/trunk/data/sites/dbpedia/src/
incubator/stanbol/trunk/data/sites/dbpedia/src/main/
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config
incubator/stanbol/trunk/data/sites/geonames/
incubator/stanbol/trunk/data/sites/geonames/README.md
incubator/stanbol/trunk/data/sites/geonames/pom.xml (with props)
incubator/stanbol/trunk/data/sites/geonames/src/
incubator/stanbol/trunk/data/sites/geonames/src/main/
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config
Added: incubator/stanbol/trunk/data/README.md
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/README.md?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/README.md (added)
+++ incubator/stanbol/trunk/data/README.md Mon May 16 10:34:52 2011
@@ -0,0 +1,28 @@
+# Data files for optional extensions of the Stanbol distributions
+
+This source repository holds the pom.xml file and folder structure to build
+optional packages for Apache Stanbol.
+
+To avoid loading subversion repository with large binary files this artifacts
+are typically not included but need to be build/precomputed or downloaded
+form other sites.
+The the documentations of the according module for details.
+
+## DataFileProvider Service
+
+The DataFileProvoder Service is typically used by components that need to load
+big binary files to Apache Stanbol.
+See {stanbol-root}/commons/stanboltools/datafileprovider for details
+
+## Bundleprovider
+
+The Bundleprovider is an extension to the Apache Sling installer framework
+and supports to load multiple configuration files form a single bundle.
+
+It is intended to be used in cases where a single Stanbol module needs to
+package several configuration files (e.g. the configuration of several OSGI
+Services).
+
+See {stanbol-root}/commons/installer/bundleprovider for details.
+
+
Added: incubator/stanbol/trunk/data/pom.xml
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/pom.xml?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/pom.xml (added)
+++ incubator/stanbol/trunk/data/pom.xml Mon May 16 10:34:52 2011
@@ -0,0 +1,24 @@
+<?xml version="1.0"?>
+<project>
+ <modelVersion>4.0.0</modelVersion>
+
+ <groupId>org.apache.stanbol</groupId>
+ <artifactId>org.apache.stanbol.data.reactor</artifactId>
+ <version>0.9-SNAPSHOT</version>
+ <packaging>pom</packaging>
+
+ <name>Apache Stanbol Data Reactor</name>
+ <scm>
+ <connection>
+
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data
+ </connection>
+ <developerConnection>
+
scm:svn:https://svn.apache.org/repos/asf/incubator/stanbol/trunk/data
+ </developerConnection>
+ <url>http://incubator.apache.org/stanbol/</url>
+ </scm>
+
+ <modules>
+ <module>sites</module>
+ </modules>
+</project>
Propchange: incubator/stanbol/trunk/data/pom.xml
------------------------------------------------------------------------------
svn:mime-type = text/plain
Added: incubator/stanbol/trunk/data/sites/dblp/README.md
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/README.md?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/dblp/README.md (added)
+++ incubator/stanbol/trunk/data/sites/dblp/README.md Mon May 16 10:34:52 2011
@@ -0,0 +1,54 @@
+# DBLP with local index for the Apache Stanbol Entityhub
+
+This build a bundle that can be installed to add the
[DBLP](http://dblp.uni-trier.de/)
+data set as a ReferencedSite to the Apache Entityhub.
+
+The binary data for the local cache are not included but need to be
+downloaded (TODO: add download location as soon as available) or built locally
+by using the DBLP indexing utility.
+
+PLEASE NOTE that the DBLP dataset does not provide any License information.
+
+
+## Installation
+
+First build the bundle by calling
+
+ mvn install
+
+It the command succeeds the bundle is available in the target folder
+
+ target/org.apache.stanbol.data.sites.dblp-.*.jar
+
+This bundle can now be installed to a running Stanbol instance e.g. by using
+the Apache Felix Webconsole.
+
+NOTE: This steps requires the Sling Installer Framework as well as the
+Stanbol BundleInstaller extension to be active. Both are typically included
+within the Stanbol Launcher.
+
+After installing and starting this Bundle the Stanbol Data File Provider (a
+tab within the Apache Felix Webconsole) will show a request for the binary
+file for the local index.
+
+To finalise the installation you need to copy the requested file to the
+directory used by the Stanbol Data File Provider
+
+ sling/datafiles/
+
+and that restart the SolrYard instance with the name
+
+ dblpIndex
+
+
+## Building the DBLP index
+
+To build a local Index for DBLP the Apache Entityhub provides an own utility
+The module is located at
+
+ {stanbol}/entityhub/indexing/dblp
+
+A detailed documentation on how to use this utility is provided by the
+README file.
+
+
Added: incubator/stanbol/trunk/data/sites/dblp/pom.xml
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/pom.xml?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/dblp/pom.xml (added)
+++ incubator/stanbol/trunk/data/sites/dblp/pom.xml Mon May 16 10:34:52 2011
@@ -0,0 +1,71 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">
+
+ <modelVersion>4.0.0</modelVersion>
+
+ <groupId>org.apache.stanbol</groupId>
+ <artifactId>org.apache.stanbol.data.sites.dblp</artifactId>
+ <!-- Fixed version number to avoid reuploading / redownloading large data
jars
+ when not not necessary
+ We need to find a way to better distribute this and clarify the potential
+ legal distribution issues and the impact on the ASF release process.
+
+ See also:
+ https://issues.apache.org/jira/browse/OPENNLP-68
+ -->
+ <version>0.0.1</version>
+ <packaging>bundle</packaging>
+
+ <name>Apache Stanbol Data: DBLP</name>
+ <description>
+ This bundle installs DBLP as Referenced Site with a full local cache to
+ the Apache Stanbol Entityhub.
+ The data of the local cache are not included but MUST be either downloaded
+ or precomputed by using the DBLP indexing utility (see
+ "{stanbol}/entityhub/indexing/dblp")
+ </description>
+
+ <inceptionYear>2011</inceptionYear>
+
+ <scm>
+ <connection>
+
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/sites/dblp
+ </connection>
+ <developerConnection>
+
scm:svn:https://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/sites/dblp
+ </developerConnection>
+ <url>http://incubator.apache.org/stanbol/</url>
+ </scm>
+ <properties>
+ <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+ </properties>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.felix</groupId>
+ <artifactId>maven-bundle-plugin</artifactId>
+ <version>2.3.4</version>
+ <inherited>true</inherited>
+ <extensions>true</extensions>
+ <configuration>
+ <instructions>
+ <Bundle-Category>Stanbol Data</Bundle-Category>
+ <Bundle-DocURL>http://incubator.apache.org/stanbol</Bundle-DocURL>
+ <Bundle-Vendor>Apache Stanbol (Incubating)</Bundle-Vendor>
+ <Bundle-SymbolicName>${project.artifactId}</Bundle-SymbolicName>
+ <_versionpolicy>$${version;===;${@}}</_versionpolicy>
+ <Export-Package>
+ org.apache.stanbol.data.site.dblp.*;version="${pom.version}"
+ </Export-Package>
+ <Install-Path>
+ org/apache/stanbol/data
+ </Install-Path>
+ </instructions>
+ </configuration>
+ </plugin>
+ </plugins>
+ </build>
+
+</project>
Propchange: incubator/stanbol/trunk/data/sites/dblp/pom.xml
------------------------------------------------------------------------------
svn:mime-type = text/plain
Added:
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref
(added)
+++
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/dblp.solrindex.ref
Mon May 16 10:34:52 2011
@@ -0,0 +1,5 @@
+Name=DBLP Computer Science Bibliography
+Description=DBLP provides bibliographic information on major computer science
journals and proceedings. DBLP indexes more than one million articles and
contains more than 10000 links to home pages of computer scientists.
+License=No License Information available
+Homepage=http://dblp.uni-trier.de/
+Index-Archive=dblp.solrindex.zip
\ No newline at end of file
Added:
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config
(added)
+++
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.core.site.CacheImpl-dblp.config
Mon May 16 10:34:52 2011
@@ -0,0 +1,2 @@
+org.apache.stanbol.entityhub.yard.cacheYardId="dblpIndex"
+org.apache.stanbol.entityhub.yard.cache.additionalMappings=[""]
Added:
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config
(added)
+++
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.site.referencedSite-dblp.config
Mon May 16 10:34:52 2011
@@ -0,0 +1,14 @@
+org.apache.stanbol.entityhub.site.searcherType="org.apache.stanbol.entityhub.site.SparqlSearcher"
+org.apache.stanbol.entityhub.site.cacheId="dblpIndex"
+org.apache.stanbol.entityhub.site.entityPrefix=["http://dblp.l3s.de/d2r/resource/"]
+org.apache.stanbol.entityhub.site.dereferencerType="org.apache.stanbol.entityhub.site.CoolUriDereferencer"
+org.apache.stanbol.entityhub.site.name="DBLP"
+org.apache.stanbol.entityhub.site.fieldMappings=["swrc:*","rdfs:seeAlso\ |\
d\=entityhub:ref"]
+org.apache.stanbol.entityhub.site.accessUri="http://dblp.l3s.de/d2r/resource/"
+org.apache.stanbol.entityhub.site.cacheStrategy="all"
+org.apache.stanbol.entityhub.site.defaultExpireDuration="0"
+org.apache.stanbol.entityhub.site.queryUri="http://dblp.l3s.de/d2r/sparql"
+org.apache.stanbol.entityhub.site.id="dblp"
+org.apache.stanbol.entityhub.site.defaultMappedEntityState="proposed"
+org.apache.stanbol.entityhub.site.description="The\ DBLP\ Computer\ Science\
Bibliography"
+org.apache.stanbol.entityhub.site.defaultSymbolState="proposed"
Added:
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config
(added)
+++
incubator/stanbol/trunk/data/sites/dblp/src/main/resources/org/apache/stanbol/data/site/dblp/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-dblpIndex.config
Mon May 16 10:34:52 2011
@@ -0,0 +1,9 @@
+org.apache.stanbol.entityhub.yard.maxQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.defaultQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.description="Yard\ holding\ the\ local\
copy\ of\ the\ DBLP\ dataset\ (see\ http://dblp.uni-trier.de/)"
+org.apache.stanbol.entityhub.yard.solr.maxBooleanClauses=I"1024"
+org.apache.stanbol.entityhub.yard.solr.solrUri="dblp"
+org.apache.stanbol.entityhub.yard.solr.multiYardIndexLayout="false"
+org.apache.stanbol.entityhub.yard.solr.allowDefaultConfig="false"
+org.apache.stanbol.entityhub.yard.id="dblpIndex"
+org.apache.stanbol.entityhub.yard.name="DPLP\ Yard"
Added: incubator/stanbol/trunk/data/sites/dbpedia/README.md
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/README.md?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/dbpedia/README.md (added)
+++ incubator/stanbol/trunk/data/sites/dbpedia/README.md Mon May 16 10:34:52
2011
@@ -0,0 +1,55 @@
+# DBpedia.org with local index for the Apache Stanbol Entityhub
+
+This build a bundle that can be installed to add DBpedia.org as a
+ReferencedSite to the Apache Entityhub.
+
+It will override the "dbpedia" referenced site included in the default
+configuration of the "full" launcher of Apache Stanbol.
+
+The binary data for the local cache are not included but need to be
+downloaded (TODO: add download location as soon as available) or built locally
+by using the DBpedia.org indexing utility.
+
+
+## Installation
+
+First build the bundle by calling
+
+ mvn install
+
+It the command succeeds the bundle is available in the target folder
+
+ target/org.apache.stanbol.data.sites.dbpedia-.*.jar
+
+This bundle can now be installed to a running Stanbol instance e.g. by using
+the Apache Felix Webconsole.
+
+NOTE: This steps requires the Sling Installer Framework as well as the
+Stanbol BundleInstaller extension to be active. Both are typically included
+within the Stanbol Launcher.
+
+After installing and starting this Bundle the Stanbol Data File Provider (a
+tab within the Apache Felix Webconsole) will show a request for the binary
+file for the local index.
+
+To finalise the installation you need to copy the requested file to the
+directory used by the Stanbol Data File Provider
+
+ sling/datafiles/
+
+and that restart the SolrYard instance with the name
+
+ DBpediaIndex
+
+
+## Building the DBpedia.org index
+
+To build a local Index for DBPedia the Apache Entityhub provides an own utility
+The module is located at
+
+ {stanbol}/entityhub/indexing/dbpedia
+
+A detailed documentation on how to use this utility is provided by the
+README file.
+
+
Added: incubator/stanbol/trunk/data/sites/dbpedia/pom.xml
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/pom.xml?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/dbpedia/pom.xml (added)
+++ incubator/stanbol/trunk/data/sites/dbpedia/pom.xml Mon May 16 10:34:52 2011
@@ -0,0 +1,71 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">
+
+ <modelVersion>4.0.0</modelVersion>
+
+ <groupId>org.apache.stanbol</groupId>
+ <artifactId>org.apache.stanbol.data.sites.dbpedia</artifactId>
+ <!-- Fixed version number to avoid reuploading / redownloading large data
jars
+ when not not necessary
+ We need to find a way to better distribute this and clarify the potential
+ legal distribution issues and the impact on the ASF release process.
+
+ See also:
+ https://issues.apache.org/jira/browse/OPENNLP-68
+ -->
+ <version>0.0.1</version>
+ <packaging>bundle</packaging>
+
+ <name>Apache Stanbol Data: DBpedia.org </name>
+ <description>
+ This bundle installs DBpedia as Referenced Site with a full local cache to
+ the Apache Stanbol Entityhub.
+ The data of the local cache are not included but MUST be either downloaded
+ or precomputed by using the DBpedia indexing utility (see
+ "{stanbol}/entityhub/indexing/dbpedia")
+ </description>
+
+ <inceptionYear>2011</inceptionYear>
+
+ <scm>
+ <connection>
+
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/sites/dbpedia
+ </connection>
+ <developerConnection>
+
scm:svn:https://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/sites/dbpedia
+ </developerConnection>
+ <url>http://incubator.apache.org/stanbol/</url>
+ </scm>
+ <properties>
+ <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+ </properties>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.felix</groupId>
+ <artifactId>maven-bundle-plugin</artifactId>
+ <version>2.0.1</version>
+ <inherited>true</inherited>
+ <extensions>true</extensions>
+ <configuration>
+ <instructions>
+ <Bundle-Category>Stanbol Data</Bundle-Category>
+ <Bundle-DocURL>http://incubator.apache.org/stanbol</Bundle-DocURL>
+ <Bundle-Vendor>Apache Stanbol (Incubating)</Bundle-Vendor>
+ <Bundle-SymbolicName>${project.artifactId}</Bundle-SymbolicName>
+ <Install-Path>
+ org/apache/stanbol/data
+ </Install-Path>
+ <_versionpolicy>$${version;===;${@}}</_versionpolicy>
+ <Export-Package>
+ org.apache.stanbol.data.site.dbpedia.*;version="${pom.version}"
+ </Export-Package>
+ </instructions>
+ </configuration>
+ </plugin>
+ </plugins>
+ </build>
+
+</project>
Propchange: incubator/stanbol/trunk/data/sites/dbpedia/pom.xml
------------------------------------------------------------------------------
svn:mime-type = text/plain
Added:
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref
(added)
+++
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/dbPedia.solrindex.ref
Mon May 16 10:34:52 2011
@@ -0,0 +1,6 @@
+Name=DBpedia
+Description=DBpedia is a community effort to extract structured information
from Wikipedia and to make this information available on the Web. DBpedia
allows you to ask sophisticated queries against Wikipedia, and to link other
data sets on the Web to Wikipedia data. We hope this will make it easier for
the amazing amount of information in Wikipedia to be used in new and
interesting ways, and that it might inspire new mechanisms for navigating,
linking and improving the encyclopaedia itself.
+License=Creative Commons Attribution-ShareAlike 3.0
+License-Url=http://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License
+Homepage=http://dbpedia.org/
+Index-Archive=dbPedia.solrindex.zip
\ No newline at end of file
Added:
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config
(added)
+++
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.core.site.CacheImpl-DBpedia.config
Mon May 16 10:34:52 2011
@@ -0,0 +1,2 @@
+org.apache.stanbol.entityhub.yard.cacheYardId="dbPediaIndex"
+org.apache.stanbol.entityhub.yard.cache.additionalMappings=["|\
@\=null;en;de;fr;it;es","*"]
Added:
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config
(added)
+++
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.site.referencedSite-DBpedia.config
Mon May 16 10:34:52 2011
@@ -0,0 +1,18 @@
+org.apache.stanbol.entityhub.site.accessUri="http://dbpedia.org/sparql/"
+org.apache.stanbol.entityhub.site.queryUri="http://dbpedia.org/sparql"
+org.apache.stanbol.entityhub.site.defaultExpireDuration="0"
+org.apache.stanbol.entityhub.site.id="dbpedia"
+org.apache.stanbol.entityhub.site.defaultMappedEntityState="proposed"
+org.apache.stanbol.entityhub.site.name="DB\ Pedia"
+org.apache.stanbol.entityhub.site.cacheId="dbPediaIndex"
+org.apache.stanbol.entityhub.site.dereferencerType="org.apache.stanbol.entityhub.site.SparqlDereferencer"
+org.apache.stanbol.entityhub.site.description="The\ OLD\ Endpoint\ for\
Wikipedia"
+org.apache.stanbol.entityhub.site.cacheStrategy="all"
+org.apache.stanbol.entityhub.site.fieldMappings=["dbp-ont:*","dbp-ont:thumbnail\
|\ d\=xsd:anyURI\ >\ foaf:depiction","dbp-prop:latitude\ |\ d\=xsd:decimal\ >\
geo:lat","dbp-prop:longitude\ |\ d\=xsd:decimal\ >\
geo:long","dbp-prop:population\ |\ d\=xsd:integer","dbp-prop:website\ |\
d\=xsd:anyURI\ >\ foaf:homepage"]
+org.apache.stanbol.entityhub.site.entityPrefix=["http://dbpedia.org/resource/","http://dbpedia.org/ontology/"]
+org.apache.stanbol.entityhub.site.searcherType="org.apache.stanbol.entityhub.site.VirtuosoSearcher"
+org.apache.stanbol.entityhub.site.defaultSymbolState="proposed"
+org.apache.stanbol.entityhub.site.licenseName=["Creative\ Commons\
Attribution-ShareAlike\ 3.0","GNU Free Documentation License"]
+org.apache.stanbol.entityhub.site.licenseUrl=["http://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License","http://en.wikipedia.org/wiki/Wikipedia:Text_of_the_GNU_Free_Documentation_License"]
+org.apache.stanbol.entityhub.site.attributionUrl="http://wiki.dbpedia.org/About"
+org.apache.stanbol.entityhub.site.attribution="DBpedia"
Added:
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config
(added)
+++
incubator/stanbol/trunk/data/sites/dbpedia/src/main/resources/org/apache/stanbol/data/site/dbpedia/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-DBpediaIndex.config
Mon May 16 10:34:52 2011
@@ -0,0 +1,9 @@
+org.apache.stanbol.entityhub.yard.maxQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.defaultQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.description="The\ cache\ for\ DBpedia\
(using\ the\ default\ cache\ index\ and\ multi\ yard\ layout)"
+org.apache.stanbol.entityhub.yard.solr.maxBooleanClauses=I"1024"
+org.apache.stanbol.entityhub.yard.solr.solrUri="dbPedia"
+org.apache.stanbol.entityhub.yard.solr.multiYardIndexLayout="false"
+org.apache.stanbol.entityhub.yard.solr.allowDefaultConfig="false"
+org.apache.stanbol.entityhub.yard.id="dbPediaIndex"
+org.apache.stanbol.entityhub.yard.name="dbPedia\ Cache"
Added: incubator/stanbol/trunk/data/sites/geonames/README.md
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/README.md?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/geonames/README.md (added)
+++ incubator/stanbol/trunk/data/sites/geonames/README.md Mon May 16 10:34:52
2011
@@ -0,0 +1,52 @@
+# geonames.org with local index for the Apache Stanbol Entityhub
+
+This build a bundle that can be installed to add the
[geonames.org](http://geonames.org/)
+data set as a ReferencedSite to the Apache Entityhub.
+
+The binary data for the local cache are not included but need to be
+downloaded (TODO: add download location as soon as available) or built locally
+by using the geonames.org indexing utility.
+
+
+## Installation
+
+First build the bundle by calling
+
+ mvn install
+
+It the command succeeds the bundle is available in the target folder
+
+ target/org.apache.stanbol.data.sites.geonames-.*.jar
+
+This bundle can now be installed to a running Stanbol instance e.g. by using
+the Apache Felix Webconsole.
+
+NOTE: This steps requires the Sling Installer Framework as well as the
+Stanbol BundleInstaller extension to be active. Both are typically included
+within the Stanbol Launcher.
+
+After installing and starting this Bundle the Stanbol Data File Provider (a
+tab within the Apache Felix Webconsole) will show a request for the binary
+file for the local index.
+
+To finalise the installation you need to copy the requested file to the
+directory used by the Stanbol Data File Provider
+
+ sling/datafiles/
+
+and that restart the SolrYard instance with the name
+
+ geonamesIndex
+
+
+## Building the geonames.org index
+
+To build a local Index for geonames.org the Apache Entityhub provides an own
+utility. The module is located at
+
+ {stanbol}/entityhub/indexing/geonames
+
+A detailed documentation on how to use this utility is provided by the
+README file.
+
+
Added: incubator/stanbol/trunk/data/sites/geonames/pom.xml
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/pom.xml?rev=1103676&view=auto
==============================================================================
--- incubator/stanbol/trunk/data/sites/geonames/pom.xml (added)
+++ incubator/stanbol/trunk/data/sites/geonames/pom.xml Mon May 16 10:34:52 2011
@@ -0,0 +1,71 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0
http://maven.apache.org/maven-v4_0_0.xsd">
+
+ <modelVersion>4.0.0</modelVersion>
+
+ <groupId>org.apache.stanbol</groupId>
+ <artifactId>org.apache.stanbol.data.sites.geonames</artifactId>
+ <!-- Fixed version number to avoid reuploading / redownloading large data
jars
+ when not not necessary
+ We need to find a way to better distribute this and clarify the potential
+ legal distribution issues and the impact on the ASF release process.
+
+ See also:
+ https://issues.apache.org/jira/browse/OPENNLP-68
+ -->
+ <version>0.0.1</version>
+ <packaging>bundle</packaging>
+
+ <name>Apache Stanbol Data: geonames.org</name>
+ <description>
+ This bundle installs geonames.org as Referenced Site with a full local
+ cache to the Apache Stanbol Entityhub.
+ The data of the local cache are not included but MUST be either downloaded
+ or precomputed by using the geonames.org indexing utility (see
+ "{stanbol}/entityhub/indexing/geonames")
+ </description>
+
+ <inceptionYear>2011</inceptionYear>
+
+ <scm>
+ <connection>
+
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/site/geonames
+ </connection>
+ <developerConnection>
+
scm:svn:http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/site/geonames
+ </developerConnection>
+ <url>http://incubator.apache.org/stanbol/</url>
+ </scm>
+ <properties>
+ <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+ </properties>
+
+ <build>
+ <plugins>
+ <plugin>
+ <groupId>org.apache.felix</groupId>
+ <artifactId>maven-bundle-plugin</artifactId>
+ <version>2.0.1</version>
+ <inherited>true</inherited>
+ <extensions>true</extensions>
+ <configuration>
+ <instructions>
+ <Bundle-Category>Stanbol Data</Bundle-Category>
+ <Bundle-DocURL>http://incubator.apache.org/stanbol</Bundle-DocURL>
+ <Bundle-Vendor>Apache Stanbol (Incubating)</Bundle-Vendor>
+ <Bundle-SymbolicName>${project.artifactId}</Bundle-SymbolicName>
+ <Install-Path>
+ org/apache/stanbol/data
+ </Install-Path>
+ <_versionpolicy>$${version;===;${@}}</_versionpolicy>
+ <Export-Package>
+ org.apache.stanbol.data.site.geonames.*;version="${pom.version}"
+ </Export-Package>
+ </instructions>
+ </configuration>
+ </plugin>
+ </plugins>
+ </build>
+
+</project>
Propchange: incubator/stanbol/trunk/data/sites/geonames/pom.xml
------------------------------------------------------------------------------
svn:mime-type = text/plain
Added:
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref
(added)
+++
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/geonames.solrindex.ref
Mon May 16 10:34:52 2011
@@ -0,0 +1,5 @@
+Name=geonames.org
+Description=The GeoNames geographical database is available for download free
of charge under a creative commons attribution license. It contains over 10
million geographical names and consists of 7.5 million unique features whereof
2.8 million populated places and 5.5 million alternate names. All features are
categorized into one out of nine feature classes and further subcategorized
into one out of 645 feature codes.License=Creative Commons Attribution 3.0
License
+License-Url=http://creativecommons.org/licenses/by/3.0/
+Homepage=http://www.geonames.org/
+Index-Archive=geonames.solrindex.zip
\ No newline at end of file
Added:
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config
(added)
+++
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.core.site.CacheImpl-geonames.config
Mon May 16 10:34:52 2011
@@ -0,0 +1,2 @@
+org.apache.stanbol.entityhub.yard.cacheYardId="geonamesIndex"
+org.apache.stanbol.entityhub.yard.cache.additionalMappings=[""]
Added:
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config
(added)
+++
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.site.referencedSite-geonames.config
Mon May 16 10:34:52 2011
@@ -0,0 +1,14 @@
+org.apache.stanbol.entityhub.site.searcherType=""
+org.apache.stanbol.entityhub.site.name="geonames.org"
+org.apache.stanbol.entityhub.site.dereferencerType="org.apache.stanbol.entityhub.site.CoolUriDereferencer"
+org.apache.stanbol.entityhub.site.description="The\ GeoNames\ geographical\
database\ covers\ all\ countries\ and\ contains\ over\ eight\ million\
placenames\ that\ are\ available\ for\ download\ free\ of\ charge."
+org.apache.stanbol.entityhub.site.entityPrefix=["http://sws.geonames.org/"]
+org.apache.stanbol.entityhub.site.cacheId="geonamesIndex"
+org.apache.stanbol.entityhub.site.defaultMappedEntityState="proposed"
+org.apache.stanbol.entityhub.site.accessUri="http://sws.geonames.org/"
+org.apache.stanbol.entityhub.site.fieldMappings=["geonames:*","geonames:name\
>\ rick:label"]
+org.apache.stanbol.entityhub.site.cacheStrategy="all"
+org.apache.stanbol.entityhub.site.id="geonames"
+org.apache.stanbol.entityhub.site.queryUri=""
+org.apache.stanbol.entityhub.site.defaultSymbolState="proposed"
+org.apache.stanbol.entityhub.site.defaultExpireDuration="0"
Added:
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config
URL:
http://svn.apache.org/viewvc/incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config?rev=1103676&view=auto
==============================================================================
---
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config
(added)
+++
incubator/stanbol/trunk/data/sites/geonames/src/main/resources/org/apache/stanbol/data/site/geonames/org.apache.stanbol.entityhub.yard.solr.impl.SolrYard-geonamesIndex.config
Mon May 16 10:34:52 2011
@@ -0,0 +1,9 @@
+org.apache.stanbol.entityhub.yard.description="The\ full\ cache\ for\
geonames.org"
+org.apache.stanbol.entityhub.yard.solr.solrUri="geonames"
+org.apache.stanbol.entityhub.yard.defaultQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.solr.multiYardIndexLayout="false"
+org.apache.stanbol.entityhub.yard.solr.allowDefaultConfig="false"
+org.apache.stanbol.entityhub.yard.id="geonamesIndex"
+org.apache.stanbol.entityhub.yard.name="geonames.org\ Cache"
+org.apache.stanbol.entityhub.yard.maxQueryResultNumber=I"-1"
+org.apache.stanbol.entityhub.yard.solr.maxBooleanClauses=I"1024"