Hi florent

On Sat, May 28, 2011 at 5:50 PM, florent andré
<[email protected]> wrote:
> Hi Stanbolers,
>
> I have in my hand a big skos file.
> My main question is : How I can create an entity site with this file and
> play with it ?
>
> I have read READMEs about indexing [1] and data set [2], but I'm not sure to
> get all :
> - can I get rid of set-up of a sparql server ? Or it's require ?

If you index the SKOS file, than you do not need an SPARQL server

> - What is the goal of indexing ? Speed-up entity detection in text or
> speed-up rdf entity representation providing ?

* To get all the Information into the Entityhub
* To support Full Test queries based on the labels and descriotions
* To use the Information for the detection of Entities in the Text

> - Stanbol data file provider, is something related to yard or not ?

* It is only to load binary files that are to big to be managed in SVN
or included within a bundle
* It is used to e.g. to load pre-computed SolrIndexes, language models
for Open NLP.
* The Data File Provider is only used for configuration.

> Something like a local dump of an rdf store ?

Year loading big RDF files to Stanbol (e.g. a Clerezza RDF store) one
would use the Data File Provider. However currently such a feature is
not supported.

>
> A side question is about the differences between the clerezza and solr yard
> : what are they ? performance ? functionalities ? ...
The Solr Yard provides better performance especially for big Datasets.
It does not support Regex constraints for Text queries.
The Clerezza  is fine for smaller data sets. The RDF store used an be
controlled by the Clerezza configuration. Therefore you have a lot of
possibilities how to store your data.

There is the plan to implement a "hybrid" Yard that uses the Clerezza
Yard implementation for storage and the Solr Yard implementation for
queries.

>
> Thanks for any pointers, RTMF links,...
> ++

While answering this mail I recognized, that currently there is no
indexer utility configuration for generic RDF files.
I will add such a configuration in the coming days. This should also
be fine for indexing SKOS files.


best
Rupert

>
> [1]
> http://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md
> [2]
> http://svn.apache.org/repos/asf/incubator/stanbol/trunk/data/sites/dbpedia/README.md
>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Reply via email to