Olivier Grisel <[email protected]> escribió:

2011/8/16  <[email protected]>:
Hi Stanbol devs,

I've been working with Stanbol since two weeks ago. I need some information
about how can I define a new ontology about products, users... in Apache
Stanbol and how can i extract entities from reasoners to tag automatically a
PDF document (i need to avoid tags from another sources, only the entities
in my own OWL file).

The targets i've reached are the following:

1-. I've put a OWL file containing the ontology in ontonet module with its
curl call.

2-. I created scopes and recipes.

I don't know how configure the environment to analyze a plain text and use
this customised ontology to extract tags. I need some global vision about
the problem and if it's possible some examples.

If anyone can help me don't hesitate to answer me.

Hi,

You most probably don't need the reasoners to process *unstructured
data* such as natural language text content. Text analysis is achieved
thanks to Enhancement engines that can rely on the EntityHub to as a
domain specific knowledge base.

If the names of your entities are very specific to your domain (not
ambiguous) then the TaxonomyLinkingEngine coupled with a dedicated
referenced site in the EntityHub that indexes your knowledge base
sounds like the right approach.

To index your knowledge base within the EntityHub you can take example
on the following examples (for DBpedia and DBLP respectively):

https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dblp/README.md

However I don't have any usage example for configuring the
TaxonomyLinkingEngine and Rupert who is the original developer of this
module is off for a couple of weeks AFAIK.

Note: reasoners are useful to process *structured* data (a.k.a.
knowledge): converting assertions already expressed in one RDF
vocabulary (e.g. dbpedia.org) into another (e.g. schema.org), checking
integrity constraints, reifying transitive and reflexive relationships
prior to indexing...

--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel


Thank you, Olivier. I'm going to try this approach. Good job with Apache Stanbol, is a great project :).

--
Jerónimo Fernández
Yerbabuena Software

Reply via email to