Olivier Grisel <[email protected]> escribió:
2011/8/16 <[email protected]>:
Hi Stanbol devs,
I've been working with Stanbol since two weeks ago. I need some information
about how can I define a new ontology about products, users... in Apache
Stanbol and how can i extract entities from reasoners to tag automatically a
PDF document (i need to avoid tags from another sources, only the entities
in my own OWL file).
The targets i've reached are the following:
1-. I've put a OWL file containing the ontology in ontonet module with its
curl call.
2-. I created scopes and recipes.
I don't know how configure the environment to analyze a plain text and use
this customised ontology to extract tags. I need some global vision about
the problem and if it's possible some examples.
If anyone can help me don't hesitate to answer me.
Hi,
You most probably don't need the reasoners to process *unstructured
data* such as natural language text content. Text analysis is achieved
thanks to Enhancement engines that can rely on the EntityHub to as a
domain specific knowledge base.
If the names of your entities are very specific to your domain (not
ambiguous) then the TaxonomyLinkingEngine coupled with a dedicated
referenced site in the EntityHub that indexes your knowledge base
sounds like the right approach.
To index your knowledge base within the EntityHub you can take example
on the following examples (for DBpedia and DBLP respectively):
https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dbpedia/README.md
https://svn.apache.org/repos/asf/incubator/stanbol/trunk/entityhub/indexing/dblp/README.md
However I don't have any usage example for configuring the
TaxonomyLinkingEngine and Rupert who is the original developer of this
module is off for a couple of weeks AFAIK.
Note: reasoners are useful to process *structured* data (a.k.a.
knowledge): converting assertions already expressed in one RDF
vocabulary (e.g. dbpedia.org) into another (e.g. schema.org), checking
integrity constraints, reifying transitive and reflexive relationships
prior to indexing...
--
Olivier
http://twitter.com/ogrisel - http://github.com/ogrisel
Hi again,
I've been looking the Dbpedia and DBLP examples, but actually i need
to define my own ontology with customised relations (not based in
definitions of Dbpedia). I didn't explain my problem very well. I need
to extract tags from plain text, yes, but i need to make inferences
over my domain (enterprise terms and concepts that are only used in a
closed environment). Is there any way to upload my ontology previously
defined in Protégé with mimetype rdf+xml? I thought that when i
uploaded the owl file to a specific scope in ontonet, i could
configure Stanbol to use this ontology. I've already created this
ontology. If it's not possible, what are the steps i need to follow in
Apache Stanbol to define own entities that Natural Language Processing
Engine can extract from plain text? And then, when these own entities
has been extracted, how can i connect these entities with my relations?
Thanks for your patience.
Best regards.
--
Jerónimo Fernández
Yerbabuena Software