Hi,
Iam a newbee to Stanbol.
I want to use Stanbol to be able to extract meaningful data from different 
unstructured text.
Fields of interest are based on my custom vocabulary.
Data in the unstructured text will keep changing and cannot be indexed upfront

Have followed the instructions on this link  -  
http://stanbol.apache.org/docs/trunk/customvocabulary.html

>From reading the link under stand I would need to follow only the keyword 
>linking approach.

Did following

1.       Created a Yard Site implementation

2.       Uploaded by basic vocabulary into this using the curl command

3.       Created an EntityHub linking engine

4.       Create an enhancement chain with following components

     *   langdetect ( required , LanguageDetectionEnhancementEngine)
     *   opennlp-sentence ( required , OpenNlpSentenceDetectionEngine)
     *   opennlp-token ( required , OpenNlpTokenizerEngine)
     *   opennlp-pos ( required , OpenNlpPosTaggingEngine)
     *   opennlp-chunker ( required , OpenNlpChunkingEngine)
     *   opennlp-ner ( required , NamedEntityExtractionEnhancementEngine)
     *   CustentityhubExtraction ( required , EntityLinkingEngine)

5.       When I run query I do not see entities from vocabulary getting 
identified
Note : Currently  my vocabulary is very simple.   More entities will be added 
later.
It is an  Ontology which has only 1 entity Person
Person in turn has following properties - Name,  City, DateOfBirth

I think I have gone wrong is some configuration parameter.
I had doubts in following :

1.       In the Entity Hub linking engine

a.        What do we enter in the fields used for dereferencing

                                                               i.      Do we 
delete the default mappings provided

b.      What do we enter in the Type mappings

c.       In processed languages do we need to enter any special parameters

2.       In the Managed Site yard site what do we enter for field mappings

a.       I have entered person:name > dbp:Person:birthName  , not sure if this 
is correct

b.      Do we need to retain the default mappings

3.       In the Solr Yard configuration I have not defined any Solr cor. Gone 
with default core / create on initialization. Is this ok

Please confirm if the steps followed are correct.
What do I need to make the custom vocabulary work.

Have spent most of last week on this  but unable to get this working.
Request your help for same


Thanks a lot,
Arthi


Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com

Reply via email to