Thanks for your assistance Rafa. Unfortunately, I'm still stuck. I used the
following longer test string that was detected as en, "This is really
English text and dengue hemorrhagic fever is a disease." However, there
were still no entity annotations returned. This was printed in my error.log:
```
01.06.2016 13:14:40.641 *INFO* [Thread-7]
org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine language
identified as en
01.06.2016 13:14:40.670 *INFO* [Thread-5]
org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
EntityLinking Statistics:
01.06.2016 13:14:40.670 *INFO* [Thread-5]
org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
    - overal: 7ms (text processing: 6%, lookup: 91%, matching 0%, ranking
0%, other 3%)
01.06.2016 13:14:40.670 *INFO* [Thread-5]
org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
  - Text Processing: 0.399572ms [count: 5 | time: 0.0799144ms
(max:0.366414, min:0.007158)]
01.06.2016 13:14:40.670 *INFO* [Thread-5]
org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
  - Vocabulary Lookup: 6.356819ms [count: 4 | time: 1.58920475ms
(max:2.560572, min:0.893326)]
01.06.2016 13:14:40.670 *INFO* [Thread-5]
org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
    - cache hits: 1 (25.0%)
01.06.2016 13:14:40.670 *INFO* [Thread-5]
org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
      - 0 query results (0 filtered - NaN%)
01.06.2016 13:14:40.670 *INFO* [Thread-5]
org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
  - Label Matching: 0.003802ms [count: 4 | time: 9.505E-4ms (max:0.001065,
min:8.85E-4)]
01.06.2016 13:14:40.670 *INFO* [Thread-5]
org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
  - Suggestion Ranking: 0.0ms [count: 0 | time: NaNms (max:-1.0E-6,
min:9.223372036854775E12)]
01.06.2016 13:14:40.671 *INFO* [qtp621234008-38]
org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
Execution of Chain doidEnhancerChain finished after 36ms for ContentItem
<urn:content-item-sha1-f506f062502e1c37eddbc5777073a1239cba0c4e>
01.06.2016 13:14:40.672 *INFO* [qtp621234008-38]
org.apache.stanbol.enhancer.servicesapi.EnhancementJobManager > processed
ContentItem
<urn:content-item-sha1-f506f062502e1c37eddbc5777073a1239cba0c4e> with Chain
'doidEnhancerChain' in 34ms | chain:[langid: 6ms (18%), tika: 0ms (0%),
opennlp-sentence: 1ms (3%), opennlp-token: 0ms (0%), opennlp-pos: 3ms (9%),
opennlp-ner: 5ms (15%), dbpediaLinking: 1ms (3%), entityhubExtraction: 18ms
(53%), doidEnhancer: 9ms (26%)], concurrency: 1.0 (0%)
```
I'm not sure what to make of NER mentions in the logs. My enhancement chain
does not include a NER, unless it is being invoked by another enhancer like
opennlp-pos.
Regards,
-Nathan

On Wed, Jun 1, 2016 at 5:32 PM, Rafa Haro <rh...@apache.org> wrote:

> Hi Nathan,
>
> You are testing the enhancer with a very short sentence and the Language
> Detection engine is identifying 'no' (probable Norwegian) as the sentence
> language. By default, Stanbol uses the identified language code for both
> loading OpenNLP models in that language and for entity lookup for searching
> only entity labels in that language. There is a couple of things you can do
> for avoiding an empty annotation is these situations:
>
> 1. Force the language code as a header in your request (curl request in
> this case)
> 2. Configure English 'en' or whatever language you know your dataset has
> labels for the entities as Default Matching Language which is missing in
> your configuration. More information here:
>
> https://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking
>
> Also, you also would probably like to disable NER engines for such kind of
> entities.
>
> Hope that helps,
> Rafa
>
> On Tue, May 31, 2016 at 6:13 PM Nathan Breit <br...@ecohealthalliance.org>
> wrote:
>
> > Hello,
> > I am trying to configure the Entityhub linking engine to use an Entityhub
> > site with vocabulary from the Disease Ontology (
> > http://disease-ontology.org/),
> > but when I enhance text with it, labels from the ontology are not being
> > annotated in the text. I am looking for advice on how to debug this. Here
> > is what I've tried so far:
> > - I used the genericrdf indexing tool to import the Disease Ontology
> into a
> > new Entityhub site. When I used the entityhub /find API endpoint to
> search
> > for the name "dengue hemorrhagic fever" a result from the Disease
> Ontology
> > was returned.
> > - I configured and built a EntityhubLinkingEngine and a WeightedChain
> > containing the linking engine. They show up on the Stanbol admin site and
> > felix console. These are the config files:
> >
> >
> https://github.com/ecohealthalliance/t11/tree/master/ansible/roles/stanbol/templates/enhancer
> > - When I used the following API call to enhance text containing the same
> > term I was able to find using the /find endpoint, the language detected
> is
> > the only annotation returned.
> >
> > curl -X POST -H "Accept: appltion/json" -H "Content-type: text/plain"
> > --data "Avoid dengue hemorrhagic fever."
> > http://54.197.175.163:3000/enhancer/chain/doidEnhancerChain
> >
> > This appears in the Stanbol error.log when the enhancement runs:
> >
> > ```
> > 31.05.2016 12:05:06.204 *INFO* [Thread-5]
> > org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
> language
> > identified as no
> > 31.05.2016 12:05:06.206 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine
> > No NER Model for person and language no available!
> > 31.05.2016 12:05:06.206 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine
> > No NER Model for organization and language no available!
> > 31.05.2016 12:05:06.207 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine
> > No NER Model for location and language no available!
> > 31.05.2016 12:05:06.210 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > EntityLinking Statistics:
> > 31.05.2016 12:05:06.210 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >     - overal: 2ms (text processing: 4%, lookup: 127%, matching 0%,
> ranking
> > 0%, other -31%)
> > 31.05.2016 12:05:06.210 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >   - Text Processing: 0.071543ms [count: 4 | time: 0.01788575ms
> > (max:0.051031, min:0.005928)]
> > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >   - Vocabulary Lookup: 2.541598ms [count: 3 | time: 0.8471993333333333ms
> > (max:1.190281, min:0.667284)]
> > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >     - cache hits: 1 (33.333332%)
> > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >       - 0 query results (0 filtered - NaN%)
> > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >   - Label Matching: 0.00218ms [count: 3 | time: 7.266666666666667E-4ms
> > (max:7.55E-4, min:7.04E-4)]
> > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >   - Suggestion Ranking: 0.0ms [count: 0 | time: NaNms (max:-1.0E-6,
> > min:9.223372036854775E12)]
> > 31.05.2016 12:05:06.214 *INFO* [qtp1118916813-38]
> > org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
> > Execution of Chain doidEnhancerChain finished after 14ms for ContentItem
> > <urn:content-item-sha1-d2851c0b02e12cc3b42bb6608fa2e1d50c43b17f>
> > 31.05.2016 12:05:06.215 *INFO* [qtp1118916813-38]
> > org.apache.stanbol.enhancer.servicesapi.EnhancementJobManager > processed
> > ContentItem
> > <urn:content-item-sha1-d2851c0b02e12cc3b42bb6608fa2e1d50c43b17f> with
> Chain
> > 'doidEnhancerChain' in 14ms | chain:[tika: 1ms (7%), langid: 3ms (21%),
> > opennlp-sentence: 0ms (0%), opennlp-token: 0ms (0%), opennlp-pos: 1ms
> (7%),
> > opennlp-ner: 1ms (7%), entityhubExtraction: 4ms (29%), doidEnhancer: 7ms
> > (50%), dbpediaLinking: 0ms (0%)], concurrency: 1.0 (0%)
> > ```
> >
> > The Ansible playbook here performs all the steps I am been using to set
> up
> > Stanbol: https://github.com/ecohealthalliance/t11/tree/master/ansible
> >
> > Thanks,
> > -Nathan Breit
> >
> > --
> >
> > Nathan Breit
> >
> > Software Developer
> >
> > EcoHealth Alliance
> >
> > 460 West 34th Street – 17th floor
> >
> > New York, NY 10001
> >
> > My Skype: nathanathan3 <http://is.gd/OyRVnD>
> >
> > My Phone Number: 1-425-296-1123
> >
> > www.ecohealthalliance.org
> >
> > EcoHealth Alliance leads cutting-edge research into the critical
> > connections between human and wildlife health and delicate ecosystems.
> With
> > this science we develop solutions that promote conservation and prevent
> > pandemics.
> >
>



-- 

Nathan Breit

Software Developer

EcoHealth Alliance

460 West 34th Street – 17th floor

New York, NY 10001

My Skype: nathanathan3 <http://is.gd/OyRVnD>

My Phone Number: 1-425-296-1123

www.ecohealthalliance.org

EcoHealth Alliance leads cutting-edge research into the critical
connections between human and wildlife health and delicate ecosystems. With
this science we develop solutions that promote conservation and prevent
pandemics.

Reply via email to