The entities' labels are in English but they don't have a language
attribute. If that is required, is there a way I can specify a mapping that
will give all the labels an @en attribute? I tried adding "rdfs:label >
rdfs:label@en" to the generic rdf reader's mappings.txt to no avail.
Thanks,
-Nathan


On Thu, Jun 2, 2016 at 7:07 AM, Rafa Haro <rh...@apache.org> wrote:

> Hi Nathan, can you check entities' labels language in your dataset?
>
> Cheers,
> Rafa
> El El mié, 1 jun 2016 a las 19:30, Nathan Breit <
> br...@ecohealthalliance.org>
> escribió:
>
> > Thanks for your assistance Rafa. Unfortunately, I'm still stuck. I used
> the
> > following longer test string that was detected as en, "This is really
> > English text and dengue hemorrhagic fever is a disease." However, there
> > were still no entity annotations returned. This was printed in my
> > error.log:
> > ```
> > 01.06.2016 13:14:40.641 *INFO* [Thread-7]
> > org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
> language
> > identified as en
> > 01.06.2016 13:14:40.670 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > EntityLinking Statistics:
> > 01.06.2016 13:14:40.670 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >     - overal: 7ms (text processing: 6%, lookup: 91%, matching 0%, ranking
> > 0%, other 3%)
> > 01.06.2016 13:14:40.670 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >   - Text Processing: 0.399572ms [count: 5 | time: 0.0799144ms
> > (max:0.366414, min:0.007158)]
> > 01.06.2016 13:14:40.670 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >   - Vocabulary Lookup: 6.356819ms [count: 4 | time: 1.58920475ms
> > (max:2.560572, min:0.893326)]
> > 01.06.2016 13:14:40.670 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >     - cache hits: 1 (25.0%)
> > 01.06.2016 13:14:40.670 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >       - 0 query results (0 filtered - NaN%)
> > 01.06.2016 13:14:40.670 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >   - Label Matching: 0.003802ms [count: 4 | time: 9.505E-4ms
> (max:0.001065,
> > min:8.85E-4)]
> > 01.06.2016 13:14:40.670 *INFO* [Thread-5]
> >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> >   - Suggestion Ranking: 0.0ms [count: 0 | time: NaNms (max:-1.0E-6,
> > min:9.223372036854775E12)]
> > 01.06.2016 13:14:40.671 *INFO* [qtp621234008-38]
> > org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
> > Execution of Chain doidEnhancerChain finished after 36ms for ContentItem
> > <urn:content-item-sha1-f506f062502e1c37eddbc5777073a1239cba0c4e>
> > 01.06.2016 13:14:40.672 *INFO* [qtp621234008-38]
> > org.apache.stanbol.enhancer.servicesapi.EnhancementJobManager > processed
> > ContentItem
> > <urn:content-item-sha1-f506f062502e1c37eddbc5777073a1239cba0c4e> with
> Chain
> > 'doidEnhancerChain' in 34ms | chain:[langid: 6ms (18%), tika: 0ms (0%),
> > opennlp-sentence: 1ms (3%), opennlp-token: 0ms (0%), opennlp-pos: 3ms
> (9%),
> > opennlp-ner: 5ms (15%), dbpediaLinking: 1ms (3%), entityhubExtraction:
> 18ms
> > (53%), doidEnhancer: 9ms (26%)], concurrency: 1.0 (0%)
> > ```
> > I'm not sure what to make of NER mentions in the logs. My enhancement
> chain
> > does not include a NER, unless it is being invoked by another enhancer
> like
> > opennlp-pos.
> > Regards,
> > -Nathan
> >
> > On Wed, Jun 1, 2016 at 5:32 PM, Rafa Haro <rh...@apache.org> wrote:
> >
> > > Hi Nathan,
> > >
> > > You are testing the enhancer with a very short sentence and the
> Language
> > > Detection engine is identifying 'no' (probable Norwegian) as the
> sentence
> > > language. By default, Stanbol uses the identified language code for
> both
> > > loading OpenNLP models in that language and for entity lookup for
> > searching
> > > only entity labels in that language. There is a couple of things you
> can
> > do
> > > for avoiding an empty annotation is these situations:
> > >
> > > 1. Force the language code as a header in your request (curl request in
> > > this case)
> > > 2. Configure English 'en' or whatever language you know your dataset
> has
> > > labels for the entities as Default Matching Language which is missing
> in
> > > your configuration. More information here:
> > >
> > >
> >
> https://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking
> > >
> > > Also, you also would probably like to disable NER engines for such kind
> > of
> > > entities.
> > >
> > > Hope that helps,
> > > Rafa
> > >
> > > On Tue, May 31, 2016 at 6:13 PM Nathan Breit <
> > br...@ecohealthalliance.org>
> > > wrote:
> > >
> > > > Hello,
> > > > I am trying to configure the Entityhub linking engine to use an
> > Entityhub
> > > > site with vocabulary from the Disease Ontology (
> > > > http://disease-ontology.org/),
> > > > but when I enhance text with it, labels from the ontology are not
> being
> > > > annotated in the text. I am looking for advice on how to debug this.
> > Here
> > > > is what I've tried so far:
> > > > - I used the genericrdf indexing tool to import the Disease Ontology
> > > into a
> > > > new Entityhub site. When I used the entityhub /find API endpoint to
> > > search
> > > > for the name "dengue hemorrhagic fever" a result from the Disease
> > > Ontology
> > > > was returned.
> > > > - I configured and built a EntityhubLinkingEngine and a WeightedChain
> > > > containing the linking engine. They show up on the Stanbol admin site
> > and
> > > > felix console. These are the config files:
> > > >
> > > >
> > >
> >
> https://github.com/ecohealthalliance/t11/tree/master/ansible/roles/stanbol/templates/enhancer
> > > > - When I used the following API call to enhance text containing the
> > same
> > > > term I was able to find using the /find endpoint, the language
> detected
> > > is
> > > > the only annotation returned.
> > > >
> > > > curl -X POST -H "Accept: appltion/json" -H "Content-type: text/plain"
> > > > --data "Avoid dengue hemorrhagic fever."
> > > > http://54.197.175.163:3000/enhancer/chain/doidEnhancerChain
> > > >
> > > > This appears in the Stanbol error.log when the enhancement runs:
> > > >
> > > > ```
> > > > 31.05.2016 12:05:06.204 *INFO* [Thread-5]
> > > > org.apache.stanbol.enhancer.engines.langid.LangIdEnhancementEngine
> > > language
> > > > identified as no
> > > > 31.05.2016 12:05:06.206 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine
> > > > No NER Model for person and language no available!
> > > > 31.05.2016 12:05:06.206 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine
> > > > No NER Model for organization and language no available!
> > > > 31.05.2016 12:05:06.207 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine
> > > > No NER Model for location and language no available!
> > > > 31.05.2016 12:05:06.210 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > > > EntityLinking Statistics:
> > > > 31.05.2016 12:05:06.210 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > > >     - overal: 2ms (text processing: 4%, lookup: 127%, matching 0%,
> > > ranking
> > > > 0%, other -31%)
> > > > 31.05.2016 12:05:06.210 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > > >   - Text Processing: 0.071543ms [count: 4 | time: 0.01788575ms
> > > > (max:0.051031, min:0.005928)]
> > > > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > > >   - Vocabulary Lookup: 2.541598ms [count: 3 | time:
> > 0.8471993333333333ms
> > > > (max:1.190281, min:0.667284)]
> > > > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > > >     - cache hits: 1 (33.333332%)
> > > > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > > >       - 0 query results (0 filtered - NaN%)
> > > > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > > >   - Label Matching: 0.00218ms [count: 3 | time:
> 7.266666666666667E-4ms
> > > > (max:7.55E-4, min:7.04E-4)]
> > > > 31.05.2016 12:05:06.211 *INFO* [Thread-5]
> > > >
> > > >
> > >
> >
> org.apache.stanbol.enhancer.engines.entitylinking.engine.EntityLinkingEngine
> > > >   - Suggestion Ranking: 0.0ms [count: 0 | time: NaNms (max:-1.0E-6,
> > > > min:9.223372036854775E12)]
> > > > 31.05.2016 12:05:06.214 *INFO* [qtp1118916813-38]
> > > > org.apache.stanbol.enhancer.jobmanager.event.impl.EventJobManagerImpl
> > > > Execution of Chain doidEnhancerChain finished after 14ms for
> > ContentItem
> > > > <urn:content-item-sha1-d2851c0b02e12cc3b42bb6608fa2e1d50c43b17f>
> > > > 31.05.2016 12:05:06.215 *INFO* [qtp1118916813-38]
> > > > org.apache.stanbol.enhancer.servicesapi.EnhancementJobManager >
> > processed
> > > > ContentItem
> > > > <urn:content-item-sha1-d2851c0b02e12cc3b42bb6608fa2e1d50c43b17f> with
> > > Chain
> > > > 'doidEnhancerChain' in 14ms | chain:[tika: 1ms (7%), langid: 3ms
> (21%),
> > > > opennlp-sentence: 0ms (0%), opennlp-token: 0ms (0%), opennlp-pos: 1ms
> > > (7%),
> > > > opennlp-ner: 1ms (7%), entityhubExtraction: 4ms (29%), doidEnhancer:
> > 7ms
> > > > (50%), dbpediaLinking: 0ms (0%)], concurrency: 1.0 (0%)
> > > > ```
> > > >
> > > > The Ansible playbook here performs all the steps I am been using to
> set
> > > up
> > > > Stanbol:
> https://github.com/ecohealthalliance/t11/tree/master/ansible
> > > >
> > > > Thanks,
> > > > -Nathan Breit
> > > >
> > > > --
> > > >
> > > > Nathan Breit
> > > >
> > > > Software Developer
> > > >
> > > > EcoHealth Alliance
> > > >
> > > > 460 West 34th Street – 17th floor
> > > >
> > > > New York, NY 10001
> > > >
> > > > My Skype: nathanathan3 <http://is.gd/OyRVnD>
> > > >
> > > > My Phone Number: 1-425-296-1123
> > > >
> > > > www.ecohealthalliance.org
> > > >
> > > > EcoHealth Alliance leads cutting-edge research into the critical
> > > > connections between human and wildlife health and delicate
> ecosystems.
> > > With
> > > > this science we develop solutions that promote conservation and
> prevent
> > > > pandemics.
> > > >
> > >
> >
> >
> >
> > --
> >
> > Nathan Breit
> >
> > Software Developer
> >
> > EcoHealth Alliance
> >
> > 460 West 34th Street – 17th floor
> >
> > New York, NY 10001
> >
> > My Skype: nathanathan3 <http://is.gd/OyRVnD>
> >
> > My Phone Number: 1-425-296-1123
> >
> > www.ecohealthalliance.org
> >
> > EcoHealth Alliance leads cutting-edge research into the critical
> > connections between human and wildlife health and delicate ecosystems.
> With
> > this science we develop solutions that promote conservation and prevent
> > pandemics.
> >
>



-- 

Nathan Breit

Software Developer

EcoHealth Alliance

460 West 34th Street – 17th floor

New York, NY 10001

My Skype: nathanathan3 <http://is.gd/OyRVnD>

My Phone Number: 1-425-296-1123

www.ecohealthalliance.org

EcoHealth Alliance leads cutting-edge research into the critical
connections between human and wildlife health and delicate ecosystems. With
this science we develop solutions that promote conservation and prevent
pandemics.

Reply via email to