Re: Stanbol NER questions

Rajan Shah Tue, 02 Jun 2015 08:03:31 -0700

Hi Rupert,

Thanks a lot for your detailed answer.


Some quick follow-up questions.

On Tue, Jun 2, 2015 at 10:32 AM, Rupert Westenthaler <
rupert.westentha...@gmail.com> wrote:

> Hi Rajan,
>
> On Mon, Jun 1, 2015 at 5:27 PM, Rajan Shah <raja...@gmail.com> wrote:
> >
> > *1. Same As:*
> >
> > How can I configure "Same As" within the stanbol framework?
> >
> > For ex.
> >
> > JP Morgan Chase is same as J.P. Morgan
> >
>
> If you manage this in your own Controlled Vocabulary (CV) you would
> just add a preferred and 0..n alternate labels. if you have existing
> CV that do use owl:sameAs relations you can convert them during
> indexing the CV with the Entityhub indexing tool (similar as done in
> the sHealth example) or collect them while dereferencing and process
> them on the client side.
>
>
> > *2. Entity Recognition:*
> >
> > Suppose that, entity person has four properties.
> >
> > a. Name
> > b. Title
> > c. Address
> > d. Company
> >
> > When NER performs, it only brings one with the match. Suppose, I want to
> > retrieve all properties associated with entity to enhancer's front-end or
> > Graph (without additional second query) - is it possible?
>
> Have a look at the Dereference Engines
>
> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/list#dereference-entities
>
>
In this case, is it fair to assume that one needs to have both of these
yards?

a. Solr yard for fast search
b. Clerzza yard for dereference

Is this the optimal way to use stanbol NER and leverage full potential?  In
a Referenced Site, I see that there exists a searcher implementation. Could
someone provide some pointers on "what are the real benefits" of using such
implementation?


> >
> > *3. Entity co-mention:*
> >
> > From the documentation, it's not crystal clear that how this engine
> works?
> > Is it possible to provide a quick concrete example in couple lines?
> >
> > Does it require two entities live in same solr index or namespace?
>
> IMO the example
>
>     ... Barack Obama gave a talk to members of the Labor Union ...
> Obama specially mentioned ...
>
> describes it well. Because "Barack Obama" is already mentioned before
> "Obama" is treated as a co-mention. The engine builds an index over
> mentions of previous fise:TextAnnotation. It only works on data
> already present in the ContentItem. Id does not require to have the CV
> in any specific storage (e.g. the Entityhub).
>
>
Is there any plan to extend it to capture the relation such as
"Researcher1" and "Researcher2" are two different entities and they're
mentioned in a research paper published by both of them?


> >
> > *4. Sentence Detection:*
> >
> > Is it possible to provide an example configuration or a pointer which
> > describes key features? Also, within a sentence if there are two usage of
> > same word say
> >
> > a. Same person detection
> >
> > 1. Mr. Smith - first sentence
> > 2. Smith's - following sentence
> >
> > Is it possible to recognize that both sentences are from the same person
> > using Stanbol?
>
> There is currently no such engine. Cristian was working to extend the
> Stanbol NLP API to support dependency Trees and Co-Reference. He also
> extended the Stanbol Stanford NLP integration to support those
> features. However their is no engine supporting those features on the
> Stanbol side.
>
> >
> > b. Sentence pattern detection based on language grammar
> >
>
> no
>
> > Does it allow to detect sentences based on language grammar?
>
> no
>
> best
> Rupert
>
> --
> | Rupert Westenthaler             rupert.westentha...@gmail.com
> | Bodenlehenstraße 11                              ++43-699-11108907
> | A-5500 Bischofshofen
> | REDLINK.CO
> ..........................................................................
> | http://redlink.co/
>

Re: Stanbol NER questions

Reply via email to