Hi Rupert,

thanks for the detailed explanations (as always). I see that it is already
on the radar. IMO it is a great design to decouple engines and entity
lookup.

Best,
 - Fabian


2012/11/27 Rupert Westenthaler <rupert.westentha...@gmail.com>

> Hi Fabian
>
> Short version:
>
> I totally agree. Our vocabulary has changed over time, but the Engines
> still use the names as when they where introduced. Changing them
> (artifactIds and class names) is dangerous as this does break
> backwards compatibility. So I would suggest change names only if we
> can also come up with better implementation/design.
>
> Regarding Vocabulary I think we should prefer the terms
> "EntityLinking" and "NamedEntityLinking" and deprecate all others like
> "keyword" instead of "entity" or "extraction" or "tagging" instead of
> "linking".
>
> The 'engines/entitylinking' and 'engines/entityhublinking' introduced
> by STANBOL-733 do already use this new terminology. They also
> deprecate the 'engines/keywordextraction'.
>
> - - -
>
> Long version with more background information
>
> Regarding the linking of Entities there are currently two different
> principles:
>
> * "NamedEntityLinking": A "NamedEntity" has a 'selected text' AND a
> 'type'. So the selected text AND the type can be used for linking
> * "EntityLinking": An "Entity" does only have a 'selected text'. Here
> linking is only possible based on the selected text.
>
> The plan would be to also have two Engine implementations that support
> those linking models.
>
> * 'NamedEntityLinkingEngine' (currently /engines/entitytagging)
> * 'EntityLinkingEngine' (was /engines/keywordextraction (now
> deprecated) ; since yesterday  /engines/entitylinking)
>
> Those should not have external dependencies (meaning to Stanbol
> components other than Stanbol Commons, Enhancer module; also not other
> major frameworks such as Solr or OpenNLP; no calls to external
> services). That would allow to keep those Engines within the enhancer
> module but also means that those implementation can not be directly
> used by the user (as the Service used for linking will be just defined
> by an Interface without an actual implementation.
>
> Because of that there will be "Engines" that are based on the above,
> but come with adapters to Services that do support the EntityLookup.
> The default will be implementations based on the StanbolEntityhub, but
> Stanbol users could also implement versions for their own
> infrastructure needs.
>
> The "EntityhubLinking" module [1] is the first example. When you look
> at the module you will recognize that it does not contain an single
> EnhancementEngine implementation. It only provides Entityhub specific
> implementations of the EntitySearcher interface defined by the
> "EntityLinkingEngine" and a OSGI component that allows users to
> configure an EntityLinkingEngine instance that uses the Entityhub to
> lookup Entities.
>
> Current state:
>
> Currently we are not yet there. The '/engines/entitytagging' still
> implements both NamedEntityLinking AND Lookup via the Entityhub. This
> engine could be replaced by a 'engines/namedentitylinking' that
> follows the design as described above. The new
> '/engines/entitylinking' already implements the above design. However
> it still depends on the Entityhub, because the EntitySearcher
> interface [3] that is still using the Entityhub Model classes.
>
> 'engines/entityhublinking' currently provides the ability to do
> 'entitylinking' with the Entityhub. As soon as the
> 'engines/namedentitylinking' is available I would add named entity
> linking functionality to that module. In a last step this module will
> also move out of the /enhancer component (as already suggested by
> STANBOL-805 [4]).
>
>
> BTW this design was the result of this [2] discussion on the Stanbol
> dev mailing list.
>
> best
> Rupert
>
>
>
> [1]
> http://svn.apache.org/repos/asf/stanbol/trunk/enhancer/engines/entityhublinking/
> [2] http://markmail.org/message/nptkntyuthv7wwqh
> [3]
> http://stanbol.staging.apache.org/docs/trunk/components/enhancer/engines/entitylinking#entitysearcher
> [4] https://issues.apache.org/jira/browse/STANBOL-805
>
>
> On Tue, Nov 27, 2012 at 11:14 AM, Fabian Christ
> <christ.fab...@googlemail.com> wrote:
> > Hi,
> >
> > enhancement engines in Stanbol can have several names and this is
> confusing
> > myself and very likely our users. Here are some examples that I came
> across
> > when trying to identify the running engines. I started to look at the
> > Web-UI and clicked through the OSGi console.
> >
> > dbpediaLinking (NamedEntityTaggingEngine) ->
> > Named Entity Tagging -> Entity Tagging ->
> > /engines/entitytagging
> >
> > entityhubExtraction (EntityLinkingEngine) ->
> > Entityhub Linking -> Entityhub Linking ->
> > /engines/entityhublinking
> >
> > Could we simplify this a bit to make it more obvious especially for new
> > users what is going on?
> >
> > Best,
> >  - Fabian
> >
> > --
> > Fabian
> > http://twitter.com/fctwitt
>
>
>
> --
> | Rupert Westenthaler             rupert.westentha...@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>



-- 
Fabian
http://twitter.com/fctwitt

Reply via email to