Hi Rupert, thanks for the detailed explanations (as always). I see that it is already on the radar. IMO it is a great design to decouple engines and entity lookup.
Best, - Fabian 2012/11/27 Rupert Westenthaler <rupert.westentha...@gmail.com> > Hi Fabian > > Short version: > > I totally agree. Our vocabulary has changed over time, but the Engines > still use the names as when they where introduced. Changing them > (artifactIds and class names) is dangerous as this does break > backwards compatibility. So I would suggest change names only if we > can also come up with better implementation/design. > > Regarding Vocabulary I think we should prefer the terms > "EntityLinking" and "NamedEntityLinking" and deprecate all others like > "keyword" instead of "entity" or "extraction" or "tagging" instead of > "linking". > > The 'engines/entitylinking' and 'engines/entityhublinking' introduced > by STANBOL-733 do already use this new terminology. They also > deprecate the 'engines/keywordextraction'. > > - - - > > Long version with more background information > > Regarding the linking of Entities there are currently two different > principles: > > * "NamedEntityLinking": A "NamedEntity" has a 'selected text' AND a > 'type'. So the selected text AND the type can be used for linking > * "EntityLinking": An "Entity" does only have a 'selected text'. Here > linking is only possible based on the selected text. > > The plan would be to also have two Engine implementations that support > those linking models. > > * 'NamedEntityLinkingEngine' (currently /engines/entitytagging) > * 'EntityLinkingEngine' (was /engines/keywordextraction (now > deprecated) ; since yesterday /engines/entitylinking) > > Those should not have external dependencies (meaning to Stanbol > components other than Stanbol Commons, Enhancer module; also not other > major frameworks such as Solr or OpenNLP; no calls to external > services). That would allow to keep those Engines within the enhancer > module but also means that those implementation can not be directly > used by the user (as the Service used for linking will be just defined > by an Interface without an actual implementation. > > Because of that there will be "Engines" that are based on the above, > but come with adapters to Services that do support the EntityLookup. > The default will be implementations based on the StanbolEntityhub, but > Stanbol users could also implement versions for their own > infrastructure needs. > > The "EntityhubLinking" module [1] is the first example. When you look > at the module you will recognize that it does not contain an single > EnhancementEngine implementation. It only provides Entityhub specific > implementations of the EntitySearcher interface defined by the > "EntityLinkingEngine" and a OSGI component that allows users to > configure an EntityLinkingEngine instance that uses the Entityhub to > lookup Entities. > > Current state: > > Currently we are not yet there. The '/engines/entitytagging' still > implements both NamedEntityLinking AND Lookup via the Entityhub. This > engine could be replaced by a 'engines/namedentitylinking' that > follows the design as described above. The new > '/engines/entitylinking' already implements the above design. However > it still depends on the Entityhub, because the EntitySearcher > interface [3] that is still using the Entityhub Model classes. > > 'engines/entityhublinking' currently provides the ability to do > 'entitylinking' with the Entityhub. As soon as the > 'engines/namedentitylinking' is available I would add named entity > linking functionality to that module. In a last step this module will > also move out of the /enhancer component (as already suggested by > STANBOL-805 [4]). > > > BTW this design was the result of this [2] discussion on the Stanbol > dev mailing list. > > best > Rupert > > > > [1] > http://svn.apache.org/repos/asf/stanbol/trunk/enhancer/engines/entityhublinking/ > [2] http://markmail.org/message/nptkntyuthv7wwqh > [3] > http://stanbol.staging.apache.org/docs/trunk/components/enhancer/engines/entitylinking#entitysearcher > [4] https://issues.apache.org/jira/browse/STANBOL-805 > > > On Tue, Nov 27, 2012 at 11:14 AM, Fabian Christ > <christ.fab...@googlemail.com> wrote: > > Hi, > > > > enhancement engines in Stanbol can have several names and this is > confusing > > myself and very likely our users. Here are some examples that I came > across > > when trying to identify the running engines. I started to look at the > > Web-UI and clicked through the OSGi console. > > > > dbpediaLinking (NamedEntityTaggingEngine) -> > > Named Entity Tagging -> Entity Tagging -> > > /engines/entitytagging > > > > entityhubExtraction (EntityLinkingEngine) -> > > Entityhub Linking -> Entityhub Linking -> > > /engines/entityhublinking > > > > Could we simplify this a bit to make it more obvious especially for new > > users what is going on? > > > > Best, > > - Fabian > > > > -- > > Fabian > > http://twitter.com/fctwitt > > > > -- > | Rupert Westenthaler rupert.westentha...@gmail.com > | Bodenlehenstraße 11 ++43-699-11108907 > | A-5500 Bischofshofen > -- Fabian http://twitter.com/fctwitt