Hi Rupert,
Thanks for the info. I will talk to Iavor about your suggestions to the
engines.


> BTW: do you plan to contribute your Enhancement Engines to Stanbol, or
> would you rather package/promote it via Spotlight?


We'd like to contribute to Stanbol, so that it is easier for users to just
use them directly from Stanbol. But we will also document them on our
website and promote via our channels.

Cheers,
Pablo

On Mon, Jun 4, 2012 at 1:10 PM, Rupert Westenthaler <
[email protected]> wrote:

> On Mon, Jun 4, 2012 at 12:51 PM, Pablo Mendes <[email protected]>
> wrote:
> > Hi Rupert,
> >
> > 1. Do you think it might make sense to allow multiple EngineInstances
> >> using different Spotting algorithms?
> >>
> >
> > You mean that instead of a "/spot" you would have a /lingpipespot, /ner
> and
> > /keyphrases? Can be done, but do you have a use case for that? Could this
> > also be done as some sort of URL rewrite? From
> > /spot?spotter=LingPipeSpotter to /lingpipespotter
>
> No much simpler. Just to use the
>
>    @Component(configurationFactory = true, [..])
>
> to allow users to configure multiple instances the DBpedia Spotlight
> Spotting engine. Than he can configure different spotting algorithms
> for the different instances.
>
> >
> >> Our next step is to create an enhancement chain with two enhancement
> >> > engines: DBpedia Spotlight Spotting and DBpedia Spotlight
> Disambiguation.
> >>
> >> So basically to split this engine in to separate one, right?
> >
> >
> > Correct. Split this engine into two, and have a chain connecting the
> > engines.
>
> Note also that you can configure the Engines and the chain by using the
>
>    <Install-Path>{path}</Install-Path>
>
> bundle extension.
>
> The OSGI service configuration need than to be located at
>
>    {module}/src/main/resources/{path}
>
> If you add this to the module of the engines, than users would
> automatically get the default configuration as soon as they add the
> engines to Stanbol.
> As example you can use the default configuration module of the Enhancer [1]
>
> [1]
> http://svn.apache.org/repos/asf/incubator/stanbol/trunk/enhancer/defaults/pom.xml
>
> >
> > Had not yet time to look at the examples in detail, but if the license
> >> if [4] allows and you agree we could think about to make them
> >> available as part of the Stanbol Enhancer.
> >
> >
> > I could not find licensing information. I can ask the authors directly.
> >
> >
> >> This is completely true. Can you start an Jira Issue about that. I
> >> will definitely help with implementing this.
> >
> >
> > Sure. Here: https://issues.apache.org/jira/browse/STANBOL-652
> > I have already implemented quite a bit of evaluation code. If you want,
> you
> > can use it as a starting point. My approach is to go over the dataset and
> > write out a log of each annotation attempt. Based on this log, I run a
> > series of R scripts to interpret the evaluation results.
> >
>
> +1
>
> BTW: do you plan to contribute your Enhancement Engines to Stanbol, or
> would you rather package/promote it via Spotlight?
>
> best
> Rupert
>
> > Cheers,
> > Pablo
> >
> >
> > On Mon, Jun 4, 2012 at 11:45 AM, Rupert Westenthaler <
> > [email protected]> wrote:
> >
> >> Hi Pablo
> >>
> >> I made som tests and the spotting looks great. Also tried some some of
> >> the different Spotting algorithms (NER, LingPipeSpotter (very slow)
> >> and  Kea).
> >>
> >> Here are some Questions/Suggestions related to the engine.
> >>
> >> 1. Do you think it might make sense to allow multiple EngineInstances
> >> using different Spotting algorithms?
> >>
> >> 2. I noticed that created TextAnnotations do not have "dc-terms:type"
> >> information. This property is used to represent the "nature" (e.g.
> >> Person, Organisation, Place in case of Named Entities) by the the
> >> Stanbol Enhancement Structure. So if such information are available it
> >> would be great to set it.
> >>
> >> 3. I would suggest to add support for the type suggestion filter
> >> feature as shown in the 2nd example of the user manuel [1]
> >>
> >> [1] http://wiki.dbpedia.org/spotlight/usersmanual#h139-10
> >>
> >> On Fri, Jun 1, 2012 at 5:37 PM, Pablo Mendes <[email protected]>
> >> wrote:
> >> > Our next step is to create an enhancement chain with two enhancement
> >> > engines: DBpedia Spotlight Spotting and DBpedia Spotlight
> Disambiguation.
> >>
> >> So basically to split this engine in to separate one, right?
> >>
> >> > We have performed preliminary evaluations of the new enhancement
> engine
> >> > using the Stanbol Benchmark Component (SBC). The SBC allows evaluating
> >> > content enhancement engines based on examples of desired and undesired
> >> > behavior defined through Benchmark Definition Language (BDL)
> statements.
> >> We
> >> > have transformed the dataset from Kulkarni et al. 2009 [4] into BDL.
> The
> >> > BDL data set is available from:
> >> > http://spotlight.dbpedia.org/download/stanbol/
> >> >
> >>
> >> Had not yet time to look at the examples in detail, but if the license
> >> if [4] allows and you agree we could think about to make them
> >> available as part of the Stanbol Enhancer.
> >>
> >> > The SBC is a nice way to perform manual inspection of the behavior of
> the
> >> > enhancement chain for different examples in the evaluation dataset.
> >> > However, for evaluations with several hundreds of examples, it would
> be
> >> > interesting to have scores that summarize the performance for the
> entire
> >> > dataset. We are in the process of conducting large scale experiments
> with
> >> > existing datasets, aiming at producing precision and recall figures
> for
> >> > different enhancement chains.*
> >> >
> >>
> >> This is completely true. Can you start an Jira Issue about that. I
> >> will definitely help with implementing this.
> >>
> >> best
> >> Rupert
> >>
> >>
> >>
> >> --
> >> | Rupert Westenthaler             [email protected]
> >> | Bodenlehenstraße 11                             ++43-699-11108907
> >> | A-5500 Bischofshofen
> >>
>
>
>
> --
> | Rupert Westenthaler             [email protected]
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>

Reply via email to