Hi Rupert, thanks for your feedback! I would like to answer 2 and 3, in order to save Pablo some time. 2) this actually is a bug in my implementation, which will be fixed. Thanks for pointing me to it. 3) the parameter "Types Restriction" in the configuration of the engine is the one responsible for that. I'm pretty sure the bug mentioned in 2 prevents this functionality at this point. Will be fixed as well.
best regards, Iavor Am 04.06.2012 11:45, schrieb Rupert Westenthaler: > Hi Pablo > > I made som tests and the spotting looks great. Also tried some some of > the different Spotting algorithms (NER, LingPipeSpotter (very slow) > and Kea). > > Here are some Questions/Suggestions related to the engine. > > 1. Do you think it might make sense to allow multiple EngineInstances > using different Spotting algorithms? > > 2. I noticed that created TextAnnotations do not have "dc-terms:type" > information. This property is used to represent the "nature" (e.g. > Person, Organisation, Place in case of Named Entities) by the the > Stanbol Enhancement Structure. So if such information are available it > would be great to set it. > > 3. I would suggest to add support for the type suggestion filter > feature as shown in the 2nd example of the user manuel [1] > > [1] http://wiki.dbpedia.org/spotlight/usersmanual#h139-10 > > On Fri, Jun 1, 2012 at 5:37 PM, Pablo Mendes <[email protected]> wrote: >> Our next step is to create an enhancement chain with two enhancement >> engines: DBpedia Spotlight Spotting and DBpedia Spotlight Disambiguation. > > So basically to split this engine in to separate one, right? > >> We have performed preliminary evaluations of the new enhancement engine >> using the Stanbol Benchmark Component (SBC). The SBC allows evaluating >> content enhancement engines based on examples of desired and undesired >> behavior defined through Benchmark Definition Language (BDL) statements. We >> have transformed the dataset from Kulkarni et al. 2009 [4] into BDL. The >> BDL data set is available from: >> http://spotlight.dbpedia.org/download/stanbol/ >> > > Had not yet time to look at the examples in detail, but if the license > if [4] allows and you agree we could think about to make them > available as part of the Stanbol Enhancer. > >> The SBC is a nice way to perform manual inspection of the behavior of the >> enhancement chain for different examples in the evaluation dataset. >> However, for evaluations with several hundreds of examples, it would be >> interesting to have scores that summarize the performance for the entire >> dataset. We are in the process of conducting large scale experiments with >> existing datasets, aiming at producing precision and recall figures for >> different enhancement chains.* >> > > This is completely true. Can you start an Jira Issue about that. I > will definitely help with implementing this. > > best > Rupert > > >
