Problems could be in several parts:
- name spotting: perhaps a bug with accents in chars?
- candidate mapping (associations between surface forms and URIs) as Jo said
- stored context (words around each URI that end up in our store/index) to
be used for disambiguation

Your results are not entirely counter-intuitive to me. The Spanish
Wikipedia is smaller, so less context  (words) might be available for each
entity. The English wikipedia has more words and some of those words are
entity names (which often have the same spelling across languages). Entity
names are high quality context words. It could be that overall, the English
context is still richer than the Spanish.

Please let us know if updating the models solved the issue.


On Wed, Apr 16, 2014 at 1:05 AM, Joachim Daiber <[email protected]>wrote:

> Hi there,
>
> the models at
>
> http://spotlight.sztaki.hu/downloads/
>
> should be the same as the demo (updated Mar 23). If you want to be sure,
> you can compare with the folder http://spotlight.sztaki.hu/downloads/demo,
> which contains what the demo is running.
>
> Jo
>
>
> On Wed, Apr 16, 2014 at 9:56 AM, Arantxa Otegi <[email protected]>wrote:
>
>>  Hi Joachim,
>>
>>  this is indeed a surprising result. Did you ensure that the decoding of
>> the entities matches? We had a bug in the spanish version that messed up
>> some of the redirect resolution, I am updating the demo today. Would you
>> mind re-running the test?
>>
>>
>> We aren't using the online demo for our experiments, we use our own
>> server. I guess when you updated the demo, you changed the Spanish model.
>> Which is the Spanish model that uses the demo? Can we get the newest
>> Spanish model to re-run the experiments in our own server?
>>
>> Thanks!
>>
>>
>>
>> Arantxa
>>
>>
>>
>> On Tue, Apr 1, 2014 at 3:24 PM, Arantxa Otegi <[email protected]>wrote:
>>
>>>  Dear all,
>>>
>>> Thanks for the great work on Spotlight!
>>>
>>> We have been evaluating DBPedia spotlight (statistical V0.6) on the
>>> Spanish TAC KBP 2012 dataset, and, to our surprise, found that running
>>> spotlight with the English model produced much better results than for the
>>> Spanish model.
>>> Here is a breakdown of the results for the non-NIL entities:
>>>
>>>     recall prec. F1
>>> ES  25.57  67.05 37.02
>>> EN  58.50  81.82 68.22
>>>
>>> Note that the English model returns a non-NIL result for 71.51% of the
>>> target occurrences, while the Spanish model only 38.14%.
>>>
>>> We wonder how is it possible that a model which builds on the contexts
>>> of occurrence for English can produce better results on Spanish text.
>>> We would be grateful for any hints!
>>>
>>> Best,
>>>
>>> Arantxa, Ander, Eneko, Jokin, Aitor
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Dbp-spotlight-users mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
>>>
>>>
>>
>>
>
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> Dbp-spotlight-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
>
>


-- 

Pablo N. Mendes
http://pablomendes.com
------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/NeoTech
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

Reply via email to