Hi Arantxa,

this is indeed a surprising result. Did you ensure that the decoding of the
entities matches? We had a bug in the spanish version that messed up some
of the redirect resolution, I am updating the demo today. Would you mind
re-running the test?

It's not entirely impossible that these results are true, though, since
Spotlight mostly uses observed surface forms to resolve entities, so in
your case it could be possible that for Spanish a lot more relevant surface
form -> entity mappings were observed in the English data. However, this
does sound rather unlikely. It might be a bug.

Best,
Jo


On Tue, Apr 1, 2014 at 3:24 PM, Arantxa Otegi <[email protected]> wrote:

>  Dear all,
>
> Thanks for the great work on Spotlight!
>
> We have been evaluating DBPedia spotlight (statistical V0.6) on the
> Spanish TAC KBP 2012 dataset, and, to our surprise, found that running
> spotlight with the English model produced much better results than for the
> Spanish model.
> Here is a breakdown of the results for the non-NIL entities:
>
>     recall prec. F1
> ES  25.57  67.05 37.02
> EN  58.50  81.82 68.22
>
> Note that the English model returns a non-NIL result for 71.51% of the
> target occurrences, while the Spanish model only 38.14%.
>
> We wonder how is it possible that a model which builds on the contexts of
> occurrence for English can produce better results on Spanish text.
> We would be grateful for any hints!
>
> Best,
>
> Arantxa, Ander, Eneko, Jokin, Aitor
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> Dbp-spotlight-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
>
>
------------------------------------------------------------------------------
_______________________________________________
Dbp-spotlight-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users

Reply via email to