Look at http://en.wikipedia.org/wiki/Louis_Vuitton and http://en.wikipedia.org/wiki/Louis_Vuitton_(designer) <http://en.wikipedia.org/wiki/Louis_Vuitton_%28designer%29> . In this case it's unambiguous, but in the italian Wikipedia both the designer and the company are described on the same page (which consequently is "tagged" as <person> and <company>). During the parse of a Wikipedia dump, it is quite hard to me the decision if it's better to assign a given tag or another one.

The simplest strategy maybe the one of establishing an ordering, or priority, between tags (e.g. between <company> and <person> assign always <company>). The other option is to discard the tagging, trying to not introduce errors, but in this case I loose a precious tagged sentence. If I could assign both tags to the same entity I see an advantage: it may be better knowing that a certain entity is a <person> OR a <company> (or in some cases a <person> AND a <company>) , than having only one or none tag.

However I wasn't interested specifically in Louis Vuitton, but in discussing about multi-tagging of a certain entity. Imagine also some example like: dolphin which is a <fish> and a <mammal> (set intersection), or a cat which is an <animal> and a <felinae> (a subset). However these cases could be resolved with some ontological inference.

Riccardo


On 12/01/2012 04:15, James Kosin wrote:
On 1/9/2012 8:43 AM, Jörn Kottmann wrote:
On 1/9/12 2:37 PM, Riccardo Tasso wrote:
Hi all,
     does it make sense using the Name Finder module with multiple
tags/entities?

My use case is the following: I have an ambiguous training set (in my
case extracted from wikipedia). For example for "Louis Vuitton" I
can't easily decide if it occurs as Company or as Person. However I
think it is better recognize that the entity found is a Person OR a
Company than not recognizing it at all.

Is it currently possible with openNLP?
That sounds like a very rare case to me, maybe just label it as one of
the both (in this case maybe as company),
then it will at least be detected?

Jörn

It may be better to label as a Company.  Louis Vuitton may not really
exist as a person.  I can't seem to find anything that suggests he is a
real person.

James

Reply via email to