Hi Maatari,
On Mon, Sep 22, 2014 at 8:22 AM, Maatari Daniel Okouya
<[email protected]> wrote:
> I’m a bit confused about few concept. Could someone clarify them a bit.
>
>
> When it comes to assigning some topics to a content resource, what would be
> the difference between entity linking and categorization ?
>
First lets explain the terminology as used by Stanbol. For that I will
use a todays headline:
"Lewis Hamilton not thinking about title after winning Singapore GP"
Named Entity Recognition: Detects mentions of Entity types within the
text. Typically Persons, Organizations and Locations
* Lewis Hamilton -> person
* Singapore -> location
Entity Linking: Detects mentions of known Entities within the processed Text
* Lewis Hamilton -> http://en.wikipedia.org/wiki/Lewis_Hamilton
* Singapore Grand Prix -> http://en.wikipedia.org/wiki/Singapore_Grand_Prix
Categorization: Assigns the content to a fixed set of categories.
Categories might be hierarchical. A typical example are the IPTC Media
Topics [1] which I will use for this example.
* sport -> http://cv.iptc.org/newscodes/mediatopic/15000000
* Formula One -> http://cv.iptc.org/newscodes/mediatopic/20000994
Important is that Entity Linking requires an actual mention of the
Entity in the text while categories do not depend on such mentions.
> What I see as of now, within some tools well established is the
> classification part. Usually it makes use of a control vocabulary to classify
> the content. Output = resource dc:Subject controledVocabularyTerm
>
> However, what i also see in the description of content resource online within
> some authority website is to link the document to external non skos resource
> via for instance the Foaf:Topic.
>
> In that second case, do we have both an entity linking and a classification ?
> or is it that both are the same, it is just that the knowledge base change,
> from external source to controlled vocabulary. Which would mean that in the
> world of linked data, content classification / categorization include entity
> linking? In that case i would say that, the same was happening when linking
> to a controlled vocabulary term.
>
IMO the properties used to represent analysis results do not
necessarily indicate if the results express linked entities or
categorizations. Based on the definition both dc:subject and
foaf:topic they should be both used for categories.
>
> I'm little confused here. If someone, could clarify these notion i would
> appreciate.
hope this helps
best
Rupert
[1] http://cv.iptc.org/newscodes/mediatopic
--
| Rupert Westenthaler [email protected]
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO
..........................................................................
| http://redlink.co/