Re: Entity Disambiguation: Midterm

Rupert Westenthaler Thu, 09 Aug 2012 13:06:38 -0700

Hi,

Stanbol currently assumes very little of Vocabularies. Basically you
need only an URI and a label to get an Entity suggested.


If you want to do some kind of disambiguation you will clearly need
more information about Entities.

Here the question is what kind of information the "spotlight approach"
needs. AFAIK this approach is based on "surface forms" - labels used
to refer to an Entity and "mentions" - sentences that mentions an
Entity. Kritarth please correct me if I get this wrong. But if this is
correct users would need to provide "mentions" for being able to use
DBpedia spotlight like disambiguations.

I think other rather typical information would be the "semantic
context" - other entities referenced by an Entity. Based on that one
can also do disambiguation (e.g. Solr MLT over the labels of the
semantic context with the labels of the current sentence; or MLT over
the URIs of the semantic Context with URIs of other extracted Entities
in the current sentence/text section of the whole document).

best
Rupert

On Thu, Aug 9, 2012 at 7:27 PM, kritarth anand <[email protected]> wrote:
> I was not sure if spotlight approach would work for all kinds of
> vocabularies that Stanbol might have.
>
> I was concerned that the structure of vocabulary it assumes is satisfied by
> dbpedia but might not be satisfied by any custom vocabulary we might have
> in any other deployment.
>
> On Thu, Aug 9, 2012 at 10:51 PM, Anuj Kumar <[email protected]> wrote:
>
>> Hi Kritarth,
>>
>> Thanks for the explanation. Spotlight approach sounds good to me but if you
>> have time, it would be good to compare it with the other two for the
>> purpose of this study.
>>
>> On the third point, I am still not clear. Do you want to convey that
>> Spotlight's disambiguation algorithm can work only with DBpedia?
>>
>> Regards,
>> Anuj
>>
>> On Thu, Aug 9, 2012 at 8:18 PM, kritarth anand <[email protected]
>> >wrote:
>>
>> > Dear Anuj,
>> >
>> > Sorry for Delayed reply.
>> >
>> > 1. In the current implementation of Stanbol what we see essentially is.
>> >       a. We find all the entities in the given paragraph
>> >       b. For each entity query with a string of other entities as
>> > additional info to query dbpedia
>> >       c. Now we change the confidence values.
>> >
>> > 3. I'll answer this one first. I am not very sure of what Stanbol expects
>> > from a vocabulary. All the other papers I read had seen were not making
>> any
>> > assumptions on Vocabulary mainly they were using Wikipedia. I was
>> confused
>> > if it meant more flexibility. After discussion with Pablo and Rupert. I
>> > think it is a way to go.
>> >
>> > 2. I am inclined towards using Spotlight approach as it seems to be
>> better
>> > than the other too and I would like comments from you guys if it is a
>> good
>> > way to proceed.
>> >
>> > Kritarth
>> >
>> >
>> > On Sun, Jul 29, 2012 at 11:29 AM, Anuj Kumar <[email protected]> wrote:
>> >
>> > > Hi Kritarth,
>> > >
>> > > Thanks for sharing the details. I have few questions-
>> > >
>> > > 1. Can you elaborate the current implementation? Is it using the
>> existing
>> > > MLT feature?
>> > > 2. Which one of the three algorithms are you planning to use?
>> > > 3. On the spotlight part, can you explain more on why you say- "I am
>> not
>> > > sure if we can play around that much with any vocabulary and not just
>> > > DBpedia."?
>> > >
>> > > Also, there is a minor typo in the report under Approach section- "Yhe
>> > > behavior
>> > > can be explained as follows:"
>> > >
>> > > Thanks,
>> > > Anuj
>> > >
>> > > On Wed, Jul 25, 2012 at 3:20 PM, kritarth anand <
>> > [email protected]
>> > > >wrote:
>> > >
>> > > > Hi all,
>> > > >
>> > > > I would like to start more interaction with the Stanbol Community by
>> > > > sharing the first iteration of the Entity Disambiguation Engine. I
>> > would
>> > > > really like you all to take a look at it and give me your valuable
>> > > opinion.
>> > > >
>> > > > https://github.com/kritarthanand/Disambiguation-Stanbol
>> > > >
>> > > > The repo consists of the engines' code.It is very easy to install,
>> the
>> > > > instructions are present in the Readme file.
>> > > >
>> > > > Besides the engine it also contains my Mid Term Report which
>> describes
>> > > the
>> > > > engine a little and also talks about future possible algorithms that
>> > can
>> > > be
>> > > > used for Entity Disambiguation. Disambiguation is a complex problem
>> and
>> > > we
>> > > > should have an efficient and performs well too. Therefore I would
>> > really
>> > > > like Stanbol community to take part in discussion with Enthusiasm.
>> > > >
>> > > > Please share your views,
>> > > >
>> > > >
>> > > > Kritarth
>> > > >
>> > >
>> >
>>



-- 
| Rupert Westenthaler             [email protected]
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: Entity Disambiguation: Midterm

Reply via email to