Hi all,
First of all, I want to introduce myself. I'm Rafa Haro from Spain and I
just arrived to the mailing list. I'm currently working in integrate
Stanbol in Alfresco and at the same time I'm doing a research on Entity
Linking for my PhD. By coincidence, the firsts emails I have received
are about this field :-).
As it has been notice, Entity Disambiguation is a challenging task.
There are some simple approaches that usually don't work well with
complex documents. As a response to Fabian suggestion regarding a
scientific network in this field, you should take a look to Entity
Linking task in Knowledge Base Population (KBP) track at NIST
Conference: http://www.nist.gov/tac/2012/KBP/index.html
This year is the fourth edition. You might be interested in take a look
of the best proposals in last three years. We are participating this year.
I wouldn't mind to get involved in bringing Entity Disambiguation to
Stanbol and to collaborate in general in the project. Is that possible?
Regards
El 23/04/12 16:04, Pablo Mendes escribió:
Hi all,
I think you should start with a really simple solution for this and then
improve this first simple algorithm.
This was exactly the approach taken by the DBpedia Spotlight project. We
have built a few entity linkers (a.k.a. disambiguators) based on Lucene
first, and started incrementally making them more sophisticated. If you are
a fan of not repeating work, please feel free to look at what we've done.
http://spotlight.dbpedia.org
Our disambiguators will be integrated as EnhancementEngines in Stanbol
within the next couple of months.
If you're a fan of reimplementing things to make them better, I'd say you
should look elsewhere. There are some interesting approaches out there that
have not been open sourced, but that have papers describing their
algorithms. Implementing them would be probably more beneficial for the
community than reimplementing what we did.
Cheers,
Pablo
On Mon, Apr 23, 2012 at 3:56 PM, kritarth anand<[email protected]>wrote:
Thanks a lot Fabian for your inputs. I'll definitely add on them in my
proposal.
On Mon, Apr 23, 2012 at 7:23 PM, Fabian Christ<
[email protected]
wrote:
Hi Kritarth,
I have read your proposal and building such a disambiguation engine is
a challenging task. Here are some thoughts:
- Did you think about restriction for the domain, or the kind of text
that this engine would/should work best for? It is often the case that
you can not implement the single engine that always works well. So
maybe you should think a little bit about the kind of content that you
would like to support.
- Do you have access to any scientific network? Perhaps looking in the
scientific world for published papers about entity disambiguation may
give you some ideas and would widen your view on the problem.
- I think you should start with a really simple solution for this and
then improve this first simple algorithm. Having a simple trivial
solution makes it more easy to have something to compare. Sometimes it
happens that the advanced algorithms are not any better than the
trivial ones. So try it ;)
Best,
- Fabian
Am 18. April 2012 11:02 schrieb kritarth anand<[email protected]
:
Hi guys,
Hope your doing well. I was advised by my supervisor Dr. Rupert that to
interest people in my application, I should provide little summary of
my
proposal. Please do have a look at it below, in case you do find it
interesting or if you might want to suggest something on that. You may
rad
the entire documents
My proposal is Entity Disambiguation as an Enhancement engine in
Stanbol.
You can have a look at it JIRA page,
https://issues.apache.org/jira/browse/*
STANBOL*-223 . I propose to build it during the summers as a part of
Google
Summer of Code. Any advice from you guys is most welcome
Kritarth
On Tue, Apr 17, 2012 at 8:36 PM, kritarth anand<
[email protected]>wrote:
Hi Guys,
Hope your doing well. Please do take out few minutes and have a look
at
my
proposal. Your feedback is extremely valuable for me.
Kritarth
On Mon, Apr 16, 2012 at 12:23 AM, kritarth anand<
[email protected]
wrote:
Dear Fabian,
Thanks for pointing it out.
@All
I have attached the PDF versions of my proposal and Background Info
with
this mail. You may also find the proposal on this Google Document
https://docs.google.com/document/d/1BA0x9craA2kiFn0jM-66HSS7SFCk5Q5U5gyEWaRftIk/edit
It is editable so you might add on comments there itself so that you
can
add on some one elses advice too. You can anyways mail me.
Kritarth Anand
On Mon, Apr 16, 2012 at 12:13 AM, Fabian Christ<
[email protected]> wrote:
Hi Kritarth,
and welcome to Stanbol. Could you share the proposal in any open
format like PDF, HTML, plain text or via an URL? Not all of us have
access to the newest M$ office suite.
Thanks, and looking forward for your contribution!
Best,
- Fabian
Am 15. April 2012 10:21 schrieb kritarth anand<
[email protected]
:
Hi,
I would like to convey my warm greetings to the entire Stanbol
community. My
name is Kritarth Anand. I study Computer Science and Indian
Institute
of
Technology Delhi. I am a potential candidate working on “Entity
disambiguation in Stanbol enhancement engines” as part of Google
Summer of
Code. If I am successful, I‘ll be coordinating with you guys.
I write to you all to request for some feedback on my proposal, I
have
given
out below. You might be able to give me valuable suggestions to
improve my
proposal, incorporate details, omit unnecessary ones and get a
more
realistic with timeline that I have suggested.
Please feel free to discuss any matters whenever you might like. I
have
attached two documents with this mail. One of the of two is the
proposal
suggested and the other little bit details about my background.
Kritarth Anand
www.cse.iitd.ac.in/~cs5080213<
http://www.cse.iitd.ac.in/%7Ecs5080213><
http://www.cse.iitd.ac.in/%7Ecs5080213>
--
Fabian
http://twitter.com/fctwitt
--
Fabian
http://twitter.com/fctwitt
This message should be regarded as confidential. If you have received this
email in error please notify the sender and destroy it immediately. Statements
of intent shall only become binding when confirmed in hard copy by an
authorised signatory.
Zaizi Ltd is registered in England and Wales with the registration number
6440931. The Registered Office is 222 Westbourne Studios, 242 Acklam Road,
London W10 5JJ, UK.