I agree with you. WSD should be included in OpenNLP once it has a
reasonably good performance.
On the other hand, I have seen few libraries or APIs doing WSD and almost
none doing it right. That may be indicative of how hard the problem is.

The only promising api I found is Babelfy : http://babelfy.org/about. It
uses a graph based model based on their BabelNet Knowledge base in order to
predict word senses. I think it's based on this paper:
http://www.aclweb.org/anthology/Q14-1019. Any thoughts on this?

On Sat, Feb 24, 2018 at 7:49 PM, Anthony Beylerian <
anthony.beyler...@gmail.com> wrote:

> Hey Cristian,
>
> We have tried different approaches such as:
>
> - Lesk (original) [1]
> - Most frequent sense from the data (MFS)
> - Extended Lesk (with different scoring functions)
> - It makes sense (IMS) [2]
> - A sense clustering approach (I don't immediately recall the reference)
>
> Lesk and MFS are meant to be used as baselines for evaluation purpose only.
> The extended version of Lesk is an effort to improve the original, through
> additional information from semantic relationships.
> Although it's not very accurate, it could be useful since it is an
> unsupervised method (no need for large training data).
> However, there were some caveats, as both approaches need to pre-load
> dictionaries as well as score a semantic graph from WordNet at runtime.
>
> IMS is a supervised method which we were hoping to mainly use, since it
> scored around 80% accuracy on SemEval, however that is only for the
> coarse-grained case. However, in reality words have various degrees of
> polysemy, and when tested in the fine-grained case the results were much
> lower.
> We have also experimented with a simple clustering approach but the
> improvements were not considerable as far as I remember.
>
> I just checked the latest results on Semeval2015 [3] and they look a bit
> improved on the fine-grained case ~65% F1.
> However, in some particular domains it looks like the accuracy increases,
> so it could depend on the use case.
>
> On the other hand, there could be some more recent studies that could yield
> better results, but that would need some more investigation.
>
> There are also some other issues such as lack of direct multi-lingual
> support from WordNet, missing sense definitions etc.
> We were also still looking for a better source of sense definitions back
> then.
> In any case, I believe it would be better to have higher performance before
> putting this in the official distribution, however that highly depends on
> the team.
> Otherwise, different parts of the code just need some simple refactoring as
> well.
>
> Best,
>
> Anthony
>
> [1] : M. Lesk, Automatic sense disambiguation using machine readable
> dictionaries
> [2] : https://www.comp.nus.edu.sg/~nght/pubs/ims.pdf
> [3] : http://alt.qcri.org/semeval2015/task13/index.php?id=results
>
> On Wed, Feb 21, 2018 at 5:26 AM, Cristian Petroaca <
> cristian.petro...@gmail.com> wrote:
>
> > Hi Anthony,
> >
> > I'd be interested to discuss this further.
> > What are the wsd methods used? Any links to papers?
> > How does the module perform when being evaluated against Senseval?
> >
> > How much work do you think it's necessary in order to have a functioning
> > WSD module in the context of OpenNLP?
> >
> > Thanks,
> > Cristian
> >
> >
> >
> > On Tue, Feb 20, 2018 at 8:09 AM, Anthony Beylerian <
> > anthony.beyler...@gmail.com> wrote:
> >
> >> Hi Cristian,
> >>
> >> Thank you for your interest.
> >>
> >> The WSD module is currently experimental, so as far as I am aware there
> >> is no timeline for it.
> >>
> >> You can find the sandboxed version here:
> >> https://github.com/apache/opennlp-sandbox/tree/master/opennlp-wsd
> >>
> >> I personally didn't have the time to revisit this for a while and there
> >> are still some details to work out.
> >> But if you are really interested, you are welcome to discuss and
> >> contribute.
> >> I will assist as much as possible.
> >>
> >> Best,
> >>
> >> Anthony
> >>
> >> On Sun, Feb 18, 2018 at 5:52 AM, Cristian Petroaca <
> >> cristian.petro...@gmail.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I'm interested in word sense disambiguation (particularly based on
> >>> Wordnet). I noticed that the latest OpenNLP version doesn't have any
> but
> >>> I
> >>> remember that a couple of years ago there was somebody working on
> >>> implementing it. Why isn't it in the official OpenNLP jar? Is there a
> >>> timeline for adding it?
> >>>
> >>> Thanks,
> >>> Cristian
> >>>
> >>
> >>
> >
>

Reply via email to