Re: Word Sense Disambiguation

Aliaksandr Autayeu Sat, 14 Feb 2015 02:16:30 -0800

Dear Anthony,


> Thank you for your suggestions and please excuse the lack of clarity.
> We're not sure about the process used by the team since we just joined.

Me neither :) Just a suggestion, since it seems to me that ad-hoc
evaluation in the UI might be subjective. It's nice to play with it,
though, to get a feeling. And it would be perfect for demoing the
algorithms!


> So we thought the web-app would be convenient to test and visualize the
> performances of what is in the pipeline before any code reviews.
>
Convenient to test - yes. Visualize - yes. Demoing - perfect. Playground -
yes. Evaluate performance - only in a subjective way. Unless I miss or
misunderstand something.


Currently, we used a sample from the SemEval-2 dataset which seemed popular
> just for testing, but as you mentioned we will perform the tests on public
> corpora and update the Test Results section with the results and references.
>
That would be great! SemEval is popular, so as long as it is easily
downloadable (many research datasets require registration, emailing, faxing
forms).


> However to move further, we were wondering about the interface to make
> with OpenNLP as previously mentioned.
>
Since you're perhaps deeper in this that others you seem to be the best
candidate to make a proposal, to check the state of the art algorithms and
devise general enough interface for all or most of them. One way could be
to see what the algorithms typically require, how diverse are sources of
senses (WordNet alone has multiple different interfaces to access it),
which options do the algorithms take and start somewhere there to see that
the interface is flexible enough to accommodate that diversity, has ability
to do some built-in checks (such as detecting the case of algorithm trained
on one source of senses working with another, or perhaps algorithm relying
on a relation which is missing in the sense source) and be similar to the
rest of OpenNLP. We might even end up with two interfaces (e.g. for sense
provider and for WSD itself).

What do you think about this way?

best regards,
Aliaksandr

Re: Word Sense Disambiguation

Reply via email to