Hi Anthony,

I had a chance to test the wsd component. That's great work. Thanks.
One question, is it possible to return the wordnet type (or database id) of
the disambiguated word?

Thanks,
Cristian

On Fri, Jul 24, 2015 at 1:14 PM, Anthony Beylerian <
[email protected]> wrote:

> Hi,
>
> To try out the ongoing implementations, after checking out the sandbox
> repository please try these steps :
> 1- Create a resource models directory:
>
> - src
>   - test
>     - resources
>       + models
>
> 2- Include the following pre-trained models and dictionary in that
> directory:
> You can find those here [1] if you like or pre-train your own models.
>
> {
> en-token.bin,
> en-pos-maxent.bin,
> en-sent.bin,en-ner-person.bin,en-lemmatizer.dict
> }
>
> As to train the IMS approach you need to include training data like
> senseval3 [2]:
> For now, please add these folders :
> - src
>   - test
>     - resources
>        - supervised
>          + raw
>          + models
>          + dictionary
>
> You can find the data files here [2].
>
> 3- We included two examples [LeskTester.java] and [IMSTester.java] that
> you can run directly, or make your own tests.
>
> To run a custom test, minimally you need to have a tokenized text or
> sentence  for example for Lesk:
>
>           1>> String[] words = Loader.getTokenizer().tokenize(sentence);
>
> Chose the index of the word to disambiguate in the token array.
>
>           2>> int wordIndex= 6;
>
> Then just create a WSDisambiguator object for example for Lesk :
>
>          3>> Lesk lesk = new Lesk();
>
> And you can call the default disambiguation method
>
>          4>> lesk.disambiguate(words,wordIndex);
>
> You will get an array of strings with the following format :
>
> Lesk : [Source SenseKey Score]
>
> To read the sense definitions you can use the method :
> [opennlp.tools.disambiguator.Constants.printResults]
>
> For using the variations of Lesk, you will need to create and configure a
> parameters object:
>           5>> LeskParameters leskParams = new LeskParameters();
> 6>>
> leskParams.setLeskType(LeskParameters.LESK_TYPE.LESK_BASIC_CTXT_WIN_BF);
>       7>> leskParams.setWin_b_size(4);          8>>
> leskParams.setDepth(3);          9>> lesk.setParams(leskParams);
>
> Typically, IMS should perform better than Lesk, since Lesk is a classic
> method but it usually used as a baseline along with the most frequent sense
> (MFS).
> However, we will be testing and adding more techniques.
>
> In any case, please feel free to ask for more details.
>
> Best,
>
> Anthony
>
> [1] :
> https://drive.google.com/folderview?id=0B67Iu3pf6WucfjdYNGhDc3hkTXd1a3FORnNUYzd3dV9YeWlyMFczeHU0SE1TcWwyU1lhZFU&usp=sharing
> [2] :
> https://drive.google.com/file/d/0ByL0dmKXzHVfSXA3SVZiMnVfOGc/view?usp=sharing
> > Date: Fri, 24 Jul 2015 09:54:02 +0200
> > Subject: Re: Word Sense Disambiguator
> > From: [email protected]
> > To: [email protected]
> >
> > It would be nice if you could share instructions on how to run it.
> > I also would like to give it a try.
> >
> > Jörn
> >
> > On Fri, Jul 24, 2015 at 4:54 AM, Anthony Beylerian <
> > [email protected]> wrote:
> >
> > > Hello,
> > > Yes for the moment we are only using WordNet for sense definitions.The
> > > plan is to complete the package by mid to late August, but if you like
> you
> > > can follow up on the progress from the sandbox.
> > > Best regards,
> > > Anthony
> > > > Date: Thu, 23 Jul 2015 15:36:57 +0300
> > > > Subject: Word Sense Disambiguator
> > > > From: [email protected]
> > > > To: [email protected]
> > > >
> > > > Hi,
> > > >
> > > > I saw that there are people actively working on a Word Sense
> > > Disambiguator.
> > > > DO you guys know when will the module be ready to use? Also I assume
> that
> > > > wordnet is used to define the disambiguated word meaning?
> > > >
> > > > Thanks,
> > > > Cristian
> > >
> > >
>
>

Reply via email to