**** sorry, there was a typo in the original posting with respect to
**** url where this is available. Please excuse the error! The url
**** is corrected in this message.

We are happy to announce the release of a small utility package called
nlm2sval2, which will take the NLM WSD test collection and convert it into
the Senseval-2 lexical sample format. This is written in Perl, and is
freely available from our data conversion page below:

*** corrected ***  http://www.d.umn.edu/~tpederse/tools.html

The NLM WSD Test Collection consists of 5000 medical journal abtracts,
where each of them contains one sense-tagged target word. There are 50
target words, and 100 instances per target word. This data is freely
available from the NLM, although you do need to register with them before
you download. See http://wsd.nlm.nih.gov for more details. We plan to
experiment on this data using systems derived from our supervised Duluth
systems, and our unsupervised SenseClusters package.

This data includes information that we don't normally get with lexical
sample data (like the title of the article that the abstract describes)
so we have done our best to incorporate that into the Senseval-2 format.
But, we are certainly open to suggestions on how to handle things. Please
let us know if you have any comments or questions about this package!

Enjoy,
Ted and Mahesh

--
Ted Pedersen
http://www.d.umn.edu/~tpederse


-------------------------------------------------------
The SF.Net email is sponsored by: Beat the post-holiday blues
Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Reply via email to