**** sorry, there was a typo in the original posting with respect to **** url where this is available. Please excuse the error! The url **** is corrected in this message.
We are happy to announce the release of a small utility package called nlm2sval2, which will take the NLM WSD test collection and convert it into the Senseval-2 lexical sample format. This is written in Perl, and is freely available from our data conversion page below: *** corrected *** http://www.d.umn.edu/~tpederse/tools.html The NLM WSD Test Collection consists of 5000 medical journal abtracts, where each of them contains one sense-tagged target word. There are 50 target words, and 100 instances per target word. This data is freely available from the NLM, although you do need to register with them before you download. See http://wsd.nlm.nih.gov for more details. We plan to experiment on this data using systems derived from our supervised Duluth systems, and our unsupervised SenseClusters package. This data includes information that we don't normally get with lexical sample data (like the title of the article that the abstract describes) so we have done our best to incorporate that into the Senseval-2 format. But, we are certainly open to suggestions on how to handle things. Please let us know if you have any comments or questions about this package! Enjoy, Ted and Mahesh -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------------------------- The SF.Net email is sponsored by: Beat the post-holiday blues Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
