I'll be presenting a new SenseClusters-related paper next week at a
Health Informatics conference.

The Effect of Different Context Representations on Word Sense
Discrimination in Biomedical Texts  (Pedersen) - To Appear in the
Proceedings of the 1st ACM International Health Informatics Symposium,
November 2010, Arlington, VA
http://www.d.umn.edu/~tpederse/Pubs/acm-ihi-2010-pedersen.pdf

This paper compares the two different ways that SenseClusters
represents second order features (word by word co-occurrences and word
by context co-occurrences) and generally finds that word by word
co-occurrences are more effective. It also finds that PK2 cluster
stopping continues to be the most reliable of our cluster stopping
methods, and that SVD has a minimal or negative effect on these
representations.

All of these findings are consistent with previous work we've done,
but I thought this was worth reporting since it deals with biomedical
texts whereas our previous work has been in general English (newswire
text and so forth).

In any case, your comments or suggestions are always welcome.

Cordially,
Ted

-- 
Ted Pedersen
http://www.d.umn.edu/~tpederse

------------------------------------------------------------------------------
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a 
Billion" shares his insights and actions to help propel your 
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Reply via email to