I'll be presenting a new SenseClusters-related paper next week at a Health Informatics conference.
The Effect of Different Context Representations on Word Sense Discrimination in Biomedical Texts (Pedersen) - To Appear in the Proceedings of the 1st ACM International Health Informatics Symposium, November 2010, Arlington, VA http://www.d.umn.edu/~tpederse/Pubs/acm-ihi-2010-pedersen.pdf This paper compares the two different ways that SenseClusters represents second order features (word by word co-occurrences and word by context co-occurrences) and generally finds that word by word co-occurrences are more effective. It also finds that PK2 cluster stopping continues to be the most reliable of our cluster stopping methods, and that SVD has a minimal or negative effect on these representations. All of these findings are consistent with previous work we've done, but I thought this was worth reporting since it deals with biomedical texts whereas our previous work has been in general English (newswire text and so forth). In any case, your comments or suggestions are always welcome. Cordially, Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------------------------------------------------ The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book "Blueprint to a Billion" shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
