We are pleased to announce the release of SenseClusters, a free software package that does unsupervised discovery of word senses by clustering together instances of a word (or words) that are used in similar contexts in raw text. It supports a wide range of clustering techniques based on both context vectors and similarity matrices.
SenseClusters is flexible, and can be used in any application that requires clustering of similar instances of text. Examples could include word sense discrimination, synonymy identification, text classification, and summarization. It can also be used to implement models such as Latent Semantic Analysis (LSA). SenseClusters takes a user through the entire process of unsupervised learning of word senses, including text preprocessing, feature selection, context vector and similarity matrix construction, dimensionality reduction via singular value decomposition (SVD), and clustering via both agglomerative and partitional algorithms. SenseClusters provides a great deal of native functionality, and also provides seamless interfaces to take advantage of a number of powerful tools, including Cluto (a Clustering toolkit), SVDPACKC (which carries out singular value decomposition), and the Ngram Statistics Package. For general information please visit: http://senseclusters.sourceforge.net For immediate download of the first public release (0.47) please visit: http://sourceforge.net/projects/senseclusters/ This is an active project, and the principle designer and lead developer (Amruta Purandare, [EMAIL PROTECTED]) and I would be delighted to hear any comments, requests, or even bug reports that you might have. You can see some of our future plans in our Todo list, which is distributed with the package. Cordially, Ted and Amruta PS To subscribe to the SenseClusters mailing list/s, visit: http://lists.sourceforge.net/lists/listinfo/senseclusters-users (discussion) http://lists.sourceforge.net/lists/listinfo/senseclusters-news (announcements) -- # Ted Pedersen http://www.umn.edu/~tpederse # # Department of Computer Science [EMAIL PROTECTED] # # University of Minnesota, Duluth # # Duluth, MN 55812 (218) 726-8770 # ------------------------------------------------------- This SF.net email is sponsored by: IBM Linux Tutorials. Become an expert in LINUX or just sharpen your skills. Sign up for IBM's Free Linux Tutorials. Learn everything from the bash shell to sys admin. Click now! http://ads.osdn.com/?ad_id=1278&alloc_id=3371&op=click _______________________________________________ senseclusters-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
