We are happy to have a paper at CICLING-2006 (http://www.cicling.org/2006) that is based on SenseClusteers. This paper shows that the methodolgy of SenseClusters is generally language independent. We evaluated SenseClusters on name discrimination problems in English, Spanish, Bulgarian, and Romanian, and found that in all cases SenseClusters performed well.
This is the paper that is to be presented at CICLING: An Unsupervised Language Independent Method of Name Discrimination Using Second Order Co-occurrence Features (Pedersen, Kulkarni, Angheluta, Kozareva, and Solorio) - Appears in the Proceedings of the Seventh International Conference on Intelligent Text Processing and Computational Linguistics, February 19-25, 2006, Mexico City. You can download the paper from: http://www.d.umn.edu/~tpederse/Pubs/cicling2006.pdf You can also get our name discrimination data (in Romanian, Bulgarian, Spanish, and English) here: http://www.d.umn.edu/~tpederse/Data/cicling2006-data.zip Finally, the stoplists we used for those languages are here: http://www.d.umn.edu/~tpederse/Data/cicling2006-stoplists.zip Please let us know if you have any questions! Enjoy, Ted, Anagha, Roxana, Zori, and Thamar -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642 _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
