SenseClusters participated in the recent sense induction task that was held as a part of Semeval-1/Senseval-4. A few more details on the task can be found at the url mentioned below, but the basic idea is to take contexts/instances where a target word has been designated, and cluster instances of that word to discover senses.
http://nlp.cs.swarthmore.edu/semeval/tasks/task02/description.shtml I am in the process of preparing a paper that will describe how we used SenseClusters in this task, but perhaps the most important point to make is that we used relatively common settings without any knowledge of the data we were clustering, and came back with reasonable results. The data used in the task was from the English lexical sample task of Semeval-1, which consists of 100 words and 27,132 instances. The short summary of our system is that we used second order context vectors where the bigrams features were selected using pmi (pointwise mutual information). A large window size of 12 was used to identify bigrams, given the relatively small amounts of data available for each word. I did not use SVD, and the number of clusters was automatically determined via the adapted gap statistic. The clustering method was direct (k-means). More details will be in the task paper, which I'll make available when it's finished... One issue that has been very interesting is reflecting upon how evaluation of unsupervised clustering systems can and should be done. There were two evaluation methods used in the sense induction task, and they are different than the built in evaluation method supported in SenseClusters. I've been discussing those issues a bit now in the sense induction task mailing list, and will start to relay some of that information here to this mailing list, since you may well wonder what are your various options for evaluation, and why one sees such different results reported for unsupervised clustering of word senses and related problems. As you know, SenseClusters provides its own method of evaluation, and also supports Cluto's built in evaluations that include purity and entropy (which were used as one of the evaluation methods in the sense induction task). Note that all of these evaluations are based on comparing to a "gold standard" clustering of the data as is available when one is clustering text that has been manually sense tagged (where the sense tags are ignored during clustering but then used for evaluation to compare to the discovered clusters). In any case, I will start to forward some of that correspondence and also add to it just to explain a bit more about how we do evaluation in SenseClusters, and what other alternatives might exist. The most important point though is that there really doesn't seem to be a standardized method for evaluation of unsupervised clustering of word senses, so before making any comparisons to other results it's quite important to understand what the evaluation measures were used and how they were defined. That's part of the motivation for discussing those issues here, one important observation is that the SenseClusters evaluation method is pretty harsh, and tends to provide lower scores than most of the other methods I've seen out there. I don't think that's a problem, unless one starts to compare SenseClusters results with such measures, in which case SenseClusters usually fares worse, when in fact it's simply the product of different evaluation techniques. Well, enough prelude. :) BTW, the discussion group for the sense induction task is found here: http://groups.google.com/group/senseinduction Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
