On the morning of July 27, 2005 I gave a tutorial on SenseClusters at EuroLAN 2005 that was followed in the afternoon by a comparative evaluation among a group of EuroLAN participants who tried to achieve the best F-measure possible using the SenseClusters web interface for a set of data I provided.
This event took place in Cluj-Napoca, Romania, which is the capital of Transylvania. Hence I referred to it as the First Transylvanian Bake-Off. :) The data that was provided for evaluation was a set available from our sample data found in Split_Small_Demo_Data.tar.gz, the file M_B-test.xml. This is a set of approx 300 contexts where Mexico and Brazil have been conflated into a single ambiguous term. A set of training data was also made available, M_B-training.xml, if a team decided they would like to get the features from some other corpora than the contexts being clustered. There were 10 teams that reported results, and I think there were approximately 15 teams that attempted the exercise (meaning that 5 teams decided not to submit results). There were up to 20 machines running at any given time, meaning that the SenseClusters web interface got a very good workout. I'm pleased to report that it did well, and there were no major bottlenecks or problems despite the heavy load. There were a total of 303 different complete runs made during the 3 hour practical session. The baseline performance achieved by putting all contexts into one cluster was approx 52%. The distribution of F-measure scores reported by the competing teams was as follows: 74, 70, 68, 67, 66, 65, 63, 63, 61, 54 Below is a link to the complete output files for the winning entry, which reported an F-measure of 73.55! This was really a nice result - as you can see from the above distribution it was a clear winner. http://marimba.d.umn.edu/SC-htdocs/rox161122473185/ The author of the winning entry was Roxana Angheluta, originally from Romania and now living and working in Belgium. She was a 1 person team, and did battle against mostly multi-person teams! For her valiant efforts she won a novel of her choosing, and soon thereafter I added a bottle of wine to her winnings. Below are the settings in the parameter file - which shows that her winning approach was based on using unigram features with a rather high remove cutoff of 10. That remove cutoff seemed to be what separated this approach from the others, which tended to use lower remove cutoffs such as 2 and 5. Also note that she used the order 1 representation, and got her features from the same data she was clustering. I noticed she had other entries that used bagglo and rbr clustering and got the same result of 73.55 - so apparently the choice of clustering algorithm was not too critical (rb, rbr, and bagglo all did equally well). TEST="rox16-test.xml" TOKEN="token.regex" PREFIX="rox16" FEATURE=uni FORMAT=f16.04 CONTEXT=o1 SPACE=vector STOP=stopfile REMOVE=10 CLUSTERS=2 CLMETHOD=rb CRFUN=i2 SIM=cos LABEL_STOP=label_stopfile LABEL_REMOVE=5 LABEL_STAT=ll LABEL_STAT_RANK=10 EVAL=ON And finally, here is the confusion matrix for the winning entry: S1 S0 TOTAL C0: 106 24 130 (40.88) C1: 59 129 188 (59.12) TOTAL 165 153 318 (51.89) (48.11) Precision = 73.90(235/318) Recall = 73.21(235/318+3) F-Measure = 73.55 Legend of Sense Tags S0 = Brazil S1 = Mexico If you are interested in seeing other systems you can browse results at: http://marimba.d.umn.edu/SC-htdocs/ The entries in the Transylvanian Bake-Off were created on July 27. Keep in mind there are 303 runs for that date, so you might want to be looking for a specific team or something rather than just random browsing. Most of the teams used prefixes to identify their entries, so you can get a sense of what a particular team tried by scanning through the similarly named files. So, this was a fun event, and I think it showed that SenseClusters can be used via the web interface in a reasonably effective and informed way after a fairly short introduction. I do not believe any of the participants had used SenseClusters prior to that date, so the results reported above are based on less than a few hours worth of experience. We'll do another one of these somewhere, somehow. :) Ted -- Ted Pedersen http://www.d.umn.edu/~tpederse ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ senseclusters-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/senseclusters-users
