On the morning of July 27, 2005 I gave a tutorial on SenseClusters at
EuroLAN 2005 that was followed in the afternoon by a comparative
evaluation among a group of EuroLAN participants who tried to achieve the
best F-measure possible using the SenseClusters web interface for a set
of data I provided.

This event took place in Cluj-Napoca, Romania, which is the capital of
Transylvania. Hence I referred to it as the First Transylvanian Bake-Off.
:)

The data that was provided for evaluation was a set available from our
sample data found in Split_Small_Demo_Data.tar.gz, the file M_B-test.xml.
This is a set of approx 300 contexts where Mexico and Brazil have been
conflated into a single ambiguous term. A set of training data was also
made available, M_B-training.xml, if a team decided they would like to get
the features from some other corpora than the contexts being clustered.

There were 10 teams that reported results, and I think there were
approximately 15 teams that attempted the exercise (meaning that 5 teams
decided not to submit results). There were up to 20 machines running at
any given time, meaning that the SenseClusters web interface got a very
good workout. I'm pleased to report that it did well, and there were no
major bottlenecks or problems despite the heavy load. There were a total
of 303 different complete runs made during the 3 hour practical session.

The baseline performance achieved by putting all contexts into one cluster
was approx 52%. The distribution of F-measure scores reported by the
competing teams was as follows: 74, 70, 68, 67, 66, 65, 63, 63, 61, 54

Below is a link to the complete output files for the winning entry, which
reported an F-measure of 73.55! This was really a nice result - as you
can see from the above distribution it was a clear winner.

http://marimba.d.umn.edu/SC-htdocs/rox161122473185/

The author of the winning entry was Roxana Angheluta, originally from
Romania and now living and working in Belgium. She was a 1 person team,
and did battle against mostly multi-person teams! For her valiant efforts
she won a novel of her choosing, and soon thereafter I added a bottle
of wine to her winnings.

Below are the settings in the parameter file - which shows that her
winning approach was based on using unigram features with a rather high
remove cutoff of 10. That remove cutoff seemed to be what separated this
approach from the others, which tended to use lower remove cutoffs such
as 2 and 5. Also note that she used the order 1 representation, and got
her features from the same data she was clustering. I noticed she had
other entries that used bagglo and rbr clustering and got the same result
of 73.55 - so apparently the choice of clustering algorithm was not too
critical (rb, rbr, and bagglo all did equally well).

TEST="rox16-test.xml"
TOKEN="token.regex"
PREFIX="rox16"
FEATURE=uni
FORMAT=f16.04
CONTEXT=o1
SPACE=vector
STOP=stopfile
REMOVE=10
CLUSTERS=2
CLMETHOD=rb
CRFUN=i2
SIM=cos
LABEL_STOP=label_stopfile
LABEL_REMOVE=5
LABEL_STAT=ll
LABEL_STAT_RANK=10
EVAL=ON

And finally, here is the confusion matrix for the winning entry:

            S1        S0           TOTAL
  C0:      106        24             130        (40.88)
  C1:       59       129             188        (59.12)
 TOTAL     165       153             318
         (51.89)   (48.11)
Precision = 73.90(235/318)
Recall = 73.21(235/318+3)
F-Measure = 73.55

Legend of Sense Tags
S0 = Brazil
S1 = Mexico

If you are interested in seeing other systems you can browse results at:

http://marimba.d.umn.edu/SC-htdocs/

The entries in the Transylvanian Bake-Off were created on July 27. Keep in
mind there are 303 runs for that date, so you might want to be looking for
a specific team or something rather than just random browsing. Most of the
teams used prefixes to identify their entries, so you can get a sense of
what a particular team tried by scanning through the similarly named
files.

So, this was a fun event, and I think it showed that SenseClusters can be
used via the web interface in a reasonably effective and informed way
after a fairly short introduction. I do not believe any of the
participants had used SenseClusters prior to that date, so the results
reported above are based on less than a few hours worth of experience.

We'll do another one of these somewhere, somehow. :)

Ted


--
Ted Pedersen
http://www.d.umn.edu/~tpederse


-------------------------------------------------------
SF.Net email is Sponsored by the Better Software Conference & EXPO
September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices
Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA
Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
_______________________________________________
senseclusters-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/senseclusters-users

Reply via email to