http://plg.uwaterloo.ca/~gvcormac/legal10/ml.html
Machine Learning and Data Mining Tasks at TREC 2010
TREC -- The Text Retrieval Conference -- is one of the premier events in
the information retrieval research calendar.
This year, TREC includes two new tasks which may be of particular
interest to researchers in machine learning and data mining:
* The Learning task of the TREC 2010 Legal Track
* The Spam task of the TREC 2010 Web Track
The Learning task requires participants to find documents that are
responsive, in the legal sense, to discovery requests arising in the
course of civil litigation. The text of the request is given, as are a
few hundred labeled examples. Participants are required to submit an
estimate of the probability of responsiveness for each document in a
collection of approximately 1 million. Evaluation will be based on two
criteria: the effectiveness of ranking by probability (measured by area
under the receiver operating characteristic curve, AUC); the
effectiveness of the probability estimates (measured by information
gain, IG, relative to a coin toss). Participants may participate in one
or all of three categories: fully automatic, fully manual, or technology
assisted.
The Spam task requires participants to rank the 500 million English
pages of the ClueWeb09 Web Dataset according to how likely they are to
be junk; that is, not useful results for any reasonable web query. A
baseline ranking is available to participants. Effectiveness will be
measured by AUC on examples identified and assessed during the course of
TREC. For this reason, there are no training examples per se.
Participants are free to gather information from any source for
training, including past and present web spam competitions at AIRWeb and
ECML/PKDD; see, for example the methods used to create the baseline
ranking.
TREC Participation Requirements
TREC participants must register for TREC, and are expected to submit a
paper for the workshop and proceedings. Attendence at TREC is restricted
to registered participants who make a bona fide submission to at least
one track.
Gordon V. Cormack ([email protected])
TREC 2010 Legal Track co-coordinator
http://plg.uwaterloo.ca/~gvcormac/legal10/ml.html
_______________________________________________
uai mailing list
[email protected]
https://secure.engr.oregonstate.edu/mailman/listinfo/uai