On Thu, 16 Jul 2009, Justin Mason wrote:

I plan to ask (on users@, on my blog etc.) for submissions of archives of ham. Submissions of _just_ false positives is OK, as long as they're labelled as such, because they'll have differing profiles and too many FPs in the corpus will cause trouble for the score generation step.

I'll then have a quick go at hand-classifying the submitted corpora, spotting obvious FNs that slipped in, etc., and will then leave them on the zone for nightly mass-checks to use as well. So the corpora won't be private submissions.

Thoughts?

Liability? Someone who provides you with a corpus voluntarily is implying they don't care if it becomes public; you might want to require a liability release.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 [email protected]    FALaholic #11174     pgpk -a [email protected]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  You know things are bad when Pravda says we [the USA] have gone
  too far to the left.                                 -- Joe Huffman
-----------------------------------------------------------------------
 Today: the 64th anniversary of the dawn of the Atomic Age

Reply via email to