-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Warren Togami writes: > Theo Van Dinter wrote: > > On Mon, Nov 21, 2005 at 08:38:05PM -0800, Justin Mason wrote: > > > >>well, it's more than that. with a small number of corpora, the > >>scores will be over-optimised for those people. It's a tricky > >>problem.... > > > > I've actually been thinking about this a bit. Our normal mass-check runs > > are heavily weighted towards a small number of people already. For 3.1, > > we used 9 people's logs. It totalled 1766844 messages (bmenschel's > > wasn't included apparently). Breaking it down: > > > > Percent Provider > > ------- ---------- > > 33.93 jm > > 31.00 theo > > 9.35 daf > > 7.68 rod > > 6.05 parkerm > > 5.62 bzoetekouw > > 5.11 quinlan > > 1.20 cthielen > > 0.07 misak > > > > So basically Justin is 34%, I'm 31%, and everyone else combined is 35%. > > So in reality, the scores are far more tuned for Justin and myself than > > any other single person. > > > > This is something I've been trying to think about wrt doing weekly score > > generations for use by sa-update, but no real solution has come to mind yet. > > We seriously need to improve documentation and tools to make it easier > for people to understand and do this. At our company we need to almost > cripple our Asian office spamassassin because of the FP levels. We need > better representation especially from non-Western users in mass checks. > > I for example am trying to get a few native Japanese employees at my > office to participate because of the total lack of Asian representation > currently in mass check. They misunderstood the sorting directions at > first, so I need to train them myself to make sure they do a good job at it. true. although without useful rules that work on Asian spam, the results aren't going to be great. by the way, I've decided I'll run a 3.0.5 mass-check on my corpora, if needed. - --j. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.1 (GNU/Linux) Comment: Exmh CVS iD8DBQFDg7FOMJF5cimLx9ARAishAKC2Qqs1x10Kn7vzY+8YH+AFIemkYQCgqifE b2vxt1b8Mq3Lq2nFoO+KQjU= =43nt -----END PGP SIGNATURE-----
