Hello Alexander,

   On Tue, 7 Dec 2004 00:24:43 +0100 (07.12.2004 4:24 my local time),
   received Tuesday, December 7, 2004 at 12:49:27 +0500,
   you wrote about "We need your help with BayesIt",
   at least in part:

ASK> "localization" option where I'm asked to build a translation table in the
ASK> style of ä=a (or the other way around?), I've never seen this in any other
ASK> Bayes filter (SpamPal's Bayesian, PopFile, K9 - I've used them all), yet
ASK> they've all worked quite good. Can anyone explain what this function is
ASK> about?
It allows detect text-degradation and
-classify tokens correctly
- store only one variation of token instead all possibilities

How it works... As you probably know, there are chars in some alphabets,
which seems as symbol from another (or "almost as...") - or some other
char from same alphabet

Replacement game was very (now less) popular in spam, because it can
(can't really long) fool bayes-filter

token "Viagra" not identical to "V1ägrä" (for bayes-filters), but it
still readable and understandable for human reading. On creating pairs
you give BIT possibility test not only found token, but also all
possible variation of writing and increase (in ideal) detection quality

-- 
Best regards,
 Alexander Leschinsky

Powered by
 • The Bat! 3.0.2.10 • POP3 Catcher 2.0.923.1620
 • MyMacros 1.11a • AnotherMacros 0.3.21 /24ED1B1E0/ • Universal Macro eXtender 
2.1.1074 rc3
Weakened by Windows XP 5.1.2600



________________________________________________________
 Current beta is 3.0.2.10 | 'Using TBBETA' information:
http://www.silverstones.com/thebat/TBUDLInfo.html
IMPORTANT: To register as a Beta tester, use this link first -
http://www.ritlabs.com/en/partners/testers/

Reply via email to