Hello, tbdev.
One bug has been fixed in filter baesyan.
The bug was that if a letter contain token consists whole from !
then during degeneration an error occured and the filter failed. So,
any letter includes this kind of tokens seemed to be non-spam
because of this fail.
Fixed version you can download here:
http://klirik.narod.ru/arc/baesnolog.tbp
http://klirik.narod.ru/arc/baes.tbp
(I still recommend you to use last (logged) version to send me a log
if any bug arises).
For this moment no other serious errors found.
In my own testing: since the first build I received 92 spam letters
and about 25 non-spam (understand now, why I began to write
the filter :). From these letters I has no false positives (i.e. none
of my good mail was accidentally deleted as spam) and 1 false negative
(i.e. one spam letter came to my mailbox). Also it were about 10 false
positives raised because of the just fixed bug. I refiltered these
letters after now and all of them were regarded as spam. So, total
effectivity (for the moment) is:
0% (0 of 25) false positives and
1.1%(1 of 92) false negative.
I use the regarding base of 650 spam and about 800 non-spam letters.
In future:
1. New rbd-generating engine (principle is same, but will be
changed user interface and some options added). Also it seems to be
good to automatically recognize and do something with PGP- or
S-MIME- encrypted messages - throw them at all or at least keep
them as hash values due to reduce a dictionary.
2. Filter settings will be stored in the registry. Or - I found
that if TBP_NeedConfig returns -1 then The Bat! himself adds a
section [Filterdata] in TBPlugin.INI. Now this section is empty but
I think in future The Bat! developers will give a possibility to
store a settings locally for every mailbox (in registry it will be
global settings).
3. Adapt rbd-generating to other mailbase formats - because as I
know SecureBat is also exist and has his mailbases encrypted.
This problem for this very program can be solved by other mailbase
imported formats, for example, unix-mailbox.
4. Self-training feature. Now I guess it can be like a question to
a user after every 50 received letters (for example) with asking
him to confirm the grade of all letters - or, as a case - to
confirm only questionable letters automatically regarded in some
definite interval of spaminess (21-80% for example). After that
new grade will be appended to regard.rbd. So, the base will be
always fresh and it wouldn't be necessary to use rbd-generating
engine to refresh it.
This is my own ideas. If anyone else has some?
--
Sincerely,
Alexey.
Using TB 1.63b7 on WinXP SP1 Corp + MUI RU, spelling by ORFO2002
mailto:[EMAIL PROTECTED]
Current version is 1.62 | Using TBDEV information:
http://www.silverstones.com/thebat/TBUDLInfo.html