TBDEV Mission Statement
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Greetings Dev Listers, This is your monthly message from the moderation team to remind you of the primary purpose of this discussion list. To review the list rules, follow the link at the end of this message. TBDEV Mission statement The TBDEV list has been set up for the purpose of discussing programming issues relating to The Bat!, particularly issues that deal with writing plug-ins. Simply send messages to [EMAIL PROTECTED] to send it to the whole list. See the notes at the end of this message for details about how to manage your list membership or to leave the list. It would probably be a good idea for you to set up a folder to keep TBDEV messages in. If you do this, the next most useful thing to have in place is an automatic filter to move mail from your inbox into your TBDEV folder. Set up a filter for incoming mail which looks for Strings LocationPresence Reply-to:.*TBDEV@ Kludges Yes Options: Regular expressions (Checked). TBOT - The Bat off topic discussion list One of our members has created a list for those occasional off topic discussions of public interest. Please feel free to join this list, where many of our readership currently participate. Addresses: Post message: [EMAIL PROTECTED] Subscribe:[EMAIL PROTECTED] Unsubscribe: [EMAIL PROTECTED] Important disclaimer The moderators and list administrators are not affiliated with RIT Labs or the development of TB, although the developers are, themselves members of the list and will sometimes chip in. If you wish to contact the developers, please use The Bat! main menu Help .. Feedback options or, if you need to write to the programmers directly, send mail to: Stefan Tanurkov [EMAIL PROTECTED] or Max Masyutin [EMAIL PROTECTED] Thank you for joining, and we hope that you find this list of use. Contacts THE FOLLOWING E-MAIL ADDRESSES SHOULD ONLY BE USED WHEN YOU NEED TO CONTACT THE ** LIST MODERATORS OR LIST ADMINISTRATORS **. DO NOT SEND THE BAT! RELATED QUESTIONS HERE; THEY WILL *NOT* GET ANSWERED. If you are having difficulties with the list, or one of its members, please contact one of the list moderators. If you need to send a message to the list moderation team, please send mail to: [EMAIL PROTECTED] or [EMAIL PROTECTED] This list is moderated by the following persons of ill repute: Marck D Pearlstone [EMAIL PROTECTED] Leif Gregory [EMAIL PROTECTED] Primary list administration is performed by Marck D Pearlstone [EMAIL PROTECTED] The TBDEV list is hosted free of charge by Johannes Posel. For a full list of the rules for the use of this list please refer to: http://www.silverstones.com/thebat/subtbdev.txt or click here: mailto:[EMAIL PROTECTED] -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.1rc1-nr1 (Windows 2000) - GPGshell v2.60 iD8DBQE+JstkOeQkq5KdzaARAg46AJ0YPxoifRrafTbUxfmJAZQL/OBYhQCeO2oA WoLFsOAycdoz7Rx/EaiW1/Q= =IPCM -END PGP SIGNATURE- Current version is 1.62 | Using TBDEV information: http://www.silverstones.com/thebat/TBUDLInfo.html
Baesyan filter - bug fixed (still test pre-release!)
Hello, tbdev. One bug has been fixed in filter baesyan. The bug was that if a letter contain token consists whole from ! then during degeneration an error occured and the filter failed. So, any letter includes this kind of tokens seemed to be non-spam because of this fail. Fixed version you can download here: http://klirik.narod.ru/arc/baesnolog.tbp http://klirik.narod.ru/arc/baes.tbp (I still recommend you to use last (logged) version to send me a log if any bug arises). For this moment no other serious errors found. In my own testing: since the first build I received 92 spam letters and about 25 non-spam (understand now, why I began to write the filter :). From these letters I has no false positives (i.e. none of my good mail was accidentally deleted as spam) and 1 false negative (i.e. one spam letter came to my mailbox). Also it were about 10 false positives raised because of the just fixed bug. I refiltered these letters after now and all of them were regarded as spam. So, total effectivity (for the moment) is: 0% (0 of 25) false positives and 1.1%(1 of 92) false negative. I use the regarding base of 650 spam and about 800 non-spam letters. In future: 1. New rbd-generating engine (principle is same, but will be changed user interface and some options added). Also it seems to be good to automatically recognize and do something with PGP- or S-MIME- encrypted messages - throw them at all or at least keep them as hash values due to reduce a dictionary. 2. Filter settings will be stored in the registry. Or - I found that if TBP_NeedConfig returns -1 then The Bat! himself adds a section [Filterdata] in TBPlugin.INI. Now this section is empty but I think in future The Bat! developers will give a possibility to store a settings locally for every mailbox (in registry it will be global settings). 3. Adapt rbd-generating to other mailbase formats - because as I know SecureBat is also exist and has his mailbases encrypted. This problem for this very program can be solved by other mailbase imported formats, for example, unix-mailbox. 4. Self-training feature. Now I guess it can be like a question to a user after every 50 received letters (for example) with asking him to confirm the grade of all letters - or, as a case - to confirm only questionable letters automatically regarded in some definite interval of spaminess (21-80% for example). After that new grade will be appended to regard.rbd. So, the base will be always fresh and it wouldn't be necessary to use rbd-generating engine to refresh it. This is my own ideas. If anyone else has some? -- Sincerely, Alexey. Using TB 1.63b7 on WinXP SP1 Corp + MUI RU, spelling by ORFO2002 mailto:[EMAIL PROTECTED] Current version is 1.62 | Using TBDEV information: http://www.silverstones.com/thebat/TBUDLInfo.html
Re[2]: problems with bayesian filter
Hello, rhabib001. You wrote in mid:[EMAIL PROTECTED] ryc One thing I don't understand about the log is that if multiple ryc messages are downloaded, the log doesn't reflect this. Is this a bug? As I understand myself, The Bat can call multiple instances of filtering procedure in a time. For this reason it give to every call a number, usually begins from 1001. By this reason I write into log also this unique number (it is unique in the bounds of current The Bat! session). You can see this numbers in the beginning of every line logged during mailcheck. And there are no such lines for global plugin functions like getname or getversion. Your log has combined the logs of first version and last, so there are no such numbers at all at the beginning of the log. If any of them includes a spam then the really problem is in your regarding base - either you confused spam and non-spam corpuses when you create your regarding base, either your regarding base is not enough yet. ryc I have trained the good dictionary on 745 letters, but the spam ryc dictionary on only 35 letters. Could this be the problem (I have ryc attached the regard.rdb file). This is the feature of method itself - you can investigate it from mathematically viewpoint - the numbers of spam and non-spam base (counted in letters) ought to be equal. Simple speaking, your base very well known what is not-spam, but has a relative hazy idea of what is spam. You need more spam to work, - but this is total problem with this method of filtering! You can, of course, download somewhere a base with spam, but the problem is that in different countries spam is different. Main grain of this method is that all user's regarding bases are different, because their grades includes also knowledge of concrete private user mail. So, it is very hard for spammers to cheat many of such filters simultaneously. From the other hand, spam base seems not to be such different from user to user, because spam is a mass mailing. So you can ask a friend to send you many (real) spam and make a better base. Or you can just take some good letters and make a new base with relatively equal quantity of spam and non-spam. -- Sincerely, Alexey. Using TB 1.63b7 on WinXP SP1 Corp + MUI RU, spelling by ORFO2002 mailto:[EMAIL PROTECTED] Current version is 1.62 | Using TBDEV information: http://www.silverstones.com/thebat/TBUDLInfo.html