Hello Eugene and all, I'm sorry for top posting, but you might not be aware that dspam-community was, since SN gave up on DSPAM, a temporary project. All list messages should be addressed to [email protected] and [email protected].
Regards, Hugo Monteiro. Eugene wrote: > Hi people! > > First of all, I am happy that there is a chance of revival for this valuable > project. > I'll try to help as much as I can =) > > For a perspective, about a year ago I switched our server (about 20 users, > 200-3000 messages per user per day -- most of them spam, of course) from > SpamAssassin to DSPAM for three reasons: > (1) to get more quick and automated handling of 'new' kinds of spam that > appear daily > (2) to improve performance compared to complex perl-based system > (3) to get automated reclassifying using the dovecot-antispam plugin from > Johannes Berg > However, unfortunately, these goals were not fully realized yet =( > > Thus, what improvements I would like to see: > 1) False negative rate is much higher than we would prefer, on the order of > 10-15% and sometimes even higher. And it is not much fun to manually handle > 100-300 spam messages each morning and then 10-50 each hour. > > Even more distressing is that if you check the reported DSPAM-Factors, you > will see lots of entirely unrelated patterns like > "Received*<my-email-address>" or "Content-Type*8859+1" with entirely > counter-intuitive ratings like 0.95 for the first one. On the other hand, > more reasonable patterns from headers and body (like "Replica+Watches") are > often not listed at all or have ratings like 0.10. Thus, it is not > surprising that DSPAM often keeps passing virtually the same spam messages > as false negatives over and over again. It is amazing that I have not yet > had a single false positive (though I could have missed them among heaps of > spam). > > I tried larger patterns but the performance was unacceptable (mostly due to > DSPAM process using all available memory). And I am not sure that would help > anyway, given the above strangeness of learning. > > 2) Although the performance is much better than SpamAssassin, there are > moments of terrible delays (e.g. when reclassifying). As far as I can tell, > they too are related to DSPAM using up all available memory (size of a > process above 150M). > Granted, the server machine is not state-of-the-art (we will be moving to a > newer machine this week), but this is too much, I suppose. > > 3) More informative documentation would be very helpful -- that is, not just > a list of parameters, but a brief overview of what each option does and when > I should prefer one or another. Especially as not all DSPAM users are > experts in statistical learning. > > 4) Speaking of which... as a longer-term goal, I think we should consider > going beyond basically linear methods and adding support for neural nets > and/or SVM-based classifiers. > > I hope these thoughts will be useful =) > > Best wishes > Eugene > > > ------------------------------------------------------------------------------ > Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) > software. With Adobe AIR, Ajax developers can use existing skills and code to > build responsive, highly engaging applications that combine the power of local > resources and data with the reach of the web. Download the Adobe AIR SDK and > Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com > _______________________________________________ > Dspam-community-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dspam-community-devel > > -- ci.fct.unl.pt:~# cat .signature Hugo Monteiro Email : [email protected] Telefone : +351 212948300 Ext.15307 Web : http://hmonteiro.net Centro de Informática Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa Quinta da Torre 2829-516 Caparica Portugal Telefone: +351 212948596 Fax: +351 212948548 www.ci.fct.unl.pt [email protected] ci.fct.unl.pt:~# _ ------------------------------------------------------------------------------ Create and Deploy Rich Internet Apps outside the browser with Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing skills and code to build responsive, highly engaging applications that combine the power of local resources and data with the reach of the web. Download the Adobe AIR SDK and Ajax docs to start building applications today-http://p.sf.net/sfu/adobe-com _______________________________________________ Dspam-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspam-devel
