Lars Stavholm wrote: > Lars Stavholm wrote: >> David Reid wrote: >>> Sorry to have to ask again, but despite trying a lot of variations the >>> situation still isn't clear and isn't improving for the affected users. >>> >>> The situation is that some domains have a catch-all address, ie >>> <anything>@domain maps to a single email address. In this situation the >>> training works on the address that the mail was sent to - which is as >>> expected. My question is whether there is a way to have all training for >>> any domain address used for all domain addresses? Can some form of >>> groups setup be used? >> Take a look in the 3.6.8 README, section 2.1 CONFIGURING GROUPS. > > In addition, here's my working setup, thanks to Tony Earnshow: > > Postfix -> DSPAM -> Cyrus IMAP > > # dspam --version > DSPAM Anti-Spam Suite 3.6.8 (agent/library) > Copyright (c) 2002-2006 Jonathan A. Zdziarski > http://dspam.nuclearelephant.com > DSPAM may be copied only under the terms of the GNU General Public > License, a copy of which can be found with the DSPAM distribution kit. > Configuration parameters: --prefix=/usr --sysconfdir=/etc > --with-dspam-home=/var/lib/dspam --mandir=/usr/share/man --enable-daemon > --enable-debug --enable-clamav --enable-syslog --enable-homedir > > # cat /var/lib/dspam/group > users:shared:[EMAIL PROTECTED] > > # egrep -v '^#|^$' /etc/dspam.conf > Home /var/lib/dspam > TrustedDeliveryAgent "/usr/lib/cyrus/bin/deliver" > DeliveryHost 127.0.0.1 > DeliveryPort 10026 > DeliveryIdent localhost > DeliveryProto SMTP > OnFail error > Trust root > Trust mail > Trust dspam > Trust wwwrun > TrainingMode teft > TestConditionalTraining on > Feature noise > Feature chained > Feature whitelist > Algorithm graham burton > PValue graham > ImprobabilityDrive on > Preference "spamAction=deliver" > Preference "signatureLocation=headers" # 'message' or 'headers' > Preference "showFactors=off" > AllowOverride trainingMode > AllowOverride spamAction > AllowOverride spamSubject > AllowOverride statisticalSedation > AllowOverride enableBNR > AllowOverride enableWhitelist > AllowOverride signatureLocation > AllowOverride showFactors > AllowOverride optIn optOut > AllowOverride whitelistThreshold > HashRecMax 98317 > HashAutoExtend on > HashMaxExtents 0 > HashExtentSize 49157 > HashMaxSeek 100 > HashConnectionCache 10 > Lookup "rabl.nuclearelephant.com" > RBLInoculate on > Notifications off > PurgeSignatures 14 > PurgeNeutral 90 > PurgeUnused 90 > PurgeHapaxes 30 > PurgeHits1S 15 > PurgeHits1I 15 > LocalMX 127.0.0.1 > SystemLog on > UserLog off > TrainPristine on > Opt out > Broken lineStripping > ClamAVPort 3310 > ClamAVHost 127.0.0.1 > ClamAVResponse spam > ServerPID /var/run/dspam.pid > ServerMode auto > ServerParameters "--deliver=innocent,spam -d %u" > ServerIdent "mail.domain.tld" > ServerDomainSocketPath "/var/tmp/dspam.sock" > ClientHost /var/tmp/dspam.sock > ProcessorBias on > > With this setup, however, the webui doesn't work, > except for the global statistics page. > > As you can see, we use the hash drive and shared groups, > works like a charm. > > For user mail training we use a simple script that collects > misclassified ham/spam on an hourly basis from dedicated > user IMAP folders like so: > > #!/bin/bash > # $Id: dspam_learn.sh.in 1971 2007-03-16 22:18:02Z stava $ > # @(#) Look for user/$user/spam/{ham,train} and if all those directories > exists, > # @(#) and there's at least one mail message to learn from, > # @(#) perform the training and the subsequent cleanup (remove the mails). > > id="`id | cut -d= -f2 | cut -d\( -f1`" > [ "$id" = "0" ] || { echo >&2 "$0: must be root"; exit 1; } > > # look here for cyrus imap users... > basedir="/var/spool/imap/user" > > # establish working directory... > cd /var/tmp > > # loop through all users... > for u in $basedir/*; do > user="`basename $u`"; ham=; spam= > # if all user directories (folders) exists, and only then... > [ -d $u/Spam ] && [ -d $u/Spam/train ] && \ > [ -d $u/Spam/train/ham ] && [ -d $u/Spam/train/spam ] && { > ls $u/Spam/train/ham/[0-9]*. &> /dev/null && { > echo -n "ham: " > for mail in $u/Spam/train/ham/[0-9]*.; do > echo -n "`basename $mail`" > sed '/^X-DSPAM-/d' $mail | \ > dspam --user users --class=innocent --deliver=innocent > --source=error > [ $? = 0 ] && rm $mail > done > echo "" > ham=. > } > ls $u/Spam/train/spam/[0-9]*. &> /dev/null && { > echo -n "spam: " > for mail in $u/Spam/train/spam/[0-9]*.; do > echo -n "`basename $mail`" > sed '/^X-DSPAM-/d' $mail | \ > dspam --user users --class=spam --deliver=spam --source=error > [ $? = 0 ] && rm $mail > done > echo "" > spam=. > } > # tell cyrus that we removed some mail messages... > [ $ham ] && su - cyrus -c "reconstruct -r user/$user/Spam/train/ham" > [ $spam ] && su - cyrus -c "reconstruct -r user/$user/Spam/train/spam" > } > done > exit 0 > > This all works beautifully now. After a few days only, > just a few hundred mails, on a low volume site, we get: > > dspam_stats -H > users: > TP True Positives: 136 > TN True Negatives: 392 > FP False Positives: 5 > FN False Negatives: 33 > SC Spam Corpusfed: 0 > NC Nonspam Corpusfed: 0 > TL Training Left: 2103 > SHR Spam Hit Rate 80.47% > HSR Ham Strike Rate: 1.26% > OCA Overall Accuracy: 93.29% > > ...were the Overall Accuracy is climbing rapidly. > > Kudos to Tony who helped me to get thus far.
Many thanks! I'll try the shared group for the domain in question :-) > > If of any use, our dspam is packaged as an rpm which > works right-out-of-the-box on a SuSE Linux 10.1 platform: > <http://www.linadd.org/download/mail/dspam-3.6.8-1.i586.rpm>. > > Hope this helps > /Lars > > !DSPAM:16,45fea6311814931510095! > >
