Lars Stavholm wrote: > David Reid wrote: >> Sorry to have to ask again, but despite trying a lot of variations the >> situation still isn't clear and isn't improving for the affected users. >> >> The situation is that some domains have a catch-all address, ie >> <anything>@domain maps to a single email address. In this situation the >> training works on the address that the mail was sent to - which is as >> expected. My question is whether there is a way to have all training for >> any domain address used for all domain addresses? Can some form of >> groups setup be used? > > Take a look in the 3.6.8 README, section 2.1 CONFIGURING GROUPS.
In addition, here's my working setup, thanks to Tony Earnshow: Postfix -> DSPAM -> Cyrus IMAP # dspam --version DSPAM Anti-Spam Suite 3.6.8 (agent/library) Copyright (c) 2002-2006 Jonathan A. Zdziarski http://dspam.nuclearelephant.com DSPAM may be copied only under the terms of the GNU General Public License, a copy of which can be found with the DSPAM distribution kit. Configuration parameters: --prefix=/usr --sysconfdir=/etc --with-dspam-home=/var/lib/dspam --mandir=/usr/share/man --enable-daemon --enable-debug --enable-clamav --enable-syslog --enable-homedir # cat /var/lib/dspam/group users:shared:[EMAIL PROTECTED] # egrep -v '^#|^$' /etc/dspam.conf Home /var/lib/dspam TrustedDeliveryAgent "/usr/lib/cyrus/bin/deliver" DeliveryHost 127.0.0.1 DeliveryPort 10026 DeliveryIdent localhost DeliveryProto SMTP OnFail error Trust root Trust mail Trust dspam Trust wwwrun TrainingMode teft TestConditionalTraining on Feature noise Feature chained Feature whitelist Algorithm graham burton PValue graham ImprobabilityDrive on Preference "spamAction=deliver" Preference "signatureLocation=headers" # 'message' or 'headers' Preference "showFactors=off" AllowOverride trainingMode AllowOverride spamAction AllowOverride spamSubject AllowOverride statisticalSedation AllowOverride enableBNR AllowOverride enableWhitelist AllowOverride signatureLocation AllowOverride showFactors AllowOverride optIn optOut AllowOverride whitelistThreshold HashRecMax 98317 HashAutoExtend on HashMaxExtents 0 HashExtentSize 49157 HashMaxSeek 100 HashConnectionCache 10 Lookup "rabl.nuclearelephant.com" RBLInoculate on Notifications off PurgeSignatures 14 PurgeNeutral 90 PurgeUnused 90 PurgeHapaxes 30 PurgeHits1S 15 PurgeHits1I 15 LocalMX 127.0.0.1 SystemLog on UserLog off TrainPristine on Opt out Broken lineStripping ClamAVPort 3310 ClamAVHost 127.0.0.1 ClamAVResponse spam ServerPID /var/run/dspam.pid ServerMode auto ServerParameters "--deliver=innocent,spam -d %u" ServerIdent "mail.domain.tld" ServerDomainSocketPath "/var/tmp/dspam.sock" ClientHost /var/tmp/dspam.sock ProcessorBias on With this setup, however, the webui doesn't work, except for the global statistics page. As you can see, we use the hash drive and shared groups, works like a charm. For user mail training we use a simple script that collects misclassified ham/spam on an hourly basis from dedicated user IMAP folders like so: #!/bin/bash # $Id: dspam_learn.sh.in 1971 2007-03-16 22:18:02Z stava $ # @(#) Look for user/$user/spam/{ham,train} and if all those directories exists, # @(#) and there's at least one mail message to learn from, # @(#) perform the training and the subsequent cleanup (remove the mails). id="`id | cut -d= -f2 | cut -d\( -f1`" [ "$id" = "0" ] || { echo >&2 "$0: must be root"; exit 1; } # look here for cyrus imap users... basedir="/var/spool/imap/user" # establish working directory... cd /var/tmp # loop through all users... for u in $basedir/*; do user="`basename $u`"; ham=; spam= # if all user directories (folders) exists, and only then... [ -d $u/Spam ] && [ -d $u/Spam/train ] && \ [ -d $u/Spam/train/ham ] && [ -d $u/Spam/train/spam ] && { ls $u/Spam/train/ham/[0-9]*. &> /dev/null && { echo -n "ham: " for mail in $u/Spam/train/ham/[0-9]*.; do echo -n "`basename $mail`" sed '/^X-DSPAM-/d' $mail | \ dspam --user users --class=innocent --deliver=innocent --source=error [ $? = 0 ] && rm $mail done echo "" ham=. } ls $u/Spam/train/spam/[0-9]*. &> /dev/null && { echo -n "spam: " for mail in $u/Spam/train/spam/[0-9]*.; do echo -n "`basename $mail`" sed '/^X-DSPAM-/d' $mail | \ dspam --user users --class=spam --deliver=spam --source=error [ $? = 0 ] && rm $mail done echo "" spam=. } # tell cyrus that we removed some mail messages... [ $ham ] && su - cyrus -c "reconstruct -r user/$user/Spam/train/ham" [ $spam ] && su - cyrus -c "reconstruct -r user/$user/Spam/train/spam" } done exit 0 This all works beautifully now. After a few days only, just a few hundred mails, on a low volume site, we get: dspam_stats -H users: TP True Positives: 136 TN True Negatives: 392 FP False Positives: 5 FN False Negatives: 33 SC Spam Corpusfed: 0 NC Nonspam Corpusfed: 0 TL Training Left: 2103 SHR Spam Hit Rate 80.47% HSR Ham Strike Rate: 1.26% OCA Overall Accuracy: 93.29% ...were the Overall Accuracy is climbing rapidly. Kudos to Tony who helped me to get thus far. If of any use, our dspam is packaged as an rpm which works right-out-of-the-box on a SuSE Linux 10.1 platform: <http://www.linadd.org/download/mail/dspam-3.6.8-1.i586.rpm>. Hope this helps /Lars
