Package: dspam Version: 3.6.4-4 Severity: important Ok, I installed dspam, dspam-webfrontend, dspam-doc, libdspam7-drv-sqlite3. Now what? Understand that I have experience using an older version of dspam on a redhat box, but I am at a loss as to how to proceed in the debian (sarge + some sid) customized version. README.Debian is of no help. It doesn't tell you what options the program was compiled with.
Dspam bugs, dspam documentation bugs, and dspam debian package bugs all compound to make a real mess. The configuration I am trying to use is exim->maildrop->dspam, and I tried both standalone and server modes. No link was provided from the DSPAM website wiki to the debian package maintainers web page. Nor does that web page have useful info. It just says that there is great documentation at the dspam site, which is definitely not true. The problems I have had indicate some areas that need to be documented. - How do you tell dspam to initialize the database for a user? Apparently this is automatic (it wasn't in older versions) if you use one of the simple (berkley db or sqlite) databases when dspam actually does something - which it usually doesn't, but the docs need to say this. Database was automagically created when I did a dspam_stats (real or bogus user). Typos create new users. But a database wasn't created if I tried to use the "dspam" or "dspam_corpus" commands - Training fails silently - actually, it verbosely indicates that it is actually doing something dspam_corpus --spam <mailbox> command: '/usr/bin/dspam' --class=spam --source=corpus --user 'whitis' /usr/bin/dspam_corpus: 2788 messages, 00:00:27 elapsed, 103.26 msgs./sec. This is true whether the training is done as ordinary user or root. meanwhile /var/spool/dspam/data/local/whitis/whitis.sdb is only 8192 bytes and dspam_stats reports all zeros. - many of the utilities don'trun as an ordinary user dspam_stats should run for ordinary users, for example. - web frontend doesn't run (you see the text of the cgi's) with all the configuration directories on apache2 instead of a monolithic file, it should be possible to make http://localhost/dspam/ a cgi direcory on package install. Maybe /etc/apache2/sites-availible/default needs to have a "Include /etc/apace2/sites-availible/default.d/" to include directorys. I tried symlinking the /etc/dspam/dspam-apache2.conf file into /etc/apache2/sites-enabled but apache died due to the suexecusergroup statement. Aparently, dspam package failed to insert into /etc/apache2/mods-enabled/ files to start suexec (should this be a separate debian package so it doesn't conflict with other attempts to start the module)? Note that suexec doesn't appear to play niceley with others anyway, since you can't define it on a per directory context. I.e. /dspam runs as /dspam /analog runs as analog, etc.) This might be a step in the right direction: echo "LoadModule suexec_module /usr/lib/apache2/modules/mod_suexec.so" >> /etc/apache2/mods-enabled/suexec.load Given the limitations of apache suexec, dspam probably needs to create a virtualhost on a different port number. <VitualHost localhost:1234> - It appears that you intend that people run dspam in client server mode. README.Debian needs to point the user to the /etc/defaults/dspam to initialize the server. And the dspam configuration files need to be initialized so the client and server can actually talk. It appears that secret authentication tokens are needed, but that is clear as mud. The package could poke a randomly generated token into the files when it creates them. token=`dd if=/dev/random bs=512 count=1 | md5sum` presumably, the token needs to be set in two places in the config file, but where is the second place? dspamc --user whitis --classify says "... unable to authenticate client" - world permissions on /etc/dspam/dspam.conf needs to be set so dspamc can read file? - How does an individual user set options like opt-in? do ~/.dspam files work? That depends on how you compiled dspam and the configuration files. The dspam docs mention that much. But they don't actually tell you what the options are to enable or disable .dspam files. So, how does an ordinary user set opt-in? It isn't like it is adequately documented at the DSPAM website. - touch /var/dspam/opt-in/local/user.whitis permission denied - dspam_admin add preference whitis optin yes "Program mode requires special privileges, e.t. root or Trusted User" dspam_admin sorta works as an ordinary user but if you give an incorrect option like "add preference whitis bogons yes" you get an error message like this: Unable to open file for writing: /var/spool/dspam/data/local/whitis/whitis.prefs.bak: So, that doesn't inspire confidence. Is "optin" vs "OptIn", "optIn", or "opt-in" correct? - touch ~/.dspam depends on whether homedirs is set, but the DSPAM site doesn't bother to tell you where one sets this so you can even check how it is set. And strace shows that program is not checking ~/.dspam, which is probably a serious mistake since there is no other way an ordianry user can set opt-in,opt-out due to permissions. - how does a user configure between: - do not run dspam at all - do not run dspam as an Exim filter rule so they can run it from within maildrop (called from .forward)? This is trickier if you have a filter rule installed since it appears if you opt-out, dspam simply fails silently even if it wasn't invoked by the MTA. - run dspam as Exim filter rule. It appears there is a serious design flaw in dspam whereby it ignores all commands based on opt-in/opt-out status rather than invoking it with a special option, such as "dspam --check-optinout", when calling it from an MTA filter rule rather than maildrop, procmail, manual command, dspam_corpus, or some other program that is using it to clasify mail. - dspam run as ordinary user gets permission error on /var/spool/dspam/data/local/whitis - Since DSPAM appears to be set up for opt in, one could make the dspam package actually install working dspam support that works as soon as a user opts in. At least if a suitable parameter is set somehwere with something like update-alternatives (i.e. we need to manipulate whether there is a symbolic link into the Exim (and other MTAs) filter directories). - No MTA integration, not even with the default Exim MTA. Given the separated configuration files for Exim, one could include the necessary setup file for the filter rule. - problems of program failing silently were the same whether using sqlite or hash database backend - if you really intend people use client/server mode doesn't that require changing "dspam" to "dspamc" in the sample exim configs? - dspam does not seem to work in either standalone or client/server mode. - with email notifications on: "Unable to open file for reading: firstrun.txt: No such file or directory" - creating /var/spool/dspam/data/opt-in/local/user.whitis made a difference, in spite of the fact that an ordinary user can't do that and /var/run/debug/clearly shows that it read the prefence set with dspam_admin: 29813: [05/06/2006 13:21:13] Loading preference 'optin' = 'yes' Now, when run as whitis I get Unable to create direcotry: /var/spool/dspam/data/local/whitis: Permission denied And as root, it actually clasifies the spam dspam --user whitis --classify --debug --stdout - what do they mean by optin in the dspam_admin preferences? Does that mean that you are opted in, as you would expect or that you must touce the file in the optin/local directory? - Fixing permission problems: chown whitis /var/spool/dspam/data/local/whitis chmod o+rx /var/spool/dspam/ chmod o+rx /var/spool/dspam/ chmod o+rx /var/spool/dspam/data/ chmod o+rx /var/spool/dspam/data/local/ But when I do an operation (as whitis) such as dspam_corpus that requires writing to database, I get: query error: attempt to write a readonly database: see sql.errors for more details Unable to open file for writing: /var/log/dspam//sql.errors: Permission denied chown whitis.dspam /var/spool/dspam/data/local/whitis/* chmod ug+rw /var/spool/dspam/data/local/whitis/* but the program creates whitis.sdb-journal later which doesn't have group write permission which could cause trouble later. chmod ug+rw /var/spool/dspam/data/local/whitis/* - performance 3000Mhz Amd athlon 64, 1GB of dual channel PC3200 ram with dspam_corpus, I am seeing very low performance of about 0.25 message per second. This was improved some by killing the unusepd dspam deamon. The problem seems to be related to sqlite database locking as running dspam on a test message with strace shows that the program pauses on an lseek() on the database file shortly after opening. Now, processing a message sometimes takes about a tenth of a second but other times (dspam_corpus running in background but no inbound mail filtering) it takes 1-2 seconds. A quick benchmark with formail on a mailbox of 217 messages shows throughput of 4.5 messages per second. Dspam_corpus is still showing rates well under 0.5 msgs per second but it was in the middle of a 200MB spam folder when I killed the daemon, so the long term average can be confusing things. Actually, I killed it in the middle, and dspam_corpus is still only reporting 0.2 messages per second. CPU load is under 10%. Disk activity is negligable (the entire database is sucked into the disk cache). Apparently, training operations (database writes) are much slower than read only classify operations. With dspam_corpus running in background: cat /tmp/spam1 | time dspam --user whitis --class=spam --source=corpus takes 4.26 seconds. But: cat /home/whitis/mail/prism | time formail -s dspam --user whitis --class=innocent --source=corpus was killed after an hour (it made dspam_corpus run slower too) cat /tmp/spam1 | time dspam --user whitis --class=innocent --source=corpus took 7.5 seconds and it took 7.9 to retrain that message as spam. Classifying that same message took 0.84 seconds running a script that runs dspam_corpus on most of my mailboxes (spam and innocent, probably about 2gb), has taken over 24 hours. corpus training was very slow (and mail filtering was slow, too) on my old box but it was about a tenth as fast and was using berkley db library (which also crashed). Note that dspam was somewhat troublesome when I used it before: - slow training - repeated database (berkely DB) corruption caused mail bounces I am currently looking at other spam filters. spamassasin is able to train 5-100msgs per second (not sure why such a wide spread, but there appears to be overhead on small mailboxes) but it also appears to depend on content. -- System Information: Debian Release: 3.1 APT prefers unstable odd, since /etc/apt/apt.conf says: APT::Default-Release "stable"; APT policy: (500, 'unstable') Architecture: i386 (i686) Kernel: Linux 2.6.8-2-386 Locale: LANG=en_US, LC_CTYPE=en_US (charmap=ISO-8859-1) Versions of packages dspam depends on: ii adduser 3.63 Add and remove users and groups ii libc6 2.3.6-7 GNU C Library: Shared libraries ii libdspam7 3.6.4-4 DSPAM is a scalable and statistica ii libldap2 2.1.30-8 OpenLDAP libraries ii procmail 3.22-11 Versatile e-mail processor -- no debconf information -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]