Re: spamassassin management by file deletion
Hmm. Do you have shell access? It's not necessary, but it'll make things easier if you do. No, I don't have shell access. I can access the file space by FTP, though. How big are each of those files? auto-whitelist = 0.7MB bayes_journal = 70kB bayes_seen = 0.3MB bayes_toks = 5.3MB users_prefs = 1.5kB You'll probably want to disable the AWL and delete auto-whitelist; How do I disable AWL? You'll probably also want to fiddle with the Bayes directive that controls how large the Bayes data files get; while it works on number of tokens rather than disk size it can be give a rough estimate of disk use. The default bayes_expiry_max_db_size of 150,000 tokens may be too large, but it looks like you can't make it much smaller. My users_prefs file does not currently have a bayes_expiry_max_db_size option in it. Do I simply add one, setting a smaller value? If so, how do I get a sense of what a good value is? Over the longer term, you can delete bayes_journal and bayes_seen; those are not critical to proper operation of the Bayes subsystem. However, if you remove bayes_seen, you'll end up re-learning messages over and over again if regularly re-learn a folder that you don't empty. Got it. They don't seem to be a big problem right now. There's nothing that I can do with bayes_toks? Thank you for the above, Kris: it helps a good deal. Best, Colin -- View this message in context: http://www.nabble.com/spamassassin-management-by-file-deletion-tf4431882.html#a12652462 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: spamassassin management by file deletion
newby 23 wrote: How do I disable AWL? Not sure; check the docs for the version of SA you're using. It *has* changed more than once in the last year or so IIRC. My users_prefs file does not currently have a bayes_expiry_max_db_size option in it. Do I simply add one, setting a smaller value? Yep. If so, how do I get a sense of what a good value is? Trial and error. :/ FWIW, I have a system with a global Bayes DB set for 1,500,000 tokens, running ~45M, and my account on my personal system runs ~5.5M with the default 150,000 tokens. It looks like it's pretty much hardcoded to keep at least 100,000 tokens, so you might get down as far as ~3M with that setting. There's nothing that I can do with bayes_toks? To actually trim the file, you may have to discard the Bayes database you've got and train a new one with the new bayes_expiry_max_db_size value. BerkelyDB doesn't shrink the file when a record is deleted; you usually have to copy the live data to a new file and move it over top of the old one. -kgd
Re: spamassassin management by file deletion
On Thu, Sep 13, 2007 at 10:24:08AM -0400, Kris Deugau wrote: How do I disable AWL? Not sure; check the docs for the version of SA you're using. It *has* changed more than once in the last year or so IIRC. It's still the same use_auto_whitelist 0, though it's recommended to just not load the plugin if possible (which it isn't in this case). My users_prefs file does not currently have a bayes_expiry_max_db_size option in it. Do I simply add one, setting a smaller value? Yep. I would recommended against this. 150k is a good running value, making it lower will potentially limit the effectiveness of using Bayes. It's also worth noting that the expire system has forced minimum value of 100k here. There's nothing that I can do with bayes_toks? To actually trim the file, you may have to discard the Bayes database you've got and train a new one with the new bayes_expiry_max_db_size value. BerkelyDB doesn't shrink the file when a record is deleted; you usually have to copy the live data to a new file and move it over top of the old one. FWIW, that's exactly what the bayes expiry system does. -- Randomly Selected Tagline: Phenomenal Cosmic Powers, Itty Little Living Space. - Aladdin pgphXbY5pOnFn.pgp Description: PGP signature
Re: spamassassin management by file deletion
Theo Van Dinter-2 wrote: On Thu, Sep 13, 2007 at 10:24:08AM -0400, Kris Deugau wrote: How do I disable AWL? Not sure; check the docs for the version of SA you're using. It *has* changed more than once in the last year or so IIRC. It's still the same use_auto_whitelist 0, though it's recommended to just not load the plugin if possible (which it isn't in this case). I've set use_auto_whitelist 0 in user_prefs. Can I now delete my auto-whitelist? I'll hold off on changing the bayes_expiry_max_db_size. This being the case, is there anything I can do to prune the bayes_toks file? Colin -- View this message in context: http://www.nabble.com/spamassassin-management-by-file-deletion-tf4431882.html#a12663007 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: spamassassin management by file deletion
newby 23 wrote: I use a domain managed by HOSTROUTE, which has installed spamassassin as a mail filter. My filespace is limited to 10MB, O_o That sounds awfully low, even for cheap-to-free hosting. According to http://www.hostroute.co.uk/hostingplans.html, the smallest plan is 20M; you might want to contact them and see why you apparently only have 10M. of which some 7.7MB are currently devoted to spamassassin. Thus, I need to prune this quickly to maintain service. As I do not maintain the system, I cannot manage spamassassin in the usual ways. Instead, I think that I am limited to deleting files and altering the user_prefs file. Hmm. Do you have shell access? It's not necessary, but it'll make things easier if you do. The following files are present in my .spamassassin directory: auto-whitelist, bayes_journal, bayes_seen, bayes_toks, users_prefs How big are each of those files? You'll probably want to disable the AWL and delete auto-whitelist; it tends to grow without bound and while *I've* never had functional trouble from it, quite a few others on this list have reported problems of one kind or another aside from the disk usage. (I wrote a script a long time ago to actually clean out old entries, and trim the file size - google for trim_whitelist. Note that you pretty much REQUIRE shell access to use this.) You'll probably also want to fiddle with the Bayes directive that controls how large the Bayes data files get; while it works on number of tokens rather than disk size it can be give a rough estimate of disk use. The default bayes_expiry_max_db_size of 150,000 tokens may be too large, but it looks like you can't make it much smaller. Running man Mail::SpamAssassin::Conf from a shell on your webhost should give you details on configuration directives, but I'm pretty sure the same listing is available on the SA site somewhere under the Docs link. Over the longer term, you can delete bayes_journal and bayes_seen; those are not critical to proper operation of the Bayes subsystem. However, if you remove bayes_seen, you'll end up re-learning messages over and over again if regularly re-learn a folder that you don't empty. -kgd