Re: Large-scale global Bayes tuning?
On Wed, 9 Apr 2008, Kris Deugau wrote: autolearn is picking up ~1.5M+ from ~300K messages on a daily basis. Push your autolearn thresholds out to reduce the overall volume of learned spam and ham? -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- People seem to have this obsession with objects and tools as being dangerous in and of themselves, as though a weapon will act of its own accord to cause harm. A weapon is just a force multiplier. It's *humans* that are (or are not) dangerous. --- 4 days until Thomas Jefferson's 265th Birthday
Re: Large-scale global Bayes tuning?
From: Kris Deugau [EMAIL PROTECTED] Organization: ViaNet Internet Solutions Date: Wed, 09 Apr 2008 12:12:43 -0400 To: users@spamassassin.apache.org Subject: Large-scale global Bayes tuning? Anyone have any suggestions on tuning a large global Bayes db for stability and sanity? I've got my fingers in the pie of a moderately large mail cluster, but I haven't yet found a Bayes configuration that's sane and stable for any extended period. Wiping it completely about once a week seems to provide acceptable filtering performance (we have a number of addon rulesets), but I still see spam in my inbox with BAYES_00 - a sure sign of a mistuned Bayes database. Bayes on cluster begs the question: what if you didn't replicate the bayes tables, and left them server specific? Since (depending on configurations) some of the servers might get 'spam only' (higher mx records), maybe just take one of the 'valid' bayes tables and manually copy it (sa-learn backup, sa-learn clear, restore) every week or so. Only way I could get a cluster of 9 to work right. -- Michael Scheidell, CTO |SECNAP Network Security Winner 2008 Network Products Guide Hot Companies FreeBSD SpamAssassin Ports maintainer Charter member, ICSA labs anti-spam consortium _ This email has been scanned and certified safe by SpammerTrap(tm). For Information please see http://www.spammertrap.com _
Re: Large-scale global Bayes tuning?
Michael Scheidell wrote: Bayes on cluster begs the question: what if you didn't replicate the bayes tables, and left them server specific? It may yet take that. :( (If only for overall cluster reliability - any one of the current three machines could handle the current load without any trouble, but we're likely going to stuff ClamAV on them as well.) Unfortunately that means doing mistake-training on *each* machine - autolearn on it's own just doesn't cut it. I'm dogfooding pretty much that exact scenario on one machine; it's got its own local Bayes DB that I'm hand-training with my own mail. Since (depending on configurations) some of the servers might get 'spam only' (higher mx records), maybe just take one of the 'valid' bayes tables and manually copy it (sa-learn backup, sa-learn clear, restore) every week or so. Mmmh. Access is for both inbound and outbound mail, through a load-balancer; the type of mail seen on any one system is pretty much identical over time.
Re: Large-scale global Bayes tuning?
John Hardin wrote: On Wed, 9 Apr 2008, Kris Deugau wrote: autolearn is picking up ~1.5M+ from ~300K messages on a daily basis. Push your autolearn thresholds out to reduce the overall volume of learned spam and ham? I've thought about that. It makes it more difficult to get Bayes data on the critical messages in that middle range though. :( -kgd
Large-scale global Bayes tuning?
Anyone have any suggestions on tuning a large global Bayes db for stability and sanity? I've got my fingers in the pie of a moderately large mail cluster, but I haven't yet found a Bayes configuration that's sane and stable for any extended period. Wiping it completely about once a week seems to provide acceptable filtering performance (we have a number of addon rulesets), but I still see spam in my inbox with BAYES_00 - a sure sign of a mistuned Bayes database. Past experience with (much) smaller systems has shown stable behaviour with bayes_expiry_max_db_size set to 150 (~40M BDB Bayes), daily expiry runs delete ~25-35K tokens; mail volume ~3K/day. However, the larger system (MySQL, currently set with max_db_size at 300, on-disk files running ~100M) only seems to be expiring that same 25-35K tokens even though autolearn is picking up ~1.5M+ from ~300K messages on a daily basis. Reading through the docs on token expiry I would guess it should be far more aggressive than it is. (Among other things, I really don't want to bump up max_db_size by two orders of magnitude; up to ~5M should be fine, and I could see as high as 7.5M if really necssary.) I'm not even really sure what questions to ask to get more detail; sa-learn -D doesn't really spit out *enough* detail about the expiry process to know for sure if something is going wrong there. -kgd
Re: Large-scale global Bayes tuning?
On Wed, 9 Apr 2008, Kris Deugau wrote: John Hardin wrote: On Wed, 9 Apr 2008, Kris Deugau wrote: autolearn is picking up ~1.5M+ from ~300K messages on a daily basis. Push your autolearn thresholds out to reduce the overall volume of learned spam and ham? I've thought about that. It makes it more difficult to get Bayes data on the critical messages in that middle range though. :( How varied is the character of your message traffic? Is manual learning an option, especially with larger autolearn thresholds? Then at least you'd be able to reseed your bayes with a known-good corpus. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ [EMAIL PROTECTED]FALaholic #11174 pgpk -a [EMAIL PROTECTED] key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- People seem to have this obsession with objects and tools as being dangerous in and of themselves, as though a weapon will act of its own accord to cause harm. A weapon is just a force multiplier. It's *humans* that are (or are not) dangerous. --- 4 days until Thomas Jefferson's 265th Birthday
Re: Large-scale global Bayes tuning?
From: Kris Deugau [EMAIL PROTECTED] Organization: ViaNet Internet Solutions Reply-To: users@spamassassin.apache.org Date: Wed, 09 Apr 2008 12:36:56 -0400 To: users@spamassassin.apache.org Subject: Re: Large-scale global Bayes tuning? Michael Scheidell wrote: Bayes on cluster begs the question: what if you didn't replicate the bayes tables, and left them server specific? It may yet take that. :( (If only for overall cluster reliability - any one of the current three machines could handle the current load without any trouble, but we're likely going to stuff ClamAV on them as well.) Unfortunately that means doing mistake-training on *each* machine - autolearn on it's own just doesn't cut it. I'm dogfooding pretty much that exact scenario on one machine; it's got its own local Bayes DB that I'm hand-training with my own mail. You could also take mysql off of one or several, have them load balance to the other mysql servers, run a caching (global) dns server and clamav on one of them. What about DCC? I assume with those volumes you are running a local DCC server, and having the other boxes talk to it? Since (depending on configurations) some of the servers might get 'spam only' (higher mx records), maybe just take one of the 'valid' bayes tables and manually copy it (sa-learn backup, sa-learn clear, restore) every week or so. Mmmh. Access is for both inbound and outbound mail, through a Keep a couple for outbound only, won't need bayes too much on those. We have an engineering spec for a 9x9 (9 nodes in a cluster, 9 clusters in a group) to support up to 2MM users, and we do a lot of task and load splitting like that. -- Michael Scheidell, CTO |SECNAP Network Security Winner 2008 Network Products Guide Hot Companies FreeBSD SpamAssassin Ports maintainer Charter member, ICSA labs anti-spam consortium _ This email has been scanned and certified safe by SpammerTrap(tm). For Information please see http://www.spammertrap.com _
Re: Large-scale global Bayes tuning?
John Hardin wrote: How varied is the character of your message traffic? Is manual learning an option, especially with larger autolearn thresholds? What is this... manual learning... you speak of? g Not really an option in the short term, although in the long term I'd *like* to have a system similar to what I've mostly trained users to do on the much smaller systems - forward misclassified mail to a suitable role account as an attachment for manual processing (whitelist, blacklist, feed to Bayes, write/adjust rules, etc). Of course, that requires someone to *do* the manual processing :( I've been taking my own FNs and feeding them back in; that's really the only misclassified mail I have easy access to. No FPs noticed so far Then at least you'd be able to reseed your bayes with a known-good corpus. *nod* I've thought about exporting the database from the smaller system and pulling it in to the cluster to see how the accuracy is. Tokens don't get expired according to my understanding of the expiry algorithm about sums up the immediate problem; overall filter accuracy is pretty good on the whole. -kgd
Re: Large-scale global Bayes tuning?
Hi Kris, At 09:12 09-04-2008, Kris Deugau wrote: Anyone have any suggestions on tuning a large global Bayes db for stability and sanity? I've got my fingers in the pie of a moderately large mail cluster, but I haven't yet found a Bayes configuration that's sane and stable for any extended period. Wiping it completely about once a week seems to provide acceptable filtering performance (we have a number of addon rulesets), but I still see spam in my inbox with BAYES_00 - a sure sign of a mistuned Bayes database. Spam hitting BAYES_00 points to the bayes database being polluted. That can happen if the autolearn levels are not low enough. Some manual learning can help to keep the Bayes database in tune. A more aggressive expiry won't necessarily prevent mistuning. You'll have to do some MySQL tuning for performance. In a large setup, manual learning isn't always possible. You can have some rules to identify some good and bad messages which are representative of the userbase. Regards, -sm
Global Bayes
Just upgraded to 3.2.4. I am running spamassasin as a normal user, not root. I keep seeing this in the log files. bayes: cannot open bayes databases /var/sabayes/.spamassassin/bayes_* R/W: lock failed: File exists There are about 20 lock files in the directory. Is spamassassin not cleaning up the lock files properly or is it even working? Looks like there have been changes here since version 3.2.0
Re: Global Bayes
On Mar 24, 2008, at 11:08 AM, Mike Fahey wrote: Just upgraded to 3.2.4. I am running spamassasin as a normal user, not root. I keep seeing this in the log files. bayes: cannot open bayes databases /var/sabayes/.spamassassin/ bayes_* R/W: lock failed: File exists There are about 20 lock files in the directory. Is spamassassin not cleaning up the lock files properly or is it even working? Looks like there have been changes here since version 3.2.0 I don't know of any specific changes between versions, but... whenever I noticed this happen it was almost always due to disk space, permissions or the fact you have autoexpire turned on. Double check the permissions on your folders and make sure the user you run SpamAssassin under has the right privs required. Stop SA, clean up the files, and try restarting. A good idea (if you're running global bayes) is to turn off auto- expire and run a sa-learn force expire at a normal interval. We've been running this way for years and it seems to perform just fine under 3.2.4. -- Robert Blayzor INOC [EMAIL PROTECTED] http://www.inoc.net/~rblayzor/ Mac OS X. Because making Unix user-friendly is easier than debugging Windows.
Re: Global Bayes
Mike Fahey wrote: Just upgraded to 3.2.4. I am running spamassasin as a normal user, not root. I keep seeing this in the log files. bayes: cannot open bayes databases /var/sabayes/.spamassassin/bayes_* R/W: lock failed: File exists There are about 20 lock files in the directory. Is spamassassin not cleaning up the lock files properly or is it even working? Looks like there have been changes here since version 3.2.0 In your global config (local.cf), try: lock_method flock and delete all those pesky lock files for good. Might see a speed improvement also. Definitely works much better under heavy loads. JCH
Global Bayes and AWL
Hi, I have read this thread, http://www.nabble.com/forum/ViewPost.jtp?post=819176framed=y This is also what I am searching for to do. Make SpamAssassin score against both a AWL/Bayes by the user and a AWL/Bayes by the system. What I was thinking on was to make a new set of rules for SA that checks agains the AWL and Bayes again, but this time as a specific user, like default. I copied the /usr/share/spamassassin/60_awl.cf and 23_bayes.cf to /etc/mail/spamassassin and renamed all BAYES_* and AWL to GLOBAL_BAYES_* and GLOBAL_AWL. Then I added user_awl_sql_override_username and user_bayes_sql_override_username to the new rules. This however made CGPSA, that I use against CommuniGate Pro, to run AWL saves against the MySQL table as default to. It also wrote output like Merging duplicate GLOBAL_AWL and AWL. Is this not possible at all, has someone made this work? -- View this message in context: http://www.nabble.com/Global-Bayes-and-AWL-tf4618683.html#a13190805 Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
RE: Global Bayes and AWL
-Original Message- From: Magnus Anderson [mailto:[EMAIL PROTECTED] Sent: Saturday, October 13, 2007 5:40 PM Hi, I have read this thread, http://www.nabble.com/forum/ViewPost.jtp?post=819176framed=y This is also what I am searching for to do. Make SpamAssassin score against both a AWL/Bayes by the user and a AWL/Bayes by the system. What I was thinking on was to make a new set of rules for SA that checks agains the AWL and Bayes again, but this time as a specific user, like default. I copied the /usr/share/spamassassin/60_awl.cf and 23_bayes.cf to /etc/mail/spamassassin and renamed all BAYES_* and AWL to GLOBAL_BAYES_* and GLOBAL_AWL. Then I added user_awl_sql_override_username and user_bayes_sql_override_username to the new rules. This however made CGPSA, that I use against CommuniGate Pro, to run AWL saves against the MySQL table as default to. It also wrote output like Merging duplicate GLOBAL_AWL and AWL. Is this not possible at all, has someone made this work? It is not impossible, but it would borrow its own speed cost. Some time ago I wished have a three-level layered Bayes: site level, organization level and mailbox level. The idea was to reshape the bayes DB store code and, probably, scoring code, such that a) during mail scanning, a token unknown by the user would get scored thanks to the organizational or site one (if any); b) new tokens learned (or auto-learned) by Bayes would contribute to all the three levels. From a store standpoint, this means that tokens shouldn't have any ham/spam count anymore, but instead there should be a table listing tokens belonging to a given mail and a table listing mails received by each user. In this latter table, there should be a ham/spam flag. When an incoming mail is scanned and tokens are extracted, for each token the code should count how many times the user (auto-) reported that token as being ham or spam or, if there are no occurrences of that token in the user layer, how many times that token had been reported as ham or spam at organizational level (that is: by all users in a domain/organization). Then, if there is again no occurrence of the token, how many times that token had been tagged as spammy at site level (that is: by every user in every organization), if any. This reasoning could even be changed somehow in order to statistically prioritize user preferences over organizational ones over site ones, which would be much preferred the previous idea since simply spreading the mail corpus in three levels would easily result in a unreliably too small user and even a organizational virtual corpus. However, this would mean to tune the well-known Bayes classification equations to this need, which should be done carefully and not released before a review from some Bayes' theory-savvy person. A further benefit steaming from a multi-layer approach would be easy and reliable expiration of bayes entries, by simply deleting mails arrived before the expire period, then tokens not anymore referred by any e-mail. This is something most serious sql server could even do automatically after deleting any token whose last-seen time is before a given threshold. Also, actually AWL owns its own table to do its work. This design could instead use two further fields on the mails table with the source mail address and ip address in them, and a further field in the usermails table with the computed SA score in it. AWL could use this data in order to do its dirty job, thereby obtaining data expiration for free. Of course, since there were so much impact in the Bayes code, I surely preferred this design be in the mainstream SA code, in order to avoid to reinvent the wheel each time I had to update SA. The problem is that this design would be much more complex than the actual one and the question is: would it be eadible by everybody but the tiniest ISPs using SA? It probably would be good to me, with some hundreds e-mails received per day. But what if one has to scan 10,000,00 mails/day? Sure one can use smart sql servers with statistical query optimizers and the like, but this way too computing the bayes score in an incoming mail would probably take a couple of seconds in the average, as opposed to the current few tents of second... So, flexibility often comes at speed expenses and I guess many in this list would not appreciate. Giampaolo
Re: switching from global bayes to per-user bayes
No comments whatsoever? signature.asc Description: This is a digitally signed message part
R: switching from global bayes to per-user bayes
I am looking into switching from a global bayes/awl/setting environment to a per-user environment with MySQL as a backend. puts on asbestos suit Would anyone care to offer an opinion as to whether and/or to what degree this might make in overall effectiveness? Anyone back up that opinion with cold hard facts? Will I be able to migrate small sets of users from global to per-user or will I have to make the jump for all my end-users/domains at once? I have a suggestion to spare for the setting environment which works pretty well to me and would avoid the per-user/global question. My amavis settings are in a postgres db in which each organization (more or less = domain) has a schema. Each organization has a table with user-defined settings. My public schema (i.e.: the default one) has a table too with organizational settings. Also, the public schema has a view which is the mean by which I get the amavis settings for a given user. It attempts fetching the per-user settings table in the organizational schema and, if they are missing, it attempts fetching the per-organization settings. If this too are missing, it uses some default which may be tought as local (ie.: server-wide) settings. I did find this three-layers way of handling settings pretty useful: a user wants to get .exe attachments without having them wrapped into a warning message? Put a record in the per-user table. An org wants to have treats reported to a specific user? Tell it to the per-organization table. However, I don't have bayes data on the db (it's in the global bdb). This is because most of my customers use pop3 to download messages, so I have quite no way to train bayes efficently. Do I? Also, awl settings are in a database only to ease their adjustment (and, in future, replication), but are global as well: I'm serving small communities in my town, so often a source ip/e-mail scores may be reasonably used for all my potential destinators. Cheers, giampaolo I'd like to preload the bayes db for each user so that's it's 'primed' and ready to do. Obviously, it would be preferable to preload with their specific mail but is it possible to feed bayes for each user with a generic set of spam/ham?
switching from global bayes to per-user bayes
I am looking into switching from a global bayes/awl/setting environment to a per-user environment with MySQL as a backend. puts on asbestos suit Would anyone care to offer an opinion as to whether and/or to what degree this might make in overall effectiveness? Anyone back up that opinion with cold hard facts? Will I be able to migrate small sets of users from global to per-user or will I have to make the jump for all my end-users/domains at once? I'd like to preload the bayes db for each user so that's it's 'primed' and ready to do. Obviously, it would be preferable to preload with their specific mail but is it possible to feed bayes for each user with a generic set of spam/ham? signature.asc Description: This is a digitally signed message part
Bayes and SQL and Vpopmail and /user + global bayes
Hello, SA 3.1.4 exec /usr/bin/spamd -v -m 32 -D -q -u vpopmail -s stderr 21 I am using vpopmail installation, and use /user perfs, for /user bayes and other user conf is stored in SQL. Problem: If a mail comes in, and no real vpopmail user is present (smtproutes), than SA pick's a random real vpopomail user, and works with that bayes db. I can't configure global fallback, or stuff like that. If I try to put @GLOBAL bayes_path to SQL than SA says, this is an administrator config param, and it is not allowed in there. Any solution ? AND not so important question: Can I use both site wide, and /user bayes for one incoming mail ? Peter
combine user and global Bayes with SQL?
I guess the subject line says it all. I'm running SA 3.1.1 with Bayes stored in MySQL. Is it possible to learn messages as a global user and have the tokens apply when evaluating individual users' email? (Never mind if it would be truly effective; this is more of a theoretical question.)
Re: per-user or global bayes (was: HUGE bayes DB (non-sitewide) advice?)
bump --- Michael Monnerie [EMAIL PROTECTED] wrote: My users are quite happy with overall markup of the spam. We occasionally get a HAM marked as SPAM. We have an odd client base though. The question is: when to use global and when per-user bayes? On our server, we have people of different languages, communicating with different countries all over the world, in different areas (advertising, production, IT, etc.). I thought in that case a per-user bayes would be much better, as viagra is something good for the one, but bad for the other. What's the general recommendation for bayes? __ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com
Re: per-user or global bayes (was: HUGE bayes DB (non-sitewide) advice?)
On Mittwoch, 9. November 2005 08:04 Gary W. Smith wrote: My users are quite happy with overall markup of the spam. We occasionally get a HAM marked as SPAM. We have an odd client base though. The question is: when to use global and when per-user bayes? On our server, we have people of different languages, communicating with different countries all over the world, in different areas (advertising, production, IT, etc.). I thought in that case a per-user bayes would be much better, as viagra is something good for the one, but bad for the other. What's the general recommendation for bayes? mfg zmi -- // Michael Monnerie, Ing.BSc --- it-management Michael Monnerie // http://zmi.at Tel: 0660/4156531 Linux 2.6.11 // PGP Key: lynx -source http://zmi.at/zmi2.asc | gpg --import // Fingerprint: EB93 ED8A 1DCD BB6C F952 F7F4 3911 B933 7054 5879 // Keyserver: www.keyserver.net Key-ID: 0x70545879 pgpCN6ryXTaZ2.pgp Description: PGP signature
Re: [sa-list] Re: global bayes database?
On Fri, 9 Sep 2005, Michael Parker wrote: Oh, you want an entirely different Bayes storage module, one that doesn't exist. You're more than welcome to create your own, perldoc Mail::SpamAssassin::BayesStore to get a sense of the API that you must implement. I'll leave the issues surrounding combining of bayes databases as an exercise to the reader (suggest you search the archives for previous msgs on this topic). Interesting. The API looks fairly straightforward. Support for full-blown-multi-bayes isn't something I could see myself implementing right now, based purely on time constraints, but I don't think it's particularly hard. Still, the tweak to add a call that does nspam_nham_get, and if it's less than the required number for effective bayes, uses the system bayes DBs for scanning (but not learning, if autolearn is deemed appropriate) should be easy enough. I've searched the users list -- my issues with token collision are nil -- I'm sure everything I've got is in the new format since any attempts I made to try and get my original users stuff in crashed my systems. I'm going to note this more for anyone else who searches this list than for myself -- scoring on multiple bayes counts could have disasterous circumstances. Since it *has* to be read-only... (since everyone gets the same spam -- see http://article.gmane.org/gmane.mail.spam.spamassassin.general/60376 ) ..any admin must realize that for all they know, their user could work for Pfizer or SmithKline -- and you could be tagging all their legit workmail as bad. Normal bayes prevents this (or forces the user to accept that since they don't consider the names of drugs a bad thing, they have to deal with the spam). To pull an old phrase...one mans junk is another's treasure. Still, this could be as simple as calling the bayes algorithm twice, once as $user, once as $system -- and maintaining a different (probably slightly lower) set of scores for $system. Given, maybe the multi-bayes option should even be off by default for users with a good corpus (define good...200 messages? a thousand?). But I know that since I get more email, use pine and a shell, and religiously shuffle all my spam to spamassassin -r, that I'm more likely to have a complete corpus than those users which use outlook and have to rely on the automatic learning features. Okay, I've babbled enough. -Dan -- Dan Mahoney Techie, Sysadmin, WebGeek Gushi on efnet/undernet IRC ICQ: 13735144 AIM: LarpGM Site: http://www.gushi.org ---
Re: global bayes database?
At 05:04 AM 9/8/2005, Dan Mahoney, System Admin wrote: As my bayes database and my training of spamassassin is much better than that of most of my users... Is there any way to augment user bayes DB with a call to mine? one way is in your local.cf just set everyone to use the same bayes database and make it world accessible: bayes_path {somepath}/bayes bayes_file_mode 0777 Notes: bayes_path needs to end in /bayes, as it's really a path plus half a filename. SA will append _toks and _seen to this to create bayes_toks and bayes_seen. bayes_mode needs to be 0777 not 0666. It's sometimes used in directory creation and works more like a umask than a mode.
Re: [sa-list] Re: global bayes database?
On Thu, 8 Sep 2005, Matt Kettler wrote: At 05:04 AM 9/8/2005, Dan Mahoney, System Admin wrote: As my bayes database and my training of spamassassin is much better than that of most of my users... Is there any way to augment user bayes DB with a call to mine? one way is in your local.cf just set everyone to use the same bayes database and make it world accessible: I'm using SQL. I'm sorry, I should have mentioned that. The SQL docs don't say anything about it. -Dan -- Why are you wearing TWO grounding straps? -John Evans, Ezzi Computers August 23, 2001 Dan Mahoney Techie, Sysadmin, WebGeek Gushi on efnet/undernet IRC ICQ: 13735144 AIM: LarpGM Site: http://www.gushi.org ---
Re: [sa-list] Re: global bayes database?
Dan Mahoney, System Admin wrote: On Thu, 8 Sep 2005, Matt Kettler wrote: At 05:04 AM 9/8/2005, Dan Mahoney, System Admin wrote: As my bayes database and my training of spamassassin is much better than that of most of my users... Is there any way to augment user bayes DB with a call to mine? one way is in your local.cf just set everyone to use the same bayes database and make it world accessible: I'm using SQL. I'm sorry, I should have mentioned that. The SQL docs don't say anything about it. You mean this portion of the SQL docs? In addition to the global configuration directives there is a user preference: bayes_sql_override_usernamesomeusername This directive, if used, will override the username used for storing data in the database. This could be used to group users together to share bayesian filter data. You can also use this config option to trick sa-learn to learn data as a specific user. Michael signature.asc Description: OpenPGP digital signature
Re: global bayes database?
Dan Mahoney, System Admin wrote: Right, but this isn't exactly what I was looking for. Basically, I'm looking for a system whereby if a users bayes corpus isn't primed properly, it can refer to others, as sort of an if -- then system, rather than manually overriding it. As SQL becomes the recommended standard, I'm hoping this feature becomes more popular, as I would *love* to see SA do scoring and training based on BOTH system-bayes files as well as user-bayes. Maybe I should just ask for a pony :) Oh, you want an entirely different Bayes storage module, one that doesn't exist. You're more than welcome to create your own, perldoc Mail::SpamAssassin::BayesStore to get a sense of the API that you must implement. I'll leave the issues surrounding combining of bayes databases as an exercise to the reader (suggest you search the archives for previous msgs on this topic). Michael signature.asc Description: OpenPGP digital signature
spamassasin global bayes database
What do I have to do to get spamassassin to use a global bayes database for all users on the system, rather then per user?
Re: spamassasin global bayes database
Matt wrote: What do I have to do to get spamassassin to use a global bayes database for all users on the system, rather then per user? http://wiki.apache.org/spamassassin/SiteWideBayesSetup Steven -- Steven Dickenson [EMAIL PROTECTED] http://www.mrchuckles.net
Re: spamassasin global bayes database
Matt wrote: What do I have to do to get spamassassin to use a global bayes database for all users on the system, rather then per user? read the wiki :) http://wiki.apache.org/spamassassin/SiteWideBayesSetup