Re: bayes autolearn off but journal updated
On Thu, Jan 22, 2009 at 02:48, Matt Kettler mkettler...@verizon.net wrote: Matus UHLAR - fantomas wrote: On 20.01.09 19:45, Matt Kettler wrote: Yes, more specifically, it's mostly going to be updating the atime, or time of last access, records for tokens. This time is used by the expiry process to drop the least recently used tokens. What does SA do, if it can't r/w open bayes database? Will it skip BAYES checks or just tie it r/o ? (I notice ocasional missing BAYES in X-Spam headers) Well, first let's be clear.. it's R/W opening the journal, not the database itself. The main _toks and _seen files are only locked R/W if there's one of the following going on: learning without bayes_learn_to_journal set a journal sync token expiry is running As for write locks to the journal, if for some reason there's a conflict, the update is just dropped with a warning. This isn't incredibly likely unless your bayes is really busy, as journal updates are pretty short in nature. on POSIX filesystems, this should be virtually impossible, since the file is opened for append with atomic writes. --j. If you look at /lib/Mail/SpamAssassin/BayesStore/DBM.pm and find sub cleanup in it. Snippets of that code: my $path = $self-_get_journal_filename(); ... if (!open (OUT, .$path)) { warn bayes: cannot write to $path, bayes db update ignored: $!\n; umask $umask; # reset umask return; }
Re: bayes autolearn off but journal updated
Yes, more specifically, it's mostly going to be updating the atime, or time of last access, records for tokens. This time is used by the expiry process to drop the least recently used tokens. What does SA do, if it can't r/w open bayes database? Will it skip BAYES checks or just tie it r/o ? (I notice ocasional missing BAYES in X-Spam headers) Well, first let's be clear.. it's R/W opening the journal, not the database itself. The main _toks and _seen files are only locked R/W if there's one of the following going on: learning without bayes_learn_to_journal set a journal sync token expiry is running As for write locks to the journal, if for some reason there's a conflict, the update is just dropped with a warning. This isn't incredibly likely unless your bayes is really busy, as journal updates are pretty short in nature. on POSIX filesystems, this should be virtually impossible, since the file is opened for append with atomic writes. It is quite common on Solaris with 40+ working spamds and really high traffic volume. Some time ago we had such situation. The server had 50% idle while the spamds were striving to lock the journal (auto_learn and auto_expire disabled) rather than going on to handle a next message. Ie the machine was 50% idle but was unable to handle more messages and the bottleneck was in journal updates. -- Paweł Sasin WIRTUALNA POLSKA Spolka Akcyjna z siedziba w Gdansku przy ul. Traugutta 115 C, wpisana do Krajowego Rejestru Sadowego - Rejestru Przedsiebiorcow prowadzonego przez Sad Rejonowy Gdansk - Polnoc w Gdansku pod numerem KRS 068548, o kapitale zakladowym 67.980.024,00 zlotych oplaconym w calosci oraz Numerze Identyfikacji Podatkowej 957-07-51-216.
Re: bayes autolearn off but journal updated
On 20.01.09 19:45, Matt Kettler wrote: Yes, more specifically, it's mostly going to be updating the atime, or time of last access, records for tokens. This time is used by the expiry process to drop the least recently used tokens. Matus UHLAR - fantomas wrote: What does SA do, if it can't r/w open bayes database? Will it skip BAYES checks or just tie it r/o ? (I notice ocasional missing BAYES in X-Spam headers) On Thu, Jan 22, 2009 at 02:48, Matt Kettler mkettler...@verizon.net wrote: Well, first let's be clear.. it's R/W opening the journal, not the database itself. well, sorry, OK. As for write locks to the journal, if for some reason there's a conflict, the update is just dropped with a warning. This isn't incredibly likely unless your bayes is really busy, as journal updates are pretty short in nature. Yes, this is what I wanted to know... On 22.01.09 09:47, Justin Mason wrote: on POSIX filesystems, this should be virtually impossible, since the file is opened for append with atomic writes. we have mailboxes on NFS, accessed from more machined, i guess that may be the reason. -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. I feel like I'm diagonally parked in a parallel universe.
Re: bayes autolearn off but journal updated
On Thu, Jan 22, 2009 at 10:05, Paweł Sasin hanni...@wp-sa.pl wrote: Yes, more specifically, it's mostly going to be updating the atime, or time of last access, records for tokens. This time is used by the expiry process to drop the least recently used tokens. What does SA do, if it can't r/w open bayes database? Will it skip BAYES checks or just tie it r/o ? (I notice ocasional missing BAYES in X-Spam headers) Well, first let's be clear.. it's R/W opening the journal, not the database itself. The main _toks and _seen files are only locked R/W if there's one of the following going on: learning without bayes_learn_to_journal set a journal sync token expiry is running As for write locks to the journal, if for some reason there's a conflict, the update is just dropped with a warning. This isn't incredibly likely unless your bayes is really busy, as journal updates are pretty short in nature. on POSIX filesystems, this should be virtually impossible, since the file is opened for append with atomic writes. It is quite common on Solaris with 40+ working spamds and really high traffic volume. Some time ago we had such situation. The server had 50% idle while the spamds were striving to lock the journal (auto_learn and auto_expire disabled) rather than going on to handle a next message. Ie the machine was 50% idle but was unable to handle more messages and the bottleneck was in journal updates. You definitely mean the journal, right? not the bayes dbs? interesting to hear this, I haven't encountered it before... --j.
Re: bayes autolearn off but journal updated
On Tue, Jan 20, 2009 at 04:49:12PM +0100, Matus UHLAR - fantomas wrote: Why does it update the journal? Why does it try to open journal in R/W mode? Theo Van Dinter wrote: $ man sa-learn Oh, sorry for missing that in docs :( In other words, the journal isn't just for learning. On 20.01.09 19:45, Matt Kettler wrote: Yes, more specifically, it's mostly going to be updating the atime, or time of last access, records for tokens. This time is used by the expiry process to drop the least recently used tokens. What does SA do, if it can't r/w open bayes database? Will it skip BAYES checks or just tie it r/o ? (I notice ocasional missing BAYES in X-Spam headers) -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Save the whales. Collect the whole set.
Re: bayes autolearn off but journal updated
Matus UHLAR - fantomas wrote: On 20.01.09 19:45, Matt Kettler wrote: Yes, more specifically, it's mostly going to be updating the atime, or time of last access, records for tokens. This time is used by the expiry process to drop the least recently used tokens. What does SA do, if it can't r/w open bayes database? Will it skip BAYES checks or just tie it r/o ? (I notice ocasional missing BAYES in X-Spam headers) Well, first let's be clear.. it's R/W opening the journal, not the database itself. The main _toks and _seen files are only locked R/W if there's one of the following going on: learning without bayes_learn_to_journal set a journal sync token expiry is running As for write locks to the journal, if for some reason there's a conflict, the update is just dropped with a warning. This isn't incredibly likely unless your bayes is really busy, as journal updates are pretty short in nature. If you look at /lib/Mail/SpamAssassin/BayesStore/DBM.pm and find sub cleanup in it. Snippets of that code: my $path = $self-_get_journal_filename(); ... if (!open (OUT, .$path)) { warn bayes: cannot write to $path, bayes db update ignored: $!\n; umask $umask; # reset umask return; }
bayes autolearn off but journal updated
Hello, on my systems I turned bayes filter off by default: cd /etc/mail/spamassassin/ grep bayes * local.cf:use_bayes 0 local.cf:bayes_auto_learn 0 local.cf:bayes_auto_expire 0 local.cf:bayes_learn_to_journal 1 ...I keep the journal default so any user who turns on bayes, would use journalling even for manual learning. One of users has BAYES turned on, without changing value of auto_learn or anything: # bayes databazu plnit budeme... use_bayes 1 bayes_auto_learn 0 bayes_auto_expire 0 However, this users' bayes_journal keeps being changed, even without manual intervention. I also get ocasionally the error in logs: Jan 20 16:33:22 t02 spamd[5073]: bayes: cannot open bayes databases /.../.spamassassin/bayes_* R/W: lock failed: File exists Why does it update the journal? Why does it try to open journal in R/W mode? -- Matus UHLAR - fantomas, uh...@fantomas.sk ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Where do you want to go to die? [Microsoft]
Re: bayes autolearn off but journal updated
On Tue, Jan 20, 2009 at 04:49:12PM +0100, Matus UHLAR - fantomas wrote: Why does it update the journal? Why does it try to open journal in R/W mode? $ man sa-learn [...] bayes_journal While SpamAssassin is scanning mails, it needs to track which tokens it uses in its cal- culations. To avoid the contention of having each SpamAssassin process attempting to gain write access to the Bayes DB, the token timestamps are written to a ’journal’ file which will later (either automatically or via sa-learn --sync) be used to synchronize the Bayes DB. In other words, the journal isn't just for learning. -- Randomly Selected Tagline: Cats are smarter than dogs. You can't make eight cats pull a sled through the snow. pgpHkdGFBX2Ib.pgp Description: PGP signature
Re: bayes autolearn off but journal updated
Theo Van Dinter wrote: On Tue, Jan 20, 2009 at 04:49:12PM +0100, Matus UHLAR - fantomas wrote: Why does it update the journal? Why does it try to open journal in R/W mode? $ man sa-learn [...] bayes_journal While SpamAssassin is scanning mails, it needs to track which tokens it uses in its cal- culations. To avoid the contention of having each SpamAssassin process attempting to gain write access to the Bayes DB, the token timestamps are written to a ’journal’ file which will later (either automatically or via sa-learn --sync) be used to synchronize the Bayes DB. In other words, the journal isn't just for learning. Yes, more specifically, it's mostly going to be updating the atime, or time of last access, records for tokens. This time is used by the expiry process to drop the least recently used tokens.
Bayes and last journal sync atime
I find that last journal sync atime is 0 on my Bayes setups that use MySQL. So, can I assume that there is no journal (well, there's no table and file for it, anyway) and stuff is added directly to the database? (which makes sense). However, looking at my setups that still use dbm files I find that the last journal sync atime is completely wrong on them. e.g. if I do a sa-learn --sync the last journal sync atime doesn't change and it's months old. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com
Re: sa-learn journal location for teaching spamassassin on multiple hosts
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hey Jake, Thx for your reply. I got this same tip off-list (from Jonas Eckerman). I liked the idea and I have already done some successful testing of centralized bayes-data storage in a MySQL database. We are using an SQL back-end for storing 'all things e-mail' anywayz, so this was easily fitted in. I will be roling stuff out as soon as it is ready for production. Alse, the READMEs in the distribution were very useful for setting this up. I did not need any other resources and there were zero issues. Thx to Jonas, Jake and the list for helping out, gj ;) Regards, Samy I'm keeping these full messages in here, as they may present a (kinda) full problem and solution for others having similar issues. On Nov 11, 2008, at 11:51 PM, Jake Maul wrote: On Fri, Nov 7, 2008 at 4:45 AM, Samy Ascha, Xel Media B.V. [EMAIL PROTECTED] wrote: I have recently setup a mailbox and a sa-learn script to start teaching SpamAssassin. This was all no problem, but: We have an MX group of usually about 3 MTAs, which all run their own content filter (amavis) and thus use their own SpamAssassin's database. When we are gonna start teaching SpamAssassin with sa-learn, I need to somehow sync the results in the journal to all these hosts. I've checked out the --no-sync and --sync options and I think these options will give me exactly the tools I need for this job. I need to know the location of the journal though and I need to know if there are any pitfalls when syncing a SpamAssassin with a journal from another one on another server. Has anyone got experience with syncing sa-learn between multiple MTAs? How did you solve this? Can SA sync with a journal in an arbitrary location, or does it look for it in one preconfigged place? I hope u have some interresting thought about this issue. Ultimately, you're not syncing 'sa-learn', you're syncing the bayes' DB that sa-learn (and spamd) records to. There's a few ways to go about sharing the bayesian database. Probably the best bet would be to store the bayes DB in MySQL, and point SA on all 3 servers to it- ideally with the database on a 4th server (hey, you can put the AWL info into MySQL as well... may as well hit that up at the same time). You could probably go the --sync and --no-sync route if you fiddled with it enough (never tried it), but honestly a single MySQL DB for bayes would probably be a lot simpler if you have any experience at all with MySQL. It's been good for performance for us even when used on a single server, and it's pretty bulletproof for us- been in use for years. The only tip you really need here is to run OPTIMIZE TABLE every now and then. An alternative hacky solution: turn off autolearn on 2 of the 3, and do sa-learns and autolearning on the 3rd. Then nightly rsync all the bayes DB files over to the other 2 servers and restart spamd. Not pretty, but it should work. Jake -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkkhQpcACgkQKIdvzp2UK/Fj+gCeIdwltuT96Zv3vYDplXR0Dh+7 9ykAoIlkJkEF1AZqH6ABbcWGFVXemBhA =gbAW -END PGP SIGNATURE-
Re: sa-learn journal location for teaching spamassassin on multiple hosts
On Fri, Nov 7, 2008 at 4:45 AM, Samy Ascha, Xel Media B.V. [EMAIL PROTECTED] wrote: I have recently setup a mailbox and a sa-learn script to start teaching SpamAssassin. This was all no problem, but: We have an MX group of usually about 3 MTAs, which all run their own content filter (amavis) and thus use their own SpamAssassin's database. When we are gonna start teaching SpamAssassin with sa-learn, I need to somehow sync the results in the journal to all these hosts. I've checked out the --no-sync and --sync options and I think these options will give me exactly the tools I need for this job. I need to know the location of the journal though and I need to know if there are any pitfalls when syncing a SpamAssassin with a journal from another one on another server. Has anyone got experience with syncing sa-learn between multiple MTAs? How did you solve this? Can SA sync with a journal in an arbitrary location, or does it look for it in one preconfigged place? I hope u have some interresting thought about this issue. Ultimately, you're not syncing 'sa-learn', you're syncing the bayes' DB that sa-learn (and spamd) records to. There's a few ways to go about sharing the bayesian database. Probably the best bet would be to store the bayes DB in MySQL, and point SA on all 3 servers to it- ideally with the database on a 4th server (hey, you can put the AWL info into MySQL as well... may as well hit that up at the same time). You could probably go the --sync and --no-sync route if you fiddled with it enough (never tried it), but honestly a single MySQL DB for bayes would probably be a lot simpler if you have any experience at all with MySQL. It's been good for performance for us even when used on a single server, and it's pretty bulletproof for us- been in use for years. The only tip you really need here is to run OPTIMIZE TABLE every now and then. An alternative hacky solution: turn off autolearn on 2 of the 3, and do sa-learns and autolearning on the 3rd. Then nightly rsync all the bayes DB files over to the other 2 servers and restart spamd. Not pretty, but it should work. Jake
sa-learn journal location for teaching spamassassin on multiple hosts
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear members, I have recently setup a mailbox and a sa-learn script to start teaching SpamAssassin. This was all no problem, but: We have an MX group of usually about 3 MTAs, which all run their own content filter (amavis) and thus use their own SpamAssassin's database. When we are gonna start teaching SpamAssassin with sa-learn, I need to somehow sync the results in the journal to all these hosts. I've checked out the --no-sync and --sync options and I think these options will give me exactly the tools I need for this job. I need to know the location of the journal though and I need to know if there are any pitfalls when syncing a SpamAssassin with a journal from another one on another server. Has anyone got experience with syncing sa-learn between multiple MTAs? How did you solve this? Can SA sync with a journal in an arbitrary location, or does it look for it in one preconfigged place? I hope u have some interresting thought about this issue. Thx much and regards, Samy Ascha Xel Media Internet Services -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.8 (Darwin) iEYEARECAAYFAkkUKlQACgkQKIdvzp2UK/HoLgCgoLnB4PeP5Vg159g+f5YfSnCo LacAn22WXVRd8y/SSqPMKeNGi9qwEjaS =3sbv -END PGP SIGNATURE-
Re: sa-learn journal location for teaching spamassassin on multiple hosts
On 07.11.08 12:45, Samy Ascha, Xel Media B.V. wrote: I have recently setup a mailbox and a sa-learn script to start teaching SpamAssassin. This was all no problem, but: We have an MX group of usually about 3 MTAs, which all run their own content filter (amavis) and thus use their own SpamAssassin's database. When we are gonna start teaching SpamAssassin with sa-learn, I need to somehow sync the results in the journal to all these hosts. We have group of four MTA servers. However they don't run SA on MTA level (yet). We have users' mailboxes on shared storage cluster, so their bayes DB is on shared space. I'd solve your case by configuring MTA's w/o BAYES, or maybe by using users' configs, if possible - if the mail is sent to one user, should not be a problem. For mail sent to more users, somehow generic configuration and filtering will be used, so users may be willing to have the mail rechecked for spamminess. Has anyone got experience with syncing sa-learn between multiple MTAs? How did you solve this? Can SA sync with a journal in an arbitrary location, or does it look for it in one preconfigged place? I am not sure if it's safe to use journal or bayes DB nfs-mounted... -- Matus UHLAR - fantomas, [EMAIL PROTECTED] ; http://www.fantomas.sk/ Warning: I wish NOT to receive e-mail advertising to this address. Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu. Save the whales. Collect the whole set.
Understanding Bayes journal sync
I have started to use a different method to call SA on some of my machines than I used in the past because the web interface (ISPConfig) I chose integrates with SA and clamav (via clamassassin). This is now classic SA calling via procmail. The other methods I used before and still use on other machines are MailScanner and a special spamc-like milter. There I have never seen this problem. So, what happens is that users get completely blank mail after the first one or two weeks of use. When I ran sa -D it became apparent that it's trying to sync the Bayes journal and couldn't acquire a lock because there already were two lockfiles: bayes_journal.lock and bayes_journal.FQDN.lock or some such. The journal had grown to about 55 MB. That somehow led to a timeout and the empty mail. Once I removed the lock files and ran a --sync it took only a few seconds to finish the sync. I would like to know how this locking problem can happen as it could frequently spoil the user experience. I assume it could happen (similar to bayes expiry) when it's time to sync and the sa run or the sync itself times out and is killed by the procmail process, leaving behind the lockfile, or so? Other possible causes? What's the best method to avoid it? There's no setting like bayes_auto_expire for the journal sync. Should I set the bayes_journal_max_size to 100 MB or so and then run a nightly sync? Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com
Re: Understanding Bayes journal sync
Kai Schaetzl wrote on Sun, 17 Aug 2008 14:09:17 +0200: Should I set the bayes_journal_max_size to 100 MB or so and then run a nightly sync? I reread the conf page. Of course, 0 would be the correct setting to stop the fly-by syncing. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com
Usage of journal in Bayesian Filtering.
Hi, I am trying understand the usage of journal in Bayesian Filtering. If bayes_learn_to_journal is set to 1, SA stores newly learnt tokens in the journal. When bayesian filter is activated, while scanning a message SA reads tokens from BOTH 'bayes_tokens' database and 'bayes_journel' While scanning a message, tokens found in bayes_tokens database are written to bayes_journel with modified timestamp Is my understanding correct ? Please correct me if my understanding is wrong regards, Srilatha This email message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential, proprietary and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please immediately notify the sender by reply email and destroy all copies of the original message. Thank you. Intoto Inc.
Re: My bayes journal just keeps growing
On Thu, Dec 14, 2006 at 12:48:34PM +0530, Ramprasad wrote: The problem is my bayes_journal file grows immensely ( around 500Mb a day ) but the bayes_toks files hardly gets touched It sounds like syncing is not working for you. When I do a bayes-expiry the process seems to hang (after even 3-4 hours ) and I simply resort to deleting the journal file. Because I cant Why do you delete the journal, which has nothing to do with expiry? Have you run in debug mode to see what is going on? -- Randomly Selected Tagline: You tell 'em Goldfish, You've been around the globe. pgponvjmQucWL.pgp Description: PGP signature
My bayes journal just keeps growing
I run SA 3.1.5 with MailScanner I have in my cf file bayes_learn_to_journal 1 use_bayes 1 bayes_path /var/spool/MailScanner/spamassassin/bayes bayes_file_mode 0666 bayes_auto_expire 0 The problem is my bayes_journal file grows immensely ( around 500Mb a day ) but the bayes_toks files hardly gets touched When I do a bayes-expiry the process seems to hang (after even 3-4 hours ) and I simply resort to deleting the journal file. Because I cant keep waiting for expiry to get complete. (We get a HUGE traffic of around 7 Million mails a day on 14 loadbalanced servers ) I am looking at MySQL based bayes, but that will take time to get implemented What is the best way of setting up bayes for high traffic servers Thanks Ram
Bayes journal problem
Dear all, I'm using spamassassin 3.1.0 without problems. Starting from today I see the following messages in the log files of my mail server: Jan 24 16:35:13 alpha spamd[8295]: partial write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (4040 of 118632), recovering. Jan 24 16:35:14 alpha spamd[8293]: partial write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (4040 of 111984), recovering. Jan 24 16:35:14 alpha spamd[8294]: partial write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (4040 of 117408), recovering. Jan 24 16:35:14 alpha spamd[8294]: cannot write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal, aborting! Jan 24 16:35:14 alpha spamd[8294]: Exiting subroutine via last at /usr/lib/perl5/vendor_perl/5.8.6/Mail/SpamAssassin/BayesStore/DBM.pm line 1073. Jan 24 16:35:14 alpha last message repeated 2 times Jan 24 16:35:14 alpha spamd[8294]: Exiting eval via last at /usr/lib/perl5/vendor_perl/5.8.6/Mail/SpamAssassin/BayesStore/DBM.pm line 1073. Jan 24 16:35:14 alpha spamd[8295]: partial write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (4040 of 118632), recovering. What's happen? -- --- (o_ (o_//\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +--+ | ENRICO MORELLI | email: [EMAIL PROTECTED] | | * * * *| phone: +39 055 4574269 | | University of Florence| fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY| +--+
Re: Bayes journal problem
Perhaps you have per-user Bayes and the user has gone over-quota on disk space? Loren
Re: Bayes journal problem
On Wed, 25 Jan 2006 03:14:30 -0800 Loren Wilton [EMAIL PROTECTED] wrote: Perhaps you have per-user Bayes and the user has gone over-quota on disk space? Loren I checked and yes I have per-user Bayes and some users was out of quota. This is the problem? Thanks a lot. -- --- (o_ (o_//\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +--+ | ENRICO MORELLI | email: [EMAIL PROTECTED] | | * * * *| phone: +39 055 4574269 | | University of Florence| fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY| +--+
Re: Bayes journal problem
On Wed, 25 Jan 2006 12:31:02 +0100 Enrico Morelli [EMAIL PROTECTED] wrote: On Wed, 25 Jan 2006 03:14:30 -0800 Loren Wilton [EMAIL PROTECTED] wrote: Perhaps you have per-user Bayes and the user has gone over-quota on disk space? Loren I checked and yes I have per-user Bayes and some users was out of quota. This is the problem? Thanks a lot. I add some disk quota to the users that was out of quota, but the problem seems unresolved. Jan 25 12:31:13 alpha spamd[2021]: bayes: write failed to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (0 of 263856)! Jan 25 12:31:13 alpha spamd[2021]: Exiting subroutine via last at /usr/lib/perl5/vendor_perl/5.8.6/Mail/SpamAssassin/BayesStore/DBM.pm line 1126. Jan 25 12:31:13 alpha last message repeated 2 times Jan 25 12:31:13 alpha spamd[2021]: Exiting eval via last at /usr/lib/perl5/vendor_perl/5.8.6/Mail/SpamAssassin/BayesStore/DBM.pm line 1126. -- --- (o_ (o_//\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +--+ | ENRICO MORELLI | email: [EMAIL PROTECTED] | | * * * *| phone: +39 055 4574269 | | University of Florence| fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY| +--+
Re: Bayes journal problem
I add some disk quota to the users that was out of quota, but the problem seems unresolved. Jan 25 12:31:13 alpha spamd[2021]: bayes: write failed to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (0 of 263856)! Jan 25 12:31:13 alpha spamd[2021]: Exiting subroutine via last This still looks like a quota or permissions problem, or maybe a missing home directory for some user. I think this is saying that Bayes could not write to a file in etc/mail/spamassassin/BAYES. I'm guessing this is a common file rather than a per-user file. So perhaps the user does not have permission to write to this directory? Loren
Re: Bayes journal problem
On Wed, Jan 25, 2006 at 04:04:28AM -0800, Loren Wilton wrote: Jan 25 12:31:13 alpha spamd[2021]: bayes: write failed to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (0 of 263856)! Jan 25 12:31:13 alpha spamd[2021]: Exiting subroutine via last This still looks like a quota or permissions problem, or maybe a missing home directory for some user. FWIW, the error occurs when the journal has been opened for writing (so in theory the permissions and such should be ok), but any attempt to actually put data into the journal fails, specifically that syswrite() (therefore the system's write() function) returns an error. I've attached a patch which you could use against M::SA::BayesStore::DBM which makes that error message include the system's error string which should hopefully be useful. -- Randomly Generated Tagline: A nod's as good as a wink to a blind bat! Index: lib/Mail/SpamAssassin/BayesStore/DBM.pm === --- lib/Mail/SpamAssassin/BayesStore/DBM.pm (revision 366688) +++ lib/Mail/SpamAssassin/BayesStore/DBM.pm (working copy) @@ -1122,8 +1122,12 @@ # argh, write failure, give up if (!defined $len || $len 0) { - $len = 0 unless (defined $len); - warn bayes: write failed to Bayes journal $path ($len of $nbytes)!\n; + my $err = ''; + if (!defined $len) { + $len = 0; + $err = ($!); + } + warn bayes: write failed to Bayes journal $path ($len of $nbytes)!$err\n; last; } pgpFzhC4hzYAx.pgp Description: PGP signature
Re: Bayes journal problem
On Wed, 25 Jan 2006 04:04:28 -0800 Loren Wilton [EMAIL PROTECTED] wrote: I add some disk quota to the users that was out of quota, but the problem seems unresolved. Jan 25 12:31:13 alpha spamd[2021]: bayes: write failed to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (0 of 263856)! Jan 25 12:31:13 alpha spamd[2021]: Exiting subroutine via last This still looks like a quota or permissions problem, or maybe a missing home directory for some user. I think this is saying that Bayes could not write to a file in etc/mail/spamassassin/BAYES. I'm guessing this is a common file rather than a per-user file. So perhaps the user does not have permission to write to this directory? Loren The BAYES is a directory containing the DBs used by spamassassin and the files are for general purpose, not per-user files. # ls -la /etc/mail/spamassassin drwx-- 2 spamc spamc 4096 Jan 25 10:38 BAYES # ls -la /etc/mail/spamassassin/BAYES drwx-- 2 spamc spamc 4096 Jan 25 10:38 . drwxr-xr-x 3 root root 4096 Jan 24 16:45 .. -rw--- 1 spamc spamc 5154 Jan 25 10:38 bayes.mutex -rw-rw-rw- 1 spamc spamc0 Jan 25 13:17 bayes_journal -rw--- 1 spamc spamc 154 May 27 2004 bayes_journal.orig -rw--- 1 spamc spamc 41598976 Jan 25 07:58 bayes_seen -rw-rw-rw- 1 spamc spamc 5414912 Jan 25 07:58 bayes_toks -- --- (o_ (o_//\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +--+ | ENRICO MORELLI | email: [EMAIL PROTECTED] | | * * * *| phone: +39 055 4574269 | | University of Florence| fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY| +--+
Re: Bayes journal problem
Enrico Morelli wrote: Dear all, I'm using spamassassin 3.1.0 without problems. Starting from today I see the following messages in the log files of my mail server: Jan 24 16:35:13 alpha spamd[8295]: partial write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (4040 of 118632), recovering. Jan 24 16:35:14 alpha spamd[8293]: partial write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (4040 of 111984), recovering. Jan 24 16:35:14 alpha spamd[8294]: partial write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (4040 of 117408), recovering. Jan 24 16:35:14 alpha spamd[8294]: cannot write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal, aborting! Jan 24 16:35:14 alpha spamd[8294]: Exiting subroutine via last at /usr/lib/perl5/vendor_perl/5.8.6/Mail/SpamAssassin/BayesStore/DBM.pm line 1073. Jan 24 16:35:14 alpha last message repeated 2 times Jan 24 16:35:14 alpha spamd[8294]: Exiting eval via last at /usr/lib/perl5/vendor_perl/5.8.6/Mail/SpamAssassin/BayesStore/DBM.pm line 1073. Jan 24 16:35:14 alpha spamd[8295]: partial write to Bayes journal /etc/mail/spamassassin/BAYES/bayes_journal (4040 of 118632), recovering. What's happen? SA tried to write a large block of data to disk and the OS only allowed it to write 4040 bytes. Possible causes: 99% chance of disk full or user quota exceeded. 1% chance of hard disk failure. Check your system logs and df
Re: Bayes journal problem
On Wed, 25 Jan 2006 10:06:13 -0500 Matt Kettler [EMAIL PROTECTED] wrote: Enrico Morelli wrote: Dear all, SA tried to write a large block of data to disk and the OS only allowed it to write 4040 bytes. Possible causes: 99% chance of disk full or user quota exceeded. 1% chance of hard disk failure. Check your system logs and df Yeah!!! Solved. Thanks. In effect the / filesystem was 100% full. -- --- (o_ (o_//\ Coltivate Linux che tanto Windows si pianta da solo. (/)_ V_/_ +--+ | ENRICO MORELLI | email: [EMAIL PROTECTED] | | * * * *| phone: +39 055 4574269 | | University of Florence| fax : +39 055 4574253 | | CERM - via Sacconi, 6 - 50019 Sesto Fiorentino (FI) - ITALY| +--+
Bayes journal options and SQL
Do the Bayes journal options (bayes_journal_max_size, bayes_learn_to_journal) have any effect when you use MySQL as the Bayes database?
Re: Bayes journal options and SQL
On Fri, Jan 07, 2005 at 05:39:44PM -0500, Rosenbaum, Larry M. wrote: Do the Bayes journal options (bayes_journal_max_size, bayes_learn_to_journal) have any effect when you use MySQL as the Bayes database? No. Michael pgp9V5vhmX9d8.pgp Description: PGP signature
Journal?
For a while now (years) I have employed a single bayes db for my entire server of users. It used bayes and kept three files; bayes_seen, bayes_toks and bayes_journal. All of these files had current creation(modification) dates so I know they were being used. Thing is, I never used the rebuild feature to sync the journal, it just happened. Is that normal? But on to my question... I recently started allowing the use of user_prefs for individual users. Everything is working , but only 2 files are being added to the user home folders; bayes_toks and bayes_seen (no journal). Is this something I should be concerned about? The accounts are fairly light, not high traffic, is there even a need to use journaling on the individual accounts? Thanks for any help. ++ Mike Yrabedra (President) 323 Incorporated Our Sites: MacDock.com MacAgent.com iTuneAgent.com MacSurfShop.com ++ W: http://www.323inc.com/ P: 770.382.1195 F: 734.448.5164 E: [EMAIL PROTECTED] I: ichatmacdock ++ Whatever you do, work at it with all your heart, as working for the Lord, not for men. ~Colossians 3:23 {{{ ++
[2.64] Bayes journal: gibberish entry found
This just appeared in the SA-logs: -- Oct 11 17:10:06 hostname spamd[16864]: info: setuid to user succeeded Oct 11 17:10:06 hostname spamd[16864]: processing message [EMAIL PROTECTED] for user:531. Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: hdu4+AdBAHUFuOwHQQBQV+h8bQAAaOAHQQBX6HFtAACDxBCF23QuahnoozQAAFmFwHQiagTopzQA Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: AEBAUI2FJP7//1DoPDUAAFBX6EFtAACDxBTrFo2FJP7//1DoKEAAAFBX6CltAACDxAyF23QsaNgH Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: QQBX6BdtAABoOC9BAGgAL0EAaNAHQQCNhXz+//9Q/xW04UAAg8QY6yFoyAdBAFfo62wAAFlZaCwB Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: AACNhXz+//9QagD/FWTgQABofAdBAFfoymwAAI2FfP7//1BX6M4PAABWV+i2bAAAjUW8UFforGwA Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: AGhwB0EAV+ihbAAAg8Qo6wdqAVjDi2Xog038/4tN8GSJDQBfXlvJw6QnQAC5J0AAwCdAAMcn Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: QADOJ0AA1SdAANwnQADjJ0AA6idAANEsQADYLEAA3yxAAOYsQADtLEAA9CxAAPssQAACLUAACS1A Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: ABAtQAAXLUAAHi1AACUtQABVi+yD7DRWi3UIaPAQQQBWgCYA6BJsAAAPt0UMSFmD+AtZd2H/JIUz Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: N0AAaOgQQQDrS2jcEEEA60Ro1BBBAOs9aMwQQQDrNmjIEEEA6y9owBBBAOsoaLgQQQDrIWiwEEEA Oct 11 17:10:07 hostname spamd[16864]: Bayes journal: gibberish entry found: 6xpopBBBAOsTaJwQQQDrDGiQEEEA6wVohBBBAFbop2sAAFlZD7dFEFt 1097498524 Relief Oct 11 17:10:11 hostname spamd[16864]: identified spam (10.4/5.0) for user:531 in 5.4 seconds, 2419 bytes. -- Is this anything to worry about? TIA Martin -- Martin Schröder, [EMAIL PROTECTED] ArtCom GmbH, Lise-Meitner-Str 5, 28359 Bremen, Germany Voice +49 421 20419-44 / Fax +49 421 20419-10 http://www.artcom-gmbh.de
Re: SA 3.0-RC2 producing extremely large bayes journal files
Kai Schaetzl wrote on Sat, 25 Sep 2004 22:34:10 +0200: FWIW, the problem seems to have been RC2-specific I spoke to soon, I just needed to wait another day. So it took 15 days to surface this time. I'm going to open a bug on this if I can't find it on Bugzilla. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com IE-Center: http://ie5.de http://msie.winware.org
Re: SA 3.0-RC2 producing extremely large bayes journal files
FWIW, the problem seems to have been RC2-specific. Didn't occur after it, now going from RC4 to RTM next week. Thanks for all the great work! Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com IE-Center: http://ie5.de http://msie.winware.org
Re: SA 3.0-RC2 producing extremely large bayes journal files
Daniel Quinlan wrote on 11 Sep 2004 13:55:08 -0700: If you haven't already, please file a bug. No, I didn't file a bug yet. It happened twice during the testing of RC2 on the RC2 machine. It didn't happen on the RC3 machine. I applied RC4 three days ago to both machines and am still waiting for it to happen. I suppose it doesn't make much sense to file a bug for RC2 which may have been eradicated already. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com IE-Center: http://ie5.de http://msie.winware.org
SA 3.0-RC2 producing extremely large bayes journal files
For about a week I've been seeing SA time-outs in MailScanner (120 sec time-out) and on investigating it seems the reason are extremely large bayes journal files. I ran sa-learn -D --sync and that took quite long, about two minutes. As I understand SA should try to sync once a day? So, it seems that when the sync should happen it takes so long and then times out with MS. I then took a look at the bayes dir and found this: -rw-rw-rw-1 spamdwww36 Sep 11 12:43 bayes.mutex -rw-rw-rw-1 root www 12968 Sep 11 13:13 bayes_journal -rw-rw-rw-1 root www 170591392 Sep 11 12:04 bayes_journal.old -rw-rw-rw-1 spamdwww 2408448 Sep 11 11:47 bayes_seen -rw-rw-rw-1 spamdwww 20951040 Sep 11 11:47 bayes_toks There was a message received at 12:04, then the --sync apparently was about to happen, but never finished? There are only a few thousand messages arriving per day, many of them whitelisted. It's nearly impossible that the journal could grow so large. Not only that it should sync automatically after some time I forced a sync only a few days ago. Also, when this happens I find that a lot of swap space is allocated although there's still 100 MB or more of free RAM available and only a restart of MailScanner frees that up. (I sent a message about this to the MS list as well.) But the basic underlying problem seems to be this massive journal bloat. This is SA 3.0-RC2 on Suse 9.0 with MailScanner 4.32.5 (I think). I have RC3 on an almost identical system and haven't seen the same there yet. Where there any changes after RC2 in that area, so testing of RC4 might prove useful? Also, could it be any of the Perl modules involved? If so, which should I check or upgrade? Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com IE-Center: http://ie5.de http://msie.winware.org
Re: SA 3.0-RC2 producing extremely large bayes journal files
Kai Schaetzl [EMAIL PROTECTED] writes: For about a week I've been seeing SA time-outs in MailScanner (120 sec time-out) and on investigating it seems the reason are extremely large bayes journal files. I ran sa-learn -D --sync and that took quite long, about two minutes. As I understand SA should try to sync once a day? So, If you haven't already, please file a bug. Daniel -- Daniel Quinlan http://www.pathname.com/~quinlan/
Re: Cannot write to journal and others
On Sun, 5 Sep 2004 13:41:15 -0500 John Fleming [EMAIL PROTECTED] wrote: Sep 5 13:23:56 Luke spamd[29971]: cannot write to /var/.spamassassin/bayes_journal, Bayes db update ignored Check the permissions on the directory /var/.spamassassin I believe it should be readable/writeable by all mail processes ... by all users. -- Raquel The person born with a talent they are meant to use will find their greatest happiness in using it. --Johann Wolfgang Von Goethe