Re: Conversion Spamassassin(bayes) database to SDBM
Axb wrote: On 2011-08-01 16:50, monolit939 wrote: Axb wrote: On 2011-08-01 9:52, monolit939 wrote: Axb wrote: wrong! http://spamassassin.apache.org/full/3.3.x/doc/Mail_SpamAssassin_Conf.txt see bayes_path in your case: bayes_path /var/mail/.spamassassin/bayes Hello, firstly, I have to thank for your advices. I added bayes_path /var/mail/.spamassassin/bayes to local.cf. I used steps you recommneded in previous post , BUT I performed them as user root. I think, that conversion from Berkeley DB to SDBM was successful. Unfortunatelly Spamassassin gives the same results with Berkeley DB and SDBM. I am not sure if Spamassassin really uses the SDBM database during scannin mails. I performed the following as root: 1) stop spamd 2) sa-learn --backup /tmp/bayes_export 3) add the following lines to local.cf bayes_store_module Mail::SpamAssassin::BayesStore::SDBM bayes_path /var/mail/.spamassassin/bayes 4) sa-learn --restore /tmp/bayes_export test change: 5) spamassassin -D --lint 21 | grep -i bayes # I didnt notice any error Jul 31 19:53:39.813 [2485] dbg: config: read file /usr/share/spamassassin/23_bayes.cf Jul 31 19:53:39.887 [2485] dbg: plugin: loading Mail::SpamAssassin::Plugin::Bayes from @INC Jul 31 19:53:40.688 [2485] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0xae6a2a0) implements 'learner_new', priority 0 Jul 31 19:53:40.688 [2485] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0xae6a2a0), bayes_store_module=Mail::SpamAssassin::BayesStore::SDBM Jul 31 19:53:40.702 [2485] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::SDBM=HASH(0xb167590) Jul 31 19:53:40.702 [2485] dbg: plugin: Mail::SpamAssassin::Plugin::Bayes=HASH(0xae6a2a0) implements 'learner_is_scan_available', priority 0 Jul 31 19:53:40.703 [2485] dbg: bayes: tie-ing to DB file R/O /var/mail/.spamassassin/bayes_toks Jul 31 19:53:40.703 [2485] dbg: bayes: tie-ing to DB file R/O /var/mail/.spamassassin/bayes_seen Jul 31 19:53:40.703 [2485] dbg: bayes: found bayes db version 3 Jul 31 19:53:40.703 [2485] dbg: bayes: DB journal sync: last sync: 0 Jul 31 19:53:40.729 [2485] dbg: bayes: DB journal sync: last sync: 0 Jul 31 19:53:40.730 [2485] dbg: bayes: corpus size: nspam = 311537, nham = 240966 Jul 31 19:53:40.734 [2485] dbg: bayes: score = 0.468256978075479 Jul 31 19:53:40.735 [2485] dbg: bayes: DB expiry: tokens in DB: 118976, Expiry max size: 15, Oldest atime: 1255330288, Newest atime: 1266342672, Last expire: 0, Current time: 1312134820 Jul 31 19:53:40.735 [2485] dbg: bayes: DB journal sync: last sync: 0 Jul 31 19:53:40.745 [2485] dbg: bayes: untie-ing Jul 31 19:53:41.074 [2485] dbg: rules: ran eval rule BAYES_50 == got hit (1) Jul 31 19:53:41.135 [2485] dbg: check: tests=BAYES_50,MISSING_DATE,MISSING_HEADERS,NO_RECEIVED,NO_RELAYS Jul 31 19:53:41.136 [2485] dbg: timing: total 1327 ms - init: 896 (67.5%), parse: 0.71 (0.1%), extract_message_metadata: 1.30 (0.1%), get_uri_detail_list: 1.11 (0.1%), tests_pri_-1000: 8 (0.6%), compile_gen: 151 (11.4%), compile_eval: 17 (1.3%), tests_pri_-950: 5 (0.3%), tests_pri_-900: 5 (0.4%), tests_pri_-400: 21 (1.6%), check_bayes: 16 (1.2%), tests_pri_0: 337 (25.4%), tests_pri_500: 51 (3.8%) if you see no errors 6) restart spamd 7) ls -lh /var/mail/.spamassassin/* -rw-r--r-- 1 mail root 12K 2010-02-16 19:39 /var/mail/.spamassassin/auto-whitelist -rw-r--r-- 1 mail root6 2010-02-16 19:39 /var/mail/.spamassassin/auto-whitelist.mutex -rw-r--r-- 1 mail root 2.7K 2011-07-31 19:53 /var/mail/.spamassassin/bayes_journal -rw-rw-r-- 1 mail root 3.8K 2011-07-31 19:50 /var/mail/.spamassassin/bayes.mutex -rw-r--r-- 1 mail root 78M 2010-02-09 12:40 /var/mail/.spamassassin/bayes_seen -rwr-- 1 root root 16K 2011-07-31 19:51 /var/mail/.spamassassin/bayes_seen.dir -rwr-- 1 root root 128M 2011-07-31 19:51 /var/mail/.spamassassin/bayes_seen.pag -rw-r--r-- 1 mail root 5.1M 2010-02-16 18:51 /var/mail/.spamassassin/bayes_toks -rwr-- 1 root root 4.0K 2011-07-31 19:51 /var/mail/.spamassassin/bayes_toks.dir -rwr-- 1 root root 4.0M 2011-07-31 19:51 /var/mail/.spamassassin/bayes_toks.pag -rw-r--r-- 1 mail root 1.2K 2010-02-09 10:20 /var/mail/.spamassassin/user_prefs file /var/mail/.spamassassin/* /var/mail/.spamassassin/auto-whitelist: Berkeley DB (Hash, version 8, native byte-order) /var/mail/.spamassassin/auto-whitelist.mutex: ASCII text /var/mail/.spamassassin/bayes_journal:ASCII text /var/mail/.spamassassin/bayes.mutex: ASCII text /var/mail/.spamassassin/bayes_seen: Berkeley DB (Hash, version 8, native byte-order) /var/mail/.spamassassin/bayes_seen.dir: DOS executable (device driver) for DOS /var/mail/.spamassassin/bayes_seen.pag: data /var/mail/.spamassassin/bayes_toks: Berkeley DB (Hash, version 9, native byte-order) /var/mail/.spamassassin/bayes_toks.dir:
How to change a database (configuration)
Hello, I am a newbie in using Spamassassin. I need your help as for configuration of spamassasin. I use Debian Lenny Spamassassin 3.2.5 and I want to change database from Berkeley DB to SDBM (because this database was recommended me on this forum). 1) I have to dump Berkeley DB a then import it as a SDBM. 2) I have to change configuration Spamassassin - set SDBM as a work database now. I found a guide how to change database, but I doesnt exactly understand what I have to do. The guide: http://wiki.mailscanner.info/doku.php?id=documentation:anti_spam:spamassassin:bayes:sdbm Could you give me a clue? Thanks a lot. -- View this message in context: http://old.nabble.com/How-to-change-a-database-%28configuration%29-tp31749051p31749051.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Spamassasin - SQLITE as storage database
Thanks for your post. Unfortunately my boss wants to use just SQLITE. Lawrence @ Rogers wrote: On 17/05/2011 12:06 PM, monolit939 wrote: Hello, do you have any experience with usage of SQLITE database as storage for Spamassassin? Spamassassin uses Berkeley DB, but I need to replace it. I could not find any manual, guide or just phorum discussion about colaboration Sapmassassin with SQLITE. I apreciate each advice. Thanks a lot I have no experience with this, but I do have experience with using MySQL with InnoDB tables. The performance is actually much better than Berkley DBs. Regards, Lawrence -- View this message in context: http://old.nabble.com/Spamassasin---SQLITE-as-storage-database-tp31637392p31644523.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: SA-learn (spamassassin)
Good morning. The output of sa-learn --dump magic after bayes learning is +1 nspam/nham. I tried the command several. times. I tried write the mail with Subject: viagra; body: viagra and sent it from my first account to the my second account(score 0,4). Then I used sa-learn -spam for this mail. I wrote the same mail and sent it from account one to the second. The mail gain higher score 2.4. I took this mail and used sa-learn -spam. I wrote the same mail and repeat the sending(From 1. account to the second). The score was again higher 3.4. I tried it still several times but the score didnt grow... Thats was my small experiment with scoring by bayes. My spamd process run under root. I started sa-learn under root. BUT the database is in /root directory and the same database is in /home/spamfilter directory. Spamfilter is user which is state in master.cf. In spamassassin (local.cf) I have record for the bayes database and the path is /home/spamfilter... When I started sa-learn under root so I check time of updating database. The database under user spamfilter is correctly updated(under root isnt updated). I know it is strange and confusing ...use two user for this. I wish all function and so on ran under one user, but I dont know how start up spamd under spamfilter. I am not sure if is it the right... maybe spamd should running under root. Here is my modification from master.cf(postfix). This modification is recommended by spamassassin www pages. smtp inet n - n - - smtpd -o content_filter=spamfilter:dummy # Interfaces to non-Postfix software. Be sure to examine the manual # pages of the non-Postfix software to find out what options it wants. # spamfilter unix - n n - - pipe flags=Rq user=spamfilter argv=/usr/local/bin/spamfilter -f ${sender} -- ${recipient} Thank you for explanation how bayes works and for time which you devoted to me. -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24786173.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: SA-learn (spamassassin)
I got you output of the command sa-learn --dump magic. About your end of your report...it could not be AWL because I have AWL disabled. I had lucky today ... my chief was busy. I will present my solutions tomorrow:) -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24793498.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: SA-learn (spamassassin)
If you are so clever (because I am bad english speaker) you can explain me this problematics in my mail(po slovensky). Its problem for you? I didnt enough good materials about this theme in czech language. -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24794082.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: SA-learn (spamassassin)
Lieber Karl. I dont know of what command output you need? You said sa-learn --dump magic. What can I think about your requirement? I am total confusing from you ... -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24794182.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: SA-learn (spamassassin)
I am really sorry I am tired from work. I made mistake with your name. This task is serious please dont joke (Oh, yeah, you do. We established that before. After all, this entire thread started as a copy-n-paste from an Ubuntu forum. ) I established the thread hier and then copy it to the Ubuntu forum. How I told I need necessary help. ...and I am not anonymous I have nick... The command didnt run! [r...@localhost 3.002005]# sa-learn --dump magic //start Unrecognized escape \g passed through in regex; marked by -- HERE in m/(?i)\g -- HERE irls\b/ at /usr/lib/perl5/vendor_perl/5.8.8/Mail/SpamAssassin/Conf/Parser.pm line 1173. 0.000 0 3 0 non-token data: bayes db version 0.000 0 67 0 non-token data: nspam 0.000 0 29 0 non-token data: nham 0.000 0 1588 0 non-token data: ntokens 0.000 0 1247338497 0 non-token data: oldest atime 0.000 0 1249317365 0 non-token data: newest atime 0.000 0 1249317143 0 non-token data: last journal sync atime 0.000 0 0 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count [r...@localhost 3.002005]# //stop This is output...the command arent running. But the almost same output I given to the forum. The single difference was that my first post had not prompt. -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24795226.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Razor, spamassassin - network test
I am really sorry it was mistake - I was yesterday very tired. Back on-list. I'm not a personal help-line. When I use spamassassin -t -D razor2 /tmp/spam so I dont get the hash and so on but content analysis details...bayes clasification and so on. I expected message like debug: Razor is available debug: Razor Agents 1.20, protocol version 2. debug: Read server list from /home/jgb/.razor.lst debug: 72636 seconds before closest server discovery debug: Closest server is 209.204.62.150 debug: Connecting to 209.204.62.150... debug: Connection established debug: Signature: 48e74b8496877ba45072b201b41eebed7038186b debug: Server version: 1.11, protocol version 2 debug: Server response: Negative 48e74b8496877ba45072b201b41eebed7038186b debug: Message 1 NOT found in the catalogue I dont have any idea howto do razor works. This command(spamassassin -t -D razor2 /tmp/spam) is without --lint and its recommended by spamassassin www pages.so I am begginer in this field and therefore I need accurate advise. Thanks for your help -- View this message in context: http://www.nabble.com/Razor%2C-spamassassin---network-test-tp24773506p24776602.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: SA-learn (spamassassin)
I read spamassassin docs... I found out the following: Sa-learn --spam Learn the input message(s) as spam. If you have previously learnt any of the messages as ham, SpamAssassin will forget them first, then re-learn them as spam. Alternatively, if you have previously learnt them as spam, it'll skip them this time around. If the messages have already been filtered through SpamAssassin, the learner will ignore any modifications SpamAssassin may have made. ...and the following bayes_min_ham_num (Default: 200) bayes_min_spam_num (Default: 200) To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings. I changed the value on 1(I use this for testing and my self-learning its my homework). According to me - spam bayes learning was activated. When I use sa-learning so bayes learn that the mail is spam. And bayes learn the signatures... Therefore is for me strange when I send the same mail again so bayes dont mark this mail like spam? I dont understand this. I realize all conditions - sa-learn --spam --file mail. bayes_min_spam_num 1. The date the databaze was too changed(but the size stay the same). nspam was increased... I really dont understand what use is SA-LEARN! I have feel that the bayes dont work correctly- bayes ignore sa-learn. I am perhaps silly but I dont understand how it works:(( I am interesred how tell to bayes THIS MAIL IS SPAM(by using sa-learn), WHEN THIS SAME MAIL COME AGAIN SO YOU HAVE TO MARK LIKE SPAM! I know that bayes find similar element between mail and according to decide. But when I mark mail like spam a next mail have 100% similarity so bayes HAVE TO mark it like SPAM. It is logical acording to me. -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24777034.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Razor, spamassassin - network test
I understand that I must read whole output(message(TOP message)). But the output this command is very fast and it stop at the end. I dont catch TOP of message. I tried | more switch but it didint help. I tried redirecting output to the file but it doesnt work. The file was empty:( I dont know how can I read the TOP of output message. The last things from spamassassin web is: Edit your spamd start-up script, or start-up options file (depending on which OS you're running, these may be different). There should be a -L or --local switch in that file. Remove it to enable network tests. I cant find the file with this switch - I use CentOS distro. -- View this message in context: http://www.nabble.com/Razor%2C-spamassassin---network-test-tp24773506p24780477.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: SA-learn (spamassassin)
FROM SA WWW bayes_min_ham_num (Default: 200) bayes_min_spam_num (Default: 200) To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings. I have theory ...I know you will think thats bad but I tried explain how I understand SA documentation. When I set the bayes_min_spam_num 1 so it means that Bayes learn system will be activate. And now for example: I got mail. I use sa-learn --spam --file mail. SA save the mail(or some signature to the database). And when I got the same mail again so Bayes looks to the database a he says: a the same mail like in my database which is marked like spam, and he mark the mail like spam. According to me is it logical. What is strange when I use SA-LEARN so database dont expand the size, but the time of modification is the same when I sa-learn started. -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24780842.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Razor, spamassassin - network test
Your command works! I found in spamassassin -D razor2 sample.msg 21 | less message the following: check[9444]: [ 6] a=ce=4ep4=7542-10s=4uO_brp3_KWEDuqMYXBVHI-4-FwA But I dont know how to recognize that is a signature(hash) of the mail. In the old version it was clearly marked for example: debug: Signature: 48e74b8496877ba45072b201b41eebed7038186b. My second question is: When I send mail for example from XP a) station to XP b) station so spamassassin write to header of mail x-spam-status and so on. According to I recognise that mail was checked by using SA rules, bayes(autolearn), but how can I recognize that the mail was really checked by Razor? In mail header isnt any info and in razor.log is too any info(about checking the mail) -- View this message in context: http://www.nabble.com/Razor%2C-spamassassin---network-test-tp24773506p24781568.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: SA-learn (spamassassin)
Question is logical. When SA learnt new spam/ham so SA have to write new info to the database and I think that database have to increase size. If you have for example *.doc file and you modify it. You add several words - *.doc will be bigger(increase his size). -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24781719.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: SA-learn (spamassassin)
To Benny Pedersen: I understand your explanation about increasing of spamassassin database. Your example with md5 is clearly. Ok thank you very much! To by Karsten Bräckelmann-2: I want to apologize for my approach - I use Ubuntu and other forums because I am hopeless because my homework was install configure and run antispam(spamassassin, ClamAV, Clamsmtp,razor, postfix). Now I am under pressure because tomorrow I have to deliver my solution to my chief... I must explain to him how it works and so on. the number of spam exceeding the bayes_min_spam_num value does not activate Bayes *learn*ing. It means that Bayes will classify mail -- based on what it learned before. it keeps track of *tokens*, and the number they have been seen in ham or spam. Your explanation is confusing for me, because you claim value of min_spam_num means that Bayes will classify mail -- based on what it learned before My min_spam_num value is 1. I get the first mail. Subject: viagra; body: viagra. I use sa - learn -spam for this mail. I get new mail: Subject: viagra; body: viagra. What will do Bayes according to you? Keep in mind your words The bayes_min_(ham|spam)_num values ONLY control, how many messages Bayes needs to have learned, before it should start classifying mail. = my Bayes can classifying mail(because min_spam_num value is 1 = the condition is accomplish). What now? Will be my new mail mark like spam? Or will get any higher score...? And again, 1 is not a sane number. - I endeavour to explain to you that this is only homework. Why number 1? Because I want to see on my own eyes how bayes works. I dont have time find many really spam(I know the number must be bigger about 1000 - its OK I knew it). -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24782439.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
SA-learn (spamassassin)
Hello, I found out the following information: my SPAMD daemon is running under root. But I have in master.cf(postfix configuration file) the following lines: Postfix master process configuration file. For details on the format # of the file, see the master(5) manual page (command: man 5 master). # # == # service type private unpriv chroot wakeup maxproc command + args # (yes) (yes) (yes) (never) (100) # == smtp inet n - n - - smtpd -o content_filter=spamfilter:dummy == == # Interfaces to non-Postfix software. Be sure to examine the manual # pages of the non-Postfix software to find out what options it wants. # # Many of the following services use the Postfix pipe( delivery # agent. See the pipe( man page for information about ${recipient} # and other message envelope options. # == == spamfilter unix - n n - - pipe flags=Rq user=spamfilter argv=/usr/local/bin/spamfilter -f ${sender} -- ${recipient} Spamfilter is user for spamassassin(spamd)(but for me is strange that spamd is running under root). I configured master.cf according to h-t-t-p://onetforum.com/fourm/viewtopic.php?p=27]Kalinga's]Kalinga's Community Support Forum bull; View topic - Integrating Spam Assassin with Postfix(h-t-t-p replace by http) It is recomended by spamassassin original www pages. In local.cf I have: bayes_path /home/spamfilter/.spamassassin/bayes. And now when I send mail(for example at 21:00 oclock) which spamassassin mark like autolearn= spam and I show to the /home/spamfilter/.spamassassin/bayes so I can see that files bayes_tooks nad bayes_seen was modified in 21:00 but their size didnt change? How is it possible - when spamssassin changes the files so they have to increase their size...When I type command sa-learn --dump magic so I can see that in row nspam increase his value +1. This is confirmation that autolearn works.(but the database dont increase his size). My second problem: I get mail with sign autolearn=ham. I take the mail and I use the following command: sa-learn --spam --file mail (at 21:55 oclock)l. When type sa-learn --dump magic so I can see that nspam was increased +1 its OK. But when I look to the /home/spamfilter/.spamassassin I can see that database file was change but their size didnt change. Its normal??? And the last problem: When I get mail with sign autolearn=ham so I tried type sa-learn --spam --file mail. When I got the same mail so spamassassin mark the mail again autolearn=ham. How is it possible when I learn bayes by hand (sa-learn --spam --file mail) that this mail is spam? I have explicit set in local.cf bayes_min_spam_num 1. This means that for bayes is sufficient one mail for learning(according to me). But it dosesnt work. Thanks for advise(I need it necessary). Sorry for my terrible english. -- View this message in context: http://www.nabble.com/SA-learn-%28spamassassin%29-tp24773517p24773517.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: Razor, spamassassin - network test
I tried it without --lint just spamassassin --lint -D razor2 so the command line freeze(dont work). When I use spamassassin -t -D razor2 /tmp/spam so I dont get the hash and so on but content analysis details...bayes clasification and so on. I expected message like : debug: Razor is available debug: Razor Agents 1.20, protocol version 2. debug: Read server list from /home/jgb/.razor.lst debug: 72636 seconds before closest server discovery debug: Closest server is 209.204.62.150 debug: Connecting to 209.204.62.150... debug: Connection established debug: Signature: 48e74b8496877ba45072b201b41eebed7038186b debug: Server version: 1.11, protocol version 2 debug: Server response: Negative 48e74b8496877ba45072b201b41eebed7038186b debug: Message 1 NOT found in the catalogue Can you type accurate command for using razor. I want test the mail... Create hash ...send it to the server ang get the answer(is spam or ham). -- View this message in context: http://www.nabble.com/Razor%2C-spamassassin---network-test-tp24773506p24773657.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.