Re: Bayes_vars records on MySQL not created automatically
Thanks for your answer Michael. Yes you are right, using sa-learn --sync as one of the user SA will create the proper record on the bayes_vars table. So I guess this is only a problem of having not received enough ham/spam email with this user. Matteo Da: Michael Parker par...@herk.net A: Matteo Dessalvi mte...@yahoo.it Cc: users@spamassassin.apache.org users@spamassassin.apache.org Inviato: Mercoledì 8 Maggio 2013 18:43 Oggetto: Re: Bayes_vars records on MySQL not created automatically On May 8, 2013, at 8:06 AM, Matteo Dessalvi mte...@yahoo.it wrote: I always thought that SA would be able to operate autonomously and that it will create the proper records in all the tables of the DB. Am I missing something? Is this the designed behavior? It's been awhile since I wrote and looked at the code, but I'm pretty sure that the bayes_var entry won't be created until you learn something as that user. Try doing an sa-learn or an auto-learn for that user and see what happens. If memory serves the behavior was deliberate so that you wouldn't get hundreds of entries in bayes_var when messages are checked for users who may not be real. Michael
Re: autolearn_discriminator callback not getting called.
So i was able to write a plugin that overrides the AWL check_from_in_auto_whitelist() eval rule. Thanks for the help Karsten. = /etc/spamassassin/25_AwlIgnore.cf ##AWL ignore address types (periods in names are not supported) ## @see AwlIgnore.pm awl_ignore_from postmaster mailer-daemon ## Overridden AWL params # header AWL eval:awl_ignore_check_from_in_auto_whitelist() describe AWLFrom: address is in the auto white-list tflags AWL userconf noautolearn priority AWL1000 = /etc/spamassassin/init_AwlIgnore.pre ## Enable the AWL Ignore plugin loadplugin Mail::SpamAssassin::Plugin::AwlIgnore = AwlIgnore.pm package Mail::SpamAssassin::Plugin::AwlIgnore; # # This plugin overrides the AWL check_from_in_auto_whitelist() method in order # to ignore specific types of addresses from getting into the whitelist database. # For Example, don't add addresses like postmas...@example.com to the AWL database. # # To activate this plugin, # 1) Enable the Mail::SpamAssassin::Plugin::AWL plugin # 2) Enable this plugin e.g. loadplugin Mail::SpamAssassin::Plugin::AwlIgnore # check /etc/spamassassin/init_AwlIngore.pre # 3) Update your config. Check 25_AwlIgnore.cf. Should look something like this: # # ### AWL ignore address types (periods in names are not supported) ## @see AwlIgnore.pm # awl_ignore_from postmaster mailer-daemon # ## Overridden AWL params # # header AWL eval:awl_ignore_check_from_in_auto_whitelist() # describe AWLFrom: address is in the auto white-list # tflags AWL userconf noautolearn # priority AWL1000 # # # @todo - support ignoring local addresses w/ . in them i.e. user.name...@example.com use Mail::SpamAssassin::Plugin; use strict; use vars qw(@ISA); @ISA = qw(Mail::SpamAssassin::Plugin); # constructor: register the eval rule sub new { my $class = shift; my $mailsaobject = shift; # some boilerplate... $class = ref($class) || $class; my $self = $class-SUPER::new($mailsaobject); bless ($self, $class); $self-set_config($mailsaobject-{conf}); $self-register_eval_rule ('awl_ignore_check_from_in_auto_whitelist'); return $self; } # # Load params from config # sub set_config { my($self, $conf) = @_; my @cmds; =item awl_ignore_from Ignore address types from going into the AWL database. =cut push (@cmds, { setting = 'awl_ignore_from', type = $Mail::SpamAssassin::Conf::CONF_TYPE_ADDRLIST }); $conf-{parser}-register_commands(\@cmds); } # # Replace check_from_in_auto_whitelist() # sub awl_ignore_check_from_in_auto_whitelist { my ($self, $pms) = @_; return 0 unless ($pms-{conf}-{use_auto_whitelist}); my $timer = $self-{main}-time_method(total_awl); my $from = lc $pms-get('From:addr'); return 0 unless $from =~ /\S/; ## ignore addresses in awl_ignore_from foreach (keys %{$pms-{conf}-{awl_ignore_from}}) { if ($from =~ /$_\@/) { dbg(auto-whitelist: AWL ignoring . $from); return 0; } } # find the earliest usable originating IP. ignore private nets my $origip; foreach my $rly (reverse (@{$pms-{relays_trusted}}, @{$pms-{relays_untrusted}})) { next if ($rly-{ip_private}); if ($rly-{ip}) { $origip = $rly-{ip}; last; } } my $scores = $pms-{conf}-{scores}; my $tflags = $pms-{conf}-{tflags}; my $points = 0; my $signedby = $pms-get_tag('DKIMDOMAIN'); undef $signedby if defined $signedby $signedby eq ''; foreach my $test (@{$pms-{test_names_hit}}) { # ignore tests with 0 score in this scoreset, # or if the test is marked as noautolearn next if !$scores-{$test}; next if exists $tflags-{$test} $tflags-{$test} =~ /\bnoautolearn\b/; $points += $scores-{$test}; } my $awlpoints = (sprintf %0.3f, $points) + 0; # Create the AWL object my $whitelist; eval { $whitelist = Mail::SpamAssassin::AutoWhitelist-new($pms-{main}); my $meanscore; { # check my $timer = $self-{main}-time_method(check_awl); $meanscore = $whitelist-check_address($from, $origip, $signedby); } my $delta = 0; dbg(auto-whitelist: AWL active, pre-score: %s, autolearn score: %s, . mean: %s, IP: %s, address: %s %s, $pms-{score}, $awlpoints, !defined $meanscore ? 'undef' : sprintf(%.3f,$meanscore), $origip || 'undef', $from, $signedby ? signed by $signedby : '(not signed)'); if (defined $meanscore) { $delta = $meanscore - $awlpoints; $delta *= $pms-{main}-{conf}-{auto_whitelist_factor}; $pms-set_tag('AWL', sprintf(%2.1f,$delta)); if (defined $meanscore) { $pms-set_tag('AWLMEAN', sprintf(%2.1f, $meanscore)); } $pms-set_tag('AWLCOUNT', sprintf(%2.1f, $whitelist-count())); $pms-set_tag('AWLPRESCORE', sprintf(%2.1f,
OT: installing on CentOS 6.4
I'm installaling latest CentOS, and would like to have SA in that too. But to my disappointment, it has only SA 3.1.1 and no Razor nor Pyzor. What would be the best method of get somewhat up to date SA to this box? -- You can rent this space for only $5 a week. signature.asc Description: OpenPGP digital signature
Re: OT: installing on CentOS 6.4
On 5/10/2013 12:22 PM, Jari Fredriksson wrote: I'm installaling latest CentOS, and would like to have SA in that too. But to my disappointment, it has only SA 3.1.1 and no Razor nor Pyzor. What would be the best method of get somewhat up to date SA to this box? rpmforge has SA 3.3.2. http://wiki.centos.org/AdditionalResources/Repositories/RPMForge -- Bowie
RE: installing on CentOS 6.4
pyzor and perl-Razor-Agent are in epel. Cheers, Phil -Original Message- From: Jari Fredriksson [mailto:ja...@iki.fi] Sent: 10 May 2013 17:22 To: SpamAssassin Users Subject: OT: installing on CentOS 6.4 I'm installaling latest CentOS, and would like to have SA in that too. But to my disappointment, it has only SA 3.1.1 and no Razor nor Pyzor. What would be the best method of get somewhat up to date SA to this box? -- You can rent this space for only $5 a week. Hoople Ltd, Registered in England and Wales No. 7556595 Registered office: Plough Lane, Hereford, HR4 OLE Any opinion expressed in this e-mail or any attached files are those of the individual and not necessarily those of Hoople Ltd. You should be aware that Hoople Ltd. monitors its email service. This e-mail and any attached files are confidential and intended solely for the use of the addressee. This communication may contain material protected by law from being passed on. If you are not the intended recipient and have received this e-mail in error, you are advised that any use, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited. If you have received this e-mail in error please contact the sender immediately and destroy all copies of it.
Re: OT: installing on CentOS 6.4
10.05.2013 19:27, Bowie Bailey kirjoitti: On 5/10/2013 12:22 PM, Jari Fredriksson wrote: I'm installaling latest CentOS, and would like to have SA in that too. But to my disappointment, it has only SA 3.1.1 and no Razor nor Pyzor. What would be the best method of get somewhat up to date SA to this box? rpmforge has SA 3.3.2. http://wiki.centos.org/AdditionalResources/Repositories/RPMForge Installed rpmforge, but still offers 3.3.1. I guess I have to cpan. -- Q: How many Zen masters does it take to screw in a light bulb? A: None. The Universe spins the bulb, and the Zen master stays out of the way. signature.asc Description: OpenPGP digital signature
Re: OT: installing on CentOS 6.4
On 5/10/2013 1:09 PM, Jari Fredriksson wrote: 10.05.2013 19:27, Bowie Bailey kirjoitti: On 5/10/2013 12:22 PM, Jari Fredriksson wrote: I'm installaling latest CentOS, and would like to have SA in that too. But to my disappointment, it has only SA 3.1.1 and no Razor nor Pyzor. What would be the best method of get somewhat up to date SA to this box? rpmforge has SA 3.3.2. http://wiki.centos.org/AdditionalResources/Repositories/RPMForge Installed rpmforge, but still offers 3.3.1. I guess I have to cpan. That's strange. Now that I actually try to install it myself, I see the same thing (my home server uses this repo, but I haven't updated in a while). But if you follow the link for the list of packages on the wiki page, it lists 3.3.2 for Centos 4, 5, and 6. And, even stranger, if I browse directly to the repo url, I can't find spamassassin at all. Maybe you should ask on their list. http://lists.repoforge.org/mailman/listinfo/users -- Bowie
RE: Default Bayes Database
You all are keeping me sane and grounded as I deal with the Powers That Be here trying to set this up. It's good to know that I'm not wrong (I agree with everything everyone has said, and pointed out from the beginning a default database would be awful). And this: If he insists on starting with a pre-populated Bayes database, he sure knows why. Other than I'm the boss, I want. ... Is exactly right too. We're implementing it locally with auto-learning enabled this weekend (oh, yeah, boss didn't want auto-learning enabled either..). So here goes!! Thanks for all your help. -Original Message- From: Karsten Bräckelmann [mailto:guent...@rudersport.de] Sent: Wednesday, May 08, 2013 8:18 PM To: users@spamassassin.apache.org Subject: Re: Default Bayes Database On Wed, 2013-05-08 at 14:09 -0400, Andrew Talbot wrote: Well, I certainly hope someone offers to help! Heh! I am really confident, Alex didn't mean to be rude, neither that he actually hopes no one will help you. Quite the contrary... He DID try to help you by explaining why a default Bayes database is a bad idea in the first place. And that was his way of telling you... If only to say there is no default database. That. :) There is none, and there never has been. As we've spoken about off-list, my boss is being very particular about the deployment of Bayes, and it sounds like one of his caveats is that we don't start from a blank database. I can see how the idea of basing off of some known to be classified tokens sounds tempting. However, there is no such token. None. Just try to imagine working in an industry where e.g. Viagra and Cialis are totally legit phrases to use... Feel free to direct your boss here. If he insists on starting with a pre- populated Bayes database, he sure knows why. Other than I'm the boss, I want. Anyway, Andrew, your idea of that whole blank slate is inaccurate. If you import someone else's data, before importing your database has been empty. If you collect some ham and spam for initial training, before training your database has been empty. You even do NOT have to deploy SA prior to that. I don't know the size of your user base, but it seems it shouldn't be hard to have a few of the users chip in. Get a few of them to collect hand-classified ham and spam for you. Train Bayes with that. After that, deploy SA to your mail processing chain. There you go! A pre-populated Bayes database, based on YOUR particular ham and spam tokens, before deploying SA in production. -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4 ; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Default Bayes Database
On Wed, 08 May 2013 19:32:26 +0200 Axb axb.li...@gmail.com wrote: - your HAM is somebody else's SPAM Do you have evidence for that? The reason I ask is that one of the main features of our (commercial) anti-spam solution is a very large Bayes database. Once a night, we aggregate all the tokens from votes from all of our customers and push out a Bayes database containing tokens for the last 21 days from about 3.2 million spam and 3.4 million ham messages. It works really well and we find that even our highly diverse customer database agrees substantially on spam vs. ham. There was a USENIX paper on this topic quite a while ago: http://static.usenix.org/event/lisa04/tech/blosser/blosser_html/ It won the best paper award for LISA '04. - A decent Bayes DB is highly dynamic and yesterday's tokens from someone else's traffic will be useless to you traffic, today. Not true. Bayes data remains relevant for several days, if not weeks or months. Obviously, our system *also* includes individual Bayes databases that adapt to specific users' mail flows and updates more than once a day, but even the daily-updated central database is surprisingly good. (It seems that a large sample size is the key.) Karsten Bräckelmann wrote: Just try to imagine working in an industry where e.g. Viagra and Cialis are totally legit phrases to use... Actually, we find that is not a problem because spammers use things like Vi@gr@ and C1AL1S that are far more damning than the unmodified words themselves. Also, our Bayes implementation uses word pairs as well as individual words which improves its selectivity. Anyway, my main point is this: Don't dismiss a shared Bayes database without supplying evidence that it's a bad idea. :) Regards, David.
Re: Default Bayes Database
On Fri, 2013-05-10 at 15:51 -0400, David F. Skoll wrote: On Wed, 08 May 2013 19:32:26 +0200 Axb axb.li...@gmail.com wrote: - your HAM is somebody else's SPAM Do you have evidence for that? Evidence... examples, rather. I happened to be the lucky recipient of specific spam campaigns in languages I do not speak. Campaign referring to quite a few samples during a specific, relatively short time period. This definitely happened with French, Spanish, and Turkish. Odds are high for any word in those languages being on the seriously spammy side. Unlike for anyone actually speaking these languages... Being easily associated with particular water sports is like a magnet for getting spammed with totally unrelated water sports. One style is good, all others are bad-ish. That would be the same for other folks, though with different signs. I do receive quite specific campaigns, plain text, no obfuscation, offering private health insurance (Private Krankenversicherung in German). That is a totally valid phrase. Unlike English, German tends to concatenate words to form specifics -- Krankenversicherung is pretty much a word-by-word translation of health insurance. This makes the word more rare, health on its own in comparison hardly gives a hint. And the totally legit word is spammy for me, because I usually do not talk about that topic in mail. My next door neighbor probably would disagree... Your ham is someone else's spam on a different level: There are quite a few reports in bugzilla, where an obfuscation pattern matches a legit word in non-English languages. Accents are good for obfuscation. But accents also are entirely legit. Paypal. And them notifying their customers about changes in the terms of use. And actually sending out the full terms of use in the same mail. In this case, again, German -- but they managed to score a whopping 12.2 once for me. Yes, of course, BAYES_99. Plus some other shady-business indicating rules, triggered various times: FUZZY_CREDIT, TRACKER_ID, URI_DOT_INFO. Oh, lovely. That 2009 sample has FUZZY_VLIUM and FRT_VALIUMx. Karsten Bräckelmann wrote: Just try to imagine working in an industry where e.g. Viagra and Cialis are totally legit phrases to use... Actually, we find that is not a problem because spammers use things like Vi@gr@ and C1AL1S that are far more damning than the unmodified words themselves. That was one quick example. See above for a similar scenario not involving medication, but sports. Also, our Bayes implementation uses word pairs as well as individual words which improves its selectivity. Good for you, but that is irrelevant to the discussion at hand, which is about the Bayes engine in SA. -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Default Bayes Database
David F. Skoll wrote: Axb wrote: - your HAM is somebody else's SPAM Do you have evidence for that? The reason I ask is that one of the main features of our (commercial) anti-spam solution is a very large Bayes database. Once a night, we aggregate all the tokens from votes from all of our customers and push out a Bayes database containing tokens for the last 21 days from about 3.2 million spam and 3.4 million ham messages. It works really well and we find that even our highly diverse customer database agrees substantially on spam vs. ham. The weasel words agrees substantially is telling. If it isn't 100% with no false positives then at least one of those messages does not agree. That would be the evidence requested. I am not saying that your technique isn't useful. It is very pragmatic. I am sure it is very effective. I would probably do that myself. But it isn't 100%. And would you suggest distributing your well-averaged database to people who install SpamAssassin to as to seed their Bayes? How would that be distributed for users to use when installing SpamAssassin? And if you did how would this large corpus of learned symbols affect the smaller amount of messages the user trains with when they train-on-error? Would it swamp it by the much larger numbers? It is trouble. I think having users start with a blank slate and then start learning from their own messages makes the most sense. And users can always learn from their current mailbox of past messages so it isn't much hardship. Bob
Re: Default Bayes Database
On Fri, 10 May 2013 15:34:13 -0600 Bob Proulx b...@proulx.com wrote: The weasel words agrees substantially is telling. If it isn't 100% with no false positives then at least one of those messages does not agree. That would be the evidence requested. I am not saying that your technique isn't useful. It is very pragmatic. I am sure it is very effective. I would probably do that myself. But it isn't 100%. Nothing is 100%. Even personal Bayes databases are not 100%. And would you suggest distributing your well-averaged database to people who install SpamAssassin to as to seed their Bayes? We have a distribution mechanism built into our software. I think having users start with a blank slate and then start learning from their own messages makes the most sense. Maybe. But I know that our (commercial) customers expect high catch rates out of the box, and we get that with our shared Bayes database. And users can always learn from their current mailbox of past messages so it isn't much hardship. Right; pretend you're a salesperson trying to sell an anti-spam product. Oh, you just have to go through your old mailbox and classify a few hundred messages by hand... then the system will work great! No sale. Regards, David.
Re: Default Bayes Database
On Fri, 10 May 2013 23:14:36 +0200 Karsten Bräckelmann guent...@rudersport.de wrote: I happened to be the lucky recipient of specific spam campaigns in languages I do not speak. Campaign referring to quite a few samples during a specific, relatively short time period. This definitely happened with French, Spanish, and Turkish. Odds are high for any word in those languages being on the seriously spammy side. Unlike for anyone actually speaking these languages... We (probably) have a much larger sample population, so this tends not to be as much of a problem for us. I do receive quite specific campaigns, plain text, no obfuscation, offering private health insurance (Private Krankenversicherung in German). That is a totally valid phrase. Unlike English, German tends to concatenate words to form specifics -- Krankenversicherung is pretty much a word-by-word translation of health insurance. This makes the word more rare, health on its own in comparison hardly gives a hint. And the totally legit word is spammy for me, because I usually do not talk about that topic in mail. My next door neighbor probably would disagree... Again, the key is a large sample size. Your ham is someone else's spam on a different level: There are quite a few reports in bugzilla, where an obfuscation pattern matches a legit word in non-English languages. These are edge cases that are pretty easily handled with personal Bayes databases or whitelisting if the system keeps getting it wrong. Accents are good for obfuscation. But accents also are entirely legit. And we can tell which is which, based on a large sample size. Paypal. And them notifying their customers about changes in the terms of use. And actually sending out the full terms of use in the same mail. In this case, again, German -- but they managed to score a whopping 12.2 once for me. Yes, of course, BAYES_99. Was this with your personal Bayes data? Even that can be wrong sometimes... Regards, David.
Bayes Data Base
Hello, I was curious if somebody out there publishes a Spamassassin Bayes SPAM/HAM data base that someone could buy or subscribe to? If so, please provide details if known. Thanks, Rick
Re: [SA-Users] Re: OT: installing on CentOS 6.4
On Fri, May 10, 2013 at 02:42:21PM -0400, Bowie Bailey wrote: That's strange. Now that I actually try to install it myself, I see the same thing (my home server uses this repo, but I haven't updated in a while). But if you follow the link for the list of packages on the wiki page, it lists 3.3.2 for Centos 4, 5, and 6. And, even stranger, if I browse directly to the repo url, I can't find spamassassin at all. Maybe you should ask on their list. http://lists.repoforge.org/mailman/listinfo/users Due to the fact that the rpmforge package stomps on the SA package in CentOS base it is in the rpmforge-extras repo not the mainline rpmforge repo. Add exclude=spamassassin to /etc/yum.repos.d/CentOS-Base.repo to prevent it from being installed/updated from base/updates and then just install it via yum with yum --enablerepo=rpmforge-extras install spamassassin. John -- The basic problem can be summed up with four numbers: 0.26% of Americans give more than $200 in a congressional election; 0.05% max out; 0.01% give more than $10,000; .63% -- 196 Americans -- have given more than 80% of the superPAC money spent so far in this election. -- Larry Lessig: The corruption of the American political system, Durham, NC, posted 13 June 2012 by Melanie Chernoff pgpI0vI49GCsI.pgp Description: PGP signature
Re: autolearn_discriminator callback not getting called.
On Fri, 2013-05-10 at 08:57 -0700, psychobyte wrote: So i was able to write a plugin that overrides the AWL check_from_in_auto_whitelist() eval rule. Thanks for the help Karsten. You're welcome. Was actually fun digging through the code. # awl_ignore_from postmaster mailer-daemon Configuration option, nice, yeah. ## Overridden AWL params # # header AWL eval:awl_ignore_check_from_in_auto_whitelist() # describe AWLFrom: address is in the auto white-list # tflags AWL userconf noautolearn # priority AWL1000 # Replace check_from_in_auto_whitelist() # sub awl_ignore_check_from_in_auto_whitelist { my ($self, $pms) = @_; [...] ## ignore addresses in awl_ignore_from foreach (keys %{$pms-{conf}-{awl_ignore_from}}) { if ($from =~ /$_\@/) { dbg(auto-whitelist: AWL ignoring . $from); return 0; } } # find the earliest usable originating IP. ignore private nets my $origip; Whoa, is this... Yes, a copy of the check_from_in_auto_whitelist() function from the AWL plugin. Code duplication. Any reason you didn't just hack the AWL.pm code? All you would need is the contents of your plugin's sub set_config, and the single foreach loop doing the actual work. Slightly more than 10 lines, including your POD. (Yay for that, btw!) No overriding of the existing AWL rule definition, just a single conf line. No naughty code duplication. -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Default Bayes Database
On Fri, 2013-05-10 at 17:58 -0400, David F. Skoll wrote: On Fri, 10 May 2013 23:14:36 +0200 Karsten Bräckelmann wrote: We (probably) have a much larger sample population, so this tends not to be as much of a problem for us. This thread is about a default Bayes database, suitable for distri- bution. Not a humongous database with millions of tokens. It also would have to be usable on small sites, as well as company wide. Train on error should not be overruled by the sheer number of tokens and occurrences of them. Again, the key is a large sample size. Yup. In the outlined case, the large sample size would most likely push that token towards no man's land. It is, after all, a totally valid and actually used word. You asked for cases of your ham is someone else's spam. That is precisely one such case. Your repeated counter-argument / solution of a large sample size translates to neither ham nor spam. Not helpful. We're talking Bayes, thus in tokens. Spam for me, ham for me neighbor (yes, literally). These are edge cases that are pretty easily handled with personal Bayes databases or whitelisting if the system keeps getting it wrong. Exactly. Personal Bayes databases. The opposite of a default database. Paypal. And them notifying their customers about changes in the terms of use. And actually sending out the full terms of use in the same mail. In this case, again, German -- but they managed to score a whopping 12.2 once for me. Yes, of course, BAYES_99. Was this with your personal Bayes data? Even that can be wrong sometimes... Yes, it was. And yes, it can. :) -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Bayes Data Base
On Fri, 2013-05-10 at 16:02 -0600, Rick Cone wrote: I was curious if somebody out there publishes a Spamassassin Bayes SPAM/HAM data base that someone could buy or subscribe to? If so, please provide details if known. Wow, I'm floored. Reading the last 3 days worth of posts might get you a pretty clear picture and answer to your question. -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: Default Bayes Database
On Fri, 10 May 2013, David F. Skoll wrote: Anyway, my main point is this: Don't dismiss a shared Bayes database without supplying evidence that it's a bad idea. :) Care to share your database? :) -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- From the Liberty perspective, it doesn't matter if it's a jackboot or a Birkenstock smashing your face. -- Robb Allen --- 344 days since the first successful private support mission to ISS (SpaceX)
Re: Default Bayes Database
David F. Skoll wrote: Bob Proulx wrote: And would you suggest distributing your well-averaged database to people who install SpamAssassin to as to seed their Bayes? We have a distribution mechanism built into our software. I think having users start with a blank slate and then start learning from their own messages makes the most sense. Maybe. But I know that our (commercial) customers expect high catch rates out of the box, and we get that with our shared Bayes database. And users can always learn from their current mailbox of past messages so it isn't much hardship. Right; pretend you're a salesperson trying to sell an anti-spam product. Oh, you just have to go through your old mailbox and classify a few hundred messages by hand... then the system will work great! No sale. Your database sounds just simply wonderful. Where can I download this database so that I can start using it? Bob
Re: [SA-Users] Re: OT: installing on CentOS 6.4
11.05.2013 01:03, John R. Dennison kirjoitti: On Fri, May 10, 2013 at 02:42:21PM -0400, Bowie Bailey wrote: That's strange. Now that I actually try to install it myself, I see the same thing (my home server uses this repo, but I haven't updated in a while). But if you follow the link for the list of packages on the wiki page, it lists 3.3.2 for Centos 4, 5, and 6. And, even stranger, if I browse directly to the repo url, I can't find spamassassin at all. Maybe you should ask on their list. http://lists.repoforge.org/mailman/listinfo/users Due to the fact that the rpmforge package stomps on the SA package in CentOS base it is in the rpmforge-extras repo not the mainline rpmforge repo. Add exclude=spamassassin to /etc/yum.repos.d/CentOS-Base.repo to prevent it from being installed/updated from base/updates and then just install it via yum with yum --enablerepo=rpmforge-extras install spamassassin. Thank You Very Much :) This works. -- Its name is Public Opinion. It is held in reverence. It settles everything. Some think it is the voice of God. -- Mark Twain signature.asc Description: OpenPGP digital signature
Re: Default Bayes Database
On Fri, 2013-05-10 at 17:49 -0400, David F. Skoll wrote: Right; pretend you're a salesperson trying to sell an anti-spam product. Oh, you just have to go through your old mailbox and classify a few hundred messages by hand... then the system will work great! No sale. Most likely, and no one argued against it. Last time I checked, SA was a project aiming at the admin type of guy, not the pointy haired boss who wants to simply buy a device and get over the issue, neither the end-user. Also, SA itself is not for sale... The OP, Andrew, clearly is the admin type. I'd guess the mere fact that he's actively discussing and tended to the SA users list in the first place, is a telltale sign. On the topic of Bayes: No one argued against a shared database. As a matter of fact, SA does deliberately support site-wide shared Bayes, and offers documentation. However, shared != default -- char *t=\10pse\0r\0dtu\0.@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: installing on CentOS 6.4
10.05.2013 19:31, Randal, Phil kirjoitti: pyzor and perl-Razor-Agent are in epel. Cheers, Phil Epel is a must, wonder how I forgot it. Now I have it, and those and running. Thanks! -Original Message- From: Jari Fredriksson [mailto:ja...@iki.fi] Sent: 10 May 2013 17:22 To: SpamAssassin Users Subject: OT: installing on CentOS 6.4 I'm installaling latest CentOS, and would like to have SA in that too. But to my disappointment, it has only SA 3.1.1 and no Razor nor Pyzor. What would be the best method of get somewhat up to date SA to this box? -- You can rent this space for only $5 a week. Hoople Ltd, Registered in England and Wales No. 7556595 Registered office: Plough Lane, Hereford, HR4 OLE Any opinion expressed in this e-mail or any attached files are those of the individual and not necessarily those of Hoople Ltd. You should be aware that Hoople Ltd. monitors its email service. This e-mail and any attached files are confidential and intended solely for the use of the addressee. This communication may contain material protected by law from being passed on. If you are not the intended recipient and have received this e-mail in error, you are advised that any use, dissemination, forwarding, printing or copying of this e-mail is strictly prohibited. If you have received this e-mail in error please contact the sender immediately and destroy all copies of it. -- Good day for a change of scene. Repaper the bedroom wall. signature.asc Description: OpenPGP digital signature
Re: Default Bayes Database
On Fri, 10 May 2013 15:52:01 -0700 (PDT) John Hardin jhar...@impsec.org wrote: Anyway, my main point is this: Don't dismiss a shared Bayes database without supplying evidence that it's a bad idea. :) Care to share your database? :) Ah... hmm. :) I would be happy to share it with SA developers who might be contemplating making some sort of shared Bayes feature in SA and who would only use the database for research purposes. But I can't make it generally available. If anyone is really interested, please contact me off-list. Regards, David.