Re: Blocking Malformed "From" Headers
The SMTP protocol RFCs are pretty clear, anything in angle-brackets '<' & '>' take priority in defining an address field. So technically that's a legit local address and sendmail is doing default MSA processing on it (IE treating it as a bare username that needs the local hostname added). Is this sendmail instance just an incoming MTA or is it also used as an outgoing MSA for your users? If it's just an incoming MTA (IE your users have another instance they're using for outgoing MSA service) then just turn off the MSA feature for that specific sendmail instance to stop that processing: "FEATURE(` no_default_msa')" On Wed, 17 Jul 2024, Kirk Ismay wrote: I have a spammer using a malformed From header, as follows: From: sha...@marketcrank.com The envelope from is: direcc...@delher.com.mx, and I've set up blocks for that address. Sendmail is munging the From: header to change to , so it ends up looking like a local address to my users. How do I detect similar mangled From headers in Spamassassin? Also does anyone know how to prevent Sendmail from rewriting the From header like this? The documentation for confFROM_HEADER is a somewhat cryptic: https://www.sendmail.org/~ca/email/doc8.12/cf/m4/tweaking_config.html#confFROM_HEADER I'd rather it say instead, or reject it entirely. Thanks, Kirk -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: whitelist_auth return_path / from
On Wed, 3 Jul 2024, Simon Wilson via users wrote: Does whitelist_auth work on From header, or Return-Path? Reason I ask: I have two emails from “support .at. wasabi.com”. Due to their emails usually triggering KAM rules I have (in /etc/mail/spamassassin/local.cf): ## Whitelist Wasabi, subject to passing of auth whitelist_auth supp...@wasabi.com [snip..] The other is not triggering whitelist_auth and is marked as spam due to the KAM rule fails. It has: Return-Path: ... From: Wasabi ... Reply-To: supp...@wasabi.com Despite passing SPF and DKIM, not whitelisted: X-Spam-Score: 20.212 X-Spam-Level: X-Spam-Status: Yes, score=20.212 tagged_above=-999 required=6.2 tests=[BAYES_00=-1.9, DCC_CHECK=1.1, DCC_REPUT_99_100=1.4, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, HTML_MESSAGE=0.001, KAM_BODY_MARKETINGBL_PCCC=0.001, KAM_BODY_URIBL_PCCC=9, KAM_FROM_URIBL_PCCC=9, KAM_MARKETINGBL_PCCC=1, KAM_REALLYHUGEIMGSRC=0.5, LR_DMARC_PASS=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01] autolearn=no autolearn_force=no [snip] Thanks. Simon. You say "passing SPF and DKIM" however in the SA rules report it clearly says: DKIM_SIGNED=0.1, DKIM_INVALID=0.1 So eventho you think 'passed DKIM' SA clearly does NOT think it does. That DKIM_INVALID will prevent the whitelist_auth from firing, thus you need to investigate what's going wrong there. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Catch a rejected message ?
That depends on the milter you're using to "glue" SA to postfix. IE if you're using a milter (the thing that's triggering that "milter-reject" response) this means that Postifx is passing the messages to the milter, the milter is passing them to SA-spamd, getting the response and then feeding the results of interpreting SA's evaluation of the message. That milter-reject status is the milter's responding to Postfix. So you need to look at the capabilities of your milter to customize it's response for the particular message(s) in question. Dave On Fri, 1 Dec 2023, White, Daniel E. (GSFC-770.0)[AEGIS] via users wrote: We are using SpamAssassin 3.4.6-1 with Postfix 3.5.8-4 on RHEL 8 We are seeing occasional blocked messages that say “milter-reject” with a spam score of 8 Is there a way to capture the offending messages to figure out the problem ? Thanks -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Really hard-to-filter spam
On Wed, 2 Aug 2023, Thomas Cameron via users wrote: Wow! What a charming response! You must be a LOT of fun at parties, and have lots of friends! Please don't feed the troll. There's a reason that Reindl is blocked from this list. No, I did not get that response. I don't have any of those specific spam to sample, as I have not gotten one today. But the last spam I got that slipped through SA had this score: X-Spam-Status: No, score=-5.1 required=5.0 tests=BAYES_00,DEAR_SOMETHING, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, HTML_MESSAGE,RCVD_IN_DNSWL_HI,RCVD_IN_MSPIKE_H2,RCVD_IN_PBL, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE shortcircuit=no So nothing about any tests not working, or queries being rejected. Nothing that looks like misconfiguration on my end. I am not saying there are no misconfigurations on my end, but if there are, it's not super obvious to me. The fact that you're getting BAYES_00 on that message indicates that Bayes -really- thinks it's ham. Given that you've trained multiple instances of this kind of message to Bayes as spam but it still gets BAYES_00 score means one of two things: 1) Either you've got thousands of instances of similar messages that were learned as 'ham' 2) or the database that Bayes in your running SA instance is using is not the same one that you were doing your training to. This could be configuration issues or pilot error (using the wrong identity when doing the training, training on the wrong machine, etc). On your SA machine what does the output of "sa-learn --dump magic" show you? (IE how many nspam & nham tokens, what is the newest "atime", etc). If careful config & log inspection doesn't give clues, try this brute-force test. Shut down your SA, move the directory containing your Bayes database out of the way and create a new empty one. ("sa-learn --dump magic" should now show 0 tokens). Then train a few ham & spam messages (only a dozen or so), recheck the --dump magic to see that there are now some tokens in the database but not too many. Restart your SA and watch the log results. If there are fewer than 200 messages (both ham & spam) in your Bayes database then SA won't use it, so make sure that's the case, your new database should be too empty for SA to be willing to use it. So if you -are- getting Bayes scores then that indicates that SA is using some database other than what you think it has. Now start manually training more messages (spam & ham). When you hit the 200 count threashold Bayes scores should start showing up in your logs. Good luck. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: authres missing when ran from spamass-milter
On Wed, 31 May 2023, Matus UHLAR - fantomas wrote: [snip..] milter adds own synthetised Received: header at the very beginning, which is mosts possibly the correct reason. spamass-milter should add this header behind locally added Authentication-Results: headers, but it needs change in spamass-milter. tl;dr if those 'Authentication-Results: headers' are generated by the MTA itself the milter may not ever see them. Which agent in the whole MTA system is adding those 'Authentication-Results: headers'? Is it the master MTA itself (EG: postfix or sendmail) or is it some other milter component? A milter can only work with what it's handed by the master MTA, if the Authentication-Results: headers aren't in its input stream then it cannot work with them. In the original sendmail incarnation of the milter API it was designed so that a milter received the message input stream -before- local headers were added, thus the need for spamassassin 'glue' milters to do that Received: header synthesis. If those Authentication-Results: headers are being generated by another milter then the solution is easy, just set the MTA configuration to run that milter before the spamassassin 'glue' milter. Milter results are chained so any headers explicitly added by one milter are passed on to succeeding milters. If those headers are being generated by the MTA then it may not be possible for milters to see them with out hacking the MTA itself. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: comparing sender domain against recipient domain
On Fri, 12 May 2023, Matija Nalis wrote: On Thu, May 11, 2023 at 09:41:34PM +, Marc wrote: I was wondering if spamassassin is applying some sort of algorithm to comparing sender domain against recipient domain to detect a phishing attempt? [snip..] That is because those domains are not EQUAL? Od did you wanted a rule that checks only on SIMILAR domain names (e.g. with lowercase letter "L" replaced with number "1" as in your example)? Now I get it, the OP is looking for some kind of comparison function that does an "apparent linguistic distance" evaluation of two strings and returns a score that indicates a "visual similarity" value. (EG replacing 'l' with '1' or 'O' with '0', etc). several years ago there were a flood of phish messages that had a 'From' address that used 'PayPaI' to try to fool people. I've also seen attempts using European character sets with letters that look like O or e to fake common domain names. I've hand coded rules to check for this stuff when frequently abused but I don't know of a programmatic algorithm to do it automagically. Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
metholdless URLs bypass DecodeShortURLs link shortner checking
Today I found some spammy messages which contained tinyurl links that were not checked by my DecodeShortURLs checker. Checking the tinyurl by hand using wget, I found that the destination was a URL that hit some of my URIBL lists. The issue is that if the method is omitted from the url it is not considered for DecodeShortURLs checking. EG: Click here does not get checked but http://tinyurl.com/REDACTED";>Click here does get checked. This happens with SA 3.4.6 Note that this is specific to DecodeShortURLs, a methodless URL is still checked via direct URIBL rules. Is this an issue with the DecodeShortURLs plugin or with SA? Where would I find the most recent version of DecodeShortURLs plugin? Thanks, Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Matching on missing To field?
On Wed, 20 Jul 2022, Alex wrote: Hi, I have a number of rules that match on the To field, but what to do if the To field is missing? Received: from test.com (wsip-72-214-24-18.sd.sd.cox.net [72.214.24.18]) by mail01.example.com (Postfix) with SMTP id 12425B9B for ; Fri, 15 Jul 2022 18:50:34 -0400 (EDT) I realize I can match on the Received header here, but that would require creating an additional rule for each corresponding To rule. Perhaps there's a way to combine them, or a tag that can be used for both? Depending on your MTA and the message, that 'for ' element may be completely missing (for example if there's multiple recipients of a message). Can you configure your "glue" to synthesize an addtional header from the envelope-to address of the message? Envelope recipient addrs must always exist, it's just a question of what you need to do to get it visable to SA. Look at the "envelope_sender_header" entry in the SA docs, apply the same concept to the envelope recipient data. In the milter I use, I create both envelope-From & envelope-To headers. I'm also aware of using ALL, but I think that may be too broad and may catch instances that shouldn't be. Can someone explain how this rule works and if something similar would apply to my situation? header __HDRS_MISSP ALL:raw =~ /^(?:Subject|From|To|Reply-To):\S/ism That rule just says: look at all the raw header data and match if there's none of Subject, From, To, Reply-To entries. IE a really malformed message. Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Add header, not beginning with X?
On Mon, 14 Feb 2022, joea- lists wrote: The reason has to do with "reply" and "reply to all" with the email client/system I am using and prefer to continue using for now. Being subscribed to several lists, I find some variation between them regarding the headers they provide and how my "reply" feature works. Those that provide "Reply-to: somelist" act as expected and place the list address in the To: field. Those that do not (users@spamassassin.apache.org included) find the address of the poster rather than the list in the To: field. While this is not a new issue, I do occasionally fail to correct the address issue and an email goes astray. ' I'm aware that "modern" clients can deal with this and there are more "practical" solutions, but I view this as an opportunity for "exercise" and perverse amusement. Does not appear to be something that can or should be done in SA, just exploring possible avenues, or, abandoning the idea completely. If you want this done for everybody on the system then modifying your MTA is the way to go (EG: at the postfix/sendmail level). If you just want to do it just for your own messages then some kind of custom delivery filter (EG procmail) would be the way to go. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: page.link spam
On Sun, 31 Oct 2021, Axb wrote: On 10/31/21 5:26 PM, Matus UHLAR - fantomas wrote: Hello, it looks like google has registered page.link domain and users are already using it for spamming: https://secretadultnightclub.page.link/... I have added it to my local domain-based blocklist. any idea/tip what to do with it next? blacklist_uri_host page.link Been there, done that, got the FP wounds to show the risks of doing it. My retirement account financial adviser sends me reports that include name.page.link URLs. So selectivly blacklist full entries like secretadultnightclub.page.link but not just page.link Think of it like you would link shortner URLs (EG bit.ly). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
SA 3.4.6 add From:addr host to URIHOSTS list?
In SA 3.4.1 the host value of From:addr was automagically added to the URIHOSTS list and thus exposed to URIBL lookups. SA 3.4.6 does not do that. Is there a configuration option to reactivate that feature? Thanks, Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: handle_user and connect to spamd failed
On Mon, 18 Oct 2021, Linkcheck wrote: On 18/10/2021 11:20 am, Matus UHLAR - fantomas wrote: spamd by default tries to find recipients' home directories and user preferences in them. try passing following option to spamd: instruct spamd to connect to 127.0.0.1 Sorry, I'm not sure where to do that. I've tried as noted in the OP; I can't find anywhere else (remembering I've dropped spamfilter.sh). Actually that timeout error is coming from "spamc". spamass-milter uses spamc under the hood to connect to spamd. It's spamc that is trying to connect to "localhost" which contains that IPv6 reference. Add an option to spamass-milter telling it to pass on to spamc the connect-to host is 127.0.0.1 not localhost. IE: This made no difference. I also have /etc/default/spamass-milter with the options: OPTIONS="-u spamass-milter -i 127.0.0.1 -4" Add the option "-D 127.0.0.1" in that spamass-milter OPTIONS. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: handle_user and connect to spamd failed
On Mon, 18 Oct 2021, Linkcheck wrote: On 18/10/2021 11:20 am, Matus UHLAR - fantomas wrote: spamd by default tries to find recipients' home directories and user preferences in them. try passing following option to spamd: -x, --nouser-config, --user-config Thanks. Where would I actually add that? Which file / command? Those options need to get used in your spamd startup arguements. They go in the same place you've got things like --max-children. But if you're going that nouserconfig route, omit the --create-prefs option. -H directory, --helper-home-dir=directory Is that the literal 'directory'? I took that to mean an actual directory. Matus is saying that your '--helper-home-dir' option syntax in your spamd settings is wrong. You say that you have those set to: OPTIONS="--create-prefs -4 --max-children 5 --helper-home-dir /var/lib/spamassassin -u debian-spamd" Mattus is saying that it should be: OPTIONS="--create-prefs -4 --max-children 5 --helper-home-dir=/var/lib/spamassassin -u debian-spamd" Or: OPTIONS="--create-prefs -4 --max-children 5 -H /var/lib/spamassassin -u debian-spamd" IE the '--helper-home-dir' option needs an '=' with no spaces, or use the -H -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: elf signature for clamav
On Sun, 26 Sep 2021, Benny Pedersen wrote: # cat local_elf.ndb from /var/lib/clamav (databasedir in clamd) Sanesecurity.ELF.1:6:0:7F454C46 took me 5 mins to make :) thanks to KAM on this its very simple, i like feed back from mimedefang and amavisd users If you use the "ClamAV" SA plugin ( http://wiki.apache.org/spamassassin/ClamAVPlugin ) then you can use the full power of ClamAV scanning/detection in SA with out the need for external connectors like mimedefang or amavisd. This has the advantage of being open to a SA users and makes it possible to make special meta rules combining the results of ClamAV scans with other SA filtering such as welcome_auth validated trusted sources. I run two copies of the ClamAV engine: 1) standard ClamAV with standard rules called from milters in my front line MX servers to outright block known malware. 2) a customized ClamAV with full bells-&-whistles such as Heuristics and lots of custom add-in signatures (EG https://github.com/extremeshok/clamav-unofficial-sigs). These can have a moderate FP risk but run from within SA I can use other rules such as welcome_auth to control their risk or use them at low score but meta with other things such as Bayes to jack up the score. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Message-ID with IPv6 domain-literal
On Tue, 21 Sep 2021, Bill Cole wrote: On 2021-09-21 at 12:25:30 UTC-0400 (Tue, 21 Sep 2021 10:25:30 -0600) Grant Taylor is rumored to have said: But why the penalty for using non-public addresses* in a Message-ID: string? Empirical evidence. The use of a non-public address in a Message-ID correlates to a message being spam. In my experience, so does using an IP literal of any sort in a Message-ID, but that may be an idiosyncrasy in my mail. I was not aware that Message-ID had any requirements that the content had to mean anything beyond being syntactically correct. As such I would expect private / non-globally routed content to be allowed. After all, isn't the purpose of the Message-ID to be a universally unique identifier? If so, why does it matter what the contents is as long as it's syntactically correct? What am I missing? Private IP addresses in general cannot specify globally unique devices (consider 127.0.0.1 or the very-popular 192.168.1.1) and therefore a Message-ID using an IP literal as the RHS part with a non-public IP cannot assure uniqueness. That is valid for Private IP addresses. However "[IPv6:::193.168.1.30]" is the representation of IPv4: 193.168.1.30 which is a Public IP address, thus that 'hit' is in error. This should be considered a parsing bug. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: An interesting bit of HTML from a spam
On Sun, 12 Sep 2021, Loren Wilton wrote: I found this little wonder in a bunch of spams I've been getting for the last few days: http://"; http://"; http://"; http://"; http://"; http://"; href="http:/mi.wey.vandalized655bccemetries -dot- cleaning/id>">unsubscribe here I have no idea if that actually works, since I'm not about to try it. The base hostname in that URL (I bowdlerized it in this message) is listed in a couple different URIBLs. SA 3.4.1 is able to spot/extract that name from the garbage and trigger URIBL rules. In debug mode for this message its 'URIDOMAINS' contains: ARY:[oxsus-vadesecure.net,uiowa.edu,uiowa.edu,avg.com,vandalized655bccemetries.cleaning,oxsus-vadesecure.net] SA 3.4.6 not so much. it doesn't seem to "see" that href/URL at all. Its 'URIDOMAINS' contains: value: avg.com So why is SA 3.4.6 much less sensitive about picking up hosts in URLs? -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: spamass-milter (sa daemon loads config different to shell ?)
On Tue, 27 Jul 2021, David Bürgin wrote: Dipl-Inform. Frank Gadegast: On 27.07.21 14:18, David Bürgin wrote: Dipl-Inform. Frank Gadegast: Seems to be, that spamass-milter simply strippes out any X-Spam* header lines, not caring, if the own call to spamd sets them, hm. Im really not getting, why spamass-milter should strip X-Spam-lines of the header AFTER SA was running. If Im right, SA is stripping them of anyway, before running or modifying anything ... Anybody an idea how to get arround this ? There is an alternative milter (which I maintain) that adds all X-Spam-* headers received from spamd. https://crates.io/crates/spamassassin-milter Looks like your milter needs to fork a spamc, wich then talks to the spamd. This will start lots of spamc processes and is not recommened. Would then not be any different to call spamc dirctly f.e. via procmail. You should rewrite your milter to talk directly to the spamd via socket or port. Yes, it communicates using spamc, just like spamass-milter. I have been told that it has been working fine in a somewhat larger deployment. I didn’t mean to derail the thread so will leave it at that. having a spam filtering milter fork off a shell and then run "spamc" to communicate with spamd does simplify the milter code (and insulates it from changes in the spamd protocol) but adds risk of shell escape attacks (as well as additional overhead). There's already been security related patches needed by spamass-milter specifically because of this issue. Writing a milter that directly talks the spamd protocol via a socket (local or network) is more work but safer and more efficient. (been there, done that, got the code to prove it). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: SA 3.4.5 meta with RBL rules not working.
Ugg, I was afraid of that. For decades I've rolled my own install of things like sendmail, SA & ClamAV but this time I wanted to try the release supplied by our server OS vender (SuSE). Unfortunately that's SA 3.4.5. OK, back to the salt-mines. Thanks On Mon, 19 Jul 2021, Henrik K wrote: How about upgrading to latest 3.4.6? This release includes fixes for the following: - Fixed URIDNSBL not triggering meta rules On Mon, Jul 19, 2021 at 01:42:51AM -0500, Dave Funk wrote: I recently updated from SA 3.4.1 to 3.4.5 and noticed that a number of my "meta" rules quit working. I have a number of meta rules that combine RBL/URIBL rules with other rules and they no longer fire, eventho the various components are fireing. EG, a rule like: meta L_TEST_NS2c ( URIBL_ABUSE_SURBL && HTML_MESSAGE ) describe L_TEST_NS2c abusive HTML message score L_TEST_NS2c 1.1 does not fire even tho the message under test triggers both URIBL_ABUSE_SURBL & HTML_MESSAGE. This used to work as expected under 3.4.1. Running a message thru "spamassassin -D" does not give any clues what's going wrong. Any suggestions about how to debug this? Thanks, Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{ -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
SA 3.4.5 meta with RBL rules not working.
I recently updated from SA 3.4.1 to 3.4.5 and noticed that a number of my "meta" rules quit working. I have a number of meta rules that combine RBL/URIBL rules with other rules and they no longer fire, eventho the various components are fireing. EG, a rule like: meta L_TEST_NS2c ( URIBL_ABUSE_SURBL && HTML_MESSAGE ) describe L_TEST_NS2c abusive HTML message score L_TEST_NS2c 1.1 does not fire even tho the message under test triggers both URIBL_ABUSE_SURBL & HTML_MESSAGE. This used to work as expected under 3.4.1. Running a message thru "spamassassin -D" does not give any clues what's going wrong. Any suggestions about how to debug this? Thanks, Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Email Phishing and Zloader: Such a Disappointment
On Sun, 11 Jul 2021, Kevin A. McGrail wrote: On 7/11/2021 5:11 PM, John Hardin wrote: "The other parts contain an application/vnd.ms-officetheme and an application/x-mso file. Which (in addition to the text/xml files) are used by Microsoft Word to load the embedded Word document." Would the presence of all three of those MIME types be a scorable indicator? If you can get me a spample, I'm sure I can tell you but in general we block macros so that's all that's needed. Likely the OLEVBMacro plugin and KAM ruleset is blocking all of these already if you have the plugin enabled. Regards, KAM Aren't there already rules and heuristics in ClamAV for detecting VBmacros in office docs? I've got two copies of ClamAV running, one used as a blocking direct milter with default rules and another one feeding into the SA "clamav.pm" plugin with extra rules and heuristics/algorithms enabled. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: spamass.sock - No such file or directory
Make sure to start spamasmilter before postfix. The milter creates the socket which must exist for postfix to be able to open it. Start the milter then use "lsof" to make sure that it has created the socket and that it's in the place which postfix expects to find it. Also make sure that the permissions on the path thru the directories containing the socket are traversable by postfix and that the permissions on the socket itself provide postfix 'rw' rights. On Sun, 27 Jun 2021, Dominic Raferd wrote: Try unix:/run/spamass/spamass.sock On Sun, 27 Jun 2021, 18:28 , wrote: Still the same Jun 27 19:21:03 nmail postfix/smtps/smtpd[4946]: warning: connect to Milter service unix:spamass/spamass.sock: No such file or directory Jun 27 19:25:37 nmail postfix/smtps/smtpd[5552]: warning: connect to Milter service unix:run/spamass/spamass.sock: No such file or directory Thanks for any update -Ursprüngliche Nachricht- Von: Reindl Harald Gesendet: Samstag, 26. Juni 2021 12:15 An: mau...@gmx.ch; users@spamassassin.apache.org Betreff: Re: spamass.sock - No such file or directory why do you think "/run/spamass" and "unix:/spamass/" are the same path? Am 26.06.21 um 09:37 schrieb mau...@gmx.ch: > Run with Debian 10 > > I dont see why “spamass.sock: No such file or directory” this message > appair > >>mail.log > > Jun 26 09:27:12 nmail postfix/smtps/smtpd[9509]: warning: connect to > Milter service unix:/spamass/spamass.sock: No such file or directory > >>main.cf > > smtpd_milters = unix:/spamass/spamass.sock, > unix:opendkim/opendkim.sock, unix:opendmarc/opendmarc.sock > >>/run/spamass# ls -la > > -rw-r--r-- 1 spamass-milter spamass-milter 5 Jun 26 09:26 spamass.pid > srw-rw 1 postfix postfix 0 Jun 26 09:26 spamass.sock > > or > > srw-rw 1 spamass-milter spamass-milter 0 Jun 26 09:26 spamass.sock > >/etc/group > spamass-milter:x:128:postfix > > thanks for any help -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Scan Attachment Content Using Spamassassin
On Thu, 3 Jun 2021, Henrik K wrote: On Thu, Jun 03, 2021 at 09:32:28AM +0200, Matus UHLAR - fantomas wrote: On 03.06.21 09:23, Henrik K wrote: That's just outdated information. It's fine to scan even 20MB+ messages, it just requires some memory. and CPU and time... Those are affected very little by message size. And all that is pretty much negated by large messages being uncommon. Be that as it may, the OP wanted to do DLP scanning of messages containing PPTx,XLSx, etc, and it's uncommon to see a small PPTx file, large is more common w/ such media. Also, spamassassin does not have a native built-in component for parsing such media attachments, it would need to be some kind of add-in (EG the "fuzzy ocr" plugin that was the rage a while ago). As such it adds an additional complication that needs to be integrated/ managed/updated etc. Probably better to use a whole different tool that comes with that kind of capability built-in (EG ClamAV). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Scan Attachment Content Using Spamassassin
On Thu, 3 Jun 2021, KADAM, SIDDHESH wrote: Hello Folks, Is there any possible way using we can scan for the content of an attachment ie .doc/pdf/.xls/ppt etc... Planning is to have a DLP kind of protection with the help of Spamassassin. Regards, Siddhesh spamassassin really isn't the best tool for this job. It's really designed for looking at text stuff, and how do you squeeze the text out of a ppt or xls in a meaningful way? Even more limiting, spamassassin is designed for small to medium size messages, scanning anything over 500KB or so is going to be a resource hog. What would be better is a tool that is already designed for scanning .doc / pdf/ .xls/ ppt etc.; an anti-virus program with custom rules for the kinds of info you want to detect. ClamAV has builtin DLP rules for standard kinds of PII (EG CC#s, SSNs, etc) and comes with tools to help you craft custom rules if you have particular kinds of info you need DLP for. Start with a mail scanning framework (EG amavis or mimedefang) and plug in spamassassin for spam and two instances of ClamAV, one with standard anti-virus rulesets and another with your DLP rules. Then you can use the framework to take what ever kinds of actions you want based on what components 'fired'. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Counting number of instances of a particular header
I'm trying to create a rule to count the number of instances of a particular header. IE in email messages there could be zero or more instances of a particular header and I want to know how many there are so I can use that info in a meta to detect a spam sign. I first crafted a rule: header L_MY_HEADER X-My-Header !~ /^UNSET$/ [if-unset: UNSET] describe L_MY_HEADER has X-My_header score L_MY_HEADER0.1 Which did correctly detect the existence of 'X-My-Header'. Then to count the number of them I added a 'tflags': tflags L_MY_HEADER multiple maxhits=10 But that would always fire 10 times if there were any instances of 'X-My-Header' (even if there was only one). So I modified the pattern match part of the rule: header L_MY_HEADER X-My-Header =~ /./ Which had the same effect as the first form (IE either zero or 10 firings). As the header would have at least 6 characters but less than 150 I then tried: header L_MY_HEADER X-My-Header =~ /^.{5,200}/ Which would fire only once, even if there were 5 or more instances of the header. What am I doing wrong? How should I craft a rule to count the number of instances of that header? Thanks, Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Error "cannot open bayes databases" lock failed: File exists
On Wed, 20 Jan 2021, Matus UHLAR - fantomas wrote: On 20.01.21 11:07, Emanuel Gonzalez wrote: Date: Wed, 20 Jan 2021 11:07:59 + From: Emanuel Gonzalez To: SA Mailing list Subject: Re: Error "cannot open bayes databases" lock failed: File exists Hello everyone, i'm back from my vacations, i try solved this problem but i could not. I still see in the spamsassin error logs the mentioned error: bayes_learn_to_journal 1 use_bayes yes bayes_path /var/spamassassin/bayesdb/bayes bayes_auto_learn 0 bayes_auto_expire 0 try: ls -la /var/spamassassin/bayesdb/bayes lsof /var/spamassassin/bayesdb/bayes_journal /var/spamassassin/bayesdb/bayes_seen /var/spamassassin/bayesdb/bayes_toks Umm, the command: ls -la /var/spamassassin/bayesdb/bayes should get you the error: ls: cannot access /var/spamassassin/bayesdb/bayes : No such file or directory On the otherhand: ls -la /var/spamassassin/bayesdb/bayes* (taken from the bayes_path parameter) should get you what you want. even better: ls -la /var/spamassassin/bayesdb/ (to see if there's any leftover lock files in that directory) -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: BCC Rule and Subject change for specific rule
On Tue, 5 Jan 2021, John Hardin wrote: On Tue, 5 Jan 2021, Giovanni Bechis wrote: On Mon, Jan 04, 2021 at 05:23:30PM -0800, John Hardin wrote: I'm pretty sure SA only allows setting the subject tag by language, not based on rule hits. Starting from 3.4.3 you can add a prefix to the email subject like that: header FROM_ME From:name =~ /Me/ subjprefix FROM_ME [From Me] Cool, I missed that at the time. Thanks! The documentation does mention it exists but does not give an example of using it... Does this work if you're using a milter for your glue? Is there some special status/command that spamd returns to the milter for this kind of modification? If so the milters may need to be recoded to implement it. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Bypass RBL checks for specific address
On Wed, 23 Dec 2020, Grant Taylor wrote: Context is Sendmail, spamass-milter, and SpamAssassin (spamd). I didn't see any way to have spamass-milter bypass, much less conditionally bypass. Nor did I see a way to have Sendmail conditionally bypass a milter. If all you want is for a particular class of recipients (at the envelope RCPT level) not be passed to spamass-milter inside sendmail that can be done with a bit of hacking of your sendmail config and the milter. I run my own customized miltrassassin milter which has support for custom macros handed to it from sendmail and it takes special action based on what it gets handed. For example if the 'skip_check' is defined, the miter just returns a 'OK' and doesn't call SA at all. If the 'no_reject' macro is set then the milter will not generate a "550" SMTP status regardless of how high the SA score is. (needed for "postmaster" messages). What version of sendmail are you using? -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Bypass RBL checks for specific address
That may not work for what the OP wanted. Because it's assumed that DNS related stuff may take some time those rules (if configured to run) are launched early in the processing of a message. So if the OP wants to completely avoid running RBL checks (as opposed to just ignoring their scores/results) he may need to do some special tricks. One thing would be to have a separate SA instance with its own configuration which has the RBL stuff removed and then configure his MTA to select that particular SA filter when the special user address is detected. This begs the question, what is the need to completely avoid running RBL checks for that special recipient? What is supposed to happen when a message comes in that is addressed to multiple recipients, including the special recipient? This could get messy. On Wed, 23 Dec 2020, Iulian Stan wrote: Hello all, You can create a meta rule with very high prio(actually check to be higher than your RBL), match what you need from email headers and than use shortcircuit to skip additional tests. Best regards, Iulian Stan Sent from my Galaxy Original message From: Grant Taylor Date: 12/23/20 20:59 (GMT+02:00) To: users@spamassassin.apache.org Subject: Re: Bypass RBL checks for specific address On 12/22/20 11:56 PM, Axb wrote: > whitelist_to ? My understanding is that whitelist_to, more_spam_to, and all_spam_to behave the same way and effectively just alter the scoring offset. It seems as if the tests are still run, and it's just the score is artificially offset based on which setting is used. I'm wanting to not run RBL tests for the specific recipient email address. -- Grant. . . . unix || die -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: adding AV scanning to working Postfix/SA system
On Wed, 2 Dec 2020, Joe Acquisto-j4 wrote: Hacking away, seem to have it working?, Using CLAMAVPlugin. At least mail does not appear "broken". But EICAR is not detected. I "think" it is being scanned as I see this: * X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on auxilary X-Spam-Level: * X-Spam-Status: No, score=1.0 required=5.0 tests=BAYES_00,FREEMAIL_FROM, HTML_MESSAGE,SPOOFED_FREEMAIL_NO_RDNS,TVD_SPACE_RATIO autolearn=no autolearn_force=no version=3.4.2 X-Spam-Virus: _CLAMAVRESULT X-Spam-Report: * -1.5 BAYES_00 BODY: Bayes spam probability is 0 to 1% * [score: 0.] * 1.0 FREEMAIL_FROM Sender email is commonly abused enduser mail * provider (joe.acquisto[at]gmail.com) * 0.0 HTML_MESSAGE BODY: HTML included in message * 0.0 TVD_SPACE_RATIO No description available. * 1.5 SPOOFED_FREEMAIL_NO_RDNS From SPOOFED_FREEMAIL and no rDNS * Is that proof it is being scanned and the non detection issue lies elsewhere? joe a. What, specifically, is the config you're using to invoke CLAMAVPlugin? You need to have at least two things set up in your spamassassin config files: 1) load the plugin in a "v*.pre" 2) invoke the check_clamav() procedure EG: in v320.pre # AntiVirus - some simple anti-virus checks, this is not a replacement # for an anti-virus filter like Clam AntiVirus # #loadplugin Mail::SpamAssassin::Plugin::AntiVirus # loadplugin ClamAV /usr/local/etc/mail/spamassassin/plugins/clamav.pm Note that line depends on the path to where you've installed the plugin In a ".cf" rules file (I call mine clamav.cf ): # # config file for using the ClamAV plugin "clamav.pm" # full L_CLAMAV eval:check_clamav() describe L_CLAMAV Clam AntiVirus detected a virus score L_CLAMAV 5 # header T__MY_CLAMAV X-Spam-Virus =~ /Yes/i header T__MY_CLAMAV_SANE X-Spam-Virus =~ /Yes.{1,50}Sanesecurity/i # -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: adding AV scanning to working Postfix/SA system
On Wed, 2 Dec 2020, Tom Hendrikx wrote: On 02-12-2020 16:18, Joe Acquisto-j4 wrote: X-Spam-Virus: _CLAMAVRESULT I never integrated Clam using this plugin, but this seems a config typo to be: there should be a Yes/No in there, and optionally a virus name. Yes, it looks like he's got a type-o in there. The config line should be: "add_header spam Clamav _CLAMAVRESULT_" in a .cf someplace. Then the plugin will add that 'X-Spam-Virus:' header with the text "Yes" followed by the name of the virus detected. You can then use the value of that header in other rules to add points for various kinds of things detected or "meta"ed with other rules. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: amazonses.com doubble dkim sign
On Tue, 10 Nov 2020, Benny Pedersen wrote: DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=n4atlko3yvgxyqpwp7palysab6occe3l; d=fing.com; t=1604971038; h=From:To:Message-ID:Subject:MIME-Version:Content-Type:Date; bh=0LT5Ztzk2B+Ecm2NPRzroGl6fTFNX9TpP6X0036qmf4=; b=Rtc9ieWPMuaNZ9iRZPZMEfuGj7pnaXu6TPjT9px08NGKZt0+rbCLyz083FG3djhk UTdHNgkEc6xGCCRN0JzbrdYaHWptG2U42qOYEajdE59uuR/Ucy+rGJA8Vr2roe/Ssvm jYWosu47Ndl6M56u9m3aNpAuBOgNmQHWoMVyWXZU= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/simple; s=shh3fegwg5fppqsuzphvschd53n6ihuv; d=amazonses.com; t=1604971038; h=From:To:Message-ID:Subject:MIME-Version:Content-Type:Date:Feedback-ID; bh=0LT5Ztzk2B+Ecm2NPRzroGl6fTFNX9TpP6X0036qmf4=; b=lihzmRF2B+mUjB1E89LLJ8JkbpbQQIpnPd5JtQjAGB5uSurBWfv6VrGHgbCy2O1e q7AWlXPTcwdca5K4iB0pormV/lgvfZV+kgwfSrLPlgWBwlB9hRi2TCsFhT9v9tbEm1b dZBXrPRFO9r+uDtLfR6OgaOtXq7RjMiAUqcDBm0k= From: Fing Alert why ? Two signatures, one for the 'From:' address (message creator) and one for the issuing SMTP system. Look at the signing domain (the 'd=D.N' part) to see who the creator of a given signature is. There's nothing to prevent each system in the SMTP hand-off chain from adding their own signature, provided they do nothing to invalidate earlier signatures. More than two is unusual/overkill, but it's not uncommon to see two. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: questions on spamassassin
On Sat, 5 Sep 2020, Rajesh M wrote: dear friends, had a few questions 1) what is the sequence based on which the rules are processed ? is there any documentation on this ? how is the rule number example 20_dnsbl_tests.cf or 25_uribl.cf related to the sequence of rule processing ? Are you asking about rule sequencing or configuration file sequencing? "20_dnsbl_tests.cf" is a configuration file which contains zero or more rules. During startup spamassassin reads all configuration files that are found in a list of specific directories (which are distro dependent). The directories are searched in list order for configuration files (name.cf), the files are read in lexical order. So if you have a rule (EG: "MY_RULE_2") in file 20_my_rules.cf and another instance of "MY_RULE_2" in 99_my_rules.cf (in the same directory) the "99" file will be read after the "20" file and the latter definition of "MY_RULE_2" will over-ride (replace) the one from "20". Also the system provided rules directories are processed before the user supplied directories (intentionally) so a user can over-ride a system rule if they don't like how that particular rule works. See: https://cwiki.apache.org/confluence/display/SPAMASSASSIN/WhereDoLocalSettingsGo Once all the rules are read and parsed spamassassin has an internal order to how specific rules get run. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Thanks to Guardian Digital & LinuxSecurity for the nice post about SpamAssassin's upcoming change
On Thu, 23 Jul 2020, Antony Stone wrote: On Thursday 23 July 2020 at 04:36:41, Olivier wrote: I am wondering what grey list should be renamed... Why - has the zombie population started complaining about racial slurs? You have just pissed off Oscar the gray geriatric grouch. ;) This is the letter G brought to you by Oscar the grouch. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: score sender domains with 4+ chars in TLD?
On Sat, 13 Jun 2020, RW wrote: On Fri, 12 Jun 2020 09:22:40 -0400 AJ Weber wrote: I want to try adding a score for a sender whose address uses a TLD with > 3 chars. I realize there are some legit ones, but I'm going to test it with a low score and see what it catches. What I did was grep my mail for TLDs seeen in ham and then create a rule __NORMAL_TLD I then score a point for: __HAS_FROM && ! __NORMAL_TLD This probably wont scale well beyond a few users though. If I were a bit more energetic I'd autogenerate the rule from cron. This sounds like a perfect application for a custom DNS-bl lookup/list. Create a local custom rbldnsd server "dnset" zone from a data file with your blessed TLDs, then a rule doing a rbl check using the hostname from the From address with custom scoring. You can easily update the rbldnsd zone data (just write/update the data file, no need to restart spamd) and could create a custom scoring value based on the DNS data (EG 127.0.0.2 for really 'good' TLDs, 127.0.0.4 for 'so-so' and 127.0.0.8 for truely spammy names). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-05491256 Seamans Center, 103 S Capitol St. Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Somewhat OT: DMARC and this list
On Sat, 20 May 2017, David Jones wrote: From: David B Funk [snip..] The message from you that I'm replying to here (both the one that came directly to me and the copy I got thru the Apache list server) are -totally- devoid of DKIM headers. (If you'd like to see it I can put it up in paste-bin.) I figured out what was going on. Microsoft must have recently (past few months or so) started sending our outbound mail through another IP range. I have updated my opendkim.conf to cover all Office 365 outbound servers. This is one of the things that I dislike/fear about being dependent on cloud based services. Many traditional system paradigms use the concept of trusted IP addresses (EG: internal_networks, trusted_networks, etc) for making operational decisions. When using cloud based services you have no control over their IP addresses and have to worry about when they might change with out notice, whom else they might be servicing using those same addrs, AND when they might abandon them only for somebody else to start using them. It also reduces the usefulness of RBLS and can even adversely affect the performance of things such as Bayes. When you get major amounts of Ham from O-365 most of the tokens derived from O-365 messages get 0.000 score. So when spammers use O-365 even blatant spam gets a Bayes score of 00%. (and this is after putting all the O-365 headers in bayes_ignore_header statements). (Our institution recently moved the majority of users' mail to O-365 so this is a battle I'm fighting now). Bottom line, in this brave new world address based auth(n/z) decisions are going to be increasingly problematic and an increasing reliance on things such as digital signatures. Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: R: learn ham
On Thu, 5 Jan 2017, Nicola Piazzi wrote: Each minute it learn messages of the last minute so it read and learn one time only for each message Messages are that it sends from internal, so il learn that words are not spam Internal messages are not spam Until one of your users gets their account hacked/phished and spammers then use it to abuse your server to send out megabytes of spam. (or they may have had an account on Yahoo that used the same password). Careless users happen to the best of us. ;( John's point is still valid; blind un-vetted automated Bayes learning is asking for trouble. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Detecting Valid Message Replies
On Tue, 3 Jan 2017, ma...@assembly.state.ny.us wrote: On 1/3/2017 8:12 AM, Christoffer G. Thomsen wrote: blacklist or increase score for mails that reply to unknown message IDs. Remember that someone out in the world might do a "Reply all" to a message which was also Cc'd to one of your users. This would show up as an unknown message ID. Of course,to remedy this, you could also keep track of incoming message IDs. That would make the wrong decision in the following scenario: A sends message to B B replies to A and also adds C to the "CC" list (as B thinks that C should be involved in the conversation) In this case C would receive a "reply" to a message that she's never seen before, but is a legitimate communication. This scenario may seem contrived but I've seen it happen around me with some regularity (both as a recipient & creator). And then there's the case where somebody forwards to you a reply that they got so you get a message "Re: blah de blah (fwd)" -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: DNS Terminology
On Fri, 23 Sep 2016, Lindsay Haisley wrote: On Fri, 2016-09-23 at 19:03 -0400, listsb-spamassas...@bitrate.net wrote: consider that, to do the work described as "forwarding" in many of these references, the nameserver must perform a recursive query [e.g. it must perform a query with the rd bit set]. "A forwarding DNS server offers the same advantage of maintaining a cache to improve DNS resolution times for clients. However, it actually does none of the recursive querying itself. Instead, it forwards all requests to an outside resolving server and then caches the results to use for later queries." What am I missing? Justin Ellingwood, who wrote the DigitalOcean piece, is a very experienced documenter. From his rather impressive resume, I'd be inclined to trust what he posts. This is the difference between asking a question (formulating a query potentially with the "want recursion" bit set) and then doing the work of chasing down all the different stake-holders necessary to answer the question (performing the recursive query) VS handing the query off to a 3'rd party and letting them do the dirty work (forwarding) -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Spam by IP-address? Spamassassin with geoiplookup?
On Thu, 22 Sep 2016, Thomas Barth wrote: And what about filter poisening? In the last 10 hours my company address got 43 mails classified as spam (even a virus mail detected today). And there was one mail classified as spam due to my rule (bad country, message-id. X-Spam-Status: Yes, score=7.474 tag=2 tag2=6.31 kill=6.31 tests=[MESSAGEID_LOCAL=3, RDNS_NONE=1.274, RELAYCOUNTRY_BAD=3.2] autolearn=no autolearn_force=no The content of the mail is: From: "Lupe Monroe" To: "my boss address" Subject: Payment approved MIME-Version: 1.0 Content-Type: multipart/related; boundary="boundary_af9c8db46eb73fca8b315aafef01" Message-Id: <20160922063255.e11d3e5...@static.vnpt.vn.local> Date: Thu, 22 Sep 2016 06:32:55 +0700 --boundary_af9c8db46eb73fca8b315aafef01 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 8bit Dear so, Your payment has been approved. Your account will be debited within two days. You can email us for any query regarding your account. Thank you. Lupe Monroe Support --boundary_af9c8db46eb73fca8b315aafef01 Content-Type: application/x-zip-compressed; name="e6dfa16bdb.zip.virus-scan-me.virus-scan-me" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="e6dfa16bdb.zip.virus-scan-me.virus-scan-me" There is no spam content, am I right? Normal words and content that a normal person can use. I dont need spam learning for all the mails already classified as spam with high score. Spam with low score are interesting for spam learning like this one. But when I use these mails for spam learning there is a risk of false positive some day, because it has learned that normal mails are also spam? You are missing the point that Bayes uses more than just body words from a message. It also looks at headers and meta-data. So those particular body words could become "neutral" (neither spam nor ham indicators) but the other components of that message (such as that '.vn.local' message ID) would be learned as spam signs. This is why you MUST also train your Bayes with HAM messages (and train them with the --ham flag) so Bayes knows how to recognise 'hammy' or 'neutral' tokens to prevent false-positives. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Spam by IP-address? Spamassassin with geoiplookup?
On Thu, 22 Sep 2016, Thomas Barth wrote: Hi ho, a virus was found: Sanesecurity.Malware.26327.JsHeur.UNOFFICIAL Scanner detecting a virus: ClamAV-clamd Content type: Virus Internal reference code for the message is 35123-18/WRf_y9XIIOFq First upstream SMTP client IP address: [103.230.105.6] According to a 'Received:' trace, the message apparently originated at: [103.230.105.6], [103.230.107.6] unknown [103.230.105.6] You REALLY should get your DNSBL problem fixed. Once you get DNSBLs working it will help alot. That particular IP address hit almost a dozen different RBLs here, including some that I use at the SMTP level to out-right block incoming traffic (such as cbl.abuseat.org , Spamhaus PBL, SBL). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: scan an HTML file, possible?
On Wed, 3 Aug 2016, Robert Boyl wrote: Hi, everyone I have a very nice regex a friend passed me that catches those emails that have an HTML attached with a redirect html command to some malefic website. He has some tool in Exim that scans text in attachments. But I wanted to use a spamassassin rule. Is there some plugin/way in Spamassassin to scan text of an html attachment? You can write 'full' rules that will work with raw HTML in recognized html attachments. The problem is that SA has business logic that ignores non-textural attachments, and that can be fooled by mime-typing. So if the attachment has a mime-type of "text/html" SA will scan it. If it has a mime-type of "application/octet-stream" SA will ignore it but if the attachment has a filename ending in ".htm" most client programs will treat it as HTML and open it as such. I once wrote a rule to detect such obfuscation but it had too many FPs. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Paragraph Length Limit (new rule)
Use a 'full' not 'rawbody' rule. IE: full B_PLL /(?:(?!<\/p>).){999,}<\/p>/msi Why are you doing a "tflags __B_PLL multiple maxhits=1" ? If you have "maxhits=1" what's the point of "multiple" at all? On Wed, 3 Aug 2016, Ruga wrote: Hello, We received a new type of spam, twice, and we are not willing to give them a third chance. The body includes a long html paragraph (...) of headlines from the news. The following works at the command line: perl -p0e 's/((?:(?!<\/p>).){999,}<\/p>)/-->$1<--/msig' example.eml perl -n0e '/((?:(?!<\/p>).){999,}<\/p>)/msig and print "--->$1<---"' example.eml The following SA rule, however, does not work at all: rawbody __B_PLL /(?:(?!<\/p>).){999,}<\/p>/msi tflags __B_PLL multiple maxhits=1 meta B_PLL __B_PLL describe B_PLL Body: Paragraph Length Limit score B_PLL 1.0 I would be most grateful if you could spot the but in the above rule. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: SA bayes file db permission issue
On Sat, 11 Jun 2016, RW wrote: On Fri, 10 Jun 2016 15:38:44 -0400 Joseph Brennan wrote: This is a nice test I found: echo -n I | od -to2 | awk '{ print substr($2,6,1); exit}' 1 little-endian 0 big-endian I don't see how this can output anything other than 1. Endianness is about the addressing of bytes within integer words. This is looking at the ordering of human-readable octal digits displaying the contents of a single byte. On big-endian system: $ echo -n I | od -to2 000044400 001 On little-endian system: # echo -n I | od -to2 000000111 001 So it works. It's a single data byte but since the display field is a two byte object, where within that two byte object does that single byte show up? -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Spamassassin not capturing obvious Spam
OK, So you are testing to see how SA scores artificial mail messages. However SA is designed to evaluate real mail messages, not botched fabrications of them, so I don't understand what you are trying to achieve. You have (either deliberately or unknowingly) omitted the necessary information that SA needs to perform meaningful network based tests. If you want to test SA with network based tests explicitly disabled there are command line (or configuration) options to achieve that. When you use those options it causes SA to "shift gears" and changes how various remaining parts are utilized. So in a way you are crippling SA by withholding info it needs for network based tests but not telling it that you are doing that so it doesn't "know" to bring full force of the non-network components to bear. I'm not surprised that its performance is sub-par in this situation. What are you trying to achieve with this artificial scenario? On Mon, 30 May 2016, Shivram Krishnan wrote: 1) The message is indeed fabricated. I had to generate a RFC 2822 mail from JSON. I am harvesting SPAM mails from mailinator.com (public email's). So that is an error in my generation of the RFC 2822. I did not change it as spamassassin did not assign a score. 2) I have set a threshold of -10 to see how spamassassin assigns a score for every mail. On Mon, May 30, 2016 at 8:25 PM, Dave Funk wrote: That message is either a fabrication or something from a messed up system. There's no sign of an IP address (neither IPv4 nor IPv6) in it. There are two identical 'Received:' headers which have '()' where there should be at least the IP address of the incoming connection. This indicates that the message has either been tampered with or is from a postfix system that somebody has messed up the configuration. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Spamassassin not capturing obvious Spam
That message is either a fabrication or something from a messed up system. There's no sign of an IP address (neither IPv4 nor IPv6) in it. There are two identical 'Received:' headers which have '()' where there should be at least the IP address of the incoming connection. This indicates that the message has either been tampered with or is from a postfix system that somebody has messed up the configuration. On Mon, 30 May 2016, Shivram Krishnan wrote: Hey guys, I am testing spamassassin on a SPAM/HAM corpus of mails. Spamassassin is not picking up an obvious spam like in this case http://pastebin.com/MbNRNFWy . I have followed the guidelines on https://wiki.apache.org/spamassassin/ImproveAccuracy . Let me know how to catch these type of Spams. It would be interesting to know what your spamassassin assigns the score for this spam. spamassassin assigned this score - Content analysis details: (3.9 points, -10.0 required) pts rule name description -- -- 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.4292] 0.0 HTML_MESSAGE BODY: HTML included in message 0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts 0.4 HTML_MIME_NO_HTML_TAG HTML-only message, but there is no HTML tag 0.0 UNPARSEABLE_RELAY Informational: message has unparseable relay lines 2.0 XPRIO Has X-Priority header Notice that none of the other body tags are triggered. Thanks, Shivram -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: malware campaign: javascript in ".tgz"
On Thu, 21 Apr 2016, Reindl Harald wrote: [snip..] Content-Type: application/octet-stream; name="0005500922.tgz" I wonder how common octet-stream is with legitimate .tgz files sadly you need to expect "application/octet-stream" for nearly any filetype, learned the hard way by doing mime-checks on webservers +1 for this, similar experience here. I've seen "application/octet-stream" typing on ".htm" components of mail messages created by major brand e-mail clients. The lazy authors assume that the correct file extension is all that is needed. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: HEADER_HOST_IN_BLACKLIST
On Sat, 12 Mar 2016, @lbutlr wrote: Where is the blacklist for HEADER_HOST_IN_BLACKLIST? I am hitting that on a non-spam mail from email.amctheatres.com It is the result of somebody putting that hostname in a 'enlist_uri_host' directive in your local SA configuration. Look up enlist_uri_host in your SA Conf documentation. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Missed spam, suggestions?
COUNT %OFRULES %OFMAIL %OFSPAM %OFHAM 1 HTML_MESSAGE 16473 9.13 50.51 87.85 90.80 2 DKIM_SIGNED 13776 7.64 42.24 13.81 75.93 3 TXREP 13228 7.33 40.56 91.00 72.91 4 DKIM_VALID 12962 7.19 39.74 11.93 71.44 5 RCVD_IN_DNSWL_NONE 99415.51 30.48 8.08 54.79 6 DKIM_VALID_AU 87114.83 26.71 7.99 48.01 7 BAYES_00 83904.65 25.72 1.84 46.24 8 RCVD_IN_JMF_W 73694.09 22.59 2.54 40.62 9 RCVD_IN_MSPIKE_WL 67133.72 20.58 4.39 37.00 10BAYES_50 62013.44 19.01 25.56 34.18 Based upon your stats it looks like you need more Bayes training. Your Bayes 00/99 hits should rank higher in the rules-fired stats and BAYES_50 shouldn't be in the top-10 at all. (of course if you've only been training for a week that would explain it). For example, here's my top-10 hits (for a one month interval). TOP SPAM RULES FIRED -- RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM S/O -- 1 T__BOTNET_NOTRUST 114907 60.32 86.81 42.66 0.5755 2 BAYES_99 109138 32.98 82.45 0.01 0.9998 3 BAYES_999 104903 31.70 79.25 0.01 0. 4 HTML_MESSAGE 90850 79.41 68.63 86.59 0.3456 5 URIBL_BLACK 90845 27.61 68.63 0.27 0.9942 6 T_QUARANTINE_1 90640 27.40 68.47 0.02 0.9996 7 URIBL_DBL_SPAM 79152 24.02 59.79 0.17 0.9956 8 KAM_VERY_BLACK_DBL 74301 22.45 56.13 0.00 1. 9 L_FROM_SPAMMER1k 73667 22.26 55.65 0.00 1. 10 T__RECEIVED_1 72413 42.60 54.70 34.54 0.5135 OP HAM RULES FIRED -- RANK RULE NAME COUNT %OFMAIL %OFSPAM %OFHAM S/O -- 1 BAYES_00 182674 56.03 2.11 91.97 0.0150 2 HTML_MESSAGE 171992 79.41 68.63 86.59 0.3456 3 SPF_PASS 136623 63.08 54.52 68.78 0.3457 4 T_RP_MATCHES_RCVD 130879 53.75 35.54 65.89 0.2644 5 T__RECEIVED_2 125492 53.76 39.62 63.18 0.2947 6 DKIM_SIGNED 114808 38.57 9.72 57.80 0.1008 7 DKIM_VALID 105385 34.70 7.16 53.06 0.0825 8 RCVD_IN_DNSWL_NONE 92951 29.90 4.56 46.80 0.0609 9 T__BOTNET_NOTRUST 84741 60.32 86.81 42.66 0.5755 10 KHOP_RCVD_TRUST 84623 26.44 2.19 42.60 0.0331 Note how highly BAYES 00/99 ranked. What you don't see is that BAYES_50 is way down in the mud (below 50 rank). BTW, this is with a Bayes that is mostly fed via auto-learning. I occasionally hand feed corner cases that get mis-classified (usually things like phishes, or conference announcments that can look shakey). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{ Robert Chalmers rob...@chalmers.com.au Quantum Radio: http://tinyurl.com/lwwddov Mac mini 6.2 - 2012, Intel Core i7,2.3 GHz, Memory:16 GB. El-Capitan 10.11. XCode 7.2.1 2TB: Drive 0:HGST HTS721010A9E630. Upper bay. Drive 1:ST1000LM024 HN-M101MBB. Lower Bay Robert Chalmers rob...@chalmers.com.au Quantum Radio: http://tinyurl.com/lwwddov Mac mini 6.2 - 2012, Intel Core i7,2.3 GHz, Memory:16 GB. El-Capitan 10.11. XCode 7.2.1 2TB: Drive 0:HGST HTS721010A9E630. Upper bay. Drive 1:ST1000LM024 HN-M101MBB. Lower Bay -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Interesting rule combo results
On Tue, 8 Mar 2016, Marc Perkel wrote: This is the for what it's worth department. I've generated the following rules combination lists. The ham list are rule combinations sorted by the number of ham hits that have 0 spam hits. The spam list are rule combinations sorted by the number of spam hits that have 0 ham hits. There are some of my personal rules mixed in. Just posting this just to see if anyone sees any value in this. SPAM RULES: 11648 HTML_MESSAGE RAZOR2_CF_RANGE_51_100 SUBJ_GROUP 11308 HTML_MESSAGE RAZOR2_CF_RANGE_E8_51_100 SUBJ_GROUP 11212 RAZOR2_CF_RANGE_51_100 RAZOR2_CF_RANGE_E8_51_100 SUBJ_GROUP 10749 RAZOR2_CF_RANGE_51_100 RAZOR2_CHECK SUBJ_GROUP 10646 RAZOR2_CF_RANGE_E8_51_100 RAZOR2_CHECK SUBJ_GROUP 5042 DKIM_VALID MIME_HTML_ONLY MISSING_DATE 5024 DKIM_VALID_AU MIME_HTML_ONLY MISSING_DATE [snip..] HAM RULES: 132983 DKIM_SIGNED MAILTO_LINK RDNS_DYNAMIC 132558 DKIM_VALID MAILTO_LINK RDNS_DYNAMIC 131916 DKIM_VALID_AU MAILTO_LINK RDNS_DYNAMIC [snip..] 80056 HTML_MESSAGE 78472 DKIM_SIGNED MAILTO_LINK UNPARSEABLE_RELAY 77994 DKIM_VALID MAILTO_LINK UNPARSEABLE_RELAY 77635 DKIM_VALID_AU MAILTO_LINK UNPARSEABLE_RELAY 76959 HTML_MESSAGE RDNS_DYNAMIC UNPARSEABLE_RELAY 72949 MAILTO_LINK RDNS_DYNAMIC UNPARSEABLE_RELAY 59189 DKIM_SIGNED 56792 DKIM_VALID [snip..] Marc, Maybe I'm misunderstanding your list but it looks like you've got HTML_MESSAGE by itself in the HAM RULES (IE zero spam hits on HTML_MESSAGE) but you've also got a rule combo of HTML_MESSAGE RAZOR2_CF_RANGE_51_100 SUBJ_GROUP as the top SPAM RULES (which implies that there is SPAM that hits HTML_MESSAGE too). Similar situation for DKIM_SIGNED & DKIM_VALID Also how can you have 132983 hits on the combo of DKIM_SIGNED MAILTO_LINK RDNS_DYNAMIC but only 59189 hits on DKIM_SIGNED by itself? -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: URIBL/DNSBL from a database
On Sat, 13 Feb 2016, Alex wrote: I've now got rbldnsd implemented. I've also known for a while it's faster/better than bind, but bind has always been in place. I have rbldnsd running on port 530, alongside bind on 53. How do I specify a urirhsbl in spamassassin to query the DNS server running on 530 instead of 53? One way to do this is to set up a "forward only" zone in your bind config. For example, assume you're authoritative for "example.com" and you've got your rbldnsd set up to serve up your data as zone "mybl.example.com" and it's bound to 192.168.124.23/530 Then in your bind config file create a zone: zone "mybl.example.com" { type forward; forward only; forwarders { 192.168.124.23 port 530; }; }; Then when your clients (spamd or regular dns tools) query "blah.com.mybl.example.com" it will hit your bind and then get passed on to your rbldnsd for an answer. If you want to hide that resource from the world put that zone in a private 'view' in your bind. You could control access via an ACL but by putting it inside a private view they'll never even see it to try pounding on it. To provide fault tolerance, you can set up rbldnsd's on multiple machines and put multiple addresses in that 'forwarders' stanza. You will need to put that zone definition in your primary bind and each secondary. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Question about spam report header
You can do that but it requires editing all your rule files, altho then you see those matches in all your reports. If you just want to test one particular message, just use the -D option to spamassassin and grep for ' got hit: ' Mar 11 21:51:44.203 [5074] dbg: rules: ran header rule __MIME_VERSION ==> got hit: "" Mar 11 21:51:44.204 [5074] dbg: rules: ran header rule __TO_HEADER_EXISTS ==> got hit: "<" Mar 11 21:51:44.204 [5074] dbg: rules: ran header rule __TOCC_EXISTS ==> got hit: "" Mar 11 21:51:44.204 [5074] dbg: rules: ran header rule __KAM_UPS2 ==> got hit: "negative match" Mar 11 21:51:44.204 [5074] dbg: rules: ran header rule __KAM_JURY3 ==> got hit: "negative match" Mar 11 21:51:44.205 [5074] dbg: rules: ran header rule __HAS_FROM ==> got hit: "" (Yes, Marc, you probably already know this, this is for the other people who might be following this thread ;) On Tue, 2 Feb 2016, Marc Perkel wrote: Never mind I found that if I change __ to T_ that it does what I want. On 02/02/16 18:05, Marc Perkel wrote: On 02/02/16 17:55, Marc Perkel wrote: Normally SA creates a header that has a list of the names of rules that matched. It skips the listing of hidden rules that start with __ . Is there a command where I can easily tell SA to include the hidden rules in the report in the headers so I can see all of it? I'm also - I suppose asking it to list rules that match that produce no scores. body __LATE_RICH_RELATIVE /\blate .{0,15}(?:father|wife|widow|husband|general|president|daughter|son|minister|client)/i body __CT_CLICK /\b(click(ing)? (here|now|this|on|below|.{0,9}(hyper)?link))|visit(ing)?this link\b/i body __BENEFICIARY/\bbeneficiary\b/i body __CT_BEGGER /\b(kind assist[ae]nce|feed my family|need (of )?your help|donat(e|ion))\b/i body __CT_CONTACT /\b((contact(?:ing) you|contact (information|me|email|number|us)|your contact))|to (inform|email) you/i body __CT_REPLY_TO_ME /\b(reply to me|please reply|my email address|private email|contact me|prompt response|reply from you|hearing from you|assist me)/i body __CT_DYING /\b(diagnosed with|months to live|dying of|transplant)\b/i body __CT_UNITED_NATIONS /\bUnited Nations?\b/i meta __CT_STRANGERCT_MY_NAME_IS || CT_DEAR_FRIEND || CT_DEAR_SOMETHING || CT_SIR_MADAM || CT_INTRODUCE meta __CT_MONEY CT_TRANSFER_MONEY || CT_THE_SUM_OF || CT_EARN_MONEY || LOTS_OF_MONEY || MILLION_USD || FUZZY_MILLION || GIVE_YOU_MONEY || __CT_BANK || BILLION_DOLLARS || US_DOLLARS_2 || ADVA$ meta __CT_VICTIM __BENEFICIARY || CT_LATE_PRESIDENT || CT_LATE_RICH_RELATIVE || __CT_DYING meta __CT_FORMFILL_THIS_FORM || FILL_THIS_FORM_LONG || T_FILL_THIS_FORM_SHORT meta __CT_CONFIDENTIALCT_PRIVATE_EMAIL || CT_PRIVATE_PHONE || CONFIDENTIAL_SCAM1 || CONFIDENTIAL_SCAM2 meta __CT_NOW CT_ACT_NOW || CT_DO_IT_TODAY || CT_URGENT_RESPOND meta CT_GOD_BENEFICIARY __CT_GOD && __CT_VICTIM describe CT_GOD_BENEFICIARY God and Beneficiary score CT_GOD_BENEFICIARY 4 meta CT_GOD_BEGGER__CT_GOD && __CT_BEGGER describe CT_GOD_BEGGERBegging in Religious Language score CT_GOD_BEGGER3 -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: OUTPUT OF SPAMASSASSIN
On Sun, 24 Jan 2016, Reindl Harald wrote: Am 24.01.2016 um 20:45 schrieb Shawn Bakhtiar: On Jan 24, 2016, at 11:29 AM, Martin Gregorie wrote: On Mon, 2016-01-25 at 00:07 +0530, Sarang Shrivastava wrote: I am just a newbie who has started using SA. Someone on the mailing list suggested me to use -D option. So if this option is for debugging then how do we classify it ? You don't classify it: that's SA's job. It only scores messages and sets the Yes/No flag before adding the X-Spam-* headers to the message. Nothing else. What you do with mail that SA has classified as spam is the responsibility of your additional software and/or your users. [snip..] * the point is that he is analyzing *local* files * so he needs to pass eml files to spamc/spamassassin * SA adds a header "X-Spam-Flag: Yes" in case of it reached spam-score * that output needs to be parsed * that's it Simpler yet, get spamd running and just use "spamc -c < mail.eml" It emits a score and sets the exit code. No "parsing" needed, just test the exit code. EG, suppose I have two messages, one known ham "ham.eml" and one known spam "spam.eml" Then: if (spamc -c < spam.eml ) ; then echo "is ham" else echo "is spam" fi will execute the 'echo "is spam"' clause and if you feed it the ham.eml will execute the 'echo "is ham"' clause. ( this presupposes a bash shell varient, coding for other shell types is left as an exercise for the reader. ;) -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Help with RegEx Rule
On Sun, 20 Sep 2015, AK wrote: [..snip..] Still no joy after removal. However, at least the rule now hits if I replace: /(^\.\n){5,}/ with /(^\.\n)*/ But that looks like it might bring about some FPs. Any other suggestions? Do you realize that rule will -always- fire on -any- message? The '*' repeat operator is "zero or more" instances. So that pattern degenerates to // which will match everything. Guaranteed FP generator. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Help with RegEx Rule
On Sun, 20 Sep 2015, AK wrote: Hi all. I'm getting hit with lots of JUNK mail that has multiple lines with just a '.' on several lines [0]. Most of the JUNK email has at least 5 and at most 10 lines (so far) with just this '.' character somewhere in the middle of the message. I've copied the message source to RegexBuddy [1] and have been able to come up with a regex that matches what I want using the Perl 5.20 engine: (^\.\n){5,} However, adding this rule to /etc/spamassassin/local.cf doesn't hit at all when I run it against my test message as follows: = Start Rule Block = rawbody __MANY_PERIODS_1 ALL =~ /(^\.\n){5,}/ meta MANY_PERIODS __MANY_PERIODS_1 score MANY_PERIODS 2.0 describe MANY_PERIODS JUNK mail with several lines that contain single dot = End Rule Block = = Begin Test Command = spamassassin -L -t test.msg = End Test Command = Please help me understand what I'm doing wrong as this is my first attempt at creating a rule. Previously I've just copied and pasted what I've found here in the forums, but this time I'm trying to do it myself but failing. Regards, ak. SA does some interesting pre-processing on mail messages before applying rules, so you need to understand that. Try this: rawbody T__LOCAL_MANY_PERIODS/\n(?:\.\n){5}?/ describe T__LOCAL_MANY_PERIODS Many lines with just a single "dot" Notes: 1) Due to SA pre-processing collapsing body into one long line, cannot match on '^' repeatedly, need to look for '\n' as line break indicator. Find start of a line and then following repeats of ".\n" 2) use '(?:' as grouping optimization unless you care about capture. 3) for terminal match clause use '{5}' not '{5,}' as we're done as soon as we see at least 5 matches, don't care if there are more. 4) use "non-greedy" match quantifier '}?' look for first hit on that pattern and don't try to go for more. Un-optimised pattern: /\n(\.\n){5}/ Note use of "testing" rule name format, that "T_". remove the leading 'T' to make it into a silent rule for combining with metas. Personal convention; I interpolate '_LOCAL_' ( or '_L_') in locally created rule names to distinguish them for debugging. And then when things don't work as expected (EG: FPs) it helps to determine if the problem is self-inflicted. Final note; now that we've discussed this spam sign, it will probably become useless as spammers follow this list and mutate their crap accordingly to dodge our rules. ;( -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: URIBL_BLOCKED while using local BIND
However you did not empty your ISP's dns server cache. That 2 msec response time is from his cache, the 543 msec for your server is when it's not in your server's cache. So you're not making a fair comparison. A response from a cache is always going to be faster, that's why people use caching servers. However with everybody & his cat using your ISP's server it gets query blocked and thus is caching the bad (blocked) response. So either you get bad data fast or good data slowly. Once you get a second spam with similar contents, queries for that copy will be in your cache and be fast. Given that a modern SA parallelizes DNS queries a somewhat slow DNS response (hundreds of Msecs) won't have too much overall affect on the spam processing time. On Tue, 15 Sep 2015, Marc Richter wrote: Yes Am 15.09.2015 um 13:30 schrieb Axb: On 09/15/2015 01:23 PM, Marc Richter wrote: Also, you shouldn't make assumptions without measuring something: 1. without forwarding: ;; Query time: 543 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) 2. with forwarding to my ISP's servers: ;; Query time: 2 msec ;; SERVER: 127.0.0.1#53(127.0.0.1) That's 271 times faster than root-servers's lookup. did you EMPTY cache after each query? -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Bayes Filtering
On Sun, 2 Aug 2015, Christian Jaeger wrote: On August 2, 2015 6:40:10 PM CEST, Reindl Harald wrote: no idea what you are talking about by saying "I can't find anything about this in the docs" I'm talking about the bundled docs. The man / perldoc pages of Mail::SpamAssassin::Plugin::Bayes / Mail::SpamAssassin::*Bayes* and the default config files. That's where I expected this info to be. It's something simple and basic, i.e. something that the writer of the software can foresee the need for documentation, so it makes sense that it's in the same files that the programmers wrote. That's where I start looking. That's where qpsmtpd, which I'm configuring around the same time, has its basic docs. Ch. In the man page for the spamassasin config file there is a paragraph: bayes_min_ham_num (Default: 200) bayes_min_spam_num (Default: 200) To be accurate, the Bayes system does not activate until a certain number of ham (non-spam) and spam have been learned. The default is 200 of each ham and spam, but you can tune these up or down with these two settings. You might argue about the clarity, but the info is there. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Classifying mail as unsolicited
On Mon, 6 Jul 2015, Alex wrote: Hi, We have a system with a few hundred users, many of which forward their mail off the server to their gmail or yahoo account. Lately I've started to notice quite a few messages are being tagged by gmail and delayed being received as unsolicited. I know the KAM rules contain a marketing rule, and razor helps too, but too many of these marketing messages are not being tagged. I'm referring to warnings such as this: Jul 6 22:54:20 bwipropemail postfix/smtp[25057]: C09F4885EA2BC: to=<44...@gmail.com>, orig_to=<44...@example.com>, relay=alt1.gmail-smtp-in.l.google.com[173.194.208.26]:25, delay=38223, delays=38220/1.3/1/0.22, dsn=4.7.0, status=deferred (host alt1.gmail-smtp-in.l.google.com[173.194.208.26] said: 421-4.7.0 [66.XXX.XXX.100 15] Our system has detected an unusual rate of 421-4.7.0 unsolicited mail originating from your IP address. To protect our 421-4.7.0 users from spam, mail sent from your IP address has been temporarily 421-4.7.0 rate limited. Please visit 421-4.7.0 https://support.google.com/mail/answer/81126 to review our Bulk Email 421 4.7.0 Senders Guidelines. 5si23309629qks.82 - gsmtp (in reply to end of DATA command)) Yes, gmail does that to almost anything they decide is relayed spam. Here is an example message: http://pastebin.com/kaD3AQMz It came from ymlpsv.net, black list them (and their other names such as ymlpsv.com, ymlpsrv.net, ymlpserver.net, ymlpsrv.com) unless one of your clients -really- wants crap from them, then selective whitelist. They are a spammy MSP. I regularly find garbage from them in my spamtraps. I realize bayes may be a problem on this one, but do you have any suggestions for blocking these more effectively before they're forwarded on to gmail? As others have alluded to, forwarding opens up a while can-of-worms but forwarding to gmail is the most problematic. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: local.cf, user_prefs etc
On Thu, 21 May 2015, Dmitry Baronov wrote: Hello folks! I use 3.4.1 freebsd version with compiled rules. Please, give me advice how I could use local config file to override downloaded default values? All my attemps were unsuccessful. I placed local.cf and user_prefs files in /root/spamassassin /etc/mail/spamassassin /usr/local/etc/mail/spamassassin /usr/local/share/spamassassin - no way to replace default values like blacklist_from or required_score. I need help :) Rgds, db First, how are you using spamassassin? Are you using the 'spamd' daemon and feeding it spam via "spamc" (from a procmail receipt or a postfix filter, or a milter)? Are you using the spamassassin program itself from procmail? Are you using some kind of dedicated mail filtering package such as mimedefang or amavis which instantiates an instance of spamassassin within its own process via the spamassassin APIs? The first two methods use the standard spamassassin config files, the last one may ignore standard spamassassin config files and use its own. For the first two you need to determine which config files are being used as it's possible that your SA kit was built with non-standard internal settings. Invoke spamassassin with the "--lint -D" flags and it will tell you which config files it's using. The 'local' variants of the config files that it says it's reading are the ones you want to modify. For the last method you'll have to consult the relevant documentation. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Rejecting without backscatter (was Re: Spamassassin not catching spam (Follow-up))
On Thu, 26 Mar 2015, Kris Deugau wrote: David F. Skoll wrote: On Thu, 26 Mar 2015 15:05:06 +0100 Reindl Harald wrote: * spamass-milter -r 8.0 * messages above 8.0 are *rejected* Silently? Or do you generate an NDR? I'm genuinely curious as to how you: 1) Accept mail for some recipients 2) Reject mail for others 3) Without generating backscatter 4) Given that the messages are sent in the same SMTP session with multiple RCPTs and only one DATA. For those of you still a little puzzled, here's an example of what David is asking about. In the following SMTP transaction, how to you reject the message for receip1, while accepting the message for recip2? $ telnet mx.example.org 25 << 220 example.org, talk to me helo sending.server << 250 Hello, friend! mail from:imma.spam...@example.com << 250 OK, send this to who? rcpt to:rec...@example.org << 250 OK rcpt to:rec...@example.org << 250 OK DATA << 354 Now for the message . At this point you have one message, scoring > 8 points. Recipient 1 absolutely requires all mail to be delivered to their Inbox, with a Subject tag in the case of mail considered spam. Recipient 2 wants mail scoring > 8 points to be rejected. What SMTP response to you send? You can only send one response, since you only have one message, but you have two recipients with conflicting filter policies. At that stage you're stuck, there is no way out of that box. To achieve the desired results you need business logic in your pre-queue / milter filter to do a triage during the 'rcpt' stage. You need a database of recipient classes to indicate whether the recipient is a spam-lover or a spam-hater. At the first recipient you look up that address and set a state variable for that session (call it love-hate). As each additional recipient comes in you compare his class against the love-hate setting for the current session. If they are compatible you respond with a 250, if not with a 452 (or other 45* type reply). This way the sender is responsible for queuing those recipients and trying again in another SMTP session. Then all the recipients in one session can be treated equally WRT the handling of reject/accept based upon some future state (EG spammyness of the message). That logic can be extended to more than just spam love/hate status, just need some kind of business logic that sets the compatibility matrix at the beginning of a session and 452's any recipient that isn't compatible. Note that Gmail is already doing something like this (the "multiple destinations not supported in one transaction" status). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Handling very large messages (was Re: Which milter do you prefer?)
On Sun, 15 Mar 2015, Reindl Harald wrote: Am 15.03.2015 um 19:15 schrieb Axb: On 03/15/2015 07:09 PM, Reindl Harald wrote: [snip..] IMO, deciding what chunk of a msg should be scanned should be managed by the glue and not by SA. true but if the glue (spamass-milter) would truncate the message it passes to spamc it would get back that truncated message with the added headers (which are used to decide reject or pass) and so finally *deliver* the truncated version then spamass-milter is the wrong choice how else should it work? it hardly can invent the report-headers SA adds by itself which needs to land in the final message, spamc/spamd are doing the message work and the milter is just the glue to bring the MTA and SA together However that glue can be intelligent and contain business logic. If the author of the milter knows what they are doing (and cares) this is very straightforward thing to do (I know because I did it with milterassassin). In the milter you must take an explicit extra step if you want to mess with the body of the message (smfi_replacebody). It's actually easier to just add/replace headers (smfi_addheader/smfi_chgheader) then it is to mess with the body. (not to mention faster & more efficient). So logic is; milter receives -copy- of message from sendmail, milter passes 'REPORT' command & (optionally truncated) message to spamd, gets back a headers-only report. milter then tells sendmail to add the new/modified headers and doesn't mess with the body. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: whitelist_from_rcvd not working, WAIDW
On Fri, 27 Feb 2015, Ian Zimmerman wrote: Header of test message, massaged for privacy, is here: http://pastebin.com/EV6g15aN I have this in user_prefs: trusted_networks 198.1.2.3/32 [...lots snipped...] whitelist_from_rcvd *@wetransfer.com *.wetransfer.com Why is the whitelist not firing? whitelist_from_rcvd can be a bit fragile because it depends upon multiple factors (trust chain, full-circle-DNS) working correctly. First thing, that second parameter is not an address but part of a DNS name, so use 'wetransfer.com' instead of that *.wet... second thing, check to see if your trust chain is working as you expect. whitelist_from_rcvd is applied at the point of the first trusted relay (IE where the last untrusted hands the message to the first trusted relay). Add the 'X-Spam-Relays-Trusted' and 'X-Spam-Relays-Untrusted' pseduo headers to your report to see if things are working as expected. Note that a DNS fubar (even temporary) will break whitelist_from_rcvd. Also if the sender changes MSP, it will break thus is a maintanance head-ache. I see that message has a valid DKIM signature, why not use whitelist_auth. Same goodness with less head-aches. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: no BAYES checking
On Wed, 25 Feb 2015, James wrote: I don't think I have the Bayesian filter working. This is some spam that wasn't marked as spam, shouldn't one of the tests be BAYES_00? X-Spam-Status: No, score=4.5 required=5.0 tests=FREEMAIL_FROM,FREEMAIL_REPLYTO, FSL_MY_NAME_IS,HTML_MESSAGE,RDNS_DYNAMIC,T_OBFU_JPG_ATTACH autolearn=no version=3.3.2 $ sudo sa-learn --username=debian-spamd --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0 5902 0 non-token data: nspam 0.000 0 4985 0 non-token data: nham 0.000 0 422427 0 non-token data: ntokens 0.000 0 1159486049 0 non-token data: oldest atime 0.000 0 1424827990 0 non-token data: newest atime 0.000 0 1424843976 0 non-token data: last journal sync atime 0.000 0 1424830068 0 non-token data: last expiry atime 0.000 0 0 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count Doesn't that show I have 5902 spam and 2462 ham messages? /etc/spamassassin/local.cf use_bayes 1 bayes_auto_learn 1 $ sudo -u debian-spamd spamassassin -D --lint 2>t $ less t $ grep bayes t Feb 25 21:07:47.606 [27839] dbg: config: fixed relative path: /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf Feb 25 21:07:47.607 [27839] dbg: config: using "/var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf" for included file Feb 25 21:07:47.607 [27839] dbg: config: read file /var/lib/spamassassin/3.003002/updates_spamassassin_org/23_bayes.cf Feb 25 21:07:55.270 [27839] dbg: bayes: learner_new self=Mail::SpamAssassin::Plugin::Bayes=HASH(0x1de4868), bayes_store_module=Mail::SpamAssassin::BayesStore::DBM Feb 25 21:07:55.353 [27839] dbg: bayes: learner_new: got store=Mail::SpamAssassin::BayesStore::DBM=HASH(0x230eb58) Feb 25 21:07:55.356 [27839] dbg: bayes: tie-ing to DB file R/O /var/lib/spamassassin/.spamassassin/bayes_toks Feb 25 21:07:55.359 [27839] dbg: bayes: tie-ing to DB file R/O /var/lib/spamassassin/.spamassassin/bayes_seen Feb 25 21:07:55.363 [27839] dbg: bayes: found bayes db version 3 Feb 25 21:07:55.365 [27839] dbg: bayes: DB journal sync: last sync: 0 Feb 25 21:07:55.366 [27839] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200 Feb 25 21:07:55.367 [27839] dbg: bayes: untie-ing Feb 25 21:07:55.379 [27839] dbg: bayes: tie-ing to DB file R/O /var/lib/spamassassin/.spamassassin/bayes_toks Feb 25 21:07:55.382 [27839] dbg: bayes: tie-ing to DB file R/O /var/lib/spamassassin/.spamassassin/bayes_seen Feb 25 21:07:55.385 [27839] dbg: bayes: found bayes db version 3 Feb 25 21:07:55.386 [27839] dbg: bayes: DB journal sync: last sync: 0 Feb 25 21:07:55.388 [27839] dbg: bayes: not available for scanning, only 0 ham(s) in bayes DB < 200 Feb 25 21:07:55.388 [27839] dbg: bayes: untie-ing Why does it say not enough ham? It looks like you either have a permissions problem or a confusion problem. Your run of 'sa-learn --dump magic' is looking at some Bayes which has enough ham/spam but what ever your spamassasin is looking at doesn't. Your 'sudo' isn't running that sa-learn --dump magic as UID 'debian-spamd' It's running it as root but telling sa-learn to emulate user 'debian-spamd' so there could be a permissions problem. Try running sa-learn in the same way that you're running spamassasin: $ sudo -u debian-spamd sa-learn --dump magic and see what you get. Other possibility is that sa-learn is looking at a different bayes database. Try running that "sa-learn --dump magic" with the "-D" option to see what bayes database it's looking at. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Recent spate of Malicious VB attachments II
On Thu, 19 Feb 2015, David F. Skoll wrote: On Thu, 19 Feb 2015 07:46:16 -0600 Chad M Stewart wrote: I use amavis-new and block based on file type. My users should never get legit executables via email, so they are sent to a quarantine. Unfortunately, we're finding those simple-minded rules are running out of gas. :( We've seen a zip file containing an Excel spreadsheet with a macro virus in it. ClamAV is essentially useless at detecting viruses, so it's a real problem... any ideas? I thought that ClamAV knew how to unpack zip/rar/tar/gzip/etc... and scan the cruft inside them. Are you saying that doesn't work or are you saying that the malware is mutating fast enough that the ClamAV signatures aren't keeping up with it? If the latter case, is there -any- AV kit that is? Are the Sanesecurity add-in ClamAV signatures helpful? -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Recent spate of Malicious VB attachments II
On Thu, 19 Feb 2015, Reindl Harald wrote: well, that can you achieve directly on the MTA but that won't help in case of "emails containing MS office attachments with a Malicious VB script" cat /etc/postfix/mime_header_checks.cf /^Content-(?:Disposition|Type):(?:.*?;)? \s*(?:file)?name \s* = \s*"?(.*?(\.|=2E)(386|acm|ade|adp|awx|ax|bas|bat|bin|cdf|chm|class|cmd|cnv|com|cpl|crt|csh|dll|dlo|drv|exe|hlp|hta|inf|ins|isp|jar|jse|lnk|mde|mdt|mdw|msc|msi|msp|mst|nws|ocx|ops|pcd|pif|pl|prf|rar|reg|scf|scr|script|sct|sh|shb|shm|shs|so|sys|tlb|vb|vbe|vbs|vbx|vxd|wiz|wll|wpc|wsc|wsf|wsh))(?:\?=)?"?\s*(;|$)/x REJECT Attachment Blocked (Executables And RAR-Files Not Allowed) "$1" (.rar because ClamAV can't scan the content on Fedora) Is that a politically inspired limitation? If you build ClamAV from source it can scan RAR. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: regex: chars to escape bsides @
On Sat, 3 Jan 2015, Reindl Harald wrote: by writing some custom rules like below i found out that @ needs to be esacped additionally to http://php.net/manual/de/function.preg-quote.php are there other chars which needs special handling? headerCUST_MANY_SPAM_TO X-Local-Envelope-To =~ /^(\)$/i score CUST_MANY_SPAM_TO -4.0 describe CUST_MANY_SPAM_TO Custom Scoring Umm, SA is written in Perl, not PHP. So you should look at Perl regex documentation, not PHP docs. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Gmail password reset FPs
On Wed, 17 Dec 2014, Joe Quinn wrote: We've been having password reset emails marked as spam by Gmail. We've tried rephrasing the email body/subject/from email, to no avail. We've even tried registering as a bulk sender (https://support.google.com/mail/contact/bulk_send_new?rd=1) and googling for anyone having similar issues. Has anyone else dealt with this before and managed to get it fixed? I see that you've got SPF set up for your domain, do you have DKIM signing enabled too? What about TLS transport on your outgoing MTA? Not sure they'll make any difference but those are things that I've done here to help improve deliverability. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Honeypot email addresses
On Thu, 4 Dec 2014, Noel Butler wrote: On 04/12/2014 00:54, Christian Grunfeld wrote: "It would be very rare, and if so you would ever more rare CC the entire list of addresses on your spam message - sure this was a lot more common in years gone by, but I've not seen any such evidence of it in almost 10 years, and if you did, well, that's not my problem, its the problem of your provider who obviously doesn't care enough to educate its users of the dangers of spam, period.." lol ! ! ! is it possible to educate users against spam? if that were the case this list would not be needed and we would be free ourselves from reading your posts, period ! you must be doing it wrong, the users of today are far wiser than they were 5 years ago, even my almost 80yo dad knows to handle spam, although its hard to do, get your users to *read* their welcome emails, and dont have a lawyer write the stuff, write it so an 10yo kid can understand it, its also rare spam gets passed SA anyway with our myriad of custom rules, and we block at MTA level from multiple DNSBL's amongst many other milter tricks which I'm not going into in a public forum :) So educate them well, and let SA do its job, and we wouldnt need to read your posts either. I have to agree with Dave, Christian, et-all. It's not frequent but not rare to see a reply-all "Take me off this list!!!". Even if you've got the smartest, best educated users who will never make that mistake and a totally perfect spam filtering system that never has a FN there are other people/systems in the world which may be on that "shotgun" spam recpient list which may be less than perfect. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Honeypot email addresses
Another way to seed spamtrap addresses is to make up some and then feed them into "unsubscribe" links in spam sent to regular users. I've got some of those I started that way 15 years ago and they're still going strong. On Sat, 22 Nov 2014, Ted Mittelstaedt wrote: That's a lot of work, there's a much easier way Just search your /var/log/maillog for user unknown messages, and create email addresses for the unknown users which are showing up multiple times over multiple days. It's a great trick because it gets spammers who already have email addresses in their spamlists and who are too lazy to remove them when they get a user unknown message from the mailserver. I have a pretty old domain - I've seen user unknown messages for users who cancelled mailboxes on the domain over a decade ago. I figure 10 years of getting user unknown messages is long enough for any real humans and for legitimate mailing lists to remove those entries. Ted On 11/21/2014 8:10 AM, Joe Quinn wrote: We are setting up some honeypot email addresses, and were wondering if anyone here had tips on how to include those addresses on webpages and other places. We're currently going with a pretty simple HTML comment. Is that too obvious? Should we put it into a CSS invisible div as well? Any other ideas? -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: URIBL_RHS_DOB #fail
On Sun, 9 Nov 2014, Axb wrote: On 11/09/2014 09:51 PM, Alex Regan wrote: Hi guys, One of my user's hotel reservations almost got tagged incorrectly: * 1.5 URIBL_RHS_DOB Contains an URI of a new domain (Day Old Bread) * [URIs: bestwestern.com] I looked around for a place to report an FP, but also thought everyone else should know about this, since it's so obviously incorrect. Their whois looks like the record was updated on the 31st. Not exactly a day ago, but could that even have something to do with it? DOB owner has been notifed. I think DOB was having a "bad hair day" this morning. I saw a number of FP hits on DOB for stuff that hadn't changed in years (EG amtrak.com ). It looks better now. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: yahoo rcvd bug?
On Mon, 20 Oct 2014, Quinn Comendant wrote: I'm getting FORGED_YAHOO_RCVD false positives for messages with yahoo received headers that do not match the search pattern defined in check_for_forged_yahoo_received_headers(). I'm using SpamAssassin 3.3.2 with latest rules as per `sa-update` rule channels `sought.rules.yerp.org` and `updates.spamassassin.org`. The spamassassin rule that is firing: * 1.6 FORGED_YAHOO_RCVD 'From' yahoo.com does not match 'Received' headers The received-by header in question: Received: from unknown (HELO nm46-vm10.bullet.mail.bf1.yahoo.com) (216.109.114.203) Full mail headers available at https://cloudup.com/cbmG8tJF71k And finally here's the `check_for_forged_yahoo_received_headers` function that parses this, which doesn't contain the correct regex for this hostname: [snip..] return 1; } You have two different rules that have fired there (FORGED_YAHOO_RCVD & RDNS_NONE) because your MTA was not able to resolve that IP address to its registered domain name. The SA code correctly parsed the info that your MTA gave it, it's just that info was incorrect either due to local DNS issues or a network issue. Then because you (or somebody configuring your SA) has lowered the spam threshold from 5.0 to 3.0 it caused a FP on this message. I don't think that it is valid to delcare a bug in SA because of an issue local to your system. (problematic MTA/DNS & local config choices). I see that you also have a hit on URIBL_BLOCKED which tends to indicate that you have local DNS issues that should be addressed. suggestions: 1) work on improving your DNS system 2) put the spam threshold back to default to reduce FPs triggered by DNS issues. 3) create a meta rule that takes the DKIM_VALID detection to nullify the effect of that FORGED_YAHOO_RCVD (in case you cannot get your DNS to work correctly). If you lowered that spam threshold because of too many FNs, I think that getting the DNS fixed so RBL tests work will take care of that too. There have been plenty of posts to this list about URIBL_BLOCKED and how to fix it. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: .link TLD spammer haven?
On Mon, 13 Oct 2014, Philip Prindeville wrote: Every connection I’ve gotten from a hostname resolving to *.link or saying helo *.link has been spam (I block the connections with MIMEDefang). Has anyone actually seen a legitimate email from a host in the .link TLD? I’ve seen (last week alone): bgo.blc-onlineconsumer140.link ratio.allgiftcardsonlinefriendly.link ratio.autodealersstarted.link [snip..] Is it worth having that triggers on the relay’s hostname being *.link? Also, I noticed that every message we saw was missing a Received: header… -Philip I'll second that and add a similar comment about ".link" URLs inside the message. Last week I created a uri rule to fire on any ".link" hosted URL and so far havn't seen a single FP. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: punctuation in subjects
On Mon, 1 Sep 2014, Martin Gregorie wrote: On Mon, 2014-09-01 at 03:17 -0400, Jude DaShiell wrote: Messages with question marks and spaces have been showing up in my inbox on another account. To blacklist these [? ] would take care of those characters in a Subject: line. Would such a regular expression effectively blacklist any message having just those two kinds of characters in its Subject: line in any combination? No: a regex along these lines /[? ]/ will hit all subject lines containing either a space or a question mark, i.e. just about every subject line you'll ever see. This one /[? ].*[? ]/ will only hit subjects with both characters in any order, but is probably also far too general to use by itself. Make it a subrule (name starts double underscore) and use a metarule to combine it with another subrule that fires on something that usually only appears in spam and you may have the basis of something more useful. Maritin's proposed rule would hit a string that contained at least two '?' or space characters as well as other characters. (EG: '?junk?' or 'this one hit'). If you want to be sure to hit subjects that contain ONLY question marks and spaces (and at least one of each) it will take two sub-rules combined into a metarule. EG: header__SUBJECT_SPACE_QM Subject =~ /(?:\? | \?)/ header __SUBJECT_MORE_THAN_SP_QM Subject =~ /[^? ]/ meta SUBJECT_SPACE_QM __SUBJECT_SPACE_QM && ! __SUBJECT_MORE_THAN_SP_QM (untested) FWIW, I would expect such a rule to have a limited useful life-span. Now that it's been discussed here spammers will adapt their garbage to avoid it (IE add one other kind of character to the subject, etc). Spammers do monitor this list and just the act of disussing spam characteristics can cause them to adapt their tactics. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
users@spamassassin.apache.org
On Sat, 16 Aug 2014, Rajesh M. wrote: hi we are getting spam with a lot of hashes & Ꭼmа i checked out KAM.cf but not able to trap such emails any solution please ? thanks rajesh Search the July archive of this list for postings with the subject of: "More text/plain questions" There were a couple of possible solutions discussed, including new features added to the latest version (trunk) of spamassassin. I took one of them (new functions in MIMEEval) back-ported it to my SA kit and it has been hitting pretty regularly on that kind of spam. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Somewhat OT - how do I whitelist a host which is in a DNSBL in sendmail?
On Thu, 24 Jul 2014, Thomas Cameron wrote: Howdy - I have two VMs at Digital Ocean, one on the east coast, one on the west. I'm running Sendmail-8.14.8-2.fc20.x86_64. I have several DNSBLs listed: FEATURE(`dnsbl',`in.dnsbl.org ')dnl FEATURE(`dnsbl',`sbl-xbl.spamhaus.org')dnl FEATURE(`dnsbl',`cbl.abuseat.org')dnl FEATURE(`dnsbl',`dul.dnsbl.sorbs.net')dnl Unfortunately, my home network is attached to a cable provider which shows up in dul.dnsbl.sorbs.net. Can I whitelist my IP address so that I can send mail through my mail servers? Right now, it gets rejected. Yeah, I know, I can always use my ISP's smtp server, I guess. But that kind of sucks. I would rather use mine. Purely a pride thing, I know. Thomas Thomas. Do you have 'MSA' port enabled for your sendmail? (IE port 567) and SMTP-AUTH? Then just skip the dnsbl checks for auth'ed mail submissions. You could whitelist your client IP address in your 'access' file but what happens when that address changes? (I assume your ISP gives you a DHCP address). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Bayes, Manual and Auto Learning Strategies
On Wed, 2 Jul 2014, Steve Bergman wrote: Well... I just turned on autolearn for a moment, deleted the bayes_* files on the test account I use, and sent myself a message from my usual outside account. And new bayes_* files were created. So I was wrong, and I win. More options. So now I can proceed to the "what does this mean?" phase. If I leave things as they are, then training is perfect if the users are diligent. But if they are not, then... what? I see plenty of spams getting through with a 0.0 score. IIRC, the autolearn spam threshold is 7? Pretty much everything there is spam. But I'm not sure I quite buy having the static rules of SA training Bayes. Isn't Bayes just learning to emulate the static rules, with all their imperfections? Unless you've explicitly disabled them, the network based rules (razor, pyzor, dcc, DNS based rules, RBLs, URIBLs, etc) constitute an external 'reputation' system to pass judgment on messages. It's not uncommon to take a low-scoring spam and find that it gets a higher score on retest as it has been added to various bad-boy lists. This is also one way that gray-listing helps. If you stiff-arm the first pass of a spam run a later check may hit it more accurately as it's been added to block-lists in the mean-time. If it starts going wrong, doesn't that mean the errors are going to spiral out of control? That is a possible risk of relying solely on auto-learning. The autolearn system has been carefully crafted and tuned over the years to try to prevent a feed-back loop from throwing it into a tail-spin. For example the internal scoring system used to determine if a message is spam or ham WRT the choice for auto-learning explicitly excludes the Bayes score (and other particular kinds of scores such as white/black lists) to try to prevent tail-eating. Occasional judicious manual learning can help to 'tweak' things when Bayes looks like it's not in top shape. (IE manual learning of FPs & FNs). I've used site-wide Bayes with auto-learning at a site with ~3000 users and have had to flush & restart our Bayes database twice in 10 years. Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Bayes, Manual and Auto Learning Strategies
On Wed, 2 Jul 2014, Steve Bergman wrote: On 07/01/2014 11:49 PM, Karsten Bräckelmann wrote: Those do not tell you about using file or SQL based databases? They do. But not specifically with respect to autolearn. You never thought about googling for "spamassassin per user" and friends? You never checked the SA wiki? I have, indeed. No reference to autolearn and persistent storage. The lack of mention is notable. I'd expect people to be lining up to tell me I'm mistaken if I absolutely were. Can you point me to a change log somewhere documenting autolearn moving from in-memory and system-wide to per user and persistent? I don't hold a strong opinion on this. It would be nice if I were wrong. It would open more options. I'm just waiting for evidence that it's the case. My perception is that It's not. -Steve Steve, For some reason you seem to be hung-up on Bayes "autolearning". It it possible that you're confusing it with "Auto-White listing"? (which is now deprecated and has -nothing- to do with Bayes). SA's Bayesian scorer is a system based upon a method that parses a message, extracts 'tokens' from it and uses an algorithm to calculate a score for the message based upon a dictionary of previously seen tokens and their relative merit. The dictionary is created and updated by a process called 'learning' wherein already-classified messages are tokenized and their tokens are stored in the dictionary along with a merit value derived from their instance count and a factor taken from being classified as spam or ham. This learning process can be either externally driven (known as 'manual' learning) or via an automated process from within SA as it scores messages (known as 'auto' learning). So regardless of whether manual or auto learning is utilized, tokens are added to the dictionary. It's also possible to employ both auto & manual learning methods in the same installation. There can be one dictionary used for scoring all messages processed (called "site wide Bayes") or many separate dictionaries, one used for each recognized user ("per user Bayes"). Either way, the dictionary(s) need to be updated (and the update process could be either manual, auto, or both). The Bayes dictionary(s) need to be stored some how, the usual method is via some kind of database. It could be a simple file based DB, some kind of fancy SQL server based system or something else. This is a DBA'ish kind of choice as to what particular technology is used to store the dictionary DB. (usually on disk in some way, could be in some kind of memory resident set of tables, or something else???). So you have a multi-dimensional matrix WRT your Bayes system configuration, and manual VS auto learning is just one factor. It's been this way for the past 10+ years AFAIK (well, maybe 10 years ago it didn't have as many options for back-end database storage, mostly limited to Berkeley-DB type methods). I hope this helps you. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: SA rule to detect prior SA pass?
On Sat, 28 Jun 2014, RW wrote: On Fri, 27 Jun 2014 20:43:19 -0500 (CDT) David B Funk wrote: Looking at my mail streams I see evidence that spammers sometimes add faked "SpamAssassin" headers to their messages (I assume to try to trick recipients into thinking that the message has already been given a clean bill-of-health). I wrote a few test rules to look for these pre-existing "X-Spam-" headers to test to see if it could be used as a spam detector. However I got no hits on these rules even on hand crafted test messages that contained such stuff. Checking the SA source I found in PerMsgStatus.pm a line of code: $self->{msg}->delete_header('X-Spam-.*'); that ran before any tests. So looking for SA headers inside of SA is pointless. So does anybody have any ideas how to test for evidence of a prior SA pass? You could simply rewrite "X-Spam-" to "X-Original-Spam-". That's what I was afraid of. As I'm using a "milter" as my glue (so I can SMTP reject high scoring spam) the usual MTA rewrite functions don't do any good, so I'll have to hack the milter. I was hoping for something more portable. I doubt this is going to be very useful because too much legitimate mail has X-Spam- headers. Most of the mailing lists I read have them. Some servers add them to outgoing mail. You may have users that receive scanned mail forwarded from ESPs etc. I'm aware that by itself the presence of those headers aren't definitive spam signs but I was hoping to combine that info with other clues to create meta rules. However cannot test out this hypothesis with out the ability to detect those headers. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: FYI - ahbl.org and BIND DNS errors
On Tue, 10 Jun 2014, Andrew Daviel wrote: Per http://ahbl.org/content/changes-ahbl, AHBL is going away (still used in spamassassin-3.3.1) Meanwhile, AHBL is serving strange DNS responses, e.g. (from wireshark) 1 0.00 142.90.100.186 -> 162.243.209.249 DNS 93 Standard query 0xc828 A zuz.rhsbl.ahbl.org 2 0.072481 162.243.209.249 -> 142.90.100.186 DNS 246 Standard query response 0xc828 Authoritative nameservers rhsbl.ahbl.org: type NS, class IN, ns invalid.ahbl.org rhsbl.ahbl.org: type NS, class IN, ns unresponsive.ahbl.org rhsbl.ahbl.org: type NS, class IN, ns unresponsive2.ahbl.org Name Server: unresponsive2.ahbl.org Additional records invalid.ahbl.org: type A, class IN, addr 244.254.254.254 Addr: 244.254.254.254 (244.254.254.254) unresponsive.ahbl.org: type A, class IN, addr 10.230.230.230 Addr: 10.230.230.230 (10.230.230.230) unresponsive2.ahbl.org: type A, class IN, addr 192.168.230.230 Addr: 192.168.230.230 (192.168.230.230) invalid.ahbl.org: type , class IN, addr fe80:: Addr: fe80:: This last one, fe80::, is an IPv6 scope-link address that causes the BIND nameserver to log a weird error named[31365]: socket.c:4373: unexpected error: named[31365]: 22/Invalid argument Per http://www.mail-archive.com/bind-users@lists.isc.org/msg05240.html connect() fails as it is missing scoping information. Umm, with a name like "invalid.ahbl.org" what do you expect? That's truth in advertising. It's 'invalid', as a matter of fact all of those addresses aren't usable, they're either RFC-1918 or multicast/local-scope. So none of those are valid for remote queries. Do NOT use rhsbl.ahbl.org. period. end of song. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: some questions on sa-compile
On Sat, 3 May 2014, RW wrote: On Fri, 02 May 2014 21:51:02 +0200 Axb wrote: 2) The non-amenable rules are processed, but may be slower than if they weren't compiled? yep It means they get processed as normal in perl, so they don't get speeded-up, but they aren't slowed-down. One thing, rules which cannot be compiled are often rules (such as ones that use negative look-ahead/look-behind ) that are potential major CPU hogs. If used in limited scope they aren't usually a problem but if not carefully written can be CPU sucks. Not to say that they shouldn't be used at all but just with care. So if you see that warning about uncompileable rules, take a second look at those specific rules. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Missing header when skipping mail
On Fri, 18 Apr 2014, Kevin A. McGrail wrote: On 4/18/2014 6:18 AM, Erik Logtenberg wrote: The tool that hands the message to spamasassin (spampd in your case) imposes the size limit. The message is never seen by spamassassin. You're barking up the wrong tree ;) Tom Ah, I agree. This is where that happens: $self->log(2, "skipped large message (". $size / 1024 ."KB)"); I'll send my question to Maxim Paperno, author of spampd, instead. Spamassassin is a program AND an API. Right now you are using the API so your wrapper (spampd) can do anything it wants but we could also consider this feature for spamc/spamd and for spamassassin as a parameter to enable. Please open a bugzilla bug if you would like it considered. Another way to deal with this problem is, via the glue agent, to truncate large messages and send just the first X-kbytes of the message (for some appropriate value of X, I'm using 256K). The idea is that a large spam message is probably large because of some attachment and the payload is in the first part. I'm using a milter to connect to spamd so I can do SMTP rejections of high scoring spam. I coded the truncation feature in the milter so no need to modify the MTA nor spamd. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: meta test HEXHASH_WORD has undefined dependency '__KAM_BODY_LENGTH_LT_512'
On Sun, 6 Apr 2014, Helmut Schneider wrote: Hi, over the last weeks I constantly run into issues when I cannot get SA up again because of "broken" rule sets. Today it's Apr 6 17:06:01.960 [31092] dbg: rules: meta test HEXHASH_WORD has undefined dependency '__KAM_BODY_LENGTH_LT_512' Is something wrong in my process or do we have a problem with QA these days. Don't get me wrong, I appreciate your work very much. Thanks, Helmut What, exactly, do you mean by 'I cannot get SA up again because of "broken" rule sets. Today it's' That is effectively a warning, not a fatal error message. That one particular kind of warning should not stop SA from running. It means that you've got a meta rule that is a combination of other rules and one of the other rules is missing. For example: meta DIGEST_MULTIPLE RAZOR2_CHECK + DCC_CHECK + PYZOR_CHECK > 1 if you don't have PYZOR installed you'll get a warning for that particular rule but it just be lacking the input from that particular potential component of the rule, it will work just fine with the other two components. Worst case, a given meta rule won't fire at all because it's missing some necessary component and thus that rule will be effectively disabled but the whole SA engine should still run. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Remove spam results from mail header
On Sun, 16 Mar 2014, Re@lබණ්ඩා™ wrote: Hi All, Is there a way to disable spam results from been published to the mail header? Usually, depending upon how spamassassin is hooked into your mail system. Looking at your example message I don't see any "Checker-Version" header which normally SA systems add. This tends to indicate that your system is doing some kind of custom header/results processing. Normal SA systems use configuration options (see BASIC MESSAGE TAGGING OPTIONS section of SA documentation) to control this however it appears that you're using SA-Exim, you'll need to check its documentation for how to configure that. And I could see two sections of spam results in the mail header as follows. What could be the reason for that? At a guess, it appears that this message has gone through some kind of list processing system, so maybe SA processed twice. Usual SA practice is to remove (or overwrite) previously existing SA headers when processing a message. However this, again, is dependent upon how SA is hooked into your mail system. Bottom line, you've got an unusual SA kit there, time to do some code diving. From: xxx To: xxx Thread-Topic: Thread-Index: Ac8/0magIx/1eKsLQWmx3TflU2Na0Q== Date: Fri, 14 Mar 2014 22:25:53 + Message-ID: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [x.x.x.x] MIME-Version: 1.0 X-Spam_score: -1.9 X-Spam_score_int: -18 X-Spam_bar: - X-Spam_report: Spam detection software, running on the system "xx.com", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: xx [...] Content analysis details: (-1.9 points, 4.8 required) pts rule name description -- -- -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.] 0.0 HTML_MESSAGE BODY: HTML included in message Subject: x X-BeenThere: X-Mailman-Version: 2.1.13 Precedence: list List-Id: List-Unsubscribe: List-Archive: List-Post: <mailto:x> List-Help: <mailto:xx>, <mailto:x> Content-Type: multipart/mixed; boundary="===153982665457238==" Sender: Errors-To: X-Spam_score: 1.6 X-Spam_score_int: 16 X-Spam_bar: + X-Spam_report: Spam detection software, running on the system "xx.com", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see @@CONTACT_ADDRESS@@ for details. Content preview: [...] Content analysis details: (1.6 points, 4.8 required) pts rule name description -- -- -0.0 SPF_PASS SPF: sender matches SPF record 0.0 HTML_MESSAGE BODY: HTML included in message 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% [score: 0.4901] 0.8 RDNS_NONE Delivered to internal network by a host with no rDNS X-SA-Exim-Connect-IP: x.x.x.x X-SA-Exim-Mail-From: xxx X-SA-Exim-Scanned: No (on ); SAEximRunCond expanded to false -- Re@lBanda -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: tons of forged bills in german
On Sat, 18 Jan 2014, Michael Monnerie wrote: Dear list, since this week there are tons of very good forged bills that look like real, from big companies like telekom, vodafone, etc. They look like the original, and just the link in the middle, where it says "download your bill here", goes to a site containing trojans. [snip..] domain. Also, as Vodafone uses SPF, I'd like to check if I hit VODAFONEgood && !SPF signature in the mail. The problem with all this is, that there are MANY companies, so does someone have a better idea? For companies who use SPF or DKIM, create a whitelist_auth entry for them then either black list them or create rules to hit on any sign of the comnpany's messages. The whitelist_auth will override any rules so real messages will get thru and the blacklist/targeted rules will hit the imposterers. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: dependency hell (completely off-topic...)
On Fri, 15 Nov 2013, David F. Skoll wrote: On Fri, 15 Nov 2013 16:25:30 + RW wrote: Why not just email yourself the package files? Or write an IP-over-email network driver that tunnels to an exterior friendly machine... (/me ducks...) Regards, David. That would earn him a visit by the MiB who snoop all incoming & outgoing emails (would perplex the c**p outta them, they'd assue he was up to something ;). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Explanation of message of RDNS_NONE??
On Tue, 22 Oct 2013, Kai Schaetzl wrote: Webmaster DKDB wrote on Tue, 22 Oct 2013 08:08:01 +0200: dkdb.dk.37.66.77.in-addr.arpa Probably because of this. This reverse DNS is not under an existing top- level-domain and looks very much like a normal reverse lookup (and not the result). Have them set it to a real public hostname. Kai Kai, .in-addr.arpa. -is- the official top-level dns zone for reverse map data. Webmaster, That's because the reverse-map entry for 119 in the 37.66.77.in-addr-arpa zone file is missing a period at its end. That's a DNS admin error. send email to hostmas...@ngdc.net and ask them to fix that. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: How do I find a parent rule for a test?
score and affect the overall message score. So, at the most basic level, any rule having a name that starts with two underscores is _inherently_ a base for other rules. In order to determine *which* rules it's a base for, you have to look for that rule name in the config files. This isn't too easy to do online, you pretty much have to grep the rules files in a local install. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- WSJ on the Financial Stimulus package: "...today there are 700,000 fewer jobs than [the administration] predicted we would have if we had done nothing at all." --- Tomorrow: the 226th anniversary of the signing of the U.S. Constitution -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Rules not working
On Sun, 8 Sep 2013, Raymond Jette wrote: When I add add custom rules to /etc/mail/spamassassin/local.cf the rules work as expected. If I create any *.cf file and put the rules in they do not work. My test rule is: body test_match_all /.*/ scoretest_match_all -0.01 Rules only work if they are in local.cf. If I run the following command: echo | spamassassin --debug I can see my custom rules that are in files other than local.cf get called. Why would they work this way but never get called when spamd is called from exim? Thanks for any help you can provide, Ray File system permissions issues? Are the new rules files readable by the "exim" user? -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Catching fake LinkedIn invites
On Thu, 29 Aug 2013, Michael Schaap wrote: On 29-Aug-2013 00:30, John Hardin wrote: On Wed, 28 Aug 2013, Michael Schaap wrote: Hi, I'm getting loads of fake LinkedIn invites, most of which aren't caught by SpamAssassin. Does anyone have a good SpamAssassin rule to catch those, while letting real LinkedIn invites through? Do they fail SPF or DKIM? Unfortunately not, for the most part. (The "From:" header is at linkedin dot com, but the envelope sender is a random address, and I guess SPF and DKIM run on the envelope sender only.) If they do, and the legit ones pass SPF or DKIM, then the standard solution is to add a header rule to detect that the message claims to be from that domain (e.g. using the domain part of the From or Reply-To headers), and then either give that rule some points and also define whitelist_from_auth for the domain, or meta that rule with (SPF_FAIL || DKIM_FAIL) and give the meta a some points. There were some examples of doing this for facebook within the last couple of weeks, check the list archives. Hmm, legit ones have SPF_PASS. So I guess I could set up a rule that punishes messages “From:” linkedin which don't have SPF_PASS. I might give that a try, once I find some time to figure out how... Untested but try: whitelist_auth *@bounce.linkedin.com whitelist_auth *@linkedin.com blacklist_from *@linkedin.com The whitelist_auth will kick in on any message from @linkedin.com which passes SPF or DKIM thus will null out the bad points from the blacklist_from, and end up being neutral. Any purported linkedin.com message not getting the whitelist_auth boost will be clobbered by the blacklist_from. One caveat, a transient DNS failure might cause the SPF/DKIM to not verify thus not boosting legit linkedin messages. There is a low-power version of whitelist_auth called def_whitelist_auth which only boosts by +15 (I use it for a lot of stuff). However there isn't a def_blacklist_from so you have to use the "full strength" versions of both white/black list (+100/-100) to make them balance out each other. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Errors when processing mail.
On Sun, 14 Jul 2013, Christian Dysthe wrote: Hi, I am very new to Spamassassin and trying to have it work with the Citadel mail server which has support for spamassassin. I'm running Spamassassin 3.3.2-2ubuntu1 on Ubuntu Server Edtion 12.04 LTS x64. The entries I get in the mail.log when mail is being delivered are: Jul 14 16:52:21 concerto spamd[7687]: plugin: eval failed: bayes: (in learn) locker: safe_lock: cannot create tmp lockfile /nonexistent/.spamassassin/bayes.lock.concerto..com.7687 for /nonexistent/.spamassassin/bayes.lock: No such file or directory Jul 14 16:52:21 concerto spamd[7687]: spamd: clean message (-0.7/5.0) for (unknown):65534 in 0.2 seconds, 1496 bytes. Jul 14 16:52:21 concerto spamd[7687]: spamd: result: . 0 - FREEMAIL_FROM,MSGID_FROM_MTA_HEADER,RCVD_IN_DNSWL_LOW,SPF_PASS,T_DKIM_INVALID scantime= 0.2,size=1496,user=(unknown),uid=65534,required_score=5.0,rhost=localhost,raddr=127.0.0.1,rport=43706,mid=,autolearn=unavailable Jul 14 16:52:21 concerto spamd[7686]: prefork: child states: II Could someone point me in the right direction so I can solve this issue, or these issues even? The error indicates that the Bayes component of your spamassassin cannot create the lock file "/nonexistent/.spamassassin/bayes.lock.concerto..com.7687" First order of business, what do you get when you do a "ls -ld /nonexistent/.spamassassin" Does that directory exist? What are its ownership and permissions? Is it writable by the UID that your spamd is running under? Bottom line, the spamassassin Bayes module needs a writable working directory. Your error messages imply that the directory that your spamassassin configuration is telling your Bayes to use (that "/nonexistent/.spamassassin" thing) has issues. So either you need to fix that directory or fix your configuration to tell it where the directory it -should- be using is. I don't know that "Citadel" kit, you may be better off finding some discussion list which is specifically about it. Just guessing by that directory name ("/nonexistent/") it's something that you need to explicitly create and change your configuration to point to. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: False negatives/positives on debian
On Sat, 22 Jun 2013, Robert S wrote: I've eliminated this problem by using openDNS servers: # cat /etc/resolv.conf domain mydomain.net.au search mydomain.net.au nameserver 192.168.0.33 #<--- My server IP nameserver 208.67.220.220 nameserver 208.67.222.222 Is this likely to have untoward consequences? I've also looked at using unbound - which looks quite straightforward. Assuming that your dnsmasq (or other DNS-server) is running on the same machine as your SA, use the loopback IP addr (127.0.0.1) instead of the explicit IP addr of your server's ethernet interface. IE, in your resolv.conf use: domain mydomain.net.au search mydomain.net.au nameserver 127.0.0.1 nameserver 208...stuff nameserver some.other.server.. This is for several reasons: 1) ease of maintenance, always works, even after changing your server's IP addr for what ever reason. 2) security, you can then change your DNS server to only listen for queries on the loopback addr and make it more immune to remote attacks. 3) performance, DNS queries work best if they fit in a single UDP packet. The loopback has a larger MTU than standard enet interfaces, so more likely to handle large DNS queries w/o fragmentation or TCP fallback. Now if you're also using that DNS server to provide DNS service for other client machine on your local LAN then you cannot do the change in (2) (make DNS server only listen to loopback) but it still simplifies configuration. (allow all queries on lo0 and selected queries on eth*). -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: False negatives/positives on debian
On Sat, 22 Jun 2013, Robert S wrote: I am running spamassassin_3.3.2-5 on debian Wheezy on a small business server (x86). I am getting numerous complaints about mail being falely categorised as spam/ham. I also use version 3.3.2 on my home server using gentoo (amd64) and don't have these problems. I have removed all customisations and have reinstalled spamassassin on my debian machine. There still seem to be problems - here's an example using the provided sample files. Can anybody help? This message seems to get blocked in a lot of blocklists (which also seem to happen to my users' messages). Options for SA are: # ps ax |grep spam 22408 ? Ss 0:02 /usr/sbin/spamd --create-prefs --max-children 5 --helper-home-dir -d --pidfile=/var/run/spamd.pid /etc/procmailrc includes this: * < 256000 | /usr/bin/spamc $ spamc < sample-nonspam.txt Received: from localhost by debian.myserver.net.au with SpamAssassin (version 3.3.2); Sat, 22 Jun 2013 12:06:12 +1000 From: Keith Dawson To: t...@world.std.com Subject: TBTF ping for 2001-04-20: Reviving Date: Fri, 20 Apr 2001 16:59:58 -0400 Message-Id: X-Spam-Checker-Version: SpamAssassin 3.3.2 (2011-06-06) on debian.myserver.net.au X-Spam-Flag: YES X-Spam-Level: X-Spam-Status: Yes, score=8.5 required=5.0 tests=RP_MATCHES_RCVD,SAGREY, URIBL_AB_SURBL,URIBL_BLOCKED,URIBL_GREY,URIBL_MW_SURBL,URIBL_PH_SURBL, URIBL_RED,URIBL_WS_SURBL autolearn=no version=3.3.2 MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="--=_51C50694.B9FC2455" This is a multi-part message in MIME format. =_51C50694.B9FC2455 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit Spam detection software, running on the system "debian.myserver.net.au", has identified this incoming email as possible spam. The original message has been attached to this so you can view it (if it isn't spam) or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: -BEGIN PGP SIGNED MESSAGE- TBTF ping for 2001-04-20: Reviving T a s t y B i t s f r o m t h e T e c h n o l o g y F r o n t [...] Content analysis details: (8.5 points, 5.0 required) pts rule name description -- -- -1.5 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain 0.0 URIBL_RED Contains an URL listed in the URIBL redlist [URIs: tbtf.com] 0.0 URIBL_BLOCKED ADMINISTRATOR NOTICE: The query to URIBL was blocked. See http://wiki.apache.org/spamassassin/DnsBlocklists#dnsbl-block for more information. [URIs: tbtf.com] 1.1 URIBL_GREY Contains an URL listed in the URIBL greylist [URIs: tbtf.com] 0.0 URIBL_PH_SURBL Contains an URL listed in the PH SURBL blocklist [URIs: tbtf.com] 4.5 URIBL_AB_SURBL Contains an URL listed in the AB SURBL blocklist [URIs: tbtf.com] 1.7 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist [URIs: tbtf.com] 1.7 URIBL_MW_SURBL Contains a Malware Domain or IP listed in the MW SURBL blocklist [URIs: tbtf.com] 1.0 SAGREY Adds 1.0 to spam from first-time senders =_51C50694.B9FC2455 Content-Type: message/rfc822; x-spam-type=original Content-Description: original message before SpamAssassin [snip..] clearly the bulk of those points come from those URI-RBL type rules, which look like FPs. At least that "tbtf.com" domain isn't listed right now, it -might- have been when this message was processed. However given that "URIBL_BLOCKED" rule fired, it looks more like there's something wrong with your setup which is causing all those URI-RBLs to FP. Have you looked at the web page that URIBL_BLOCKED rule references? Have you investigated why it fired? Have you tried taking any of the advice on that page as to how to deal with this problem? To go beyond the advice on that page we'd need to know more details about how your DNS/network is configured on your SA scanner machine (are you running a local caching DNS server? Are you using some explicit DNS forwarder? Does your ISP do anything special with DNS queries? ... -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: MariaDB instead of MySQL
On Fri, 17 May 2013, David F. Skoll wrote: On Fri, 17 May 2013 08:58:53 -0700 Quanah Gibson-Mount wrote: Personally I wish SA supported LMDB. We've had very good luck with CDB, Dan Bernstein's "constant database" format. Reads are unbelievably fast. The only downside to CDB is that you cannot update a CDB file. You need to generate a new one from scratch. Still, even that is quick enough that we use it. One of my colleagues benchmarked CDB versus Berkeley DB and the difference was dramatic: http://www.dmo.ca/blog/benchmarking-hash-databases-on-large-data/ CDB was about 6-7 times as fast on random reads as Berkeley DB. If CDB is read-only, how do you store the a-time values on lookups so you know which tokens aren't being used to facilitate expiry? -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: .pw / Palau URL domains in spam
Donesh, Thanks for your prompt response. Do you just want the domain names or do you also want copies of the spam? Dave On Sun, 5 May 2013, doneshlaher wrote: Hello Dave Funk, Thank you for providing us with the list of domain names. We are acting on them and will be taken down within 24/48 hours. We request you to report the domain names at abuse.al...@registry.pw and also cc the same mail to abuse.al...@directi.com. Regards Donesh Laher Cyber Security Analyst .PW Registry -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: .pw / Palau URL domains in spam
On Wed, 1 May 2013, doneshlaher wrote: Hello Axb, Thank you for providing with the domain names. We will be suspending all these reported domain names. However, in the mean time may i know what kind of spams have been received ?? also can you please forward us the email headers of few of the reported domain names. This would help us to analyse the headers and understand, whether we the account is compromised or not. Regards Donesh Laher Cyber Security Analyst .PW Registry Donesh, How many dozen spams a day would you like to receive? Should I send them to your personal address or is there some other reporting address I should use? We are not a large site (only a few thousand users) but in the past few weeks have been receiving hundreds of spams a day advertising ".pw" domains. Here's a partial list of some of the past 3 days worth: (this list would be much larger except that I've been black-listing the IP addresses of their hosting providers as fast as I can identify them) vision-virtuahosting1.pw visionsvirtualwebhost4.pw allsupremedeal.pw alltopdeals.pw amerivalues.pw autopricefind.pw autopricefinder.pw banesgroup.pw dallyhost.pw dimehosts.pw dursidis.pw efulan.pw efundess.pw ekmsgroup.pw ezhotdealz.pw getgreatwins.pw gethotdealz.pw grevaluaqu.pw igreatness.pw imaginec1.pw iradjead.pw islity.pw metagreatwins.pw neathotdealz.pw newgreatdealz.pw progreatdealz.pw servermaximum.pw sharpgreatdealz.pw sleekgreatdealz.pw specialzhome.pw specialzland.pw specialztoday.pw successtopdeals.pw superbtopdeals.pw supertopdeals.pw usdirects1.pw vision-virtualhosting12.pw vision-virtualhosting14.pw visionsvirtualwebhost2.pw zbidnow.pw avanheertyu.pw getsuperiordeal.pw sleeplessdaysnow.pw gwampuer.pw treelendnews.pw getmatchednows.pw -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: rule problem basing on X-Spam-ASN - not a rule problem
On Thu, 25 Apr 2013, Frank Gadegast wrote: And SA is doing it right, to remove all X-Spam-lines before its starting, so that spammer cannot trick SA. And whatever line is inserted by ASN.pm, it needs to be stripped too, and thats why its programmed like it is. But I have no still no idea how to get it done in a perfect order, like the following - SA strips the X-Spam-lines - ASN.pm inserts its line including the AS - SA runs its rules and triggeres also on the X-Spam-ASN-line Is it time to ask the developers or file a bug ? This doesn't help unless the plugin adds a pseud-header but in the case of plugins that do, you can change the priority of your rules to get them to run after the plugin. I ran into this issue when writing rules to check the results of the ClamAV plugin. EG, in my "clamav.cf" file I invoke the plugin with an "eval" then have rules to trigger off the pseudo-header that it adds. In the rules I have lines like: #loadplugin ClamAV /etc/mail/spamassassin/plugins/clamav.pm now done in v310.pre # full L_CLAMAV eval:check_clamav() describe L_CLAMAV Clam AntiVirus detected a virus score L_CLAMAV 3 # header T__MY_CLAMAV X-Spam-Virus =~ /Yes/i header T__MY_CLAMAV_SANE X-Spam-Virus =~ /Yes.{1,50}Sanesecurity/i header T__MY_CLAMAV_MSRBL X-Spam-Virus =~ /Yes.{1,50}(?:MSRBL|MBL)/ header T__MY_CLAMAV_PHISH X-Spam-Virus =~ /Yes.{1,50}Phish/ header L_UI_PHISHs X-Spam-Virus =~ /Yes.{1,50}Phishing/ # Need to set the 'X-Spam-Virus' header rules to a "high" priority # so they run late and will be evaluated -after- the plugin runs priority T__MY_CLAMAV priority T__MY_CLAMAV_SANE priority T__MY_CLAMAV_MSRBL priority T__MY_CLAMAV_PHISH priority L_UI_PHISHs # meta MY_CLAMAV_SANE (L_CLAMAV && T__MY_CLAMAV_SANE) meta MY_CLAMAV_MSRBL (L_CLAMAV && T__MY_CLAMAV_MSRBL) [snip..] -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: re-learning ? was - bayes - large message
On Sun, 21 Apr 2013, Joe Acquisto-j4 wrote: -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org Thanks. This has cleared most of my fog. I had chosen to forward as it seemed simpler at the time, given the SA learning curve. Still on the uphill part. Just setup shared folders into which I can drag and drop spam and miss-caught spam, unaltered. And, I can access that folder using an imap client, from the SA box. Its "alpine", which came with the Distro I am using, opensuse 12.2. Is there a linux imap client that can be scripted (bash preferred)? Relatively easily? Or perhaps someone knows of something already crafted for this purpose, that needs only minor tweaking? joe a. Included in the UWash IMAP kit is a program called "mailutil" which may be available ready-built for your OS distro (EG: http://linux.die.net/man/1/mailutil). One of the things that mailutil can do is to transfer mailboxes (mail "folders") from one mail server to another (EG from a traditional mbox into an IMAP server or from one IMAP server to another). Use it to copy your IMAP spam/ham folders to local (on your SA server) 'mbox' format folders and then learn from them. Dave -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: spamass-milter rejecting messages because no score found in large emails
On Sat, 23 Mar 2013, Matus UHLAR - fantomas wrote: Am 22.03.2013 22:31, schrieb Benny Pedersen: are spamass-milter using spamc ? On 23.03.13 00:34, Robert Schetterer wrote: at my knowledge spamass-milter uses spamd, the deamon vers of spamc no, no, spamd is the daemon and spamc is an utility that talks to spam daemon :-) and yes, spamass-milter uses spamc. you can pass extra flags to it, e.g. -s to send all mail up to given size to spamd (default:500KB) It is true that spamass-milter uses the spamc utility but not all spamassassin connecting milters do. I'm using a customized version of miltrassassin which speaks the 'SPAMC' network protocol directly to spamd, no use of the "spamc" client program at all. There are some milters that don't even use spamd, they directly instantiate the spamassassin engine within themselves. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Hot News
On Fri, 15 Mar 2013, Kevin A. McGrail wrote: On 3/15/2013 9:17 AM, Tom Kinghorn wrote: On 15/03/2013 15:11, Christopher Nido wrote: http://www.naturalstonesinc-munged.com/aah/pabfjd/pgrezs Now this is a guy with "cahona's grande' " for spamming the spamassassin list. Poor sucker. It's a compromised Yahoo! account. One of the #1 spamming issues right now for us. Regards, KAM Not only a compromised Yahoo! account but also a compromised website so listing the URLs in some kind of RBL will be probelmatic for FPs. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Spamassassin not parsing email messages
That implies that what ever mechanism you're using in the original process is adding a blank line (or bare 'nl' or 'cr') to the beginning of the message that you're then handing to SA. Idiot question, are you doing (or not) a "chomp" in the initial read process? On Fri, 28 Dec 2012, Sean Tout wrote: Hi Henrik & Jeff, One more input that might shed more light. I copied one of the emails from the above 3 emails into its own file and ran spamassassin from the command line in test mode against it and it worked fine. the command is spamassassin --test-mode < /spamemails/singleemail.spam where singleemail.spam contains a single spam email. Regards, -Sean. -- View this message in context: http://spamassassin.1065346.n5.nabble.com/Spamassassin-not-parsing-email-messages-tp102770p102782.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{
Re: Scoring Yahoo mail from certain continents/countries ?
On Sun, 9 Dec 2012, Frederic De Mees wrote: Dear list, Here is the context. The French-speaking countries receive tons of e-mails, mostly fraud attempts, fake lotteries, originating from West-Africa and sent by Yahoomail users. Often those messages contain big attachments. The payload (text of the message) is embedded in a 1MB jpeg with fake certificates of a lawyer, a logo, or whatever. Spamassassin misses 100% of them because: - the sender IP (Yahoo) is genuine and has a good reputation - the analysis of the message text shows nothing bad, as the mill!ions of euros are in the picture attachment - due to the message size, the analysis is skipped anyway. If no customer of the mail server in question expect any mail from any Yahoo user in Africa, a simple 'header_checks' Postfix directive like this will match such messages if their sender IP starts with 41. /^Received: from .41\..*web.*mail.*yahoo\.com via HTTP/i I admit this is rough albeit effective. On one side, not all Africa is 41. On the other side, I do not want to block all 41. I would have loved to do it with SA. This means that the line "Received: from [ip.add.res.ss].*web.*mail.*yahoo\.com via HTTP" should be detected and analysed. The ip address should be extracted. The whois of the address should be queried. The country code of the IP address would return certain number of SA points from a list of "Yahoousers bad countries" I would manage. Because of its size, your message didn't get processed by SA at all. Try a test run with the max-size parameter bumped up high enough that SA will take a crack at it. You might find that SA is already able to deal with that garbage. If that works then you just need to figure out how to deal with bloated image spams. Recently there have already been a couple different threads on this list about exactly that issue (ranging from just increase the max-size for everything, to make special connector that truncates bloated spams). Until you get SA to actually process these messages, there's no point to discussing added bells-and-whisles. -- Dave Funk University of Iowa College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527 #include Better is not better, 'standard' is better. B{