RE: Question on Rule
> > > Hello, > > > > Can someone explain what this actually means and maybe provide an > > example? > > > > Rule Name: FROM_MISSP_DYNIP > > Rule Definition: misspaced + dynamic rDNS > > > > Getting a high score on this and having trouble finding an actual real > > definition and example. I get the dynamic rDNS I believe, but not sure > > about the misspaced meaning for sure. > > It means that there is no space between the display name and the '<', e.g. > >From: John Smith > > If you are seeing anything very different? Thanks, however, I do see a space between the name and the '<' This is what it looks like: From: =?UTF-8?Q?Name?=
RE: Question on Rule
> Am 27.01.20 um 17:22 schrieb Charles Amstutz: > > Can someone explain what this actually means and maybe provide an > example? > > > > Rule Name: FROM_MISSP_DYNIP > > > > Rule Definition: misspaced + dynamic rDNS > > > > Getting a high score on this and having trouble finding an actual real > > definition and example. I get the dynamic rDNS I believe, but not sure > > about the misspaced meaning for sure > > misspaced FROM header which leave sthe question open why you don't > provide any useful information like, well, the headers or better raw-mail at > pastebin >From your explanation, I think I found what might be causing the rule to >trigger. I think it is the Weird characters in subject, from and to? This is redacted a bit, of course. Return-Path: Delivered-To: recipi...@email.com Received: (qmail 4989 invoked by alias); 25 Jan 2020 15:13:45 -0600 Delivered-To: recipi...@email.com Received: (qmail 4975 invoked from network); 25 Jan 2020 15:13:45 -0600 Received: from SMTP Server (HELO SMTP Server) (internal IP) by mailserver with ESMTP; 25 Jan 2020 15:13:45 -0600 Received: (qmail 81888 invoked from network); 25 Jan 2020 15:13:35 -0600 Received: from dynamic RDNS (HELO HP511DF8) (Dynamic IP) by smtp external DNS name with ESMTP; 25 Jan 2020 15:13:35 -0600 Received-SPF: softfail (SMTP Server: transitioning SPF record at domain does not designate dynamic IP as permitted sender) From: =?UTF-8?Q?Sender_name?= To: =?UTF-8?Q?Recipient_name?= Subject: =?UTF-8?Q?Subject?= Date: Sat, 25 Jan 2020 19:35:07 + Message-ID: <1815052843-1579980907@> Content-Type: multipart/mixed; boundary="=_Part_Boundary_004b_6b102fb7.6b102fb7" MIME-Version: 1.0
Question on Rule
Hello, Can someone explain what this actually means and maybe provide an example? Rule Name: FROM_MISSP_DYNIP Rule Definition: misspaced + dynamic rDNS Getting a high score on this and having trouble finding an actual real definition and example. I get the dynamic rDNS I believe, but not sure about the misspaced meaning for sure. Thanks
RE: MISSING_SUBJECT rule on email with subject
> Hi Charles, > > My apologies, I forgot to provide feedback to the mailing list. Bad regex was > the cause of this problem for us, too. As soon as the custom rule was fixed, > the problem went away. If I can ask, was it an incorrectly escaped special character? I think it is the @ symbol breaking mine.
RE: MISSING_SUBJECT rule on email with subject
> But as has already been pointed out it has the combination of > MISSING_FROM and HK_RANDOM_FROM, and the latter is based on a > From:addr test. I saw this too, however, I thought I noticed a potentially bad regex (from another custom rule) breaking mine. I think this is the case as when I removed the rule, it stopped the missing_subject stopped hitting. However, I'm still testing.
low scoring spam
Hello, I keep having spam come through that hits on almost zero rules, (or very few) . I get this is definitely possibly, but it's annoying as its obviously spam. I guess my question is, if what we have in place isn't hitting on much, then aside from learning it to Bayes, what do we do? Even that isn't enough it seems as it learns it to Bayes_50 and not Bayes_99. Even Bayes_99 is not enough to catch it as spam typically if it doesn't trip anything else. (as it only 3.5 for Bayes_99 and many users are set to default to 4 or 5)
RE: "bout u" campaign
I'm starting mine out at 0.5 until I see what happens. Infinite Systems Charles Amstutz | Systems Administrator charl...@infinitesys.com 402.477.2474 134 S 13th Street, Suite 302 | Lincoln, NE 68508 -Original Message- From: David Jones [mailto:djo...@ena.com] Sent: Thursday, July 13, 2017 11:13 AM To: users@spamassassin.apache.org Subject: Re: "bout u" campaign On 07/13/2017 10:56 AM, RW wrote: > On Thu, 13 Jul 2017 09:33:04 -0400 > Alex wrote: > >> On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz >> wrote: >>> How do you use lashback? It says that it is free to use for >>> commercial and non commercial use. How do I set it up? >> >> Drop this into your local.cf or similar: >> >> header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK', >> 'ubl.unsubscore.com') > > I have it as lastexternal: > > header RCVD_IN_UNSUBBL eval:check_rbl('ubl-lastexternal', > 'ubl.unsubscore.com') > > I've found there to be quite a lot of ISP pool addresses in it, so > deep checks are probably unsafe. > I started mine with lastexternal and didn't find much added value over other major RBLs and since my MTA was blocking mostly with IVM and Spamhaus RBLs that overlapped Lashback. I also wanted to check outbound mail where the second or more hop was from an infected device most likely under botnet control. It would have helped in the OP spam. > I've also found it has quite a high FP rate of ~2%. > I am working with them to fix these FPs (they include major mail providers like Comcast, Microsoft and Google which are pointless) and potentially be included in the default SA rules. It's still a valuable RBL to help with an overall score even with a ~2% FP. I wouldn't score it too high like you can with Spamhaus and IVM. I also have it at 1.2. -- David Jones
RE: "bout u" campaign
Hello, For the inexeperienced, what is the difference between lashback and lastexternal. Infinite Systems Charles Amstutz | Systems Administrator charl...@infinitesys.com 402.477.2474 134 S 13th Street, Suite 302 | Lincoln, NE 68508 -Original Message- From: RW [mailto:rwmailli...@googlemail.com] Sent: Thursday, July 13, 2017 10:57 AM To: users@spamassassin.apache.org Subject: Re: "bout u" campaign On Thu, 13 Jul 2017 09:33:04 -0400 Alex wrote: > On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz > wrote: > > How do you use lashback? It says that it is free to use for > > commercial and non commercial use. How do I set it up? > > Drop this into your local.cf or similar: > > header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK', > 'ubl.unsubscore.com') I have it as lastexternal: header RCVD_IN_UNSUBBL eval:check_rbl('ubl-lastexternal', 'ubl.unsubscore.com') I've found there to be quite a lot of ISP pool addresses in it, so deep checks are probably unsafe. I've also found it has quite a high FP rate of ~2%.
RE: "bout u" campaign
Thanks, I was looking at the default RBL lists https://wiki.apache.org/spamassassin/DnsBlocklists But was looking for other things that are free for commercial use. I found this that is possible. http://0spam.fusionzero.com/ but don't know if wanyone had experience with it, or could make other recommendations. >Drop this into your local.cf or similar: >header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK', 'ubl.unsubscore.com') >describe RCVD_IN_LASHBACK LashBack Unsubscribe Blacklist >tflags RCVD_IN_LASHBACK net >scoreRCVD_IN_LASHBACK 1.2 > I've scored it at 1.2. You may wish to change that, perhaps lower for a > while, while you see how it works in your organization.
RE: "bout u" campaign
Thanks Infinite Systems Charles Amstutz | Systems Administrator charl...@infinitesys.com 402.477.2474 134 S 13th Street, Suite 302 | Lincoln, NE 68508 -Original Message- From: Alex [mailto:mysqlstud...@gmail.com] Sent: Thursday, July 13, 2017 8:33 AM To: Charles Amstutz ; SA Mailing list Subject: Re: "bout u" campaign On Thu, Jul 13, 2017 at 9:29 AM, Charles Amstutz wrote: > How do you use lashback? It says that it is free to use for commercial and > non commercial use. How do I set it up? Drop this into your local.cf or similar: header RCVD_IN_LASHBACK eval:check_rbl('LASHBACK', 'ubl.unsubscore.com') describe RCVD_IN_LASHBACK LashBack Unsubscribe Blacklist tflags RCVD_IN_LASHBACK net scoreRCVD_IN_LASHBACK 1.2 I've scored it at 1.2. You may wish to change that, perhaps lower for a while, while you see how it works in your organization.
RE: "bout u" campaign
As a follow up, it says how to do the DNS, just now how to list in the .cf files, maybe I can copy another blacklist syntax? Infinite Systems Charles Amstutz | Systems Administrator charl...@infinitesys.com 402.477.2474 134 S 13th Street, Suite 302 | Lincoln, NE 68508 -Original Message- From: David Jones [mailto:djo...@ena.com] Sent: Thursday, July 13, 2017 8:17 AM To: users@spamassassin.apache.org Subject: Re: "bout u" campaign On 07/12/2017 09:50 PM, Alex wrote: > Hi, > >> pretty high mainly due to DCC and BAYES_99. > > Are you paying for DCC? I think we're over their limit and they > blacklisted us long ago, lol. I have my own DCC server joined into the DCC network. https://www.dcc-servers.net/dcc/ > >> I guess I have well trained Bayes. > > I think you just don't have many one-liner emails as a regular course > of business? I am classifying about 10K ham and 8K spam each day which I also use in the masscheck processing (currently on hold). Since I have started doing this about a month or so ago, my BAYES scores seem to be more accurate. Maybe I wasn't training enough ham/spam before? I don't know for sure yet. > >> 1.2 RCVD_IN_LASHBACK RBL: Received is listed in Lashback >> usb.unsubscore.com >> [204.29.186.60 listed in >> ubl.unsubscore.com] > > I forgot about this. I have it in postscreen (+1) but now also added it in SA. > >> 2.2 RCVD_IN_SORBS_SPAM RBL: SORBS: sender is a spam source > > We do have some in SORBS, but only score it 0.5. Do you really > recommend scoring it so high? > Obviously I do because it's working well in my platform. I have other WL rules that subtract points to offset this one. If there are no other WL (i.e. list.dnswl.org) hits then this will stand out more. Do some analysis of your emails that hit this rule and what the scores were. My threshold for blocking is 6.0 (default for MailScanner). If your threshold is 5.0 and your ham with this rule his is scoring below 3.3 (5.0 - 1.7), then you would be fine setting this to score 2.2. >> 0.0 OS_UNKNOWN Relay runs on unknown OS > > That's an interesting one. Fingerprinting? > Yeh. I thought it might be a useful data point for making meta rules but it turns out to not be. I will probably leave this out when I rebuild my filters in the next couple of months on CentOS 7. >> 1.2 FREEMAIL_FROM Sender email is commonly abused enduser mail > > This is also scored *much* lower here - we have many freemail senders. > The default score is 0.001, so you must have changed it. > Yep. Again my block threshold is 6.0 in MailScanner and I have less default trust for FREEMAIL senders. I also have meta rules based on FREEMAIL and other hits that add to the score based on combinations I have seen over the years. FREEMAIL senders are very difficult to accurately filter but I feel like my rules are pretty good. I have to postwhite exclude most freemail providers since they are listed on some RBLs which makes no sense to me. You can't block the big ones like Yahoo, Hotmail, Comcast, etc. just because they are so large and there are many legit senders in the middle of the spammers. >> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100 > > For 90_100, I think we're only subtracting -0.2. > For my mail flow, I have noticed that senders in the 90's are normally very trustworthy. If you separate your rules into 2 main categories, then you can setup scores based on their category to balance out the other category. 1. IP and domain reputation 2. Message content Good IP reputation can offset questionable message content and vice versa. I tend to go heavy on the reputation side at the MTA and in SA which has serve me well in the past several years. Before that, I was constantly adjusting content rule scores and writing custom rules to react to the latest spam campaign where I was always behind. I have a huge list of whitelist_auth based on domain reputation which allows me to crank up some content scores and not let Bayes block good reputation senders based on content. >> 2.2 ENA_DIGEST_FREEMAILFreemail account hitting message digest spam >> seen by the Internet (DCC, Pyzor, or Razor). > > The problem I always had with pyzor/dcc was that it works on very > small blocks of text, no? Perhaps it works well for small messages, > but isn't it problematic for larger messages? > I have no idea. I just analyzed my mail scoring and noticed combinations like DCC and FREEMAIL are common in my spam. >> 1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or
RE: "bout u" campaign
How do you use lashback? It says that it is free to use for commercial and non commercial use. How do I set it up? Infinite Systems Charles Amstutz | Systems Administrator charl...@infinitesys.com 402.477.2474 134 S 13th Street, Suite 302 | Lincoln, NE 68508 -Original Message- From: David Jones [mailto:djo...@ena.com] Sent: Thursday, July 13, 2017 8:17 AM To: users@spamassassin.apache.org Subject: Re: "bout u" campaign On 07/12/2017 09:50 PM, Alex wrote: > Hi, > >> pretty high mainly due to DCC and BAYES_99. > > Are you paying for DCC? I think we're over their limit and they > blacklisted us long ago, lol. I have my own DCC server joined into the DCC network. https://www.dcc-servers.net/dcc/ > >> I guess I have well trained Bayes. > > I think you just don't have many one-liner emails as a regular course > of business? I am classifying about 10K ham and 8K spam each day which I also use in the masscheck processing (currently on hold). Since I have started doing this about a month or so ago, my BAYES scores seem to be more accurate. Maybe I wasn't training enough ham/spam before? I don't know for sure yet. > >> 1.2 RCVD_IN_LASHBACK RBL: Received is listed in Lashback >> usb.unsubscore.com >> [204.29.186.60 listed in >> ubl.unsubscore.com] > > I forgot about this. I have it in postscreen (+1) but now also added it in SA. > >> 2.2 RCVD_IN_SORBS_SPAM RBL: SORBS: sender is a spam source > > We do have some in SORBS, but only score it 0.5. Do you really > recommend scoring it so high? > Obviously I do because it's working well in my platform. I have other WL rules that subtract points to offset this one. If there are no other WL (i.e. list.dnswl.org) hits then this will stand out more. Do some analysis of your emails that hit this rule and what the scores were. My threshold for blocking is 6.0 (default for MailScanner). If your threshold is 5.0 and your ham with this rule his is scoring below 3.3 (5.0 - 1.7), then you would be fine setting this to score 2.2. >> 0.0 OS_UNKNOWN Relay runs on unknown OS > > That's an interesting one. Fingerprinting? > Yeh. I thought it might be a useful data point for making meta rules but it turns out to not be. I will probably leave this out when I rebuild my filters in the next couple of months on CentOS 7. >> 1.2 FREEMAIL_FROM Sender email is commonly abused enduser mail > > This is also scored *much* lower here - we have many freemail senders. > The default score is 0.001, so you must have changed it. > Yep. Again my block threshold is 6.0 in MailScanner and I have less default trust for FREEMAIL senders. I also have meta rules based on FREEMAIL and other hits that add to the score based on combinations I have seen over the years. FREEMAIL senders are very difficult to accurately filter but I feel like my rules are pretty good. I have to postwhite exclude most freemail providers since they are listed on some RBLs which makes no sense to me. You can't block the big ones like Yahoo, Hotmail, Comcast, etc. just because they are so large and there are many legit senders in the middle of the spammers. >> -2.2 RCVD_IN_SENDERSCORE_90_100 Senderscore.org score of 90 to 100 > > For 90_100, I think we're only subtracting -0.2. > For my mail flow, I have noticed that senders in the 90's are normally very trustworthy. If you separate your rules into 2 main categories, then you can setup scores based on their category to balance out the other category. 1. IP and domain reputation 2. Message content Good IP reputation can offset questionable message content and vice versa. I tend to go heavy on the reputation side at the MTA and in SA which has serve me well in the past several years. Before that, I was constantly adjusting content rule scores and writing custom rules to react to the latest spam campaign where I was always behind. I have a huge list of whitelist_auth based on domain reputation which allows me to crank up some content scores and not let Bayes block good reputation senders based on content. >> 2.2 ENA_DIGEST_FREEMAILFreemail account hitting message digest spam >> seen by the Internet (DCC, Pyzor, or Razor). > > The problem I always had with pyzor/dcc was that it works on very > small blocks of text, no? Perhaps it works well for small messages, > but isn't it problematic for larger messages? > I have no idea. I just analyzed my mail scoring and noticed combinations like DCC and FREEMAIL are common in my spam. >> 1.2 ENA_DIGEST_MULTIPLE_MSPIKE_H2 Dcc, Razor, or
RE: "bout u" campaign
I find it challenging to constantly keep up with campaign's. My guess with the phone number is to try to make it seem more legitimate. More recent, I try to look for general characteristics and go for that, in order to futureproof rules. However, there are always legitimate emails being sent that would trigger a potential rule (depending on what you are matching on) >> What is even the point of spam with a phone number?
RE: Random word spams and wiki spams
Mostly autolearn ham and train some spam, have found that one account needed ham though. Most user accounts in question are at least 200/200, most are well over a few thousand each (I believe) >> I need to read up bayes a bit, I was surprised to learn that after >> using sa-learn --spam, then bayes only tagged it at Bayes_50 instead >> of Bayes_99, Unless I did something incorrect. >There is a minimum level of both spam *and ham* that Bayes must be trained >with before it will start providing scoreable analysis. >How much have you trained it with?
RE: Random word spams and wiki spams
>> I find many don't contribute (despite it being open source) for fear of >> spammers using these ideas against us, but the project suffers as a result. I think others don't due to IP rights. I'm glad people do though.
RE: Random word spams and wiki spams
I need to read up bayes a bit, I was surprised to learn that after using sa-learn --spam, then bayes only tagged it at Bayes_50 instead of Bayes_99, Unless I did something incorrect. Note: I do not use bayes files in user profiles, I use it in mysql database
RE: Random word spams and wiki spams
Has anyone ever got something like machine learning (I get that is what bayes kind of is) or R working with spam assassin? I’ve seen Books on this and maybe was refering to Bayes, but not sure.
RE: Random word spams and wiki spams
I setup spamdyke to block .top and many other TLDs where mostly spam came from. Unfortunately, I had to remove them, and now have to rely on content analysis with the use of *BL's. With setting up pattern matching, in efforts to future proof blocking, it will catch legit email that use characters to form tables (happens occasionally). The only thing I could think of was to set individual scores lower, but high meta scores. I appreciate the options for postfix, but I do not run that on incoming mail servers. Infinite Systems Charles Amstutz | Systems Administrator charl...@infinitesys.com 402.477.2474 134 S 13th Street, Suite 302 | Lincoln, NE 68508 -Original Message- From: David Jones [mailto:djo...@ena.com] Sent: Friday, July 7, 2017 11:15 AM To: Charles Amstutz ; 'users@spamassassin.apache.org' Subject: Re: Random word spams and wiki spams On 07/07/2017 11:04 AM, Charles Amstutz wrote: > Thank you everyone for the suggestions, I will look into it. One thing > I've noticed is that sometimes it takes a day for any *BL's to pick up > some of the spam, and by that time, the run could be done. Greylisting > isn't an option. It sometimes feels like always reactive vs pro-active > in filtering. For example, I try to block the old runs of "Ford > Warranties", write a few rules, then never receive them again :) > > This is a slight over exaggeration, but close. > No. I completely understand. A couple of years ago I was doing the same thing always reacting to new spam campaigns. It took a lot of my time and I never felt like I was winning those one-day battles. Now I have tuned my MTA (Postfix with postscreen) to reject the majority of junk before it ever reaches SA. See the archives for these Postscreen weighted RBLs if you are running Postfix. With about 24 RBLs including invaluement, I am able to be aggressive with many RBLs adding up to a block threshold of 8 in postscreen. On the other side of this, you have to setup postwhite to whitelist major mail providers like comcast.net, aol, google, yahoo.com, etc. and let SA score them. Now I rarely get any reports of spam getting through unless it's from a compromised account. These will always be difficult to block for zero-hour spam campaigns from botnets. Also, setup the KAM.cf rules and extra signatures for ClamAV from Sanesecurity. These often help with new spam campaigns. I can post which signature DBs I am using if that would be helpful. -- Dave
RE: Random word spams and wiki spams
Thank you everyone for the suggestions, I will look into it. One thing I've noticed is that sometimes it takes a day for any *BL's to pick up some of the spam, and by that time, the run could be done. Greylisting isn't an option. It sometimes feels like always reactive vs pro-active in filtering. For example, I try to block the old runs of "Ford Warranties", write a few rules, then never receive them again :) This is a slight over exaggeration, but close.
Random word spams and wiki spams
Hello, I am new to the group, but have experience with writing some rules and some meta rules. Has anyone come up with a good way to detect spam that has random words in paragraph forms (usually at the bottom of the message body) or they look like they copy parts from various wiki's or other news sources? Thanks Charles