Re: Gmail confidential mode

2022-11-17 Thread Dave Warren

On 2022-10-16 10:38, Alex wrote:


 > What do you know about "Gmail confidential mode" emails? I'm
starting to
 > see a few of these come in to users now, and not sure how to
treat them.
 > They are sent through gmail, but require a one-time passcode sent
to the
 > recipient,

Did you actually look at them?  What do they look like?  What does the
recipient have to do to actually get the mail?  Does this only work
gmail to gmail?


Some of those questions I was hoping others could help me to answer. 
This is a legitimate email service provided by gmail. It was routed 
through google's servers only. It passed DKIM and SPF, but not DMARC. I 
don't think it's only gmail-to-gmail, as the recipient is not a gmail 
account.


I neglected to send my reply and found it in drafts, sorry for the late 
reply.


This isn't e-mail, it's a hosted text document and a link sent by email. 
It is functionally the same as putting something on a (vaguely) private 
PasteBin and telling your recipient where to go look at it.


ProtonMail has their own thing, when you send an "encrypted" message to 
someone not on ProtonMail...


Luckily these things don't usually take off since most people use email 
because they want email.


Google is completely unable to address their outbound spam problem so it 
is unlikely they'll manage to address their 
spam-via-online-documents-that-bypass-spam-filters either and spammers 
are good at finding ways to send messages that hide within something 
otherwise legit looking.




Re: Avoid processing upsteam trusted mail with X-Spam-Flag: YES?

2022-01-06 Thread Dave Warren

On 2022-01-06 11:13, Benny Pedersen wrote:

On 2022-01-06 18:20, Grant Taylor wrote:


Q:  Does the upstream MSA not do filtering of inbound messages from
clients?  I would think that this filtering would cover messages
originating from the upstream organization to the downstream
organization.


that header should be on same host as the email clients read there 
mails, if its trusted outside of local mta, then its forged say 
X-Spam-Flag: NO


do we want to trust it ?

spamassassin removes upstream results, to make X-Spam-Flag trusted to be 
local


Personally, I'll always trust anyone that declares "X-Spam-Flag: YES" or 
similar, because if even the sender says it is spam, who am I to argue?


In the past I had some rules that tripped on M365 mail that they decided 
was spam but sent anyway and had minimal false positives here, and I 
think I dabbled with reading SA's header and assigning some points on 
that basis for those senders who ran SA on outbound mail.


I was lucky enough to be able to read the original headers, and make 
scoring decisions outside of just SA too.


Still, I would happily trust nearly any "this is likely spam" signal a 
sender cares to forge into their messages, because who would do that on 
mail that actually should be delivered?


And since the vast majority of delivery attempts are spam, if the 
concern is some sort of resource consumption, trusting an upstream's 
"this is spam" signal is probably your biggest resource saver.


Trusting a not-spam signal is more complicated and I don't think I would 
try in SA alone, unless the upstream adds some clues that would let me 
identify messages that passed through their system specifically.




Re: SPF_NONE scoring

2021-12-02 Thread Dave Warren

On 2021-11-30 12:24, Greg Troxel wrote:

Lots of people think SPF is silly.  And spammers spamming from a domain
they control can even dkim/dmarc.


Domain based reputation is an extremely powerful tool, but it is only 
useful when you know the actual sender of a message. The benefit isn't 
in blocklisting, it is enabling legitimate mail to get through while you 
can filter more aggressively.


This applies a lot less to smaller operations as you really need a large 
amount of data (and the skills to use it), but even at small scales 
being able to bypass spam filters for mail you know you want is 
incredibly useful, especially when you want mail from a particular 
company even though they use a garbage ESP or service provider that you 
would really rather block.





Re: updates.spamassassin.org not resolving

2021-07-23 Thread Dave Warren

On 2021-07-23 06:54, Benny Pedersen wrote:

On 2021-07-23 14:35, Kevin A. McGrail wrote:

TL;DR: Everything looks good to me.


+1


I think you are just doing DNS calls that are either invalid or look
like you are trying to do discovery through recursion.  For example:

dig -t txt 0.0.4.updates.spamassassin.org [5] @ns2.pccc.com [2]


why is specific version needed ?, rules updates works imho with all 
spamassassin versions if version is used in rule sets ?


Today that is true, but as ClamAV has discovered, old versions sometimes 
behave badly in a way that can generate a lot of pain in the future.


Having requests already going to versioned URLs provides a lot of 
options, for example, asking outdated versions to stay on outdated 
rulesets:


0.2.3.updates.spamassassin.org. 3600 IN TXT "895075"

Which is better than feeding them a ruleset in a format that they won't 
understand, and can't meaningfully use while the current release gets 
new rulesets:


0.3.3.updates.spamassassin.org. 3600 IN TXT "1891700"



Re: MALFORMED_FREEMAIL

2019-11-01 Thread Dave Warren
In general it is the concept of sending from a particular domain in a 
format that the infrastructure on that domain will not send.


A really easy to grasp concept: I know that example.com's mail server 
always adds a X-Yup-We-Sent-It: True header, so I will consider anything 
claiming to be coming from example.com but missing that header to be 
suspicious.


Similar to messages with a header indicating they were written in a 
client but yet formatted in a way that that client does not produce.



On 2019-11-01 10:55, Axb wrote:

What is a "faked mail" ?

On 11/1/19 3:15 PM, Joseph Brennan wrote:

MALFORMED_FREEMAIL is a meta on:
(MISSING_HEADERS||__HDRS_LCASE) && FREEMAIL_FROM

So that and MISSING_HEADERS itself add up to 3.0 points. This seems high.

We rejected a message from gmail that hit MALFORMED_FREEMAIL and
MISSING_HEADERS, and a few other low-scoring things. Because it was
rejected I do not have the message. I believe the sender tried to BCC a
group of people. If I recall correctly MISSING_HEADERS, which refers only
to the To: header, hits when To: exists but is blank. People (ab)using 
BCC

instead of a list for legit mail is not that uncommon.

The case with  __HDRS_LCASE strikes me as very different and much more
likely to be faked mail. I don't know of any freemail providers that 
write
header names in all lower case. A check against the corpus obviously 
needs

to back up my guess but I think I'm right.







Re: SpamAssassin Scoring For MDAEMON_DNSBL

2019-05-26 Thread Dave Warren

On 2019-05-14 09:17, John Hardin wrote:

On Tue, 14 May 2019, cyflhn wrote:

It has happened many times that the emails from our server were 
identified as

spam. I have checked the emails which were not identified as spam. But I
found that the SpamAssassin Scoring For MDAEMON_DNSBL is quite high, the
score of MDAEMON_DNSBL is always 4. I also checked the logs of 
SpamAssassin

and here are some messages:


Is this a local SA install, or some third party testing service? If the 
latter, who?



Performing DNS-BL lookup
* zen.spamhaus.org - passed
* bl.spamcop.net - passed
* bad.psky.me - failed - 198.54.117.200


That doesn't appear to be SA related. Is that just informational related 
data?



* 1.6 BAYES_50 BODY: Bayes spam probability is 40 to 60%
* [score: 0.5000]
* 4.0 MDAEMON_DNSBL MDaemon: marked by MDaemon\'s DNSBL
* 2.1 FREEMAIL_FORGED_REPLYTO Freemail in Reply-To, but not From



I still don't know what's reason for such a high score for MDAEMON_DNSBL


That rule is not in the base SA ruleset so we can't help you analyze it. 
I suggest you contact MDaemon to see why you're listed.


I've been aware on a family matter, but I can provide a bit of context 
about this particular rule. I previously worked with MDaemon (then Alt-N 
Technologies) and although this was some years ago I'm still familiar 
with the product and can help off-list if needed, feel free to reach out 
on or off list as applicable.


The "Performing DNS-BL lookup" header (above) shows the DNS-BLs which 
are configured in MDaemon and the results for each. bad.psky.me has not 
(to my knowledge) ever been a default in MDaemon.


Normally you should only use DNS-BLs for outright blocking at this stage 
(and let SpamAssassin's own DNS-BL functionality score) as this feature 
provides pre-DATA message rejection, but if you choose to accept 
messages that hit MDaemon's DNS-BL then you can pass points into 
SpamAssassin via the MDAEMON_DNSBL rule.


There are a few reasons for this, but mainly it comes down to the fact 
that MDaemon's DNS-BL implementation predated SpamAssassin being 
supported by MDaemon, and in the initial implementation there were a 
number of issues with SpamAssassin's implementation when running under 
Windows. These issues are long since resolved, but there is no incentive 
to remove the integration.


Removing the IP from bad.psky.me will cause the rule in SpamAssassin to 
disappear. Since bad.psky.me seems to be in a "list the world" phase the 
MDaemon administrator should completely remove this DNS-BL from the 
MDaemon configuration.


There is nothing a sender can do, only the receiving MDaemon server's 
administrator can make changes here.


Re: Amazon continues to get tagged as spam

2019-04-02 Thread Dave Warren

On 2019-04-02 06:01, RW wrote:

On Mon, 01 Apr 2019 20:14:13 -0400
Dave Warren wrote:



1.8 DKIM_ADSP_DISCARD  No valid author signature, domain signs
all mail and suggests discarding the rest



This is a bit odd too, I don't see ADSP records on Amazon's
various .com domains (although there is one on at least one
country-specific domain, but I only see .com in the pasted header).

Perhaps someone else can comment if SpamAssassin overloads the
DKIM_ADSP_DISCARD rule with other meanings beyond literal ADSP
records? I don't have a good way to read through the .cf files from
this location.



There are overrides in 60_adsp_override_dkim.cf. They were put there
to give  ADSP rules something to work with until ADSP caught on, which
it never really did.



Ahh. Well that's unfortunate, it really should note that it failed an 
invented test -- Or better yet, consider dropping the whole thing now 
that DMARC has replaced what ADSP tried to accomplish.


Re: Amazon continues to get tagged as spam

2019-04-01 Thread Dave Warren
On Mon, Apr 1, 2019, at 17:11, @lbutlr wrote:
> 3.5 BAYES_99   BODY: Bayes spam probability is 99 to 100%
> [score: 1.]
> 0.2 BAYES_999  BODY: Bayes spam probability is 99.9 to 100%
> [score: 1.]

These two are both a bit of a bad sign, this indicates that the bayes system is 
very *very* sure that this message is spam. While this shouldn't override 
whitelisting, I would probably investigate the training methods being used here.


> 1.8 DKIM_ADSP_DISCARD  No valid author signature, domain signs all
> mail and suggests discarding the rest


This is a bit odd too, I don't see ADSP records on Amazon's various .com 
domains (although there is one on at least one country-specific domain, but I 
only see .com in the pasted header).

Perhaps someone else can comment if SpamAssassin overloads the 
DKIM_ADSP_DISCARD rule with other meanings beyond literal ADSP records? I don't 
have a good way to read through the .cf files from this location.



Re: Filtering at border routers: Is it possible?

2019-03-25 Thread Dave Warren

On 2019-03-22 21:43, Grant Taylor wrote:

On 3/22/19 7:01 PM, Dave Warren wrote:
To me, the big one is this: It sets your users up for failure. If a 
user configures their client on a network that allows unrestricted 
port 25 access and later moves (temporarily or permanently) to a 
network that does restrict port 25, they'll get an error and you'll 
get a support ticket.


Valid as that is, that is addressing a client issue, not a server issue.


It isn't really a server or client issue, rather, it is a user issue and 
a technical support issue.



You'll save yourself a lot of hassle if you get clients set up right 
from the start rather than fixing user configurations after the fact.


Agreed.  But configuring clients to use port 587 or 465 does not 
preclude allowing SMTP Authentication on port 25.


This isn't really true.

By rejecting authentication on port 25 upfront you force clients to be 
configured properly from the start whereas when you allow authentication 
on port 25 a client will often guess at port 25, see that it works and 
the user will not reconfigure anything despite what the instructions 
recommend.



One other consideration, although this is more opinion than fact: In 
my experience users/clients that still default to port 25 often don't 
default to STARTTLS and therefore will transmit an unencrypted 
password at least once (even if you refuse it and instruct them to 
authenticate, the damage could already have been done). Forcing 465 is 
the only way to ensure that this can't happen, but clients that 
default to 587 are far more likely to default to using encryption.


There is another way.  You can configure the server to not offer SMTP 
Authentication until after encryption is established with STARTTLS.


That doesn't work because some (poorly written) clients blindly throw 
authentication commands hoping to get a response.


This is an admittedly minor issue as it would require an attacker in a 
MITM position to have a chance at intercepting it, but it is still less 
than ideal.




Re: Filtering at border routers: Is it possible?

2019-03-22 Thread Dave Warren

On 2019-03-22 18:37, Grant Taylor wrote:

On 3/22/19 3:23 PM, Benny Pedersen wrote:

you only need sasl auth


You should do the SMTP Authentication across STARTTLS to protect 
credentials.


do not enable sasl auth on port 25, if it lists AUTH on port 25 ehlo, 
you will need to remove  it in postfix main.cf


enable sasl auth only on port 465 and 587


What is wrong with having SMTP Authentication on the MTA port as an 
/option/?


To me, the big one is this: It sets your users up for failure. If a user 
configures their client on a network that allows unrestricted port 25 
access and later moves (temporarily or permanently) to a network that 
does restrict port 25, they'll get an error and you'll get a support ticket.


You'll save yourself a lot of hassle if you get clients set up right 
from the start rather than fixing user configurations after the fact.


One other consideration, although this is more opinion than fact: In my 
experience users/clients that still default to port 25 often don't 
default to STARTTLS and therefore will transmit an unencrypted password 
at least once (even if you refuse it and instruct them to authenticate, 
the damage could already have been done). Forcing 465 is the only way to 
ensure that this can't happen, but clients that default to 587 are far 
more likely to default to using encryption.


Re: Filtering at border routers: Is it possible?

2019-03-22 Thread Dave Warren

On 2019-03-22 18:39, Grant Taylor wrote:

On 3/22/19 3:29 PM, Benny Pedersen wrote:

custommers wish for port 25 open relay ?


Having unfettered access to send traffic to TCP port 25 is /not/ the 
same thing as an open relay.


Especially if you are a host with your clients running self-managed 
servers and you therefore cannot guess at what software they might run.


I like the idea of restricting port 25 access by default although it 
should be easy to unblock -- The point isn't to annoy customers, just to 
reduce the odds of a compromised website/script being able to spew spam.


I also wouldn't offer unblocking of port 25 under a free trial, I would 
instead suggest offering a very generous refund policy for the same 
duration as a trial if your business model offers free trials. I don't 
know if this is still the case, but in the past spammers would sign up 
using free or ultra-cheap services to get a few days worth of spamming 
out of an account.




Re: more spam is getting through :-(

2019-03-20 Thread Dave Warren

On 2019-03-18 23:39, Duane Hill wrote:

Hello Dave,

Tuesday, March 19, 2019, 12:11:40 AM, you wrote:

*> On 2019-03-18 17:40, @lbutlr wrote:



On 18 Mar 2019, at 13:59, James <*bjloc...@lockie.ca 
*> wrote:



On 2019-03-17 5:43 p.m., @lbutlr wrote:

On 17 Mar 2019, at 15:03, James <*bjloc...@lockie.ca 
*> wrote:

I run sa-learn --ham on my inboxes.

You inboxes likely contain spam messages that haven't been caught, so training 
on inbox will poison your bayes in favor of more spam. Unless your inbox is 
perfect (entirely devoid of spam and containing only desired messages) you 
should not do this.



The documentation says to use your inbox. :-)



It does not.



It certainly suggests it by example.


*That suggests by example. It doesn't mean you cannot learn any other 
mailbox folder as either ham or spam.


I'm not suggesting that you can't learn other folders, but rather, that 
the example explicitly suggests using the inbox, along with a couple 
other folders for both ham and spam training.




Again, if you are like most people, there is spam in your inbox. Reality, you 
know?



I've been pondering this, what happens when you learn a message as
non-spam and then ham a few minutes later? As I understand it you do not
need to explicitly --forget, SpamAssassin is smart enough to handle this
situation, no? And if so, learning your Inbox should be fine as long as
you move messages to Spam (and don't just delete) when appropriate.


*non-spam and ham are the exact same. Therefore, it would not make any 
difference. According to documentation, the --ham switch means to learn 
as ham (non-spam). Therefore, the same thing.


Editing error.

What is the result when you train inbound spam as ham first, then as 
spam? As I understand it, forgetting is not required, SpamAssassin will 
handle this automatically. So as long as users move spam into the spam 
training folder (not deleting spam directly) it should be safe to train 
the inbox as ham.




Re: more spam is getting through :-(

2019-03-18 Thread Dave Warren

On 2019-03-18 17:40, @lbutlr wrote:




On 18 Mar 2019, at 13:59, James  wrote:

On 2019-03-17 5:43 p.m., @lbutlr wrote:

On 17 Mar 2019, at 15:03, James  wrote:

I run sa-learn --ham on my inboxes.

You inboxes likely contain spam messages that haven't been caught, so training 
on inbox will poison your bayes in favor of more spam. Unless your inbox is 
perfect (entirely devoid of spam and containing only desired messages) you 
should not do this.


The documentation says to use your inbox. :-)


It does not.


It certainly suggests it by example.




http://svn.apache.org/repos/asf/spamassassin/branches/3.4/README

Learning

Apache SpamAssassin includes a Bayesian learning filter, so it is worthwhile
training Apache SpamAssassin with your collection of non-spam and spam,


It says to train it with spam and not spam.


if possible.  This will make it more accurate for your incoming mail.
Do this using the "sa-learn" tools, like so:
sa-learn --spam ~/Mail/saved-spam-folder
sa-learn --ham ~/Mail/inbox
sa-learn --ham ~/Mail/other-nonspam-folder


Again, if you are like most people, there is spam in your inbox. Reality, you 
know?


I've been pondering this, what happens when you learn a message as 
non-spam and then ham a few minutes later? As I understand it you do not 
need to explicitly --forget, SpamAssassin is smart enough to handle this 
situation, no? And if so, learning your Inbox should be fine as long as 
you move messages to Spam (and don't just delete) when appropriate.





Re: more spam is getting through :-(

2019-03-18 Thread Dave Warren
On Sun, Mar 17, 2019, at 22:45, John Hardin wrote:
> On Sun, 17 Mar 2019, James wrote:
> > $ sudo sa-learn --dump magic
> > 0.000  04665448  0  non-token data: nspam
> > 0.000  0   51031938  0  non-token data: nham
> 
> I'd generally expect those numbers to be somewhat reversed as most people 
> get more spam than ham...


That really depends, if you don't use auto-learn, reject most inbound spam 
before it hits the mailbox and train all messages received into the mailbox 
then I'd expect to see a lot more ham than spam. 

This is roughly my configuration, I train ham from my "Archives" folder, and 
Spam from a Spam-Confirmed folder. I periodically review the Spam folder and 
sort messages out into either Inbox/Archives or Spam-Confirmed.

I've had reasonably good success with this method.



Re: Is it weird to worry I'm getting too little spam? (success of RBLs)

2019-01-26 Thread Dave Warren
In my experience, the right combination of DNSBLs are extremely
effective, typically well into the 90% of delivery attempts can be
rejected before the DATA command (and therefore before SpamAssassin)
with a combination of DNSBLs, RFC validations (greet pause of 11
seconds, early talkers rejected), rDNS validation, EHLO validation
(rejecting localhost, your own hostname and domain names, etc).
I tend to use a hair-trigger on each of these and trigger greylisting
which allows fast-acting DNSBLs to have another 30 minutes to detect and
list new spammers.
But ultimately DNSBLs alone are very very effective, a significant part
of pre-DATA filtering.


On Sat, Jan 26, 2019, at 14:02, Ian Evans wrote:
> Background: I run a small postfix/dovecot server on my site server.
> Just a handful of careful users. My spam folder would only have about
> 10-30 messages a day marked as spam by spamassassin. Server's running
> denyhosts to help block bad actors.> 
> Recently checked my logs and noticed that the rbl checks in postfix or
> SA were sometimes getting blocked. So I finally installed a caching
> DNS server.> 
> Suddenly the spam that gets to my spam folder is down to five or so a
> day. Seems postfix is dropping a lot of connections due to RBL checks
> before they even get to SA.> 
> Are the RBLs that good? Is it crazy to worry that not enough spam is
> getting to my spam folder? :-)


Re: FPs on FORGED_MUA_MOZILLA (for my own hand-typed messages from my latest-version Thunderbird)

2018-10-02 Thread Dave Warren

> On Oct 2, 2018, at 13:49, Bill Cole  
> wrote:
> 
> On 2 Oct 2018, at 13:39, Matus UHLAR - fantomas wrote:
> 
>>> On 2 Oct 2018, at 9:36, Rob McEwen wrote:
 SIDE NOTE: I don't think there was any domain my message that was 
 blacklisted on URIBL - so I can't explain the "URIBL_BLOCKED", but that 
 only scored 0.001, so that was innocuous. I suspect that that rule is 
 malfunctioning on their end, and then they changed the score to .001 - so 
 just please ignore that for the purpose of this discussion.
>> 
>> On 02.10.18 11:48, Bill Cole wrote:
>>> No, "URIBL_BLOCKED" means that the URIBL DNS returned a value that is 
>>> supposed to be a message to a mail admin that they are using URIBL wrong
>> 
>>> A mail filtering system that gets URIBL_BLOCKED hits is broken. A mail 
>>> filtering system that gets them chronically is mismanaged.
>> 
>> Nonsense. There is no such implication here. While URIBL_BLOCKED may and
>> most of the time apparently does mean that system uses DNS server shared
>> with too many clients, any system that receives and checks too much mail may
>> get URIBL_BLOCKED just because they have crossed the limit, withous using it
>> wrong or being broken.
> 
> Operating a system in a manner which chronically crosses that limit is 
> abusive.
> 
> The DNS reply that results in URIBL_BLOCKED is not "free" for the URIBL 
> operators and depending on their software may be as expensive as sending a 
> real reply. It has the advantage over simply dropping abusive queries that it 
> does not impose timeout delays on abusive queriers and sends a clear signal 
> that can and should be acted upon.


The DNSBL operator can also choose to use a frontend firewall/router/etc system 
to redirect the queries to a dedicated server which can reduce the packet per 
second rate that the authoritative DNS servers need to cope with.

Abusive queries can almost definitely be handled much faster by a 
small/dedicated server that does nothing but return one single wild carded 
response, reducing the impact that abusive users can have on the primary 
infrastructure.




Re: IADB whitelist - again

2018-03-05 Thread Dave Warren

On 2018-03-04 05:46, David Jones wrote:
That's great.  It means you know what you are doing when you change the 
default threshold to less than 5.0.  In that case you need to change a 
lot of other scores down too including RCVD_IN_IADB_* and the KAM.cf 
rules probably score way too high for you as well.


Maybe this is just me, but I'm a firm believer that if you change the 
thresholds, you don't get to complain about the scores of any rules. The 
rules are all balanced against the target threshold of 5.0, and if you 
set your threshold differently it is quite likely that some rules will 
be scored too highly or two low for your needs.




Re: From:name spoofing

2018-02-17 Thread Dave Warren

On 2018-02-17 01:11, Daniele Duca wrote:

On 17/02/2018 00:41, John Hardin wrote:



Not necessarily safe. If your MTA receives a message without a 
Message-ID, it is supposed to generate one. And if it does so, it will 
probably do so using your (recipient) domain...


Isn't MID creation responsability of the MUA and not the MTA? If every 
MTA would generate a MID when not found in inbound emails rules like 
SA's MISSING_MID would be useless.


MID creation should be done by the MUA, and if missing, should be added 
by the MSA. Think of it as a belt-and-suspenders approach. This is also 
why such rules are useful, spambots are often garbage and skip important 
steps that any properly designed software would do.


(Lowercase should, read the RFCs if you want literal SHOULD/etc from the 
specs).


A receiving MTA shouldn't add a Message-ID, but it does happen, 
particularly in infrastructures that need a Message-ID internally.


Also keep forwarding in mind, I might choose to accept an inbound 
message without a Message-ID but I won't forward it on without adding a 
Message-ID, so in this case the final receiving MTA will see a 
Message-ID that is unrelated to the original message in any way.


In an ideal world, it's just a random string (with a bit of formatting 
requirements), but in reality it obviously has some value as different 
senders (and types of senders) will leave a fingerprint behind which may 
be useful for categorization.


Re: Email filtering theory and the definition of spam

2018-02-07 Thread Dave Warren
On Wed, Feb 7, 2018, at 15:52, Martin Gregorie wrote:
> > Technically, you asked for the email and they have a valid opt-out 
> > process that will stop sending you email.  Yes, the site has scummy 
> > practices but that is not spam by my definition.
> > 
> Yes, under EU/UK that counts as spam because the regulations say that
> the signer-upper must explicitly choose to receive e-mail from the
> site, and by-default sign-in doesn't count as 'informed sign-in'.

Canadian law is the same, this is absolutely spam without any ambiguity.


Re: Barracuda Reputation Block List (BRBL) removal from the SA ruleset

2018-02-06 Thread Dave Warren

On 2018-02-05 09:12, Benny Pedersen wrote:

Kevin A. McGrail skrev den 2018-02-05 16:53:


I don't think that will apply will it because it will be looking up
something like 1.2.3.4.bb.barracuda.blah which isn't cached.


the first qurry can make a qurry with very low ttl, so it would not be 
cached, that means number 2 query still mkae dns query to that zone :(


How low are the TTLs? I'm seeing 300 seconds on 127.0.0.2 which is more 
than sufficient time for a single message to finish processing, such 
that multiple queries from one message would absolutely be cached (or 
more likely, the first would still be pending and the second would get 
the same answer as the first).


;; ANSWER SECTION:
2.0.0.127.bb.barracudacentral.org. 300 IN A 127.0.0.2

Maybe the TTLs are different for other records?

I am also noticing very intermittent response times, sometimes taking 
over a second to get a response, other times taking under 50ms.




Re: Using Cloud AutoML as an AI for an Anti-spam filter ?

2018-01-23 Thread Dave Warren
On Tue, Jan 23, 2018, at 02:55, Zulma Pape wrote:
> In other words, can we integrate the Cloud AutoML into our server's
> spam filter and make it behave the same way Gmail behave ?
In short, not without a *lot* of work.

Gmail implements a lot more complexity, and they have a lot more data
than you. One example is that they track user interaction with email,
things like what messages does a user delete without reading, what
messages are opened and for how long, are links clicked, replies
generated, etc.
They also have a very wide view of all the email around the world, and
therefore are very likely to spot new botnets, changes in spammer
techniques, and also changes in legitimate mail far faster than almost
anyone else.
Bayesian is good, per-user bayesian is better, but Gmail can build
bayesian databases without the user's help simply based on their
activity combined with generalized multiple user filters. They can also
use this type of learning to split out mailing lists, receipts,
advertising, scams and others in a general sense, and then apply some
logic to determine if this particular user is likely receptive to the
classifications of messages.
You could reproduce all of this to the best of your data, but you also
need a relatively massive dataset and ability to collect a lot of
details about your user activity to really make it work.
On the other hand, you can make unilateral decisions under the "my
server, my rules" policy to customize and tweak your own filters in a
way that Google cannot.



Re: NOTE: Warning to Abusers of Update Servers

2017-11-24 Thread Dave Warren
Alright, it might be live at http://sa-update.razx.cloud/

Currently I don't do any logging of mirror traffic, although this may
change in the near future.


On Fri, Nov 24, 2017, at 05:02, Kevin A. McGrail wrote:
> I really don't pay too much attention to bandwidth and you will want
> to use http.  We typically set new mirrors at the weight of 1 and then
> you can let us know if we can bump it up.>  Regards,
>  KAM
> 
> On November 23, 2017 10:08:06 PM EST, Dave Warren
> <d...@thedave.ca> wrote:>> On Thu, Nov 23, 2017, at 16:01, Kevin A. McGrail 
> wrote:
>> 
>>>  On 11/23/2017 6:31 PM, Dave Warren wrote:
>>> 
>>>>  Would more mirrors be useful? I've got a ton of spare upstream
>>>>>>>>  bandwidth and am in the progress of setting up a few mirrors for
>>>>  other
>>>>>>>>  projects.
>>>> 
>>> 
>>> 
>>>  Sure.  Always helps to spread the load more.
>>> 
>>>  
>>> 
>>>  All you have to do is setup sa-update.XYZ.tld and add an rsync
>>>  command
>>>>>>  every 10 minutes.  Then we add you to the mirrors list weighted
>>>  by how
>>>>>>  much traffic you can bear.
>>> 
>> 
>> 
>> Any idea what sort of traffic I should expect? 
>> 
>> 
>> 
>> Also, is it better to serve traffic on HTTP, HTTPS, or both? And if
>>>> both, should HTTP be redirected to HTTPS? 
>> 
>> 
>> 
>> My standard configuration does redirect, but this can be disabled if
>>>> appropriate.
>> 



Re: NOTE: Warning to Abusers of Update Servers

2017-11-24 Thread Dave Warren
On Fri, Nov 24, 2017, at 09:45, RW wrote:
> On Fri, 24 Nov 2017 08:23:21 -0700
> Dave wrote:
> > >> It mostly shouldn't, but when I was supporting a mail server that 
> > >> included a SpamAssassin integration, we ran into a non-zero number
> > >> of installations where DNS checks failed and they fell back on
> > >> direct connections.  
> > > 
> > > I don't follow that. sa-update needs the result of the dns lookup to
> > > construct the download URL.   
> > 
> > My recollection is that something was eating the TXT results; but not
> > the A records. 
>  
> 
> But it's the version number in the TXT result that determines the
> names of the files to download. If it's not available the channel is
> supposed to be skipped. 

I believe there is/was a HTTP based fallback. I could be remembering
wrong, it's been a few years since I was professionally involved.


Re: NOTE: Warning to Abusers of Update Servers

2017-11-23 Thread Dave Warren
On Thu, Nov 23, 2017, at 16:01, Kevin A. McGrail wrote:
> On 11/23/2017 6:31 PM, Dave Warren wrote:
> > Would more mirrors be useful? I've got a ton of spare upstream 
> > bandwidth and am in the progress of setting up a few mirrors for other 
> > projects.
> >
> Sure.  Always helps to spread the load more.
> 
> All you have to do is setup sa-update.XYZ.tld and add an rsync command 
> every 10 minutes.  Then we add you to the mirrors list weighted by how 
> much traffic you can bear.

Any idea what sort of traffic I should expect? 

Also, is it better to serve traffic on HTTP, HTTPS, or both? And if
both, should HTTP be redirected to HTTPS? 

My standard configuration does redirect, but this can be disabled if
appropriate.


Re: NOTE: Warning to Abusers of Update Servers

2017-11-23 Thread Dave Warren

On 2017-11-21 11:57, RW wrote:

On Tue, 21 Nov 2017 08:55:34 -0600
David Jones wrote:



You are correct.  I haven't dug into the code to verify but it
appears that 3.4.x sa-update does use the DNS TXT record to know when
to download so it doesn't hurt anything to run this version hourly.



By the sound of it this warning doesn't apply at all to anyone with a
normal up to date installation.



It mostly shouldn't, but when I was supporting a mail server that 
included a SpamAssassin integration, we ran into a non-zero number of 
installations where DNS checks failed and they fell back on direct 
connections.


The default update frequency was sane, but nothing stopped a user from 
implementing their own schedules.




Re: NOTE: Warning to Abusers of Update Servers

2017-11-23 Thread Dave Warren
Would more mirrors be useful? I've got a ton of spare upstream bandwidth 
and am in the progress of setting up a few mirrors for other projects.



On 2017-11-21 10:47, Kevin A. McGrail wrote:
My goal is to stop abuse without causing undue grief or fps. It may come 
to more draconian steps as you suggest.

Regards,
KAM

On November 21, 2017 10:13:38 AM EST, AJ Weber  wrote:


The major offenders are sa-update 3.3.x and generic curl clients
based
on the user agent in the logs running from every minute to every 15
minutes and blindly pulling down the same rulesets over and over.


My "vote" counts for very, very little, but since these clients already
have the latest rules (multiple times, apparently), why not just block them?

If they are actually monitoring their update scripts at all (seems
doubtful), it should get their heads out of the sand (was going to use a
similar metaphor but wanted to be nice).  They will probably look for a
resolution and find these latest posts.

If they're not monitoring their updates on any regular basis, it doesn't
seem like they care if they get them anyway.





Re: Blocking senders that are whitelisted

2017-10-04 Thread Dave Warren

On 2017-10-04 10:26, Ian Zimmerman wrote:

On 2017-10-04 10:52, David Jones wrote:


I bet this user signed up for this email somehow, possibly a while ago and has
forgotten about doing so.  So many times, when you register for accounts on
websites, the check box to opt-in to a mailing list is already checked and most
users don't take the time to read the page and uncheck the box before clicking


Then it's not really opt-in except to a lawyer.

Sorry, I know this is beating a dead horse.



In Canada, and many parts of Europe, pre-filled checkboxes no longer 
qualify as consent either unless the only purpose of the form is to 
subscribe to a mailing list.


Transactions still result in implied consent, but this is limited and 
somewhat risky to rely upon.


Your legal jurisdiction may vary, and if you rely on random mailing list 
participants for legal advice, well, you'll get what you paid for.


Nonetheless, MailChimp will honour unsubscribes and their abuse 
department is at least somewhat responsive.






Re: Direct download link detection

2017-07-24 Thread Dave Warren
On Mon, Jul 24, 2017, at 15:00, Alex wrote:
> Hi,
> 
> We're currently experiencing a new spam campaign that involves some
> text pertaining to invoicing then a link that immediately downloads a
> Word macro file.
> 
> http://sdeflores.com/PHJC579907/
> 
> What would be involved in following these links in SA to determine if
> they immediately download a file (other than a web page)? Would that
> even be a reliable indicator?

You want to be very careful with your implementation, many "Verify this
account" or Subscribe/Unsubscribe links act on a single GET rather than
following standards and using a POST, so this type of activity can
trigger an action.

It is possible that a HEAD would be safer and still provide the needed
information, but i'm not clear if this will trigger any actions, but
noting how terrible many implementations are, it wouldn't shock me if
there are a few home-brewed beasts that take action on a HEAD request.




Re: updates.spamassassin.org gone?

2017-07-06 Thread Dave Warren
Did you read any of the thread? 

There shouldn't be an A record, and there literally can't be a (valid)
PTR record.

On Thu, Jul 6, 2017, at 15:48, jdow wrote:
> No A or PTR record:
> 
> ===8<---
> [jdow@thursday ~]$ dig updates.spamassassin.org ns1.apache.org all
> 
> ; <<>> DiG 9.9.4-RedHat-9.9.4-50.el7_3.1 <<>> updates.spamassassin.org 
> ns1.apache.org all
> ;; global options: +cmd
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7892
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
> 
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 4096
> ;; QUESTION SECTION:
> ;updates.spamassassin.org.  IN  A
> 
> ;; AUTHORITY SECTION:
> spamassassin.org.   3415IN  SOA ns2.pccc.com. 
> pmc.spamassassin.apache.org. 2017062901 7200 3600 604800 3600
> 
> ;; Query time: 0 msec
> ;; SERVER: 127.0.0.1#53(127.0.0.1)
> ;; WHEN: Thu Jul 06 15:28:44 PDT 2017
> ;; MSG SIZE  rcvd: 125
> 
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 39442
> ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
> 
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 4096
> ;; QUESTION SECTION:
> ;ns1.apache.org.IN  A
> 
> ;; AUTHORITY SECTION:
> apache.org. 1800IN  SOA ns2.surfnet.nl. 
> hostmaster-2005-alpha.apache.org. 2017070600 3600 900 604800 3600
> 
> ;; Query time: 321 msec
> ;; SERVER: 127.0.0.1#53(127.0.0.1)
> ;; WHEN: Thu Jul 06 15:28:45 PDT 2017
> ;; MSG SIZE  rcvd: 115
> 
> ;; Got answer:
> ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 56671
> ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
> 
> ;; OPT PSEUDOSECTION:
> ; EDNS: version: 0, flags:; udp: 4096
> ;; QUESTION SECTION:
> ;all.   IN  A
> 
> ;; AUTHORITY SECTION:
> .   10740   IN  SOA a.root-servers.net. 
> nstld.verisign-grs.com. 2017070601 1800 900 604800 86400
> 
> ;; Query time: 0 msec
> ;; SERVER: 127.0.0.1#53(127.0.0.1)
> ;; WHEN: Thu Jul 06 15:28:45 PDT 2017
> ;; MSG SIZE  rcvd: 107
> ===8<---
> 
> It looks like somebody fat fingered when updating the Apache.org NS
> records. (Or 
> perhaps it was  ns.pccc.com that was fscked up perhaps on June 29th.)
> 
> {^_^}   Joanne
> 
> On 2017-07-06 11:39, RW wrote:
> > On Thu, 6 Jul 2017 12:14:02 -0500
> > David Jones wrote:
> > 
> >> On 07/06/2017 12:02 PM, Rainer Sokoll wrote:
> >>>
>  Am 06.07.2017 um 18:55 schrieb David Jones :
> >>>
>  You can also run 'sa-update -vvv' to see more information on what
>  it's looking for.
> >>>
> >>> Now I'm really confused: it works the way described in this thread:
> >>>
> > 
> >>>
> >>
> >> That's odd.  Nothing has changed in DNS for almost 2 weeks so that
> >> shouldn't have happened.
> > 
> > Both DNS txt lookups worked, all the HTTP downloads failed. If the
> > script has worked in the past, and nothing has changed, it was most
> > likely just a temporary networking problem.
> > 
> 



Re: The nice thing about standards (was Re: Legit Yahoo mail servers list)

2017-01-31 Thread Dave Warren

On 2017-01-30 08:06, Dianne Skoll wrote:

On Mon, 30 Jan 2017 09:06:34 -0500
Rob McEwen  wrote:


On 1/30/2017 8:54 AM, Matus UHLAR - fantomas wrote:

they do and it has been mentioned:
https://help.yahoo.com/kb/SLN23997.html

Cool.  So Yahoo uses an HTML page that's a pain to process by
computer.


They publish SPF records and DKIM sign everything for competent SMTP 
receivers to handle in real-time, AND they publish a HTML version for 
humans, and yet someone still finds a reason to complain?


Maybe it's just me, but hand-maintaining a list of IPs to whitelist is 
so 1997s. The real value of SPF and DKIM is that you don't do any of 
that, you can whitelist by domain and let the sending domain tell you, 
in real time, whether or not the inbound message should be trusted.


Or, if you insist on doing things manually, glance at the HTML source 
and spend a good strong 3 minutes with your favourite regex parser and 
you're good to go.


 
has both the answer and shows my work.


But remember, this list is only valid until it isn't, even big providers 
move things around, sometimes frequently, so expect to update the list 
frequently (or again, don't, just use the tools that exist to do it in 
real time and go watch a movie instead).





Re: No rule updates since 1/1/17

2017-01-17 Thread Dave Warren
On Tue, Jan 17, 2017, at 12:51, Axb wrote:
> On 01/17/2017 09:14 PM, Dave Warren wrote:
> > On Sun, Jan 15, 2017, at 20:02, Kevin A. McGrail wrote:
> >> On 1/15/2017 9:21 PM, Chris wrote:
> >>> The last update of rules I've seen is 1/1/17. The attached cron output
> >>> seems to show no problems though. Doesn't seem right no updates for two
> >>> weeks but I guess it's possible.
> >>
> >> It's been noted and I think i have the root issue tracked down. Some of
> >> the checkers are running the wrong SVN checkout and I don't know why so
> >> they are skipped.  Then we miss the minimum number of masscheckers to
> >> publish.
> >
> > Have you reached you to any that weren't reporting correctly? Mine went
> > offline in November and just came back up 1-3 days ago, could you take a
> > quick look at "dwarren" to see if everything is okay with my
> > submissions?
> >
> 
> Dave,
> 
> If you look into
> http://ruleqa.spamassassin.org/
>   and unfold [+] the "green" (lastest) you should find your "dwarren".
> If it's not there, could be you submission came in too late.

I'm in the list on the 15th and 16th, but not the 17th. Not sure what to
make of that, from what little I can tell, it completed around the same
time, but I don't have time to dig into it further right now. I was more
worried about whether I fell into the "wrong SVN checkout" group and was
ignored for that reason.



Re: No rule updates since 1/1/17

2017-01-17 Thread Dave Warren
On Sun, Jan 15, 2017, at 20:02, Kevin A. McGrail wrote:
> On 1/15/2017 9:21 PM, Chris wrote:
> > The last update of rules I've seen is 1/1/17. The attached cron output
> > seems to show no problems though. Doesn't seem right no updates for two
> > weeks but I guess it's possible.
> 
> It's been noted and I think i have the root issue tracked down. Some of 
> the checkers are running the wrong SVN checkout and I don't know why so 
> they are skipped.  Then we miss the minimum number of masscheckers to 
> publish.

Have you reached you to any that weren't reporting correctly? Mine went
offline in November and just came back up 1-3 days ago, could you take a
quick look at "dwarren" to see if everything is okay with my
submissions?






Re: I have some bad news

2016-09-05 Thread Dave Warren
On Sun, Sep 4, 2016, at 18:11, @lbutlr wrote:
> On Sep 1, 2016, at 7:41 PM, David Niklas  wrote:
>>
>> Would you like to go out to lunch?
>
> Other than your message, that phrase does not appear in 7 years of
> my mail.

And? Replace the string with an example that does appear frequently in
ham. Or, a dozen examples that do, structured into a plausible
paragraph.


Re: DKIM domainkeys=fail (1024-bit key) reason="fail (message has been altered)"

2016-08-26 Thread Dave Warren
On Fri, Aug 26, 2016, at 08:39, Bowie Bailey wrote:
> On 8/26/2016 11:34 AM, widowsoft wrote:
> > I am sure this has been done to death but I would like to ban emails that
> > show
> > "domainkeys=fail (1024-bit key) reason="fail (message has been altered)""
> >
> > any ideas please I have tried regex but i admit i am a novice
> > i added
> > header DKIM_FAIL  ALL =~ /domainkeys=fail (1024-bit key)/i
> > score DKIM_FAIL 4
> >
> > can a guru help please
> 
> Might be an escaping issue.  Parenthesis are special in Perl regexes.
> 
> Try this:
> header DKIM_FAIL  ALL =~ /domainkeys=fail \(1024-bit key\)/i

Note that taking negative action is explicitly prohibited by DKIM, and
as a result there are valid, standards compliant configurations that
will return this DKIM report.

(I'm not saying don't do it, it's your server, you make the rules, but
this isn't as foolproof as one might naively expect).



Re: false possitive

2016-07-31 Thread Dave Warren

On 2016-07-31 13:27, Benny Pedersen wrote:

# rule:[h.reindl maillist]
if allof (header :contains "from" "h.rei...@thelounge.net", header 
:contains "to" "users@spamassassin.apache.org")

{
setflag "\\Seen";
stop;
} 


That seems poorly written as it relies upon the To field while the list 
may also receive CC/BCC'd messages, and doesn't handle all possible 
instances in the From field. Also, I'd remove "stop" as I still want 
this delivered into the spamassassin folder by a later rule. This seems 
to work better:


if allof (anyof (header :contains "from" "h.rei...@thelounge.net",header 
:contains "from" "m...@junc.eu"), header :is "List-Id" 
"")

{
setflag "\\Seen";
}

--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Abused accounts

2016-03-19 Thread Dave Warren

On 2016-03-15 14:15, Ted Mittelstaedt wrote:

I agree with you on that one.  There's a big push among colleges to push
students to use their "blessed" mailsystems.  They don't want students 
emailing instructors from the student's gmail account, they want the

students emailing the instructors from the college-provided gmail account.


I would too. If nothing else, this prepares students for the real world, 
where you can't just use your own random @gmail.com account for business 
purposes either.


I've walked away from a university study after getting an email from and 
CC'd to some random @gmail.com/@hotmail.com addresses requesting further 
medical information to confirm placement in the study. I filed an 
official ethics complaint as the preliminary medical information I 
submitted was supposed to be held safely and all data is supposed to be 
protected, revealed only to me, my doctor, and otherwise anonymized 
before any dissemination, yet was CC'd to multiple providers and now is 
subject to their marketing department's whims, within the range of third 
party privacy policies in other countries.


Information security is hard, mostly because of users, and it takes 
practice. Accepting and trusting inbound email from random addresses is 
what brings us to 
https://krebsonsecurity.com/2016/03/thieves-phish-moneytree-employee-tax-data/


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




3.5 XPRIO Has X-Priority header

2016-03-10 Thread Dave Warren

Howdy!

We've had a rash of false positives in the last couple of weeks, almost 
exclusively tipping the scales is one particular hit:


3.5 XPRIO Has X-Priority header

This seems to be scored fairly high for what it is as some mobile 
devices are inserting this header on all of their messages, and 3.5 is a 
good chunk of the way to a hit.


Anyone else seeing issues, or should just re-score it locally and call 
it a day?


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Can your bayes do this?

2016-01-24 Thread Dave Warren

On 2016-01-20 22:21, Marc Perkel wrote:
Here is a list of 3494938 words and phrases used in the subject line 
of SPAM and never seen in the subject line of HAM


http://www.junkemailfilter.com/data/subject-spam.txt


I thought I'd take you up on this using a combination of my corpus, and 
the other mail I have indexed and trivially searchable which is not 
necessarily corpus quality, but which I can review casually, so I looked 
through your list of "words and phrases... never seen in the subject 
line of HAM" that I thought I might find in my collection of ham and 
here we go:


"alert you have"
"almost done!"
"almost go"
"application declined"
"application support"
"at any time dave" <-- Found one in my own mailbox! Woot!
"audible app" <-- Audible themselves used this in 2014.
"audio with" <-- Are you kidding? A bunch of hits from my mailbox, I see 
a bunch from OpenBSD's mailing lists, ffmpeg.org, and other places.


My ham indexes are tokenized stripping punctuation, I found over a 
hundred hits for "almost done" and manually reviewed, I found at least 
two "almost done!" in the first dozen and got bored. A ton of mail is 
already excluded for various reasons. For results with a small number, I 
manually reviewed to rate spamminess, for larger numbers of hits I got 
bored once I found a few strong hits.


I'm looking for substring matches, not necessarily anchored to the start 
or end of the subject, but a good chunk of these comprise the entire 
subject line ("almost done!", "application support" "application 
declined"), so even if you're not looking at substrings, it's still a 
sloppy mess.


This is only on a few million messages that comprise a very narrow slice 
of the mail flow on the internet, and only from those customers where I 
can query their mail trivially.



Hope you understand it now. Not Bayesian 


Perhaps not, but it seems like it's a natural precursor to a bayesian 
implementation. As RW said further down the thread:



the only difference between

   "ambulatory care" -> only in ham
   "aall cards"  -> only in spam

and

"ambulatory care"  occurs 16 times in ham and 0 times in spam
"aall cards"   occurs  0 times in ham and 3 times in spam

is that you have discarded the count information.


And count information is important in determining the likely 
trustworthiness of a result. What would your system do with a phrase 
that appears in thousands of ham messages, and 2 spam messages? Ignore 
it completely?


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Google redirects

2015-12-20 Thread Dave Warren

On 2015-12-20 03:22, Reindl Harald wrote:
but usually there are daily score-updates what didn't happen for more 
than two weeks now! 


It happens, life goes on.

--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Is it worth transferring bayes data between different sites?

2015-12-02 Thread Dave Warren

On 2015-12-02 09:14, Sebastian Arcus wrote:
Perfect - that's exactly the sort of real-life based advice I was 
looking for. Many thanks! 


I run a small shared hosting environment, with a global bayes for all 
users as not enough users are ready/willing/able to take the time to 
sort ham (although more will press "this is spam") and in general, the 
results work out well enough.


Sharing bayes between servers or sites would not seem to be particularly 
different than a shared bayes between multiple customers in a shared 
hosting, as long as the "typical end user" is similar. If you have a 
viagra dealer or diet pill retailer as one of your customers, your 
mileage may vary and they may need more personalization, but in general, 
for typical SOHO and SMB customers, spammy spam is spammy spam and 
pretty widely distributed.


From what I see, it's ham that varies a lot per-user, and so while we 
try to train bayes across a wide range of ham sets, we also do a lot of 
automated whitelisting based on user behaviour based on mail that users 
send, or mail that users keep in their mailboxes so that we can skip 
spam filtering entirely for as much "wanted" mail as possible. We also 
try to reduce filtering on replies based on the "In-Reply-To:" header 
containing headers that match certain formats (such as what our webmail 
produces, what we add to messages missing this header, and a few other 
formats), so it's possible that someone else who borrowed our bayes 
database might end up seeing a higher false-positive rate.


We avoid training big companies (Amazon, eBay, etc) as spam even when 
they spam, as long as it's clearly identified in a blockable way, 
instead providing users the ability to block senders outright when 
applicable.


Sure, there are errors and mistakes, by and large, bayes works out the 
details in a shared environment, a multi-server environment shouldn't be 
too different, as long as the customer base is similar.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Learning only on read emails?

2015-10-19 Thread Dave Warren

On 2015-10-19 14:41, Ryan Coleman wrote:

Actually it makes absolute sense since I dump my spam into a folder to be 
scanned as spam and anything that is still in my inbox, and read, is indeed ham.

I just have to re-investigate the ./new and ./cur folders to make sure they 
will operate how I want. But if the email was delivered to my phone and it 
moves (but not read) then it’s not an option.


I agree completely, this has proved to be quite useful here. In my case, 
I scavenge the "Archive" folder of various accounts for my users who use 
the Archive functionality of modern mail clients, in particular, mobile 
clients and Thunderbird.


The concept of "Copy" doesn't exist on most mobile clients, and even 
when it does, most users simply won't be bothered to copy non-spam with 
any regularity, so at least for me, I've had far better success dealing 
with messages on an automated basis than trying to influence user behaviour.


At this point I only implement this for specially selected users (and of 
course, only with a user's consent, but since I approach users when 
they've had a message get misclassified, they're usually happy to help). 
For users who don't use an "Archived" folder, capturing 
left-in-the-inbox, marked-as-read would be useful, but it hasn't been 
worth the time to implement (yet). Unfortunately I don't use maildir, so 
any tools I've created here aren't useful in the general case and are 
really platform and environment specific.


Also, if a user later takes a message from their archive and places it 
into the spam folder, I do have a tool that detects the duplicate and 
purges it from the corpus. Currently I think I just delete the whole 
message, although I did actually write code to detect which was newer 
and trust the most recent decision made by the user, but ultimately I 
decided it was safer to just delete it completely.



--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Add "may be forged" minor rule?

2015-09-28 Thread Dave Warren

On 2015-09-28 14:39, David Jones wrote:

I thought the same thing so I eased it in with the sqlgrey discrimination
option.  My users never even knew I implemented it since I went live on
a Friday evening.  I filter for almost 100,000 mailboxes and zero complaints
from users, just a lot less reports of spam to our support mailbox.


One other trick you can use, if you haven't already considered it and 
are starting a fresh implementation, see if you can "prime" the database 
by feeding recent history into the greylist "approved" list before you 
start. Or if not, start with a 1 second greylist period for the first 
couple weeks, such that any retry gets through, then raise it to a more 
reasonable number once your greylist database has a more useful picture 
of the type of mail you receive.



You could use the discrimination to just start with those rare 4 letter
and longer TLDs and ease into it.


I've had good success by only triggering greylisting if a session is 
already suspicious in some fashion. My current criteria are:


- EHLO or rDNS mismatch.
- SPF fail or softfail.
- DKIM headers exist, but don't validate.
- SpamAssassin score over 4.9.
- "Suspicious" TLD (a moving target, I admit).
- Any DNSBL hit.

Some of these are applied at RCPT TO time, some not until DATA, but none 
of these are candidates to reject mail outright, just to run it through 
greylisting. If greylisting decides to let it through, it will get 
accepted, but if not, maybe a 30 minute cool down period will be enough 
for other DNSBLs or bayes to be ready to reject it outright.


You might also want to see if you can avoid greylisting some big 
senders. There is zero advantage in greylisting Google, Outlook.com, 
Outlook 365, Yahoo, AOL, etc, as you know they're real mail servers and 
you know they will retry. For senders that send a large amount of good 
mail, content filtering is worthwhile, but greylisting won't do anything 
but potentially delay legitimate traffic.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: SPF confusion

2015-07-16 Thread Dave Warren

On 2015-07-15 23:49, Matus UHLAR - fantomas wrote:

On 2015-07-15 13:53, David Jones wrote:

I have seen Microsoft Exchange servers use the header From: domain
instead of the envelope-from but this does not follow RFC 4408 spec.


On 15.07.15 15:06, Dave Warren wrote:
This is valid under Sender-ID, which was Microsoft's attempt at SPF 
version 2. It has since died a (deserved) death, and such use should 
be depreciated. However, if you don't have an explicit spf2.0/pra 
record, Sender-ID suggested using the v=spf1 record instead (which 
obviously causes issues)


If you want to avoid these tests, add this as a TXT record: 
spf2.0/pra or spf2.0/pra ?all


Better not. Don't jump on dead horse. Microsoft SPF/2 is dead, let it die
and don't even try to fix things by implementing it, since may break 
things

working properly.


From a receiving side, I agree completely. As a sender, I still like to 
have a record telling receivers Don't apply spf2.0/pra tests because 
otherwise there is (potential for) minor breakage if you happen to send 
to a server that still performs these tests. If not, the records are 
harmless TXT records.


With that being said, I only personally know of one small Exchange 
server which still implements Sender ID tests, it's been running some 
years without an administrator and they have no plans to hire one, so 
it's likely got other issues which need attention, but aren't critical 
enough to get it. And frankly, even when spf2.0/pra records don't exist 
and Sender ID falls back to v=spf1 records, things work often enough.


Each to their own, but if you encounter a server that is still using 
Sender-ID, it's generally so poorly maintained that it's easier to add 
the record yourself than to try and get them to fix it.



--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Return Path (TM) whitelists

2015-07-15 Thread Dave Warren

On 2015-07-09 15:07, Dianne Skoll wrote:

Just as SPF pass is a mild spam indicator nowadays


Huh? Last I looked, somewhere near 80% of my legitimate mail flow passes 
SPF. It wouldn't shock me if this has gone higher.


While a lot of spam does too, SPF:PASS alone doesn't really mean 
anything, but rather, it should be used as a way to indicate that the 
mail comes from an IP authorized to use the domain in question (or not). 
SPF FAIL/SOFTFAIL is often a bad sign (it either indicates forgery OR 
misconfiguration, so you can treat it with suspicion), but SPF PASS is 
meaningless on it's own.


I'd suggest that SPF:PASS means you can rely on domain based logic 
(trusts/whitelists/reputation) rather than only IP based logic, allowing 
you to safely whitelist example.com without guessing what IPs 
example.com uses (and might use tomorrow.)


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: SPF confusion

2015-07-15 Thread Dave Warren

On 2015-07-15 13:53, David Jones wrote:

I have seen Microsoft Exchange servers use the header From: domain
instead of the envelope-from but this does not follow RFC 4408 spec.


This is valid under Sender-ID, which was Microsoft's attempt at SPF 
version 2. It has since died a (deserved) death, and such use should be 
depreciated. However, if you don't have an explicit spf2.0/pra record, 
Sender-ID suggested using the v=spf1 record instead (which obviously 
causes issues)


If you want to avoid these tests, add this as a TXT record: spf2.0/pra 
or spf2.0/pra ?all


Note that while it was presented as a version 2, it's depreciated, and 
v=spf1 records are still current and the only records that really 
should be used in practice today.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Macs/Yosemite can no longer send abuse reports

2015-06-27 Thread Dave Warren

On 2015-06-27 15:00, Jo Rhett wrote:
In the meantime, is there a mail client for Yosemite which does work? 
 I tried Thunderbird, and while it is capable it’s more than 15 clicks 
and manual hand editing to send a report. The two key combinations was 
far easier to use.


I'm not sure if Thunderbird is otherwise hobbled on OSX, but on my OS, 
CTRL+U (View Source) will bring up the original message, and if you add 
the Forward button to your toolbar, you can Forward as attachment in 
two clicks (to create a message with the currently selected message(s) 
already attached)


What are you doing that takes 15 clicks?

You still need to address the report and add comments, but since this 
needs to be done regardless of client, I don't care to count these steps.


Also, you can set the default forward method if you only intend to 
forward email as an attachment and want the extra click for when you're 
forwarding inline.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Handling very large messages (was Re: Which milter do you prefer?)

2015-03-15 Thread Dave Warren

On 2015-03-15 15:01, Reindl Harald wrote:
surely, only 5% of incoming spam attempts make it to spamassassin / 
clamav here, but you need to keep in mind the amount of your regular 
ham messages in your mailflow which unconditionally touch the content 
scanners 


Why would it? I'd hazard a guess that, on a percentage basis, I run less 
ham though SpamAssassin than spam.


Obviously comparing the raw numbers will give a different reset of 
results, due to the drastically different number of spam attempts vs ham 
attempts.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Handling very large messages (was Re: Which milter do you prefer?)

2015-03-15 Thread Dave Warren

On 2015-03-15 17:26, Reindl Harald wrote:


Am 16.03.2015 um 01:23 schrieb Dave Warren:

On 2015-03-15 15:01, Reindl Harald wrote:

surely, only 5% of incoming spam attempts make it to spamassassin /
clamav here, but you need to keep in mind the amount of your regular
ham messages in your mailflow which unconditionally touch the content
scanners


Why would it? I'd hazard a guess that, on a percentage basis, I run less
ham though SpamAssassin than spam


than your MTA filters *before* SA just don't work or you have very few 
legit mail at all




Not at all, I just have comprehensive, adaptive, user-learned 
whitelisting that catch the vast majority of legitimate mail before it 
hits SpamAssassin. By whitelisting known-good sources aggressively and 
automatically, I can cut the false positive rate to near zero, allowing 
me to filter more aggressively at later stages.


95% of any delivery attempt is blocked by a sensible 
DNSBL/DNSWL/PTR/HELO check on the MTA level and never makes it to 
milters at all 


SpamAssassin need only be responsible for sorting through mail that 
isn't already known to be good or bad, putting known-good mail through 
SpamAssassin is wasteful.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Blocking .exe in zips (was Re: Lots of Polish spam)

2015-02-25 Thread Dave Warren

On 2015-02-25 12:18, David F. Skoll wrote:

On Tue, 24 Feb 2015 23:06:02 +0100
Yves Goergen nospam.l...@unclassified.de wrote:


If the mail server now blocks all .exe in .zip without
actually scanning the contents, they're going to complain.

...

So far, no major complaints.  The few who really need to send such files
rename them to .ex_ before zipping them up.  We have a fairly large
userbase (more than 140,000) so I think we would have heard lots of
complaints by now if people really couldn't live with the policy.


Seconded. I run a small hosting company with email for hundreds of 
clients, I've had a grand total of 0 complaints about blocking EXE, SCR, 
COM and similar types. We maybe get one inquiry per year about it, but 
no one has ever had a problem with .ex_ solutions, and they generally 
understand and appreciate the approach.


It scales up to large installations as well, Google blocks executable 
files (even if zipped) too, and they seem to be doing alright in the 
email world: https://support.google.com/mail/answer/6590?hl=en


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Lots of Polish spam

2015-02-25 Thread Dave Warren

On 2015-02-25 11:42, Bill Cole wrote:

On 24 Feb 2015, at 17:06, Yves Goergen wrote:


I can't block all archives with executable files in them.


Then in all seriousness: why bother filtering email specifically for 
malware?


I second this. Either go all the way, or don't do it, it's worse to 
leave users with a false sense of security. A mentality of The virus 
scanner says it's safe, so it won't do any harm is exceedingly dangerous.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Lots of Polish spam

2015-02-25 Thread Dave Warren

On 2015-02-25 14:23, Yves Goergen wrote:

Am 25.02.2015 um 23:04 schrieb Dave Warren:

I second this. Either go all the way, or don't do it, it's worse to
leave users with a false sense of security. A mentality of The virus
scanner says it's safe, so it won't do any harm is exceedingly 
dangerous.


The virus scanner doesn't say anything at all. It is just an 
additional effort to keep unwanted e-mails away, just like the spam 
filter. Nobody claimed that there is any guarantee associated with it, 
not even for false rejects. Considering what still passes the filters 
this should quickly become obvious.




You're thinking like a techie. Don't do that. When an end user becomes 
aware that there is a malware filter or antivirus, they will assume it 
works, and since malware and viruses are filtered, that which is not 
filtered must be safe.


Users are stupid; this is why we're employed. Understand them, and build 
systems that set appropriate expectations and encourage the correct 
behaviour. If you're handing people a dangerous weapon, don't tell them 
all the reasons it's safe, tell them all the reasons it's dangerous even 
if there are a few safeguards.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Quick question about training...

2015-02-20 Thread Dave Warren

On 2015-02-20 09:44, Bowie Bailey wrote:

On 2/20/2015 12:35 PM, Kevin Miller wrote:
When a fresh spam flood comes in, sometimes 50 or more of my users 
will get hit with the same message - just a different user in the To: 
line.  When one trains the bayes database, is there a significant 
difference between training on all 50+ or just grabbing a few of the 
messages and training on them?  Will bayes be more convinced of the 
spaminess of a particular message if it sees dozens rather than a 
couple?


Yes, there will be a difference.  Training the exact same message 
multiple times will not do anything, but if you have 50 copies of the 
message that are all slightly different, train them all.


In general, train as much as you can manage.  Ideally, you would train 
bayes on every message that passes through your server.  The more data 
bayes has, the better it works.


And I'd suggest the same for non-spam, train duplicative ham even if it 
happens to be similarly addressed to different users. More data is 
(nearly) always better for bayesian learning systems.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: SPF rules do not look at spoofed From: address

2015-02-12 Thread Dave Warren

On 2015-02-12 08:17, francis picabia wrote:

Our spamassassin 3.3.1 is marking email with tags like and
SPF_SOFTFAIL and SPF_FAIL, as long as the sender info
is failing the SPF test.  But if the sender passes the test
and the From: address is from our domain, then there
are no SPF tags appearing.

The risk is that users don't look at the sender, only the From:
field of their email, and this can potentially allow phishing.

Has anyone encountered this issue and resolved it?


As others have said, this is by design. Sender-ID attempted to extend 
SPF records to the RFC5322.From header, and was not widely deployed 
because of the massive breakage. It's legacy at this point.


DMARC is a more modern solution, allowing senders to specify that mail 
from their domain must be identified and authenticated, including an 
alignment requirement between the RFC5321.Mail and RFC5322.From domains.


However, using a DMARC quarantine or reject policy causes breakage 
when users attempt to participate in discussion based mailing lists, or 
other systems which modify messages (adding subject tags, adding 
footers, removing existing signatures), so DMARC quarantine or reject 
policies are only really useful for domains which send mail in 
predictable and largely automated ways, which are frequently forged, 
with live users living at another domain for their own mailboxes.


With that being said, there could be some room for ham-detection 
(negative scoring, from a SA perspective) when RFC5322.From headers pass 
parsing of SPF records, but you should not attempt to use any 
spam-detection when there is a mismatch as a mismatch is normal and 
expected behaviour.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: SPF rules do not look at spoofed From: address

2015-02-12 Thread Dave Warren

On 2015-02-12 11:27, Martin Gregorie wrote:

On Thu, 2015-02-12 at 15:07 -0400, francis picabia wrote:

SPF works as designed.  Forget SPF.


Quite: the only real use for SPF is to prevent you inadvertently
spraying innocent people with backscatter. If the sender has been forged
by a spammer and your MTA can't deliver it (usually because the spammer
used an unrecognised recipient name) then an SPF check will show that
the sending IP is wrong and your MTA can drop the message in the bit
bucket rather than sending a reject message to the owner of the forged
sender address.


Not at all. SPF is very useful for whitelisting by domain, without 
having to guess at what IPs a sender uses today, might use tomorrow, and 
without having to trust every single thing coming from that IP space.


SPF based whitelisting trivially allows you to whitelist all mail from 
@example.com even if they use Google Apps and you don't want to blanket 
whitelist Google Apps. And it will still work when they transition to 
another provider and don't think to tell you.


It's not effective as a blacklist, nor a spam filter. Nor should it, 
that's not it's design goal; SPF does a /great/ job at telling you when 
a message is directly from a legitimate sender, allowing you to act 
accordingly.


DKIM is similar, it excels at identifying legitimate messages, using 
cryptography that survives forwarders rather than using IPs. More 
complicated to implement, but ultimately, technically, a better solution.


In both cases, it helps you pick out legitimate mail from wanted senders 
which can benefit spam filtering by allowing to you be just a little bit 
more aggressive against unknown senders without raising false positives 
too much in the process.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Heads Up: Yahoo! goof

2015-02-08 Thread Dave Warren

On 2015-02-05 05:02, Axb wrote:
Wonder what moron came up with the idea that Yahoo! should use 
addresses in the multicast range 224.0.0.0/4 in the webmail Received 
headers.




There are other places/reasons that this happens. For example, take a 
look at CloudFlare's Pseudo IPv4 implementation that kicks in when a 
IPv6 client is served content from their proxy from a IPv4-only host.


https://blog.cloudflare.com/eliminating-the-last-reasons-to-not-enable-ipv6/

Is it possible that Yahoo is doing similar (even if for different reasons)?

--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: SPF_HELO_PASS,SPF_NONE

2015-01-05 Thread Dave Warren

On 2015-01-05 18:34, Reindl Harald wrote:


if the envelope-domain has no SPF published and want to verify 
anything in context of HELO then you can check:


* does the HELO hostname exist at all
* does the IP match in both directions

but you are far away from a SPF_HELO_PASS in context of the incoming 
mail, frankly it's wrong and unrelated until the envelope sender is 
not @helo-hostname


You might want to give the SPF specs another read, SPF can optionally 
apply to the HELO/EHLO field. 
https://tools.ietf.org/html/rfc7208#section-2.3, which reads in part:



It is RECOMMENDED that SPF verifiers not only check the MAIL FROM
identity but also separately check the HELO identity by applying
the check_host() function (Section 4  
https://tools.ietf.org/html/rfc7208#section-4) to the HELO identity as the
sender.


Since this applies to the HELO/EHLO field separately from the MAIL FROM 
based checks, it is perfectly valid to have a SPF_HELO_PASS even if the 
sending domain has no SPF policy. This is normal and expected behaviour.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren



Re: Slightly OT- nolisting

2014-10-21 Thread Dave Warren

On 2014-10-21 01:49, Matus UHLAR - fantomas wrote:

we know about this... Marc Perkel (the owner of junkemailfilter.com) got
blamed here for repeated advertising of his services on this list.
Please do not make the same mistake 


I can't help you with that. I'm a satisfied user, have no affiliation 
with them, and have no other incentive to suggest them beyond personal 
experience; the suggestion is directly on-topic with regards to using 
additional MX records servers for spam reduction purposes.


If you're not interested, or if the company or their representatives 
start advertising, take it up with them, I agree that that's likely 
inappropriate if it happens on an ongoing basis, when it's not directly 
being discussed, or after they're advised that they're not welcome. This 
is not the same situation.


Any list owner/moderator is welcome to contact me off-list to discuss 
further.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Slightly OT- nolisting

2014-10-20 Thread Dave Warren

On 2014-10-20 05:18, Robert Moskowitz wrote:
SInce this is about mail and spam, I thought this might be a good 
place to ask about nolisting:


http://en.wikipedia.org/wiki/Nolisting

I get ~ 7000 messages/day on my server, with ~70% getting tagged as spam.



I did some experimentation a few weeks ago and found that a nolisting 
style dead first MX didn't make anywhere near as much an impact as I 
hoped, while in some cases it did cause delays (although only a few lost 
messages that we could find, and all from small home-grown systems that 
really deserved to feed to a proper mail relay)


What does seem to still work is having a secondary/last dummy MX that 
answers with 4xx, at least at this point. Based on my (definitely 
unscientific) testing, I believe that dumb ratware hits the lower 
priority (highest numbered) MX, smarter ratware either starts at the top 
or hits them all.


For this purpose, I'm currently using junkemailfilter.com's freebie:

MX 997 mxbackup1.junkemailfilter.com.
MX 998 mxbackup2.junkemailfilter.com.

mxbackup1 is a free backup-MX service, mxbackup2 is an always fails 
final MX. It's very clever, before accepting mail, it probes your 
server. If your server is up and returns a 2xx or 4xx, it'll return a 
4xx (so it won't accept mail if your server is working, thereby avoiding 
the situation where a backup mail provider opens a hole in your finely 
tuned filters), or if your server returns a 5xx, it will pass on the 5xx.


If your server doesn't respond, they'll 200 and accept the mail, then 
forward it to your higher-numbered MX when you return.


It's a really nice package, plus they use the data they collect to 
improve their service, so it's a win-win. Obviously read their policies 
and ensure you're okay with part of your mail stream passing through a 
third party.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Unsubscribe

2014-10-17 Thread Dave Warren

On 2014-10-16 19:14, Duane Hill wrote:

Are you sending the unsubscribe request from an address subscribed?


That generates a useful error message too:

Hi! This is the ezmlm program. I'm managing the
users@spamassassin.apache.org  mailing list.

Acknowledgment: The address

   li...@hireahit.com

was not on the users mailing list when I received
your request and is not a subscriber of this list.

If you unsubscribe, but continue to receive mail, you're subscribed
under a different address than you currently use. Please look at the
header for:
...

I also verified that the -unsubscribe address works, and got a reply in 
under a minute from there too. I'm obviously not completing the loop 
since I would prefer to stay subscribed.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Unsubscribe

2014-10-15 Thread Dave Warren

On 2014-10-15 10:16, Kevin A. McGrail wrote:

On 10/15/2014 1:07 PM, Derek Harding wrote:

On 9/25/14, 2:00 PM, Reindl Harald wrote:
irrelevant - every list has a welcome message and there is no logic 
in ask other members to unsubscribe yourself, one did also not ask 
them for subscribe 
https://www.google.at/search?q=spamassassin+mailing+list 
https://wiki.apache.org/spamassassin/MailingLists


I love it when people get all authoritarian on lists. I've used the 
list-unsubscribe address numerous times. Never once seen a response 
and the mail has never once stopped.


Process seems broken to me. 
I've looked into this and tested the process without issue.  The only 
issue I've seen to date is someone who joined who did not know what 
email address they joined as and does not know how to read the email 
source to determine the correct email address to unsubscribe.


I second this. I tested it a few months back when I unsubscribed and 
re-subscribed a new address from all of the mailing lists. This list 
worked fine. Most did.


There were a couple I encountered that didn't work, where the -request 
or -unsubscribe addresses weren't properly aliased, and I can confirm 
that all the ones I encountered are now fixed.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: recent channel update woes

2014-10-08 Thread Dave Warren

On 2014-10-07 16:58, Karsten Bräckelmann wrote:

I monitor positive and negative responses, for IP based DNS BLs, I use
the following by default:

127.0.0.1 should not be listed.
127.0.0.2 should be listed.

Depending on how the DNSBL implements such static test-points, they
might not be affected by the issue causing the false listings.
Similarly, domains likely to appear on exonerate lists (compare
uridnsbl_skip_domain e.g.) might also not be affected.

For paranoid monitoring, low-profile domains that definitely do not and
will not match the listing criteria might be better suited for the task.


I included: $MYIP for that reason; If I'm listed, either the world is 
being listed, or I have a problem. Either way, I want to know about it, now.



$MYIP should not be listed.


In the event that I'm blocked from querying the DNSBL, that a DNSBL is 
offline, under attack or whatever, odds are that 127.0.0.2 (or whatever 
is applicable) will disappear.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: spamd does not start

2014-10-08 Thread Dave Warren

On 2014-10-08 15:23, Duane Hill wrote:

No.is  a  way  of chaining commands together. Your cron says run
sa-update  and  then  restart  spamd.  In  other words, when sa-update
finishes  running,  regardless  if there was an update applied or not,
restart spamd.


I thought that ; would chain commands together and run both in sequence 
regardless of the results, whereas  is a conditional for if the 
previous command succeeded and || was a conditional for if the previous 
command failed?


At least in bash...

--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: recent channel update woes

2014-10-07 Thread Dave Warren

On 2014-10-07 16:18, Reindl Harald wrote:
what happens here is unintentional and so you can't say if the 
response is wrong - if you would know the answer you would not ask the 
server 


If you're paranoid, you can monitor the DNSBLs that you use via script 
(externally from SpamAssassin) and generate something that reports to 
you when there's a possible issue. If you're really paranoid, you can 
have it write a .cf that would 0 out the scores, but I assure you that 
you'll spend more time building, testing and maintaining such a system 
than it's worth in the long run, in my experience it's better to just 
page an admin.


I monitor positive and negative responses, for IP based DNS BLs, I use 
the following by default:


127.0.0.1 should not be listed.
127.0.0.2 should be listed.
$MYIP should not be listed.

Obviously these need to be tweaked and configured per-list, not all 
lists list 127.0.0.2, and some lists use status codes, so should not be 
listed and should be listed are really match/do-not-match some 
condition


In the case of DNSWL, $MYIP should be listed, if I get de-listed, I want 
to know about that too.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: How to report spam to mailspike

2014-08-29 Thread Dave Warren

On 2014-08-29 02:38, Marcin Mirosław wrote:

So what should I do in your opinion? I'm getting spam to my private
spamtrap so I can't fill fields about company - it doesn't matter where
I'm hired for reporting spam. What if I would be unemployed? Then I
would have to lie about company? IMHO it is the way to hinder sending
complaints from users.


If you're not willing to provide the information they request, and they 
won't accept an inquiry without it, then you're left with a different 
choice: 1) Do nothing, 2) Cease using the service.


From their perspective, either the policy will increase the quality of 
reports they get by reducing the noise, allowing them to focus on real 
queries, and ultimately increasing the quality of the list, or it will 
discourage people from reporting, decreasing the quality of the list, 
resulting in less users and less relevance.


They've made their choice, now you get to make yours. Personally, I'm 
quite pleased with their performance, and I have no problem identifying 
myself when I contact a company. If I'm acting on my own behalf, I'd put 
Personal or None or N/A into a form, and if it's not accepted, oh 
well.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Give a penalty to messages with non latin UTF-8 characters?

2014-08-29 Thread Dave Warren

On 2014-08-29 02:41, Michael Opdenacker wrote:

I find it hard to believe I'm the only one getting spam in Chinese
characters;)


I get a fair amount in my spamtraps, but only because my trap addresses 
are very permissive. None of it would have been accepted for normal 
delivery.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Rule to check return-path for To address

2014-08-23 Thread Dave Warren

On 2014-08-23 11:59, Jeff wrote:


I recently started getting hammered by spam and nearly all of the spam 
emails have one thing in common. The return-path header contains the 
email address that the spam is being sent to.


Below is a sample header:
...
Return-Path: amazon-voucher-myname=mydomain@indiarti.com
...

The green text above is the email address that the spam is being sent 
to (i.e., myn...@mydomain.com mailto:myn...@mydomain.com).


Is there a way to write a custom SpamAssassin rule that will mark any 
message as spam if the return-path contains the 'To' address, 
regardless of what it may be, and the equal sign (i.e., user=domain.tld)?





Are you aware that such a rule would hit on this mailing list, and tons 
of otherwise legitimate mail sent by mailing lists and bulk mailers alike?


The (slightly modified to neuter it) Return-Path for your message is:

Return-path: 
users-return-00-davew=hireahit@spamassassin.apache.org.example 
for my recipient address of da...@hireahit.com.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren



Re: Bogus SPF +all (was Re: dnssec / dane)

2014-08-15 Thread Dave Warren

On 2014-08-15 12:05, John Hardin wrote:
exists:? (looks up SPF syntax) (boggle) WTF is the sane use case for 
exists:??


Imagine something like:

exists:%{l}.%{o}.%{i}._spf.webhost.example

This might allow me to PASS only messages coming from addresses that 
actually exist, and are from the correct server. (Sure, the sending 
server really should enforce this itself, but not all do)


Or I could get more complicated, PASS message from addresses that exist 
from the correct server, NEUTRAL from addresses that exist when the 
message is from an incorrect server, and fail everything from invalid 
addresses no matter what:


exists:%{l}.%{o}.%{i}._spf.webhost.example 
?exists:%{l}.%{o}._any._spf.webhost.example -all


With other types of macro expansion, you could query a DNS backend that 
returns responses from database or algorithmically rather than based on 
static SPF rules written in DNS as text.


Arguably most of it is needlessly complex in practice, but it's still a 
neat idea, or would be, if SPF FAIL were universally enforced.


Even without FAIL enforcement though, exists: can be used as a logging 
mechanism to track forgeries, similar to DMARC's feedback mechanism.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: dnssec / dane

2014-08-15 Thread Dave Warren

On 2014-08-15 10:34, Robert Schetterer wrote:

yes this is what i awaited, any idea about DKIM ?


While spammers aren't doing it yet, DKIM can be done trivially easily as 
well for spammers that already register throwaway domains.


The private key can be shared the same way the list of throwaway domains 
is shared, or if you wanted to get creative, it could stored in DNS in a 
way that the botnet knows how to look up the current DKIM private key to 
sign mail.


However, the DKIM world is a different place than when SPF was released, 
and I'm not sure that there's any push to whitelist DKIM signed messages 
(without further indicators, such as a domain-level reputation system) 
whereas there was a bit of a push from SPF-proponents to whitelist SPF 
approved messages.


DKIM seems to be much more closely aligned with reputation systems, 
which spammers are not currently able to game.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Opinions needed on what to consider spam

2014-08-13 Thread Dave Warren

On 2014-08-13 07:14, Matus UHLAR - fantomas wrote:


call an unsubscribe-hook _and_ train as spam.
Should be viable for both solicided an unsolicited mail.

Or, does anyone think that unsubscribing spam is counter-productive 
still?




In short, yes, it is unproductive. The quasi-legitimate stuff does go 
away, but the rest doesn't. This was confirmed just recently by Laura on 
Word To The Wise, who posted about this just 5 days ago:


https://wordtothewise.com/2014/08/unsubscribing-spam-part-3/

TL;DR: Spam load went up. Unsubscribing from each of 312 messages in one 
month resulted in 6 straight months of higher spam load.


I've had similar results on a Gmail spamtrap I've got (an address I've 
never used and don't use, but happens to be a common firstname.lastname 
combination, so it gets tons of typo'd mail seeding the trap)


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Opinions needed on what to consider spam

2014-08-13 Thread Dave Warren

On 2014-08-13 17:47, Steve Bergman wrote:


On 08/13/2014 01:06 PM, Dave Warren wrote:


In short, yes, it is unproductive. The quasi-legitimate stuff does go
away, but the rest doesn't. This was confirmed just recently by Laura on
Word To The Wise, who posted about this just 5 days ago:

https://wordtothewise.com/2014/08/unsubscribing-spam-part-3/



Quote from the linked material:

During the month of November, I unsubscribed from every commercial 
email that came into the account.


So mindlessly unsubscribing from viagra ads, with unsubscribe links, 
which have a load of random phrases at the bottom results in a a 
higher spam load later... if you are willing to accept data from an 
n=1 experiment with a low spam count.


What if you have a larger number of accounts, and direct intelligent 
users to unsubscribe from emails which seem reasonably legit to them?


I've performed similar experiments with my own spam-trap addresses over 
the years, with similar results. In my experience, it helps to keep a 
domain fresh in spammer's lists if they see periodic activity for 
domains that are entirely comprised of traps.


I seeded one trap from scratch simply by editing/entering the address 
into the unsubscribe link/form of any spam probably legitimate spam 
that I received that had a form I could manipulate without revealing 
it's true source. The address still receives a moderate volume of spam 
today, mostly from very disreputable sources that likely bought the 
list, but not exclusively. Again, a n=1 experiment, but again, it showed 
that even if you're selective, there's no such thing as limiting 
yourself to reputable spammers.


However, I don't find that it's the intelligent users who have massive 
spam problems to begin with, it's the ones who throw their email address 
into every field requesting it and pound Next like a monkey wanting a 
banana, ignoring pre-checked boxes along the way, that have the worst 
spam problem. In my experience, these are the types that don't do 
particularly well at knowing what to unsubscribe from, and what might be 
legitimate. You can explain the obvious viagra stuff, but their 
attention span is that of a gnat.


But as with all things, your mileage may vary.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Opinions needed on what to consider spam

2014-08-12 Thread Dave Warren

On 2014-08-12 15:11, Kris Deugau wrote:

So...  What do you do, when user A gets extremely mad to see
$legitimatenewsletter in their Inbox, and user B gets extremely mad to
see $legitimatenewsletter in their Spam folder?  If you only have a
global policy with no way to adjust on a per-user basis, you're going to
have someone mad at you either way.

Sooner or later, once you scale beyond a very small number of users, you
*will*  have a conflict between where any give pair of users expects to
see a particular message.

At that point you have to decide:  Is this something most people want in
their Inbox?  And then make exceptions on a per-user basis for those who
don't.


This is why god invented mailbox rules. Users can filter mail that isn't 
spam themselves as they see fit.


I won't create per-user rules at the spamfilter level, and have done 
very well with site-wide bayes (I don't find users are generally willing 
to train enough to make per-user bayes make sense)


However, I do expose whitelisting and blacklisting to users, as well as 
a range of filtering options that users can use at the server level for 
webmail and IMAP use, plus of course users can create whatever disaster 
of client-side rules their client is capable of implementing


(although we never recommend these, and do not support them, since users 
create a nightmare of crap that we aren't willing to invest the time 
into understanding and fixing)


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren



Re: Spam Assassin - does it work or not?

2014-08-07 Thread Dave Warren

On 2014-08-07 16:58, Andy wrote:

As for me going elsewhere, I think my best option at this point is to just
go back to the old way of using my own client software on my computer.
And find one with a strong filter of its own. Anyone have any good results
with Firefox Thunderbird?


Thunderbird has a decent bayesian implementation, I've heard good things 
about it but I don't use it myself.


(I use Thunderbird, but not it's spam filtering capabilities)

--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Ready to throw in the towel on email providing...

2014-07-31 Thread Dave Warren

On 2014-07-31 07:39, David F. Skoll wrote:

Gmail's spam filtering is at least as good as stock SpamAssassin, and
honestly I think it's better.  You can achieve equal quality with SpamAssassin
if you're willing to work at it.  But it does take a lot of work.


This is the real difference with Gmail -- You don't have to work at it. 
Gmail controls the client and the server, their spam filtering learns 
based on how you interact with messages.


They also have some impressive bayes-type categorization which narrows 
messages into far more specific categories than just a spam or not 
spam, and it profiles what types of messages you are likely to *want* 
vs *not want* rather than what is technically spam.


Open messages frequently? Click on multiple links? Those messages, and 
messages similar to them, are less likely to be spam. Leave something in 
your mailbox for days/weeks and delete it without reading it? You might 
not miss it next time.


It's the level of personalization that makes Gmail appear to be so 
amazing to users, it has an understanding that one message might be spam 
to you, and not spam to someone else, and it uses your own history to 
make that decision on freshly received messages.


To me, it's not worth the price as a primary mailbox (privacy, security, 
control of data, terrible UI usability), but the filtering alone is 
impressive.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: colors TLDs in spam

2014-07-31 Thread Dave Warren

On 2014-07-31 16:34, Kevin A. McGrail wrote:
Theoretically, legitimate mail could use these TLDs?  is your theory 
that they are new enough that you are just blocking? 


They could, but like .info and .biz, there will be a handful of 
legitimate users and mostly just spam and junk.


To me, it's too soon to actually start blocking, but if certain TLDs 
have an uptick in spam use, it would be worth evaluating their 
usefulness in email in general, and potentially worth applying low-level 
scores.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: RBL effectiveness (was Re: Ready to throw in the towel on email providing...)

2014-07-30 Thread Dave Warren

On 2014-07-30 06:12, David F. Skoll wrote:

On Wed, 30 Jul 2014 09:34:30 +1000
Noel Butler noel.but...@ausics.net wrote:


This is the exact attitude as to why they wont get off their arses,
because people think they are too big to block. be damned if I care,
I have blocked yahoo and gmail before, and I dare say I'll have to
again sometime.

You don't have paying customers for whom you relay email, do you?


I know as a fact that I wouldn't have many left if I intentionally 
blocked mail they wanted, and the reality of it is that they want mail 
from users of freemail services.


The sheer number of complaints we get when mail to Yahoo is deferred is 
enough to give us a taste of what would happen if we did start 
interfering with the flow of legitimate mail between us and Yahoo, and 
Gmail is a much bigger player.


Luckily there are other tools available than blanket IP-level or 
provider-level blocks.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: RBL effectiveness (was Re: Ready to throw in the towel on email providing...)

2014-07-30 Thread Dave Warren

On 2014-07-30 16:06, Noel Butler wrote:
Certainly have done it on employers network before (a public ISP), and 
would have no problem doing it again if the need arose.
There is no such thing as 'too big' when it comes to handling the shit 
storm of spam that gets spewed out of some organisations, and I'll 
treat Gmail and the likes the same as a  ma 'n pa run outback country 
dialup ISP, there is no difference in my eyes, the fact that many see 
there is, is exactly why the likes of Gmail don't give a rats about 
spam complaints, if more operators started taking a stand, and 
directed their users bitching about blocked mail to Gmail etc, maybe 
Google etc, will pull their finger out of their ears (amongst other 
places) and not only listen, but act.


There is a difference: Gmail is a very major source of wanted, 
legitimate mail. Most may 'n pa run outback country dialup ISPs are not.


A substantial percentage of our pre-sales inquiries come from Gmail 
addresses (even if the final purchase use a legitimate corporate mailbox 
-- We're B2B, we don't sell to consumers), and a surprisingly large 
percentage of actual corporate addresses are hosted on Google Apps.


We literally can't afford to discard all mail from Gmail any more than 
we could afford to de-list ourselves from Google's search index, the hit 
to our business would be substantial.


If you don't care about interacting with prospective or current 
customers, you might be able to afford to block Gmail. At $DAYJOB, we can't.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Ready to throw in the towel on email providing...

2014-07-29 Thread Dave Warren

On 2014-07-29 12:20, Axb wrote:

On 07/29/2014 08:21 PM, Dave Pooser wrote:

On 7/29/14, 2:13 PM, Asai a...@globalchangemusic.org wrote:

My question regarding all of this interesting topic is, isn't there 
some

kind of RBL or something which can be subscribed to for a nominal fee
per year that can aid the small IT shop in maintaining spam filters?


We use the invaluement lists managed by Rob McEwen and have been very
happy with them-- been using them for 3-4 years. A lot of blocking that
doesn't overlap with Spamhaus, very few false positives, and those 
that do

occur are addressed quickly with a lot of transparency. Well worth the
cash, IMO.

(And no, I'm pretty sure I'm not getting a discount or anything for 
this.)

:-)


+1

I've also been using them for a few years and they do a good job



+1

The same. Happy user, no affiliation. Plus Rob is kinda awesome when you 
need something.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Ready to throw in the towel on email providing...

2014-07-29 Thread Dave Warren

On 2014-07-29 13:29, Ted Mittelstaedt wrote:



On 7/29/2014 12:44 PM, David F. Skoll wrote:

On Tue, 29 Jul 2014 12:37:00 -0700
Ted Mittelstaedtt...@ipinc.net  wrote:

Hotmail/MSN/Live/Microsoft/365/whatever-the-name-o-the-week-they-call-themselves 


all have SIGNIFICANTLY BETTER spam filtering than
Spamassassin+free/public RBLs+some judicious blacklists.


My experience is only with Gmail.  And I have to say: Gmail's spam
filtering is pretty darn good.  I almost never get spam on my gmail.com
account and I almost never get false-positives either.



Yet you don't use your gmail address to post here - so how is this a 
fair apples to apples comparison.  It isn't.  All you saying is - an
email address at gmail that I hardly use, doesn't get a lot of spam - 
and an email address at roaringpenguin.com which I use all the time - 
gets more spam.


Therefore google's spam filter is better?


I own (but don't use) my firstname.lastname over there, and a I get a 
metric boatload of misdirected junk. I've narrowed it down to a couple 
regular users who can't figure out their email address, one who was dumb 
enough to have my address printed on his business cards (I got a 
recipient of such a business card to send me a photo)


So while I don't personally use it everywhere, I have tons of people 
that do spread it far and wide. I get Amazon orders, RMA status from 
very legitimate companies, invitations to movie premieres, contact from 
wanna-be actors, restaurant reservations, etc. All legit, from companies 
that can't be bothered to verify user-supplied addresses. Plus I get the 
fallout as these companies sell their lists, subscribe me thinking I'm a 
customer, etc.


One day I got bored and started flagging this stuff as spam. Took just 
about a month to get it under control (read: routed to my spam folder)


If spam filtering were the only consideration, I'd switch to Gmail 
(well, Google Apps) in a heartbeat, and I'd figure out a way to make 
money putting my customers over on Google Apps too.


But it isn't the only consideration.

--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Ready to throw in the towel on email providing...

2014-07-28 Thread Dave Warren

On 2014-07-28 12:40, Daniel Reynolds wrote:
What you could do, is send a regular (weekly or monthly) spam report 
that tells your customers how many emails that were blocked vs the 
number of ham emails and other such statistics.


We quarantine mail that is between our target threshold and 10 points, 
above that we reject at the SMTP level. The quarantine report is sent 
daily.


This approach works well for two reasons, #1 is definitely marketing, #2 
is that it makes users feel like our spam filter isn't blocking anything 
they wanted.


Sure, if we did quarantine something a user wanted, they might want to 
release it. Last I looked, there's a single digit number of quarantine 
releases per month, despite the fact that it's a single un-authenticated 
click from the email in their mailbox.


I do really believe that it makes users feel happier about the handful 
of spam that does make it into their mailbox when they see even a 
percentage of the stuff that didn't make it -- And it's a small 
percentage, a vast majority is rejected outright.


(Also, take my numbers with a grain of salt, my spam filtering system is 
comprised of more than just SpamAssassin, SA's score is directly added 
to various other rules for the final decision)


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Ready to throw in the towel on email providing...

2014-07-28 Thread Dave Warren

On 2014-07-28 10:56, Mauricio Tavares wrote:

   I think there is also the tolerance level people have depending
on who they are dealing with. If they are dealing with a smaller/local
company, they expect 24/7 support and solutions for problems before
said problems are even conceived.


While that's sometimes true, as a very small service provider, a lot of 
my customers appreciate that they're speaking to a person and not a 
department, and it allows me to to provide solutions to customers based 
on /their/ needs rather than their demographic's needs.


But as with so many other markets, most customers will opt for a bigger, 
generic level of solution rather than going for a small local business 
when it can save them a few dollars.


Google, Office 365 and Outlook.com are the Walmart of our industry, and 
that's okay, there's still room for competition, but you do have to work 
a lot harder at areas that the big guys can't compete with.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Alternate method to check for rule updates?

2014-07-24 Thread Dave Warren

On 2014-07-24 18:56, jdebert wrote:

On Fri, 25 Jul 2014 03:30:19 +0200
Karsten Bräckelmann guent...@rudersport.de wrote:


On Thu, 2014-07-24 at 17:32 -0700, jdebert wrote:

Sprint, which I use for net access is hijacking DNS.

What exactly do you mean hijacking? Routing NXDOMAIN to some sort of
advertising web-server? Or serious packet-sniffing tampering with
*any* DNS query crossing their hardware?

Yes. Also disabling dnssec, not responding to certain queries and
modifying responses and queries.

They like to call it transparent DNS proxying. But it's not
proxying and obviously not transparent.


If they're actually tampering with DNS requests made to other DNS 
servers, I'd give some serious thought to dropping them completely.


If that's not an option, perhaps a $5 VPS at a network location that's 
reasonably near yourself, and then forwarding your own resolver to that 
resolver over port other than 53.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Dealing with a bad network device affecting DNS lookups

2014-07-15 Thread Dave Warren

On 2014-07-15 14:40, John Hardin wrote:

On Tue, 15 Jul 2014, Martin Hepworth wrote:


On Tuesday, 15 July 2014, Quanah Gibson-Mount qua...@zimbra.com wrote:


--On Wednesday, July 16, 2014 12:08 AM +0200 Axb axb.li...@gmail.com
wrote:

 and what's prevents you from running a recursor on those servers?

In a halfway well connected network, and Rackpace is VERY well 
connected,

DNS requests should takes less that 1 sec.


The problem isn't the DNS requests.  The problem is the appliance 
that is

INTERCEPTING THE REQUESTS ON THE WAY OUT.


Run your own caching server on the sa box itself, makes a surprising
difference and something I always reconmend


...which does not help if your upstream gateway is *intercepting and 
reprocessing* your DNS queries.




If that's the case, I'd get a new upstream, or log a ticket and get it 
resolved. DNS is a mission critical piece of infrastructure and not 
something to be tampered with lightly.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Dealing with a bad network device affecting DNS lookups

2014-07-15 Thread Dave Warren

On 2014-07-15 14:46, Quanah Gibson-Mount wrote:
I've been complaining about it since last October. Supposedly it will 
be fixed by the end of this month.  In the meantime, I still have 
floods of spam coming in that I'd like scored correctly. 


Are you saying that if you perform something like dig @8.8.8.8 
asdfalksdflk.example.com a, Rackspace intercepts the packet on port 53 
and does something with it?


And it's taken them since October to resolve it?

And you still pay for this service?

Or is there more going on than is immediately obvious here?

--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: remove

2014-07-03 Thread Dave Warren

On 2014-07-03 11:51, Brent Kennedy wrote:

remove


Try list-unsubscribe: mailto:users-unsubscr...@spamassassin.apache.org 
instead?


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: remove

2014-07-03 Thread Dave Warren

On 2014-07-03 17:54, Philip Prindeville wrote:


Which reminds me: is there a ‘reflector’ address anyone can mail to which will 
verify their SPF and DKIM records?

Or… no, I guess such a beast would be susceptible to reflector attacks from 
spoofed addresses… so it’s a dumb question.

Unless it cached a response and you had to click on a link to see the results 
instead…


http://www.mail-tester.com/ does it. It provides you an email address, 
you send an email and then click the button on the website to see the 
results.


It's aimed at bulk mail, but it works for individually sent messages as 
well.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: Why do I get both URIBL_DBL_SPAM and URIBL_BLOCKED?

2014-06-06 Thread Dave Warren

On 2014-06-05 21:48, zespri wrote:

As I read it, it means that non-forwarding dnsmasq is simply nonsensical.
What am I missing?


Yeah... I don't believe dnsmasq would be a good choice, unbound or BIND 
would be better choices.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: SPAM from a registrar

2014-05-19 Thread Dave Warren

On 2014-05-19 19:39, Ian Zimmerman wrote:

Ok, I installed a local bind instance on Saturday.  But it is not
helping: out of about 100 spams I got today (counting both those that
got flagged and those that didn't, but not counting the horrible spams
with score  15 that go directly to /dev/null), _none_ scored on
URIBL_RHS_DOB.  And I know for a fact that most of them contain fresh
domains :-(  Btw, all those domains are registered with enom.  Wth?



Have you checked the domains to see if they're listed on DOB? Or can you 
at least verify that test domains can be queried on DOB?


Did you leave your local BIND instance acting as a full resolver, or did 
you set forwarders? If so, removing the forwarder configuration should help.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: RCVD_IN_IADB_VOUCHED pushed spam into false negatives

2014-04-16 Thread Dave Warren

  
  
On 2014-04-15 06:42, Kevin A. McGrail
  wrote:

On
  4/14/2014 7:34 PM, Dave Warren wrote:
  
  On 2014-04-13 12:22, Dave Pooser wrote:

And looking at the IADB web page, what I
  see is them bragging about how
  
  little checking they do. What I don't see on their Web site is
  any way to
  
  report spam to them.
  
  
  I've gone ahead and set all IADB scores to 0 locally, but I'm
  curious if
  
  this strikes anybody else as a questionable default for stock
  SA?
  


Are we talking about http://www.isipp.com/iadb.php or something
else?


Based on
https://wiki.apache.org/spamassassin/DnsBlocklistsInclusionPolicy
I'm not sure that they qualify anyway, specifically:


"Must not have intent to profit, including optional or required
payments, in order to remove, add, expedite or otherwise
non-objectively handle entries to their lists."


http://www.isipp.com/suretymail-faq.php#pricing indicates that
pricing starts at $10.00 per month, and I cannot find any way to
add or remove an IP without paying a fee. Am I misreading the
rules, or are they out of compliance to be included at all?

  
  No, I think you are reading it correctly. It was added prior to
  the policy. If it has a lot of problems, we would have to
  consider disabling it by default.
  


I just thought I'd mention the rules since it seemed to be outside
them, but if it's grandfathered in, that's not unfair.

I don't know if there are problems with it or not, I don't use
SpamAssassin's DNS lookups in my infrastructure, my mail server
implements it's own DNSBL and URIBL lookups before SA is called.
    
    -- 
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren


  




Re: RCVD_IN_IADB_VOUCHED pushed spam into false negatives

2014-04-14 Thread Dave Warren

On 2014-04-13 12:22, Dave Pooser wrote:

And looking at the IADB web page, what I see is them bragging about how
little checking they do. What I don't see on their Web site is any way to
report spam to them.

I've gone ahead and set all IADB scores to 0 locally, but I'm curious if
this strikes anybody else as a questionable default for stock SA?


Are we talking about http://www.isipp.com/iadb.php or something else?

Based on 
https://wiki.apache.org/spamassassin/DnsBlocklistsInclusionPolicy I'm 
not sure that they qualify anyway, specifically:


Must not have intent to profit, including optional or required 
payments, in order to remove, add, expedite or otherwise non-objectively 
handle entries to their lists.


http://www.isipp.com/suretymail-faq.php#pricing indicates that pricing 
starts at $10.00 per month, and I cannot find any way to add or remove 
an IP without paying a fee. Am I misreading the rules, or are they out 
of compliance to be included at all?


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: sa-update (nightly mass-check)

2014-04-08 Thread Dave Warren

On 2014-04-08 03:56, Kevin A. McGrail wrote:

On 4/8/2014 1:16 AM, Dave Warren wrote:

On 2014-04-07 19:23, Thomas Harold wrote:
NOTE: New masscheck contributors are now being accepted since about 
2012-08-09.

Is that supposed to say now being or not being?


I'm assuming now being since there are regular mentions of a need 
for ham corpus. But that's just a hopeful guess, given that I've put 
some resources into setting up appropriate systems and preparing some 
messages to start the process.



Yes, we can make accounts again.  Did you send a request?


Indeed, I sent a message to private@ as described on the wiki.



However, the ham is not starved.  We have been publishing rules. Not 
sure where the disconnect on the firing of the script is coming from.


Understood. However, over the last couple years, there have been 
multiple times that this was mentioned (whether it was actually true or 
not), which is what motivated me to attempt to contribute.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: sa-update (nightly mass-check)

2014-04-08 Thread Dave Warren

On 2014-04-08 11:17, Kevin A. McGrail wrote:

On 4/8/2014 2:15 PM, Dave Warren wrote:

On 2014-04-08 03:56, Kevin A. McGrail wrote:

On 4/8/2014 1:16 AM, Dave Warren wrote:

On 2014-04-07 19:23, Thomas Harold wrote:
NOTE: New masscheck contributors are now being accepted since 
about 2012-08-09.

Is that supposed to say now being or not being?


I'm assuming now being since there are regular mentions of a need 
for ham corpus. But that's just a hopeful guess, given that I've 
put some resources into setting up appropriate systems and 
preparing some messages to start the process.



Yes, we can make accounts again.  Did you send a request?


Indeed, I sent a message to private@ as described on the wiki.
OK, cc me and send again please.  it might not have been moderated 
through.


Sent and CC'd, thanks!

--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: sa-update

2014-04-07 Thread Dave Warren

On 2014-04-06 17:21, John Hardin wrote:

On Sun, 6 Apr 2014, Dave Warren wrote:

Is older ham useful? It specifically mentions that older spam isn't 
useful, and why, but I'm thinking older ham is probably useful since 
old mail clients and legitimately sent mail never dies. But I could 
filter based on date.


There's some debate about that. :)

I personally agree with you. Others disagree.


I've been giving it some thought and I think that perhaps limiting it to 
the last few months will make it easier to get a sane set of 
TRUSTED_NETWORKS and INTERNAL_NETWORKS; I've got mail going back to 
~2002 but no real recollection of how things were set up or named prior 
to 2007 or so.


Initially I'll limit it to mail within the last couple of months, but 
perhaps expand that up to 24-36 months for non-spam and 6 months for 
spam, is that sane/reasonable?




Yes, ham-only masscheck submissions would be very welcome.


Perfect, glad to hear it. At this point I've built a dedicated box to 
run the masscheck scripts, so now it's just a matter of putting together 
a corpus and doing some sanity checking and testing.


My current thought is to take user-fed spam and non-spam folders and 
place copies of messages into a staging path which will then be reviewed 
before being added to the corpus for learning. Hopefully I'll be ready 
to go live within a day or two.



--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: sa-update

2014-04-07 Thread Dave Warren

On 2014-04-06 20:25, jdebert wrote:

On Sat, 5 Apr 2014 09:14:56 -0700 (PDT)
John Hardin jhar...@impsec.org wrote:


On Sat, 5 Apr 2014, Amir Reza Rahbaran wrote:


I want to know how long it takes custom signatures updated by
sa-update.

Daily, if the corpora are sufficient for masscheck scoring to run.

At the moment the masscheck corpus is ham-starved. There's not quite
enough ham available for reliable scores to be generated and
published.

This explains why SA is not catching any spam here? After updating
to updates 1584283 and then 1585021, all spam is being passed. Nothing
else was done. No other changes made.


No -- This issue just means that rule updates may not get created, but 
the last valid set of rules will still available to sa-update.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: sa-update (nightly mass-check)

2014-04-07 Thread Dave Warren

On 2014-04-07 19:23, Thomas Harold wrote:

NOTE: New masscheck contributors are now being accepted since about 2012-08-09.

Is that supposed to say now being or not being?


I'm assuming now being since there are regular mentions of a need for 
ham corpus. But that's just a hopeful guess, given that I've put some 
resources into setting up appropriate systems and preparing some 
messages to start the process.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: sa-update

2014-04-06 Thread Dave Warren

On 2014-04-05 09:14, John Hardin wrote:

On Sat, 5 Apr 2014, Amir Reza Rahbaran wrote:


I want to know how long it takes custom signatures updated by sa-update.


Daily, if the corpora are sufficient for masscheck scoring to run.

At the moment the masscheck corpus is ham-starved. There's not quite 
enough ham available for reliable scores to be generated and published.


Once again, participation as a mass-checker, especially if you can 
provide a non-English ham corpus, is solicited. If you have access to 
thousands of reliably-categorized messages and can set up a box to run 
SpamAssassin to scan them to test the performance of the base rules, 
please consider becoming a masscheck contributor. The content of 
private messages is not exposed by this process, only the rule hits 
are public.


If you can do this, see the wiki for the process and contact Kevin 
McGrail for upload credentials. Thanks!


I've been idly debating figuring out how to contribute, but having read 
the wiki articles, I have a few questions:


Is older ham useful? It specifically mentions that older spam isn't 
useful, and why, but I'm thinking older ham is probably useful since old 
mail clients and legitimately sent mail never dies. But I could filter 
based on date.


Is mail Sent folder mail of any use? I suspect not, since there's not 
necessarily a Received header yet (although there might be, it depends 
on how the user sent the message), so direct-to-MX and similar rules 
will skew.


Is a ham-only corpus submission useful? Our ham is well cleaned, but we 
don't archive spam on an ongoing basis, and users primarily just delete 
spam. But most of our users archive ham and retain it, so depending on 
what the results look like, it might be useful data source.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: New expensive Regexps

2014-02-06 Thread Dave Warren

On 2014-02-06 17:17, John Hardin wrote:

On Thu, 6 Feb 2014, Kevin A. McGrail wrote:

I've discussed it with Alex a bit but one of my next ideas for the 
Rules QA process is the following:


- we measure and report on metrics for the rules that are promoted 
such as rank (existing), computational expense, time spent on rule.


I assume meta rules would combine the expense of their components?

Sounds interesting!



How about if one or more components were called more by more than one 
meta-rule? It's perhaps not entirely fair to divide it evenly, since 
that might imply that removing the metarule would kill off that CPU usage.


Perhaps documenting the cost of the individual components, summing them, 
with a flag to indicate that some or all of the components are shared? 
That sounds overly complex, but it at least gives the enterprising rule 
author or server administrator the ability to understand what is happening.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: New expensive Regexps

2014-02-06 Thread Dave Warren


 On Feb 6, 2014, at 18:04, Kevin A. McGrail kmcgr...@pccc.com wrote:
 
 On 2/6/2014 8:32 PM, Dave Warren wrote:
 On 2014-02-06 17:17, John Hardin wrote:
 On Thu, 6 Feb 2014, Kevin A. McGrail wrote:
 
 I've discussed it with Alex a bit but one of my next ideas for the Rules 
 QA process is the following:
 
 - we measure and report on metrics for the rules that are promoted such as 
 rank (existing), computational expense, time spent on rule.
 
 I assume meta rules would combine the expense of their components?
 
 Sounds interesting!
 
 How about if one or more components were called more by more than one 
 meta-rule? It's perhaps not entirely fair to divide it evenly, since that 
 might imply that removing the metarule would kill off that CPU usage.
 Without triple checking the code, my 99.9% belief is Rules are cached.  
 Calling them multiple times does not trigger a re-check.

I believe so too, which is why this matters. If they were re-evaluated, you 
could just sum up a meta rule and not care. 

Doing just a sum of a meta rule is misleading because the savings from 
disabling a meta rule may only be a fraction if all of the underlying component 
rules are being called anyway. 



Re: Who wants to trade data?

2014-02-06 Thread Dave Warren

On 2014-02-06 19:30, Noel Butler wrote:
so, how about EVERYONE with list of IP's who try compromise or abuse 
systems, start offering them for sale on here, then lets see what you 
think.


Maybe you were reading a different mailing list than I am, but the 
message I received didn't have any commercial sales offer, it offered up 
the link freely (and indicated he might be interested in receiving 
similar data, hence, a trade)


Given that it's loosely on-topic (anti-email-abuse, anti-spam), 
SpamAssassin's mailing list doesn't seem to have a Thou shall only 
speaketh regarding SpamAssassin policy, and non-commercial (free access 
to the data, without any preconditions), I'm having trouble seeing the 
problem.


I'd also like to say that I think it's awesome when commercial vendors 
give back to the community, in large or small ways.


But that's just me.

--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren




Re: What is the view re- SPF_FAIL these days?

2014-01-24 Thread Dave Warren

On 2014-01-15 09:36, hospice admin wrote:

Hi Team,

I was wondering what folks were doing with SPF_FAIL ,   TO_EQ_FM_SPF_FAIL and   
TO_EQ_FM_DOM_SPF_FAIL   these days?

I personally have never seen an FP with any, but understand from the reading 
I've done that some people do.

My approach has always been to combine with DCC/Pyzor/Razor hits in a Meta 
rule, but we've recently started seeing   mail just squeak under the fence 
using this approach ... particularly some of the 'nicer' Bank Spam. The 
temptation is to add Bayes to the Meta. Is this a bad idea, or does anyone have 
any better suggestions?

We're running SA version 3.3.2. Sadly, upgrading to 3.4 isn't an option at this 
stage.


I forgot about this message, I had a partial response drafted that I'd 
forgotten about, Thomas's reply reminded me.


Some time ago I flipped SPF:FAIL to automatically quarantine rather than 
reject messages to allow me to perform more of a review of the rejected 
messages, and invariably they're either legitimate messages by someone 
who has an incomplete or out of date SPF record, or they're already 
scored as spam (I do apply a slight score to SPF failures, and a smaller 
one to soft failures)


Most of the failures were cases where a small company listed their 
primary SMTP, but had messages going out on their behalf from a third 
party or directly from their web server or similar, usually receipts, 
invoices, or other automation that didn't use their primary SMTP 
infrastructure.


When I initially performed this test and reviewed the results, I not 
only released the legitimate messages to users, but I also I reached out 
to each and every sender; most failed to respond at all (probably 
80%-85%), of those that did, half had a We sent the email, it's your 
server's fault if you didn't get it and the other half adjusted their 
records. One spotted us a free license of their software for our 
trouble, which was nice of them.


At this point, I apply a small score (and if I recall correctly, I kick 
off mandatory greylisting -- I don't greylist all mail, only mail with 
failing DNS, SPF, or where something is otherwise suspicious), and I 
wouldn't recommend blocking outright simply due to the fact that while 
SPF fails do add some value to spam blocking, it wasn't particularly 
significant.


All of this being said, my opinion when I started was confirmed by my 
testing, so there might be a bias involved. I've never been a fan of SPF 
for rejecting mail, to me, the power of SPF and DKIM are in accepting 
and whitelisting legitimate mail. It's a lot easier to whitelist 
Anything from example.com where (SPF:PASS or DKIM:PASS) than it is to 
figure out the IP ranges example.com uses today and tomorrow and at this 
point, I all but refuse to whitelist by IP, or by domain unless there is 
some authentication method.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren

Are you tired of having your hands cut off by snowblowers?




Re: Do you want to buy this domain name spam

2014-01-15 Thread Dave Warren

On 2014-01-15 22:51, Marc Perkel wrote:
I'm seeing a lot of Do you want to buy this domain name spam lately. 
Is it just me or is anyone else seeing this?




It's not just you. Mostly to addresses harvested from WHOIS, at least 
that I've noticed.


--
Dave Warren
http://www.hireahit.com/
http://ca.linkedin.com/in/davejwarren

Oh well, I guess this is just going to be one of those lifetimes.




  1   2   >