Re: Clearly bogus false positives -- on abuse contact point, no less
From: Philip Prindeville [EMAIL PROTECTED] Date: Sat, 16 Feb 2008 18:44:55 -0800 To: Spamassassin Mailing List users@spamassassin.apache.org Subject: Clearly bogus false positives -- on abuse contact point, no less Hmmm. I think we need a BL for reporting ISP's that are clueless as to run filtering on their abuse mailbox (or the mailbox that's listed for their ARIN/RIPE AbuseEmail attributes). There is, but due to complaints, it was taken out of SA. (some thought that just because you ignore abuse@ or postmaster@ email wasn't enough of a sign that you were a spammer. And it is 'Spam' assassin, not 'lame-admin' assassin ;-) See www.rfc-ignorant.org. Abuse, postmaster, whois, dsn, bogusmx (every messagelabs client is listed in bogusmx due to the phusked up way message labs assigns mx records) Several LARGE isp's are listed in the abuse@ due to their bouncing of abuse complaints, or requiring you to fill our forms on their web site. Maybe you don't want to 100% block them, but you can look for old RFCI rules and score them higher. -- Michael Scheidell, CTO |SECNAP Network Security Winner 2008 Network Products Guide Hot Companies _ This email has been scanned and certified safe by SpammerTrap(tm). For Information please see http://www.spammertrap.com _
Re: Clearly bogus false positives -- on abuse contact point, no less
Philip Prindeville wrote: Karsten Bräckelmann wrote: Please, do not paste a gigantic blob of multipart MIME messages. Put it up somewhere, raw, and simply provide a link. On Sat, 2008-02-16 at 18:44 -0800, Philip Prindeville wrote: Anyway, I have no idea why I'm seeing some of these scores. URL matches when there aren't even URL's in my message? There are. Self-inflicted. The ones in square brackets with the leading 550 code, which you seem to keep sending back and forth. :) And just *mentioning* the domain name, without any sort of valid URL (ftp: or http: or anything of the sort) is going to match it as a URL? That's highly bogus. A domain name alone does not a URL make. You tell that to most windows-based clients, which will automatically make clickalble URLs out of things like www.google.com in text sections. snip Oh, and DNS_FROM_OPENWHOIS probably is http://open-whois.org/, which gives you a hint about what it actually is. The hit itself pretty much mentions this... Yeah, I read this. And I don't get that either. How does having your domain be anonymous (for whatever reason... maybe you're a small company operating below the radar) make your email any more likely to be spam Decidedly so. The people with the strongest reason to hide their contact information are the spammers, and other shady businesses. That's not to say they're aren't some legitimate folks that use this kind of anonymization. However, the domains by proxy model is a questionable practice, as it violates the spirit of the whois requirements. Also, many of them violate the letter of the requirements, such as the phone issue noted on the open-whois main page. (ie: anyone registered using securewhois is not correctly reigstered, per ICANN requirements for whois) TVD_STOCK1? There's no mention of stock anywhere in the message. Not sure, you migth want to try running it with debugging on. The debug message from the code would be: dbg(eval: stock info hit: $1); That should tell you what exact substring matched the stock info code. From a quick glimpse of the code, it appears to identify common words used in stock (as in stock exchange, pump-n-dump penny stocks) spam. It does not search for the word stock. Just as pretty much no rule in SA ever searches for single words only... Again, I didn't see anything that should legitimately be causing this rule to fire, and certainly not with such a high score for such an unreliable rule. Why am I seeing all of these bogus matches? From what I can tell, and what you sent us, they don't appear to be bogus. Depends on whether you equate bare domains with URL's, I suppose. If MUA's equate them with URLs, spammers will use this, and SpamAssassin will use it. I looked on the wiki for some of these, but couldn't find descriptions. What should I do? Just block their domain? I don't want to deal with their misconfiguration issues. Apparently you already exchanged messages? Try not sending the offensive mail in question. Put it up somewhere as reference, if need be. Hmm, sounds familiar... ;) guenther No, I sent them back the offending email, initially. Which they marked as spam (bloody brilliant, of course it's spam, otherwise I wouldn't be bothering to report it what else do they expect to come to their Abuse mailbox, anyway???). So I sent back the SA scores back to them, and that's the part that I pasted previously. How do you report Spam to such a site that's going to block your Spam reports for being... well, Spam! Well, it's stupid, and probably a RFC violation to perform such filtering on your abuse box. So, I'm not saying the domain in question isn't behaving foolishly. You might want to point this out to them, and suggest they whitelist their abuse address. At the very least, ask them if they have an alternate reporting address that isn't filtered.
Re: Clearly bogus false positives -- on abuse contact point, no less
Michael Scheidell wrote: From: Philip Prindeville [EMAIL PROTECTED] Date: Sat, 16 Feb 2008 18:44:55 -0800 To: Spamassassin Mailing List users@spamassassin.apache.org Subject: Clearly bogus false positives -- on abuse contact point, no less Hmmm. I think we need a BL for reporting ISP's that are clueless as to run filtering on their abuse mailbox (or the mailbox that's listed for their ARIN/RIPE AbuseEmail attributes). There is, but due to complaints, it was taken out of SA. (some thought that just because you ignore abuse@ or postmaster@ email wasn't enough of a sign that you were a spammer. And it is 'Spam' assassin, not 'lame-admin' assassin ;-) Actually, it was removed due to lack of accuracy. If a rule can't maintain a S/O of over 0.80, or under 0.20, it's removed.
SVN notifications killing spamassassin
I sometimes get SVN notifications that contain lists of files and their status. The filenames will often get picked up by the URI matching algorithm, each of which end up being processed through numerous lookups (URICOUNTRY, my LDAP filter, etc). Sometimes I get very large messages with hundreds of file lists, which in turn causes spamassassin to go into never-never land while it thinks about the hundreds of URI matches. For example, Afpo/reports/perl/nagios_notifications1.pl.bak Afoo/reports/perl/nagios_outages1.pl Afoo/reports/perl/GWIR.pm nagios_outages1.pl will be determined as a URI for .pl domain and GWIR.pm will be determined as a URI for .pm domain, and so forth. The only way to get these messages through is to disable spamassassin... I've updated to 3.2.4 just now and it still has the same problem I'm guessing the URI analyzer needs to be smarter. -- Eric A. Hallhttp://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/
Re: Clearly bogus false positives -- on abuse contact point, no less
Matt Kettler wrote: Philip Prindeville wrote: Karsten Bräckelmann wrote: Please, do not paste a gigantic blob of multipart MIME messages. Put it up somewhere, raw, and simply provide a link. On Sat, 2008-02-16 at 18:44 -0800, Philip Prindeville wrote: Anyway, I have no idea why I'm seeing some of these scores. URL matches when there aren't even URL's in my message? There are. Self-inflicted. The ones in square brackets with the leading 550 code, which you seem to keep sending back and forth. :) And just *mentioning* the domain name, without any sort of valid URL (ftp: or http: or anything of the sort) is going to match it as a URL? That's highly bogus. A domain name alone does not a URL make. You tell that to most windows-based clients, which will automatically make clickalble URLs out of things like www.google.com in text sections. snip Oh, and DNS_FROM_OPENWHOIS probably is http://open-whois.org/, which gives you a hint about what it actually is. The hit itself pretty much mentions this... Yeah, I read this. And I don't get that either. How does having your domain be anonymous (for whatever reason... maybe you're a small company operating below the radar) make your email any more likely to be spam Decidedly so. The people with the strongest reason to hide their contact information are the spammers, and other shady businesses. That's not to say they're aren't some legitimate folks that use this kind of anonymization. However, the domains by proxy model is a questionable practice, as it violates the spirit of the whois requirements. Also, many of them violate the letter of the requirements, such as the phone issue noted on the open-whois main page. (ie: anyone registered using securewhois is not correctly reigstered, per ICANN requirements for whois) Well, what's ironic here is this: I go to the open-whois web-site, and read their blurb: What do you have against privacy? In a word: nothing. This is not about privacy, but about accountability. The Internet is built upon cooperation and accountability, anything which undermines accountability is a bad thing. The usability of the WHOIS database is seriously undermined by anonymous domains. Ah... But filtering your spam reports so no one can ever report spam to you... that's a lot more accountable, clearly. :-) TVD_STOCK1? There's no mention of stock anywhere in the message. Not sure, you migth want to try running it with debugging on. The debug message from the code would be: dbg(eval: stock info hit: $1); That should tell you what exact substring matched the stock info code. From a quick glimpse of the code, it appears to identify common words used in stock (as in stock exchange, pump-n-dump penny stocks) spam. It does not search for the word stock. Just as pretty much no rule in SA ever searches for single words only... Again, I didn't see anything that should legitimately be causing this rule to fire, and certainly not with such a high score for such an unreliable rule. Why am I seeing all of these bogus matches? From what I can tell, and what you sent us, they don't appear to be bogus. Depends on whether you equate bare domains with URL's, I suppose. If MUA's equate them with URLs, spammers will use this, and SpamAssassin will use it. There is only so much braindeath in UA's that you can bend the rules for. Clearly, this involves breaking them. I looked on the wiki for some of these, but couldn't find descriptions. What should I do? Just block their domain? I don't want to deal with their misconfiguration issues. Apparently you already exchanged messages? Try not sending the offensive mail in question. Put it up somewhere as reference, if need be. Hmm, sounds familiar... ;) guenther No, I sent them back the offending email, initially. Which they marked as spam (bloody brilliant, of course it's spam, otherwise I wouldn't be bothering to report it what else do they expect to come to their Abuse mailbox, anyway???). So I sent back the SA scores back to them, and that's the part that I pasted previously. How do you report Spam to such a site that's going to block your Spam reports for being... well, Spam! Well, it's stupid, and probably a RFC violation to perform such filtering on your abuse box. So, I'm not saying the domain in question isn't behaving foolishly. You might want to point this out to them, and suggest they whitelist their abuse address. At the very least, ask them if they have an alternate reporting address that isn't filtered. I'll give it another try. If not, their CIDR range and domain name will go into my blacklist. I don't want to open myself up to them if I can't reasonably expect them to respond to spam issues when/if they occur (again). -Philip
Re: SVN notifications killing spamassassin
Eric A. Hall wrote: I sometimes get SVN notifications that contain lists of files and their status. The filenames will often get picked up by the URI matching algorithm, each of which end up being processed through numerous lookups (URICOUNTRY, my LDAP filter, etc). Sometimes I get very large messages with hundreds of file lists, which in turn causes spamassassin to go into never-never land while it thinks about the hundreds of URI matches. For example, Afpo/reports/perl/nagios_notifications1.pl.bak Afoo/reports/perl/nagios_outages1.pl Afoo/reports/perl/GWIR.pm nagios_outages1.pl will be determined as a URI for .pl domain and GWIR.pm will be determined as a URI for .pm domain, and so forth. The only way to get these messages through is to disable spamassassin... I've updated to 3.2.4 just now and it still has the same problem I'm guessing the URI analyzer needs to be smarter. That's strangely appropriate to the issue I had with calthurs.com. It would be nice if this checker had an option to enforce checking only of well-formed URL's (i.e. not anything that might conceivably be munged into a URL by the most ignorant of UA's)... something requiring a protocol name (ftp:, http:, tftp:, etc.), a domain name, and a path name (even if it's just slash). Or at the very least, to score complete URL's higher than just domain names alone. -Philip
RE: Rule for Russian character sets
-Original Message- For the most part you can match any character by the appearance of the character. Any character with special meaning needs to be escaped in some way. The easiest way is usually with a backslash, but in some cases you can also do it by making it a member of a character class. So for you questionmark case, you could do \? or [?], as most of the special characters lose their meaning in a character class. The exceptions are obviously right bracket, backslash, and dash becomes special if it isn't the first character. /\=\?koi8\-r\?/ This is what I'd setup originally, except when I ran it past a RE interpreter the results were just.. wrong. I do think it would work, however, and will be testing it on a Virtual Machine today to be sure. This should work. You don't need to escape the dash, and I'm pretty sure you don't need to escape the equal sign; just the questionmark. Also, you may want to handle this in both uppercase and lowercase, so you could do /=\?koi8-r\?/i And you probably don't need the = sign to get reasonably reliable matching. Ah, this is the bit I was unsure about, limiting how many characters are escaped. I would tend towards the fully escaped one myself, I just wouldn't trust non-escaped = and ? signs. But that's probably got to do with some bad history with Spamassassin:) Thanks for reinforcing some points with RE that needed to be (: Cheers, Mike
using sare rules
Hi all, I have recently inherited the responsibility of looking after our spam machine as such i'm having a few teething issues : ) I just followed the instructions in the sare-sa-update-howto.txt I am just a bit confused as to whether I have done it correctly originally in the /var/lib/spamassassin/3.xxx folder there was the updates_spamassassin_org folder and cf file. Now there is also the key file the sare-sa-update-channels.txt as well as the full that I wanted to add (which has a file and a folder) I would have though that the rule file would have ended up inside the updates_spamassassin_org folder as all the other .cf files seem to be inside there. Can someone let me know if I have done something wrong. Thanks Kate
FW: Rule for Russian character sets (=?koi8-r? not quite acharset)
-Original Message-snipsnip We don't want to only allow the English locale, because we (here at my work) do not want our international clients (non Russian) to be denied email service. ok_locales en ja ko th zh This will allow anything but Cyrillic char sets. Please note that en does *not* mean English locale despite its name. It applies to all Western charsets, including German Umlauts, Swedisch, French, Turkish, etc. Basically everything that uses the characters in this post, plus language specific chars. Ok now we're talking turkey. Thanks for providing the much needed clarity on ok_locales. I may just employ that technique yet, pending whether we get any more Russian spam through the gates. Sorry, I did not mean to troll nor any kind of offense. You have my apologies, as being a Friday afternoon, I was pretty sick of work and shouldn't have taken it out on you or the list. Sorry. However, you missed my point. Getting detailed with REs is a good thing, sure. I was not about that -- but the RE in question does not properly handle charset encoding. See the Subject for an example which is not encoding, but will be matched by your rule. My point was, that the rule discussed aims at being something that it unfortunately is not, because charset encoding is slightly more complex and definitely requires a closing part. A Regular Expression that does this can be found in check_for_faraway_charset_in_headers() in HeaderEval.pm: $hdr =~ /=\?(.+?)\?.\?.*?\?=/g Hence, the my re-inventing the wheel analogy. And these wheels are quite flexible, too. ;-) Also, your rule applies to the Subject only, whereas ok_locales does check all MIME parts and will trigger on Russian spam with a western Subject. The RE in question (my one) was not just written for subject, but a separate rule was written for the raw From: line as well. As we only score spam here and leave filing it to the MUA (unless a score of 25 is reached, where SA bins it), scoring against the Subject and From lines makes OK sense, because if you used simply (=?koi8-r?) in the subject it would not score high enough on it's own to be filtered or blocked. (I'm trying to employ what I've learned from the SA webpage about writing multiple low-scoring rules, instead of a few big-scoring ones). I can see it is flawed, but have to also admit that it is working rather well at the moment. Mind you, I have taken the time to translate some of the Russian Spam, work out spammy phrases, and then quote those phrases to be scored against by SA. Hope this clarifies my previous posts and is appreciated again... Your posts are appreciated, and sorry for the mean comment. Cheers, Mike
Re: Clearly bogus false positives -- on abuse contact point, no less
Philip Prindeville wrote: Matt Kettler wrote: Philip Prindeville wrote: Depends on whether you equate bare domains with URL's, I suppose. If MUA's equate them with URLs, spammers will use this, and SpamAssassin will use it. There is only so much braindeath in UA's that you can bend the rules for. Clearly, this involves breaking them. Erm.. What rule does this actually break? Is there a rule in an RFC somewhere specifying you MUST not interpret bare domains as URIs in text emails? Besides, when this braindeath is more the norm than the exception, it's a de facto standard. Particularly in the absence of any rules against it. *EVERY* graphical MUA I've used in the past 10 years does this. Thunderbird, Outlook, Groupwise, Eudora, they all do it. I'm sure there are MUAs that don't, but there's an awful lot that do. Most webmails seem to do it too. Outlook web access, Comcast and Yahoo all do, but I'll concede that Verizon's webmail doesn't.
Re: SVN notifications killing spamassassin
Philip Prindeville wrote: Eric A. Hall wrote: I'm guessing the URI analyzer needs to be smarter. That's strangely appropriate to the issue I had with calthurs.com. It would be nice if this checker had an option to enforce checking only of well-formed URL's (i.e. not anything that might conceivably be munged into a URL by the most ignorant of UA's)... something requiring a protocol name (ftp:, http:, tftp:, etc.), a domain name, and a path name (even if it's just slash). See the discussion at: http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5780
Re: Clearly bogus false positives -- on abuse contact point, no less
Matt Kettler wrote: Philip Prindeville wrote: Matt Kettler wrote: Philip Prindeville wrote: Depends on whether you equate bare domains with URL's, I suppose. If MUA's equate them with URLs, spammers will use this, and SpamAssassin will use it. There is only so much braindeath in UA's that you can bend the rules for. Clearly, this involves breaking them. Erm.. What rule does this actually break? Is there a rule in an RFC somewhere specifying you MUST not interpret bare domains as URIs in text emails? There is an RFC that defines what a URL looks like. A bare domain doesn't cut it. You want to forbid bare domains in email? Go ahead. You can forbid anything you like. But don't call it a test for URL's, since it's clearly not. Besides, when this braindeath is more the norm than the exception, it's a de facto standard. Particularly in the absence of any rules against it. Yeah, I'll talk to the Outlook folks, and file a bug against Thunderbird... (I think the latter only does it to be compatible with the former...) *EVERY* graphical MUA I've used in the past 10 years does this. Thunderbird, Outlook, Groupwise, Eudora, they all do it. I'm sure there are MUAs that don't, but there's an awful lot that do. Most webmails seem to do it too. Outlook web access, Comcast and Yahoo all do, but I'll concede that Verizon's webmail doesn't.
Re: using sare rules
On Mon, Feb 18, 2008 at 10:06:32AM +1300, Kathryn Allan wrote: I would have though that the rule file would have ended up inside the updates_spamassassin_org folder as all the other .cf files seem to be inside there. updates_spamassassin_org is for update files from updates.spamassassin.org. Files from other update channels go in their own directory. -- Randomly Selected Tagline: MSDOS didn't get as bad as it is overnight -- it took over ten years of careful development. - [EMAIL PROTECTED] pgp6vpm4HXd5V.pgp Description: PGP signature
Re: Clearly bogus false positives -- on abuse contact point, no less
Philip Prindeville wrote: There is an RFC that defines what a URL looks like. A bare domain doesn't cut it. You want to forbid bare domains in email? Go ahead. You can forbid anything you like. I don't, and I doubt Matt wants to either. But don't call it a test for URL's, since it's clearly not. FWIW, you're the only one who's been calling it a URL. The SA headers say it's a URI, which isn't accurate either, unless of course you consider SURBL to be a Schemeless URI Realtime Blocklist. Besides, when this braindeath is more the norm than the exception, it's a de facto standard. Particularly in the absence of any rules against it. Yeah, I'll talk to the Outlook folks, and file a bug against Thunderbird... (I think the latter only does it to be compatible with the former...) Yeah, good luck with that. Do you really have an issue with SA, or is it just that you're pissed off that somebody rejected spam sent to their abuse account and you're taking your frustration out on how SA detected that spam? Daryl
Re: Clearly bogus false positives -- on abuse contact point, no less
Philip Prindeville wrote: Matt Kettler wrote: Philip Prindeville wrote: Matt Kettler wrote: Philip Prindeville wrote: Depends on whether you equate bare domains with URL's, I suppose. If MUA's equate them with URLs, spammers will use this, and SpamAssassin will use it. There is only so much braindeath in UA's that you can bend the rules for. Clearly, this involves breaking them. Erm.. What rule does this actually break? Is there a rule in an RFC somewhere specifying you MUST not interpret bare domains as URIs in text emails? There is an RFC that defines what a URL looks like. A bare domain doesn't cut it. Yes, but there's nowhere that says you can't interpret any text you want as a URL. RFCs in general are interpreted with be strict about what you generate, and liberal with what you accept. URLizing text segments fits with that spirit, and it does not violate the letter of any RFC I'm aware of. If you can prove otherwise, please do so. You want to forbid bare domains in email? Go ahead. You can forbid anything you like. But don't call it a test for URL's, since it's clearly not. Well, they don't.. they call it a test for URIs, which is actually slightly different, but not really to the point here. However, in general, it is intended to be a test for anything most MUA's will interpret as a URI. Besides, when this braindeath is more the norm than the exception, it's a de facto standard. Particularly in the absence of any rules against it. Yeah, I'll talk to the Outlook folks, and file a bug against Thunderbird... (I think the latter only does it to be compatible with the former...) I'd venture to guess neither started it. Eudora predates both products by quite an extensive period of time. It could have originated there, or in Netscape mail. Sorry, but I highly doubt you can blame this on microsoftism, nor do I think it's any kind of wild incorrectness as you so strongly postulate. This has been a very standard feature in email for a very long time. It's not a recent development. It's also a feature that is quite important to accuracy in spamassassin. Spammers regularly take advantage of MUA's urlizing text. Regularly.. Every day. Adding the ability to detect those domains increases SA's hit rate for spam, and that's a good thing. Yes, it causes SA to trigger on spam reports, but it generally will do that for other parts of spam messages anyway. Let's face it, your problem isn't with SA detecting a spam domain, it's with some idiot filter/rejecting their abuse box.
Re: What setup do I need?
Thanks alot will do :) tmasboa wrote: Hello, I am new to SA and here is the situation: A normal mail server from my hosting company (pop3) and basically I have a computer i want to check the emails, run them through SA, and then deliver them to a local mail server just in our network. Any free suggestions? I tried installing a POP3 server like Dovecot but i had a problem and couldn't get it to authenticate. Thanks -- View this message in context: http://www.nabble.com/What-setup-do-I-need--tp15518685p15536332.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.
Re: What setup do I need?
Hello I need a little bit more help please. I am using webmin and got Fetchmail working partially... It checks and downloads the remote message but doesn't deliever the messages to the unix account. Sendmail is installed and i noticed it adds @localhost, i don't know if that is the problem though.. THanks -- View this message in context: http://www.nabble.com/What-setup-do-I-need--tp15518685p15537609.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.