RE: sa-learn help
The problem is that when they forward the email you will loose the headers and it will think they are the spam/hammers. -Original Message- From: Matt [mailto:[EMAIL PROTECTED] Sent: Thursday, March 17, 2005 3:05 PM To: [EMAIL PROTECTED] Subject: sa-learn help I am running a Directadmin server that uses Exim and Spamassassin 3.0.2 release. I would like to create two email addresses such as [EMAIL PROTECTED] and [EMAIL PROTECTED] Then I would like to ask all our users to forward there ham or spam to these addresses as an attachment. Then magically have some cronjob that runs sa-learn on them every 5 minutes or so. Has anyone done something like this? If so how? Most of our users use Outlook Express for email. Nearly 1000 email accounts. Also, Spamassassin seems to create a seperate bayes file for each user. For this I would like to have these addresses cover all domains and users on the server. Is that possible? Thanks Matt
bayes test
Hi, I wonder how I have to train spamassassin to get bayes_XX test start working. I have a rule that trains the bayessian filter with each email y received with the sa-learn tool. After some months of training (I thought I needed 200 of spam and 200 of ham) I haven't seen it yet. The last spam my spamassassin caught it had these tests: Return-Path: [EMAIL PROTECTED] X-Original-To: aktor{@|aktornet.ath.cx Delivered-To: aktor{@|aktornet.ath.cx Received: from 203.90.52.8 (unknown [203.90.52.8]) by aktornet.ath.cx (Postfix) with SMTP id 375F6BB49 for aktor{@|aktornet.ath.cx; Thu, 17 Mar 2005 06:19:52 +0100 (CET) From: ydlBobby [EMAIL PROTECTED] To: aktor{@|aktornet.ath.cx Subject: Better than Vìagra and cheaper, too! npdu Sender: ydlBobby [EMAIL PROTECTED] Mime-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Date: Wed, 16 Mar 2005 22:25:20 -0600 X-Mailer: Microsoft Outlook Express 5.00.2615.200 Message-Id: [EMAIL PROTECTED] X-Virus-Scanned: por AMAVIS + CLAMAV en aktornet.ath.cx X-Amavis-Alert: BAD HEADER Non-encoded 8-bit data (char EC hex) in message header 'Subject'Subject: Better than V\354agra and cheape... ^ X-Spam-Status: Yes, hits=11.1 tagged_above=0.0 required=4.0 tests=DRUGS_ERECTILE, DRUGS_ERECTILE_OBFU, FORGED_HOTMAIL_RCVD2, FORGED_MUA_OUTLOOK, INFO_TLD, MSGID_FROM_MTA_ID, RCVD_NUMERIC_HELO X-Spam-Level: *** X-Spam-Flag: YES No BAYES_XX test. I use spamassassin through amavisd-new, with Mail::SpamAssassin Perl module, with default options. [EMAIL PROTECTED] aktor $ sa-learn --dump magic 0.000 0 3 0 non-token data: bayes db version 0.000 0568 0 non-token data: nspam 0.000 0 1996 0 non-token data: nham 0.000 0 203190 0 non-token data: ntokens 0.000 0 1086896787 0 non-token data: oldest atime 0.000 0 102059 0 non-token data: newest atime 0.000 0 0 0 non-token data: last journal sync atime 0.000 0 102285 0 non-token data: last expiry atime 0.000 0 29436939 0 non-token data: last expire atime delta 0.000 0 0 0 non-token data: last expire reduction count Do I have to do something else? What am I doing wrong? Thank you, aktor -- Bienaventurados los pesimistas, porque ellos harán backups. -- Www.frases.com. This mail is copyleft-ed to aktor under the terms of the CC License (Creative Commons). pgpKuz0I7n2f7.pgp Description: PGP signature
Re: URI Tests and Japanese Chars (solved)
... To: Daryl C. W. O'Shea [EMAIL PROTECTED] Cc: List Mail User [EMAIL PROTECTED], [EMAIL PROTECTED], users@spamassassin.apache.org Subject: Re: URI Tests and Japanese Chars (solved) In-Reply-To: [EMAIL PROTECTED] From: [EMAIL PROTECTED] (Justin Mason) Justin, Daryl C. W. O'Shea writes: List Mail User wrote: Jeff, RFC 1630 make pretty clear that a email address in either a mailto:; or cid:; clause *is* a URI. It does not address whether a bare email address would count (it seems that it doesn't fit the RFC definition, but does fit some other I found by Goggle). I could be convinced either way from a bare address (as it stand now, maybe someone else has something to add). But a mailto:; mail: or cid:; clause should (in my opinion) be looked up by the URI rules - they are URI, not URL rules (though URLs are clearly the most common from of URIs). I was surprised to see that from the RFC, even Msg-Id: clauses are URIs. Paul Shupak [EMAIL PROTECTED] I'd agree with Paul, what's the difference between doing the lookup of the domain listed in a mailto: link and a http: link -- both of which are often found in someone's signature? Eliminating the mailto: domain lookup could lead to spam such as email us at [EMAIL PROTECTED] for all the junk you don't really want. However, it's an impedance mismatch between what's going into the backends (the SBL and SURBL uribls) and what we're matching on the other end. At least for SBL, it's definitely problematic, since a SBL escalation (of mail relays) will blocklist mail that *mentions* that domain! Thats not true in general. Since the SBL is an IP based list, a mail server escalation would have no effect on any other domain, only on messages relayed through the servers. The more common case where a SBL escalation will affect other domains is (the typical kind I've noticed) when they list all corporate servers and some otherwise innocent domains use name servers within that space (this was the Russian government/Rostelecom earlier this week). Still, you are correct, there is a big difference between the SURBL policy of zero FPs and the SBL policy, which I can best state as kill the spammers. SURBLs rarely have `collateral' damage and their default scores reflect that; The URIBL_SBL is only assigned scores of 0 0.629 0 0.996 in 3.0.2 - Only URIBL_AB_SURBL with set 3 and URIBL_WS_SURBL with set 1 are ever assigned lower scores than the URIBL_SBL. All the other SURBL have significantly higher scores - URIBL_SC_SURBL is many times what URIBL_SBL is. (You may not know, but I even proposed adding back the SPEWS lists, though with low scores, and I do use all the rfci lists with relatively low scores except for bogusmx, which may be the best single indicator I have ever found, and I still assign it fewer points than URIBL_SC_SURBL). - --j. {snipped PGP SIGNATURE] Paul Shupak [EMAIL PROTECTED] P.S. I understand the political problems with the particular FPs that SPEWS generates, but I do hope the rfci lists make it to the URIBL rulesets.
RE: URI Tests and Japanese Chars (solved)
... Subject: RE: URI Tests and Japanese Chars (solved) Date: Thu, 17 Mar 2005 17:41:03 -0500 ... From: Rose, Bobby [EMAIL PROTECTED] To: [EMAIL PROTECTED], Daryl C. W. O'Shea [EMAIL PROTECTED] Cc: List Mail User [EMAIL PROTECTED], [EMAIL PROTECTED], users@spamassassin.apache.org ... But in my test messages the email address wasn't in the form of a URI. It was just the email address. I even used pine for a test to make sure it was a gui client doing some reformatting business. Do we know if it's possible to know if the results from SBL are for the domain of the URI being queried or if their results are due to some association with the domain being queried. If so then we could ignore any results other than for the domain being queried or weigh the results differently so long as they aren't accumulative points for each occurrence. Otherwise, the points would add up the more that person's email address appears in the email. It has been suggested before that the indirect name server lookup done be a different class of rules and/or scored differently than the direct lookups - by default the SBL is the only list used for name servers, but on my servers I use several other lists (and then there is Bugzilla #4106 -Original Message- all snipped] Paul Shupak [EMAIL PROTECTED] P.S. Extra points for anyone who actually knows why Bugzilla (or Mozilla) have zilla in their name (or knows who Tom Paquin is).
Re: URI Tests and Japanese Chars (solved)
List Mail User wrote: (B... (BTo: "Daryl C. W. O'Shea" [EMAIL PROTECTED] (BCc: List Mail User [EMAIL PROTECTED], [EMAIL PROTECTED], (B users@spamassassin.apache.org (BSubject: Re: URI Tests and Japanese Chars (solved) (BIn-Reply-To: [EMAIL PROTECTED] (BFrom: [EMAIL PROTECTED] (Justin Mason) (B (B (B Justin, (B (B (BDaryl C. W. O'Shea writes: (B (BList Mail User wrote: (B (BJeff, (B (BRFC 1630 make pretty clear that a email address in either a "mailto:" (Bor "cid:" clause *is* a URI. It does not address whether a bare email (Baddress (Bwould count (it seems that it doesn't fit the RFC definition, but does fit (Bsome other I found by Goggle). (B (BI could be convinced either way from a bare address (as it stand now, (Bmaybe someone else has something to add). But a "mailto:" "mail:" or "cid:" (Bclause should (in my opinion) be looked up by the URI rules - they are URI, (Bnot URL rules (though URLs are clearly the most common from of URIs). (B (BI was surprised to see that from the RFC, even "Msg-Id:" clauses (Bare URIs. (B (BPaul Shupak (B[EMAIL PROTECTED] (B (BI'd agree with Paul, what's the difference between doing the lookup of (Bthe domain listed in a mailto: link and a http: link -- both of which (Bare often found in someone's signature? (B (BEliminating the mailto: domain lookup could lead to spam such as "email (Bus at [EMAIL PROTECTED] for all the junk you don't really want". (B (BHowever, it's an impedance mismatch between what's going into the backends (B(the SBL and SURBL uribls) and what we're matching on the other end. (B (BAt least for SBL, it's definitely problematic, since a SBL escalation (B(of mail relays) will blocklist mail that *mentions* that domain! (B (B (B Thats not true in general. Since the SBL is an IP based list, (B a mail server escalation would have no effect on any other domain, only (B on messages relayed through the servers. (B (B The more common case where a SBL escalation will affect other domains (B is (the typical kind I've noticed) when they list all corporate servers and (B some otherwise innocent domains use name servers within that space (this was (B the Russian government/Rostelecom earlier this week). (B (B Still, you are correct, there is a big difference between the SURBL (B policy of zero FPs and the SBL policy, which I can best state as "kill the (B spammers". SURBLs rarely have `collateral' damage and their default scores (B reflect that; The URIBL_SBL is only assigned scores of "0 0.629 0 0.996" (B in 3.0.2 - Only URIBL_AB_SURBL with set 3 and URIBL_WS_SURBL with set 1 are (B ever assigned lower scores than the URIBL_SBL. All the other SURBL have (B significantly higher scores - URIBL_SC_SURBL is many times what URIBL_SBL is. (B (You may not know, but I even proposed adding back the SPEWS lists, though (B with low scores, and I do use all the rfci lists with relatively low scores (B except for bogusmx, which may be the best single indicator I have ever found, (B and I still assign it fewer points than URIBL_SC_SURBL). (B (B- --j. (B{snipped PGP SIGNATURE] (B (B (B Paul Shupak (B [EMAIL PROTECTED] (B (B P.S. I understand the political problems with the particular FPs that SPEWS (B generates, but I do hope the rfci lists make it to the URIBL rulesets. (B (B (BSince you mentioned the scores, please note the Bobby Rose, the original (Bposter of this issue had modified the score for URIBL_SBL from its (Bdefaults to 10 ... (B (BI had suggested that he reduce the score (possibly setting it back to (Bthe defaults) (B (BWhile it doesn't negate the issues surrounding the way the URI lookups (Bwork (or should possibly work) ... it's obvious that there is enough FP (Bpotential to warrant not scoring it so high. (B (Balan
Re: URI Tests and Japanese Chars (solved)
[all sipped] Since you mentioned the scores, please note the Bobby Rose, the original poster of this issue had modified the score for URIBL_SBL from its defaults to 10 ... I had suggested that he reduce the score (possibly setting it back to the defaults) While it doesn't negate the issues surrounding the way the URI lookups work (or should possibly work) ... it's obvious that there is enough FP potential to warrant not scoring it so high. alan I think you are quite correct. If you want to have a high weight on the SBL, use it as a RBL at the SMTP level (I do). I think its score once a message hits SA is already correct given the extreme overlap with other hit rules (I have lots of filtering before that - SA is my last line of defense and seems almost impenetrable). Even my own local rules generally have very low scores - only two score above 1.5 and only 5 score above .6, out of about 25 local rules. As best I can tell, the default scoring is very well adjusted already. Paul Shupak [EMAIL PROTECTED]
Re: Is this Received header correctly formatted?
mouss wrote: Eric A. Hall wrote: Huh? The helo= stuff is inside the parenthesis. Perhaps I am missing something but your point 3 seems to conflicewith your point 2. comments are only allowed where whitespace occurs can you give you me the line num in the rfc? It's actually somewhat stricter than that, and actually says that comments can only be used where folding would occur (that's a hyper-techinical but accurate reading; see the robustness principle). Here is what rfc2822 says: 3.2.3. Folding white space and comments [...] There are several places in this standard where comments and FWS may be freely inserted. To accommodate that syntax, an additional token for CFWS is defined for places where comments and/or FWS can occur. However, where CFWS occurs in this standard, it MUST NOT be inserted in such a way that any line of a folded header field is made up entirely of WSP characters and nothing else. FWS = ([*WSP CRLF] 1*WSP) / ; Folding white space obs-FWS ctext = NO-WS-CTL / ; Non white space controls %d33-39 / ; The rest of the US-ASCII %d42-91 / ; characters not including (, %d93-126; ), or \ ccontent= ctext / quoted-pair / comment comment = ( *([FWS] ccontent) [FWS] ) CFWS= *([FWS] comment) (([FWS] comment) / FWS) Throughout this standard, where FWS (the folding white space token) appears, it indicates a place where header folding, as discussed in section 2.2.3, may take place. Wherever header folding appears in a message (that is, a header field body containing a CRLF followed by any WSP), header unfolding (removal of the CRLF) is performed before any further lexical analysis is performed on that header field according to this standard. That is to say, any CRLF that appears in FWS is semantically invisible. A comment is normally used in a structured field body to provide some human readable informational text. Since a comment is allowed to contain FWS, folding is permitted within the comment. Also note that since quoted-pair is allowed in a comment, the parentheses and backslash characters may appear in a comment so long as they appear as a quoted-pair. Semantically, the enclosing parentheses are not part of the comment; the comment is what is contained between the two parentheses. As stated earlier, the \ in any quoted-pair and the CRLF in any FWS that appears within the comment are semantically invisible and therefore not part of the comment either. Runs of FWS, comment or CFWS that occur between lexical tokens in a structured field header are semantically interpreted as a single space character. RFC 2822 is slightly stricter than RFC 822 in this regard. And while it's not full standard like 822, it is a standards-track update to 822 and was sanctioned by the IESG as such, and was developed after years of debate over good and bad behavior. and even then, the original thing was: Received: from ar39.lsanca2-4.16.241.28.lsanca2.elnk.dsl.genuity.net ([4.16.241.28] helo=watson1) and here helo=watson1 is inside parens, and with withespace (before and after the parens). or am I missing something? Check the BNF again. -- Eric A. Hall http://www.ehsco.com/ Internet Core Protocolshttp://www.oreilly.com/catalog/coreprot/
OT: SURBL usage for content-filters like SquidGuard?
Hi there I was wondering if anyone has written a Squid/proxy redirector filter that uses SURBL? It would seem to me the URLs referenced by SURBL are Web sites I'd never want to go to? :-) Maybe it would be only usable via an rsync feed (i.e text file), but the data quality should be pretty good... -- Cheers Jason Haar Information Security Manager, Trimble Navigation Ltd. Phone: +64 3 9635 377 Fax: +64 3 9635 417 PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
Re: sa-learn help
The problem is that when they forward the email you will loose the headers and it will think they are the spam/hammers. No, he said they're forwarding them *as attachments*. All you need to do is take the attachments out of the email, and voila, email as the receiver received it. Mike Jackson
Please help with subject rule
Dear all, Could you please help me with one SA subject rule that sometimes works and sometimes doesn't. SpamAssassin 3.0.2 with qmail-scanner 1.25st. Everything works like a charm but we receive a lot of spam messages from yahoo.com group with [expoforum_kg] subject. I created a rule in 20_head_tests.cf to score all messages containing [expoforum_kg] in a subject. I know I shouldn't use global cf rules but I was just testing. 20_head_tests.cf: header EXPO_SUCKERS Subject =~ /\b(?:[a-z]([-_. =~\/:,[EMAIL PROTECTED]+;\\'\\])\1{0,2}){4,}/i describe EXPO_SUCKERS Subject: contains [expoforum_kg] 50_scores.cf: score EXPO_SUCKERS 10 10.05 10.07 10.09 Now the problem is that sometimes this rule works but sometimes it is being ignored. This is an example of successful detection: Mon, 14 Mar 2005 18:11:21 KGT:40007: from='Neomarketing [EMAIL PROTECTED]', subj='[expoforum_kg] A D V E R T I S E - TO - M I L L I O N S', via SMTP from 66.94.237.16 Mon, 14 Mar 2005 18:11:23 KGT:40007: uvscan: finished scan in 1.860183 secs Mon, 14 Mar 2005 18:11:41 KGT:40007: SA: REPORT hits = 10.6/3.5 1.3 GAPPY_SUBJECT Subject: contains G.a.p.p.y-T.e.x.t 10 EXPO_SUCKERS Subject: contains [expoforum_kg] 1.3 DATE_IN_FUTURE_06_12 Date: is 6 to 12 hours after Received: date 0.5 TARGETED BODY: Targeted Traffic / Email Addresses Mon, 14 Mar 2005 18:11:41 KGT:40007: SA: yup, this smells like SPAM - hits=10.6 - rejecting message... Mon, 14 Mar 2005 18:11:41 KGT:40007: SA: finished scan in 17.88551 secs - hits=10.6 Mon, 14 Mar 2005 18:11:41 KGT:40007: r_e: X-Qmail-Scanner-1.25st: We have reasons to believe this mail is SPAM This is an example of unsuccessful detection: Tue, 15 Mar 2005 18:28:48 KGT:17412: from='Jodi Chu [EMAIL PROTECTED]', subj='[expoforum_kg] Paid ontime 50% profit', via SMTP from 66.94.237.41 Tue, 15 Mar 2005 18:28:50 KGT:17412: uvscan: finished scan in 1.859957 secs Tue, 15 Mar 2005 18:29:06 KGT:17412: SA: REPORT hits = 0.4/3.5 1.0 RATWARE_HASH_2_V2 Bulk email fingerprint (hash 2 v2) found 0.1 TO_EMPTY To: is empty 0.0 RATWARE_HASH_2 Bulk email fingerprint (hash 2) found 0.1 EXCUSE_3 BODY: Claims you can be removed from the list 0.0 EXCUSE_7 BODY: Claims you can be removed from the list 0.3 EXCUSE_REMOVE BODY: Talks about how to be removed from mailings 1.5 URIBL_WS_SURBL Contains an URL listed in the WS SURBL blocklist [URIs: idv.st] 0.0 MISSING_MIMEOLE Message has X-MSMail-Priority, but no X-MimeOLE Tue, 15 Mar 2005 18:29:06 KGT:17412: SA: required_hits 3.5 / sa_quarantine +2.1 / sa_delete +4.2 Tue, 15 Mar 2005 18:29:06 KGT:17412: SA: finished scan in 16.069264 secs - hits=0.4 Any ideas would be greatly appreciated. Thank you. Roman
Re: bayes test
I wonder how I have to train spamassassin to get bayes_XX test start working. I have a rule that trains the bayessian filter with each email y received with the sa-learn tool. You have not mentioned that rule and file in which you have written that rule. if you can tell then it will help others to reply better. anyway let me try to explain bayes_XX wrks purely on basis of probability. It tries to find out tokens in the mail which match to earlier learned tokens. Its always better that bayes rules should learn themselves. but we can always create rules to enhance the chances of that rule appear with other tests. they have their default score which you can check in files: /usr/share/spamassassin/* directory. and user created rules you can write in either /etc/mail/spamassassin/local.cf or user specific file in its home directory user_prefs file. Any rule you write or scores you change do not forget to run the command spamassassin --lint and for debugging you can add -D option. After some months of training (I thought I needed 200 of spam and 200 of ham) I haven't seen it yet. The last spam my spamassassin caught it had these tests: yes its mentioned in spamassassin wiki documentation but reality is much more than this. Read man sa-learn , that will help you in understanding the process better. For further queries mail to the list. -- Crisppy Fernandes
Is spamassassin 3.0.2 wrked for any one just after install or upgrade
Dev community, This is to know from developers community is spamassassin wrked for anyone just after upgrade or install. Everyday one or other new user complaints abt this behaviour that spamassassin after upgrade to 3.0.x version not seems to wrk. After checking the man documents or wiki we come to know that , made it learn 200 spam and ham then it will wrk. But even then it actually not wrk. Corpus are not exact things to check for validity as per sa-learn documentation. Then is there any other easy way. using which a novice can wrk with spamassassin without any need to bother abt learning and all. After going through documentation i am able to understand that it learn automatically on basis of its different rules. But what about users who dont have big load of spams on their servers. Simply here i want to point out is spamassassin.org should provide any procedure which will make users wrk easy and they feel happy using this s/w. -/Crisppyf
Re: OT: SURBL usage for content-filters like SquidGuard?
On Thursday, March 17, 2005, 7:13:32 PM, Jason Haar wrote: I was wondering if anyone has written a Squid/proxy redirector filter that uses SURBL? Bill Stearns has some instructions for using Squid, Privoxy and other programs with sa-blacklist, which is the data source that goes into ws.surbl.org, at: http://www.stearns.org/sa-blacklist/README.howtouse.html It would seem to me the URLs referenced by SURBL are Web sites I'd never want to go to? :-) Perhaps, though we would probably not want to make that decision in a shared or public environment. Bear in mind that the SURBL data is strongly biased towards URIs that appear in spam. While it's true that most people would probably not want to visit spam sites, they could be useful for spam research, etc. Maybe it would be only usable via an rsync feed (i.e text file), but the data quality should be pretty good... Bill allows web grabs of sa-blacklist, but SURBLs are usually used though DNS query or rsync only for high volume mail servers. You may want to discuss this further on the SURBL discussion list: http://lists.surbl.org/ Cheers, Jeff C. -- Jeff Chan mailto:[EMAIL PROTECTED] http://www.surbl.org/
Re: Bayes DB does not grow anymore
Thanks for the offer. You can send it to the email address I use for this list, or you could just send me an FTP URL for retrieval. Sorry I did not find the time to do this, but I will try to send it during the weekend. Oh, yes. You need to have SURBL switched on via the init.pre (I think it's off by default) and you should use custom rules. I use a set of carefully chosen rulesets mostly from SARE and updated via rulesdujour and some more rules of my own accumulated over time. It seems SURBL is now enabled by default. It has also changed its name to URIDNSBL :-) I do not use SARE rules (although I am trying to find time to look at them, as I am aware of their credibility). I use Gray's rules (http://files.grayonline.id.au), they seem quite efficient. I think on a heavy traffic machine it's preferrable to have it off, especially when using MailScanner. Otherwise the expiry can kick in at random times every few hours (you can set a minimum time, though, f.i. one day). Some people run a scheduled expiry three times a day. That's an advice which often comes up on the Mailscanner list (which is a very helpful list, btw). Depends on how often you need it (whether it reaches the limit you want to hold more often or not). Starting with one expiry per night should be fine, but you should occasionally expire manually and look at the output, in case there are problems. No. One should get rid of really old tokens, they are only ballast in the db. I don't know how a big db behaves on a busy site. Ours contain 1 Mio. tokens and have a size of 40 MB. They work very well with no ressource hogging. But I have only a few thousand messages running thru each of our servers, there's probably none which gets more than 10.000 a day. If you get 100.000 it may be different. I understand what you say. The point is, what should be the criteria to understand if the time for an expiration has come? I mean, supposing we take only the size in consideration, could be a problem. What if some old tokens are still common nowadays in spam mail? You could say it doesn't matter it will be started again and recognize all the bad stuff. In that sense, we could just stop maintaining Bayes completely. That's what we do. I only learn messages which were categorized wrong. Not by Bayes, but by SA. Most messages which get a score lower than 5 still get a BAYES_99 which means that Bayes identifies them all. Nevertheless, I learn these messages because they are spam and it reassures Bayes that they are spam. BTW: I have set BAYES_99 to 3.0, because it's so accurate for us. As I told you, since my last post I have reset everything. It seems to me it works fine, and it learns rapidly. It gives me no reason not to trust it, in a degree I have set my SA score to be more or less equal with the BAYES_99 score (around 8). Of course I keep doing mistake-based learning, but most of the times I feed it with 'subjective' spam mail (ie. mail that my users don't want to receive, but is definitely not spam). I monitor it constantly and I am happy about it. No problem :-) I tend to be a bit snappy on first messages which look to me like the author could have done a bit more research, but once we are over that stage I hope I can give some good advice based on my experience. I have to admit that our communication was valuable to me, I learned so much about how the whole thing works. Once again, I appreciate it. Greg _ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
Re: bayes test
Hi, El Fri, 18 Mar 2005 10:54:08 +0530 crisppy fernandes escribió: You have not mentioned that rule and file in which you have written that rule. if you can tell then it will help others to reply better. anyway let me try to explain I haven't written any rule by myself. I thought it should start learning by itself. Both files /etc/spamassassin/local.cf ~/.spamassassin don't hace any directive as I use amavisd-new default settings After some months of training (I thought I needed 200 of spam and 200 of ham) I haven't seen it yet. The last spam my spamassassin caught it had these tests: yes its mentioned in spamassassin wiki documentation but reality is much more than this. Ok. That's gonna be the problem. Which is the real number of emails needed to start the bayessian filter to work? Thx, aktor -- Compre un MODEM, navegue en Internet: gane amigos y pierda a su mujer. -- Www.frases.com. This mail is copyleft-ed to aktor under the terms of the CC License (Creative Commons). pgpZ02XVmEL4q.pgp Description: PGP signature
Re: Is this Received header correctly formatted?
Eric A. Hall wrote: Huh? The helo= stuff is inside the parenthesis. Perhaps I am missing something but your point 3 seems to conflicewith your point 2. comments are only allowed where whitespace occurs can you give you me the line num in the rfc? and even then, the original thing was: Received: from ar39.lsanca2-4.16.241.28.lsanca2.elnk.dsl.genuity.net ([4.16.241.28] helo=watson1) and here helo=watson1 is inside parens, and with withespace (before and after the parens). or am I missing something? regards, mouss
Re: Is this Received header correctly formatted?
List Mail User wrote: ... Date: Thu, 17 Mar 2005 00:29:43 +0100 From: mouss [EMAIL PROTECTED] ... To: List Mail User [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Is this Received header correctly formatted? ... List Mail User wrote: In other words, lowercase is conformant. and your first point is not correct (though all the examples do show uppercase). However, you are completely correct that the helo= is flat out wrong, why? it's inside a comment, no? but with a slight variation, and it becomes something like (watson1 [4.16.241.28]) which is not only conformant, but is the the typical behavior or both sendmail and postfix. except that here the situation is reversed. while postfix and sendmail use from heloname (client_namer [client_ip]), others such as qmail prefer from client_name ([client_ip]) (helo heloname) or other variants. Mous, You're correct about the reversal, I realized that *after* I sent the message. Also technically the area after the [client_ip] is not white space. Eric properly pointed out that the entire header field already has an assigned use already, and the comment in the definition states specifically not to use information from the HELO. To requote: TCP-info = Address-literal / ( Domain FWS Address-literal ) ; Information derived by server from TCP connection ; not client EHLO. that says what should be inside, not in comments. or are you meaning that qmail's: Received: (from the network ...); ... is illegal? you might, but you'd better come with real arguments. Notice the definition does not use any specification for white space after the address literal. The single space character does not count - The notation uses that to delineate between atoms and/or tokens; There would have to be a reference to either FWS, WSP or maybe even LWSP might qualify; But since none of those atoms are part of the definition, the area after the literal and before the ')' does not qualify as white space. So the clause ([4.16.241.28] helo=watson1) seems to be clearly non-conformant. Also, the inclusion of the parenthesis seems to be incorrect for a bare literal; They are only specified for the second alternative with both the Domain and Address-literals. I do agree that is it not enough of an error that mail should be refused on that basis alone, but if a server were to do so, it would be within its prerogative (and seemingly legal to do so). Paul Shupak [EMAIL PROTECTED]
Re: bayes test
Hi again, El Fri, 18 Mar 2005 10:54:08 +0530 crisppy fernandes escribió: Any rule you write or scores you change do not forget to run the command spamassassin --lint and for debugging you can add -D option. AsteriX root # amavisd-new debug-sa [..] debug: bayes: 20621 tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_toks debug: bayes: 20621 tie-ing to DB file R/O /var/lib/amavis/.spamassassin/bayes_seen debug: bayes: found bayes db version 3 debug: bayes: Not available for scanning, only 49 spam(s) in Bayes DB 200 debug: bayes: 20621 untie-ing debug: bayes: 20621 untie-ing db_toks debug: bayes: 20621 untie-ing db_seen debug: Score set 0 chosen. I've got this architecture.. postfix - amavisd-new - postfix - maildrop - sa-learn - mailbox | | V V clamav spamassassin So I would like to load per user bayes_toks and bayes_seen files. I think my problem is that the only file used by spamassasssin is /var/lib/amavis/.spamassassin/bayes_* and no per user ones AsteriX root # sa-learn --dump magic --dbpath /var/lib/amavis/.spamassassin/ 0.0000 3 0 non-token data: bayes db version 0.0000 49 0 non-token data: nspam ^^ 0.0000 5240 0 non-token data: nham 0.0000 164819 0 non-token data: ntokens 0.0000 1106523114 0 non-token data: oldest atime 0.0000 139568 0 non-token data: newest atime 0.0000 1106526477 0 non-token data: last journal sync atime 0.0000 123833 0 non-token data: last expiry atime 0.0000 0 0 non-token data: last expire atime delta 0.0000 0 0 non-token data: last expire reduction count [EMAIL PROTECTED] aktor $ sa-learn --dump magic 0.0000 3 0 non-token data: bayes db version 0.0000572 0 non-token data: nspam ^^^ 0.0000 1996 0 non-token data: nham 0.0000 203323 0 non-token data: ntokens 0.0000 1086896787 0 non-token data: oldest atime 0.0000 127201 0 non-token data: newest atime 0.0000 0 0 non-token data: last journal sync atime 0.0000 102285 0 non-token data: last expiry atime 0.0000 29436939 0 non-token data: last expire atime delta 0.0000 0 0 non-token data: last expire reduction cou Is there any way to solve this? Thx, aktor -- El hombre todavía puede apagar el ordenador. Sin embargo, tendremos que esforzarnos mucho para conservar este privilegio. -- J. Weizembaum. Sociólogo norteamericano experto en ordenadores. This mail is copyleft-ed to aktor under the terms of the CC License (Creative Commons). pgptNqbVQ76Uf.pgp Description: PGP signature
Re: Is this Received header correctly formatted?
From: mouss [EMAIL PROTECTED] Eric A. Hall wrote: Huh? The helo= stuff is inside the parenthesis. Perhaps I am missing something but your point 3 seems to conflicewith your point 2. comments are only allowed where whitespace occurs can you give you me the line num in the rfc? and even then, the original thing was: Received: from ar39.lsanca2-4.16.241.28.lsanca2.elnk.dsl.genuity.net ([4.16.241.28] helo=watson1) and here helo=watson1 is inside parens, and with withespace (before and after the parens). or am I missing something? It IS Microsoft. I know that for certain. That machine is sitting about 10' to the East of me at this moment. My Received: header is will be a similar format with kittycat as the helo. These are the computer names on the local network isolated from the outside network by a Linux firewall. I am *NOT* about to rename these machines by the incomprehensible, impossible to type from memory, and changeable name assigned to the firewall interface. I do NOT run a mail server for sending mail to the Internet on the firewall machine. I do not, at this time, intend to. If we get a static IP I might consider firing up a suitably screwed down Postfix for direct incoming and outgoing email rather than the fetchmail configuration in use at the moment. While I fully realize that Microsoft is well known to embrace and extend otherwise known as screw-up common standards for their own incomprehensible reasons. (Most often it's probably some jerk genius programming it who might declare, Gee, I didn't think of that! An example of that is the means by which I, were I a malware author, could render your machine mysteriously unbootable in anything but safe-mode simply because Microsoft did not think of the consequences of a change they put into SP2. A product I make happened to trigger this defect. I had to find a way around it.) Anyway, the point of this is that denying that format will deny a very large proportion of mail that is from Outlook Express users. Personally, I don't give a fleeking furglemonk whether you do or not. I'm simply telling you what the facts of the situation are so that you can make your own determination whether you want to block email from a VERY large segment of the legitimate email crossing the net today. Then you can take responsibility for lost or rejected email for yourself. (If you have customers involved be aware this may constitute a liability situation for you personally and your company.) {^_^} Joanne PS: The actual firewall machine is imaginatively named it. If you dig in the headers enough maybe you can even figure out the internal network particulars. It is NOT going to change because somebody is needlessly particular about header formats.
Re: Is this Received header correctly formatted?
List Mail User wrote: ... Date: Thu, 17 Mar 2005 00:29:43 +0100 From: mouss [EMAIL PROTECTED] ... To: List Mail User [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Is this Received header correctly formatted? ... List Mail User wrote: In other words, lowercase is conformant. and your first point is not correct (though all the examples do show uppercase). However, you are completely correct that the helo= is flat out wrong, why? it's inside a comment, no? but with a slight variation, and it becomes something like (watson1 [4.16.241.28]) which is not only conformant, but is the the typical behavior or both sendmail and postfix. except that here the situation is reversed. while postfix and sendmail use from heloname (client_namer [client_ip]), others such as qmail prefer from client_name ([client_ip]) (helo heloname) or other variants. Mous, You're correct about the reversal, I realized that *after* I sent the message. Also technically the area after the [client_ip] is not white space. Eric properly pointed out that the entire header field already has an assigned use already, and the comment in the definition states specifically not to use information from the HELO. To requote: TCP-info = Address-literal / ( Domain FWS Address-literal ) ; Information derived by server from TCP connection ; not client EHLO. Notice the definition does not use any specification for white space after the address literal. The single space character does not count - The notation uses that to delineate between atoms and/or tokens; There would have to be a reference to either FWS, WSP or maybe even LWSP might qualify; But since none of those atoms are part of the definition, the area after the literal and before the ')' does not qualify as white space. So the clause ([4.16.241.28] helo=watson1) seems to be clearly non-conformant. ahem. the specs provide for comments, and don't restrict comments. so whatever is in between pars is ok. the specs even allow silly things linke Fr(foo)om. btw, unlike what a lot of people seem to think, rfc2821 is only a standard track'. Also, the inclusion of the parenthesis seems to be incorrect for a bare literal; as far as this is in comments, there is no issue. so Receieved: from foo (whatever is here) is ok. They are only specified for the second alternative with both the Domain and Address-literals. I do agree that is it not enough of an error that mail should be refused on that basis alone, but if a server were to do so, it would be within its prerogative (and seemingly legal to do so). as far as I can see, the std allows for a lot of received stuff. the std even manages to create a notion of domain that is not compatible with a dns domain. after all, smtp has apparently been defined by sendmail
Time in the log file is incorrect?
Hi all, I just read my spamd log file and I found that the time in the log is incorrect. I just sent an email to myself and here is the log: @4000423abce8189d6a1c 2005-03-18 11:34:54 [21095] snip whereas right now the time is Fri Mar 18 22:36:42 EST 2005. I have ntp installed and should not be the problem. Do guys know they reason why incorrect? Although it is not a big issue, it may cause the problem with my log analyzer. Thanks David
Spammers Target Secondary MX hosts?
Hi all, I've been noticing it lately that almost 90% of emails come in through our secondary MX host are spams, I just want to know if there's an explanation for this, my guess is that the spammers spam the secondary MX host intentionally for some reason I can't understand, maybe hoping the secondary host will configured with less care? Many thanks, Yang
Re: Spammers Target Secondary MX hosts?
I think the reason is that they think we might trust the secondary MX more than anything else and therefore let it through without checks. -- Martin Hepworth Snr Systems Administrator Solid State Logic Tel: +44 (0)1865 842300 Yang Xiao wrote: Hi all, I've been noticing it lately that almost 90% of emails come in through our secondary MX host are spams, I just want to know if there's an explanation for this, my guess is that the spammers spam the secondary MX host intentionally for some reason I can't understand, maybe hoping the secondary host will configured with less care? Many thanks, Yang ** This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify the system manager. This footnote confirms that this email message has been swept for the presence of computer viruses and is believed to be clean. **
Network Tests
Hi guys, I have the Spam Assassin 2.63 with Amavis installed in my box and now I am trying to enable network tests with SpamcopURI. Its working but the delivery of the messages isvery slow when network tests are enabled, so I´d to disable it. Any ideas to make thedeliver of messages faster with network tests enabled ? Thanks a lot, Daniel. Esta mensagem eletronica (e qualquer anexo) e confidencial e enderecada ao(s) individuo(s) referidos acima e a outros que tenham sido expressamente autorizados a recebe-la.Se voce nao e o destinatario(a) desta mensagem, por gentileza nao copie, use ou divulgue seu conteudo. Caso voce tenha recebido esta mensagem equivocadamente por favor, apague esta mensagem e eventuais copias. This e-mail communication (and any attachments) is confidential and is intended only for the individual(s) named above and others who have been specifically authorized to receive it. If you are not the intended recipient, please do not read, copy, use or disclose the contents of this communication to others. Please then delete the e-mail and any copies of it. sem acentuacao ...
Re: Spammers Target Secondary MX hosts?
On Fri, 18 Mar 2005 13:48:46 +, Duncan Hill [EMAIL PROTECTED] wrote: On Friday 18 March 2005 13:09, Yang Xiao typed: Hi all, I've been noticing it lately that almost 90% of emails come in through our secondary MX host are spams, I just want to know if there's an explanation for this, my guess is that the spammers spam the secondary MX host intentionally for some reason I can't understand, maybe hoping the secondary host will configured with less care? In a large number of cases, the secondary MX is not configured to know the list of valid users etc, and may be configured to pass directly to the internal mail server, bypassing protections on the primary relay. hm...I'd be interested to know what's the percentage is like for this kind of settings just to feed my curiousity, because it totally doesn't make sense to me , it's like settings up a secondary firewall with no blocking rules, what good is it? Yang
Re: Is this Received header correctly formatted?
... Date: Fri, 18 Mar 2005 03:40:20 +0100 From: mouss [EMAIL PROTECTED] ... Subject: Re: Is this Received header correctly formatted? ... List Mail User wrote: ... Date: Thu, 17 Mar 2005 00:29:43 +0100 From: mouss [EMAIL PROTECTED] ... To: List Mail User [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Is this Received header correctly formatted? ... List Mail User wrote: In other words, lowercase is conformant. and your first point is not correct (though all the examples do show uppercase). However, you are completely correct that the helo= is flat out wrong, why? it's inside a comment, no? but with a slight variation, and it becomes something like (watson1 [4.16.241.28]) which is not only conformant, but is the the typical behavior or both sendmail and postfix. except that here the situation is reversed. while postfix and sendmail use from heloname (client_namer [client_ip]), others such as qmail prefer from client_name ([client_ip]) (helo heloname) or other variants. Mous, You're correct about the reversal, I realized that *after* I sent the message. Also technically the area after the [client_ip] is not white space. Eric properly pointed out that the entire header field already has an assigned use already, and the comment in the definition states specifically not to use information from the HELO. To requote: TCP-info = Address-literal / ( Domain FWS Address-literal ) ; Information derived by server from TCP connection ; not client EHLO. that says what should be inside, not in comments. or are you meaning that qmail's: Received: (from the network ...); ... is illegal? you might, but you'd better come with real arguments. [end of history - start of actual response] Actually, I have to admit, that without checking I usually just assume qmail is wrong;) But in this case the Received: (from the network (actually, all the examples from qmail a quick check showed, were of the form (invoked ..., but the argument is the same) is comformant because the format for a Received: line is defined by: RFC 2822 Section 3.6.7 ... received= Received: name-val-list ; date-time CRLF name-val-list = [CFWS] [name-val-pair *(CFWS name-val-pair)] ... And the CFWS is exactly what Eric pointed to before as the case where comments are allowed. What you seem to be missing is that a space in the BNF is *not* white space, but just a delimiter. You need to check what is in RFC 2234, as specifically mentioned in RFC2822 Section 1.2.2. Whis is white space is always denoted as one of WSP, or LWSP (RFC2234 Sections 4 and 6.1). RFC2822 Section 3.2.3 introduces FWS and CWSP for the purposes of that document. Comments are allowed in headers whenever CWSP is used in the BNF - The definition a comment (for RFC2822) is given as: RFC 2822 Section 3.2.3 ... There are several places in this standard where comments and FWS may be freely inserted. To accommodate that syntax, an additional token for CFWS is defined for places where comments and/or FWS can occur. However, where CFWS occurs in this standard, it MUST NOT be inserted in such a way that any line of a folded header field is made up entirely of WSP characters and nothing else. FWS = ([*WSP CRLF] 1*WSP) / ; Folding white space obs-FWS ... comment = ( *([FWS] ccontent) [FWS] ) CFWS= *([FWS] comment) (([FWS] comment) / FWS) ... Note that when a comment appears in part of CFWS it is required to have parenthesis around it - again, so the helo=watson1 clause which started all of this mess is again, not valid. It does seem that a line containing (helo=watson1) [4.16.241.28] would be legal, but would seem to be violating the spirit of the law which says, (paraphrased) data not derived from EHLO. Note, the parenthesis are required around comments (the BNF specifies them as quoted literal characters as shown above). Anyway the qmail case would be parsed as a received line with a perfectly legal comment at the beginning of an otherwise empty name-val-list and a required date-time at the end. Certainly not optimal, but legal (I would want to see a name-val-list containing at least one name-val-pair, as it is of more interest than the comment of invoked ... in some fashion). [more thread history below] Notice the definition does not use any specification for white space after the address literal. The single space character does not count - The notation uses that to delineate between atoms and/or tokens; There would have to be a reference to either FWS, WSP or maybe even LWSP might qualify; But since none of those atoms are part of the definition, the area after the literal and before the ')' does not qualify as white space. So the clause ([4.16.241.28] helo=watson1) seems to be clearly non-conformant. Also,
Re: Spammers Target Secondary MX hosts?
Yang Xiao wrote on Fri, 18 Mar 2005 08:09:24 -0500: I've been noticing it lately that almost 90% of emails come in through our secondary MX host are spams, I just want to know if there's an explanation for this, my guess is that the spammers spam the secondary MX host intentionally for some reason I can't understand, maybe hoping the secondary host will configured with less care? Yes, that seems to be the idea. Kai -- Kai Schätzl, Berlin, Germany Get your web at Conactive Internet Services: http://www.conactive.com IE-Center: http://ie5.de http://msie.winware.org
RE: Please help with subject rule
From: Roman Serbski [mailto:[EMAIL PROTECTED] Dear all, Could you please help me with one SA subject rule that sometimes works and sometimes doesn't. SpamAssassin 3.0.2 with qmail-scanner 1.25st. Everything works like a charm but we receive a lot of spam messages from yahoo.com group with [expoforum_kg] subject. I created a rule in 20_head_tests.cf to score all messages containing [expoforum_kg] in a subject. I know I shouldn't use global cf rules but I was just testing. 20_head_tests.cf: header EXPO_SUCKERS Subject =~ /\b(?:[a-z]([-_. =~\/:,[EMAIL PROTECTED]+;\\'\\])\1{0,2}){4,}/i describe EXPO_SUCKERS Subject: contains [expoforum_kg] This is an example of successful detection: subj='[expoforum_kg] A D V E R T I S E - TO - M I L L I O N S' This is an example of unsuccessful detection: subj='[expoforum_kg] Paid ontime 50% profit' The problem is that your rule is matching the expanded text seen in the first subject rather than the '[expoforum_kg]' that you seem to expect. Try this rule instead: header EXPO_SUCKERS Subject =~ /\b\[expoforum_kg\]\b/i Bowie
Re: Please help with subject rule
At 08:58 PM 3/17/2005, you wrote: Everything works like a charm but we receive a lot of spam messages from yahoo.com group with [expoforum_kg] subject. I created a rule in 20_head_tests.cf to score all messages containing [expoforum_kg] in a subject. I know I shouldn't use global cf rules but I was just testing. Unless I'm missing the point... [EMAIL PROTECTED] would be a much better solution. :) Evan
Gray's rules?
I just came across a mention of these rules in another post. I am already using quite a few of the SARE rules and am wondering whether it would be useful to add these to my server. Has anyone done any mass checks on these rules? If they will increase spam detection, I'd love to add them in, but I don't want to significantly increase my false positive rate (which is near zero at the moment). Gray's rules (http://files.grayonline.id.au) Thanks, Bowie
Re: Whitelist Question
I am not sure if there is a whitelist_subject, an allow rule would accomplish the same thing headerSUBJ_ALLOW_RULE_1 Subject =~ /words go here/i describe SUBJ_ALLOW_RULE_1 Subject ALLOW Rule for words go here score SUBJ_ALLOW_RULE_1 -15.0 --Bryan Timothy Richter [EMAIL PROTECTED] 03/17/05 03:51PM Good Afternoon, I have made whitelist_from exceptions and whitelist_to exceptions. Is it possible to make a exception in the whitelist file by subject? I am guessing it would be whitelist_subject . Thanks, Tim - This email transmission and any documents, files or previous email messages attached to it may contain information that is confidential or legally privileged. If you are not the intended recipient, you are hereby notified that any disclosure, copying, printing, distributing or use of this transmission is strictly prohibited. If you have received this transmission in error, please immediately notify the sender by telephone or return email and delete the original transmission and its attachments without reading or saving in any manner. The Evangelical Lutheran Good Samaritan Society. -
Re: Spammers Target Secondary MX hosts?
A secondary MX host will get mostly spam. Mailers that follow the rules will use the MX records as they were intended. Spammers scan all hosts for port 25 and send email through them any way they can. You can put a machine on the Internet without any MX records and spam will start flowing through it. It usually does not take them very long to discover a mail server. The upside is that the spam can be used for testing new versions of SpamAssassin. :) On Fri, 18 Mar 2005 08:09:24 -0500, Yang Xiao [EMAIL PROTECTED] wrote: Hi all, I've been noticing it lately that almost 90% of emails come in through our secondary MX host are spams, I just want to know if there's an explanation for this, my guess is that the spammers spam the secondary MX host intentionally for some reason I can't understand, maybe hoping the secondary host will configured with less care? Many thanks, Yang
Re: Is this Received header correctly formatted?
... Date: Thu, 17 Mar 2005 00:29:43 +0100 From: mouss [EMAIL PROTECTED] ... To: List Mail User [EMAIL PROTECTED] Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED] Subject: Re: Is this Received header correctly formatted? ... List Mail User wrote: In other words, lowercase is conformant. and your first point is not correct (though all the examples do show uppercase). However, you are completely correct that the helo= is flat out wrong, why? it's inside a comment, no? but with a slight variation, and it becomes something like (watson1 [4.16.241.28]) which is not only conformant, but is the the typical behavior or both sendmail and postfix. except that here the situation is reversed. while postfix and sendmail use from heloname (client_namer [client_ip]), others such as qmail prefer from client_name ([client_ip]) (helo heloname) or other variants. Mous, You're correct about the reversal, I realized that *after* I sent the message. Also technically the area after the [client_ip] is not white space. Eric properly pointed out that the entire header field already has an assigned use already, and the comment in the definition states specifically not to use information from the HELO. To requote: TCP-info = Address-literal / ( Domain FWS Address-literal ) ; Information derived by server from TCP connection ; not client EHLO. Notice the definition does not use any specification for white space after the address literal. The single space character does not count - The notation uses that to delineate between atoms and/or tokens; There would have to be a reference to either FWS, WSP or maybe even LWSP might qualify; But since none of those atoms are part of the definition, the area after the literal and before the ')' does not qualify as white space. So the clause ([4.16.241.28] helo=watson1) seems to be clearly non-conformant. ahem. the specs provide for comments, and don't restrict comments. so whatever is in between pars is ok. the specs even allow silly things linke Fr(foo)om. btw, unlike what a lot of people seem to think, rfc2821 is only a standard track'. I've made this argument myself, but it has been upgraded to Best Practices. Also, your Fr(foo)om case is not allowed, because as you can read below a comment is to be parsed as if it were a single space character, so your example would parse to Fr om which is meaningless. Anyway, let's go back to RFC822 which is a Standard and still stands depite the intentions for 2822 to replace it. To quote the `old' restriction on comments: RFC822 Section 3.4.3 3.4.3. COMMENTS A comment is a set of ASCII characters, which is enclosed in matching parentheses and which is not within a quoted-string The comment construct permits message originators to add text which will be useful for human readers, but which will be ignored by the formal semantics. Comments should be retained while the message is subject to interpretation according to this standard. However, comments must NOT be included in other cases, such as during protocol exchanges with mail servers. Comments nest, so that if an unquoted left parenthesis occurs in a comment string, there must also be a matching right parenthesis. When a comment acts as the delimiter between a sequence of two lexical symbols, such as two atoms, it is lex- ically equivalent with a single SPACE, for the purposes of regenerating the sequence, such as when passing the sequence onto a mail protocol server. Comments are detected as such only within field-bodies of structured fields. If a comment is to be folded onto multiple lines, then the syntax for folding must be adhered to. (See the Lexical Analysis of Messages section on Folding Long Header Fields above, and the section on Case Independence below.) Note that the official semantics therefore do not see any unquoted CRLFs that are in comments, although particular pars- ing programs may wish to note their presence. For these pro- grams, it would be reasonable to interpret a CRLF LWSP-char as being a CRLF that is part of the comment; i.e., the CRLF is kept and the LWSP-char is discarded. Quoted CRLFs (i.e., a backslash followed by a CR followed by a LF) still must be followed by at least one LWSP-char. and RFC822 Section 3.4.6 3.4.6. BRACKETING CHARACTERS There is one type of bracket which must occur in matched pairs and may have pairs nested within each other: o Parentheses (( and )) are used to indicate com- ments. ... So even in RFC822, comments require parenthesis. Also, the inclusion of the
Re: Time in the log file is incorrect?
At 06:37 AM 3/18/2005, David Suen wrote: Hi all, I just read my spamd log file and I found that the time in the log is incorrect. I just sent an email to myself and here is the log: @4000423abce8189d6a1c 2005-03-18 11:34:54 [21095] snip whereas right now the time is Fri Mar 18 22:36:42 EST 2005. Given that your time zone is GMT +11, and the difference between those two times is 11 hours, I'd check the server in question and make sure that /etc/localtime is in fact the correct timezone, and not GMT.
DCC License Change
Has anyone been following the DCC license change thread on the DCC mailing list? Is anyone going to be negatively affected by it? I run a small mail server for my own small business, so I don't imagine that it will affect me. Does anyone have any opinions on the licensing change? Thomas
Re: SPAM/HAM folder
Norman Zhang wrote: On my SA Gateway, I have no local box except root. Should I forward HAM/SPAM to local box? Mail are not meant for local delivery here. I assume you mean for Bayesian training. In that case, you can't use forwarded mail for that, as Bayesian training depends on having the original message intact. If you try and train on forwarded messages, your Bayes database will get real ugly real quick. We use an Exchange public folder that get's messages dragged to it, and a Perl script on the Exim gateway box that grabs messages from the public folder via IMAP and trains them. It's not a perfect system, as users have to figure out how to drag and drop the messages into the public folder, plus Exchange will strip some headers out and add some of its own when you access a message through IMAP, but its better than nothing. Steven -- Steven Dickenson [EMAIL PROTECTED] http://www.mrchuckles.net
Re: Spammers Target Secondary MX hosts?
...on Fri, Mar 18, 2005 at 08:52:23AM -0500, Yang Xiao wrote: On Fri, 18 Mar 2005 13:48:46 +, Duncan Hill [EMAIL PROTECTED] wrote: In a large number of cases, the secondary MX is not configured to know the list of valid users etc, and may be configured to pass directly to the internal mail server, bypassing protections on the primary relay. hm...I'd be interested to know what's the percentage is like for this kind of settings just to feed my curiousity, because it totally doesn't make sense to me , it's like settings up a secondary firewall with no blocking rules, what good is it? It shurely doesn't make sense if the secondary MX is under your control, but there are many setups where the ISP or someone else runs a backup MX for his customer's domains as a service. With this configuration, the secondary MX will usually not know about valid users in the destination domain. Therefore it makes sense for the spammers to deliver mail to the secondary MX, as they can always claim that 100% of the mails have been successfully delivered. Alex.
Re: Spammers Target Secondary MX hosts?
On Friday 18 March 2005 08:17, Alexander Bochmann wrote: ...on Fri, Mar 18, 2005 at 08:52:23AM -0500, Yang Xiao wrote: On Fri, 18 Mar 2005 13:48:46 +, Duncan Hill [EMAIL PROTECTED] wrote: In a large number of cases, the secondary MX is not configured to know the list of valid users etc, and may be configured to pass directly to the internal mail server, bypassing protections on the primary relay. hm...I'd be interested to know what's the percentage is like for this kind of settings just to feed my curiousity, because it totally doesn't make sense to me , it's like settings up a secondary firewall with no blocking rules, what good is it? It shurely doesn't make sense if the secondary MX is under your control, but there are many setups where the ISP or someone else runs a backup MX for his customer's domains as a service. With this configuration, the secondary MX will usually not know about valid users in the destination domain. Therefore it makes sense for the spammers to deliver mail to the secondary MX, as they can always claim that 100% of the mails have been successfully delivered. Alex. That, in fact, is the setup that I am operating and, yes, most of what comes through my secondary MX, at my ISP, is SPAM. Some time ago I implemented a rule that adds a (small) spam score for mail received via my secondary MX. -- Larry G. Starr - [EMAIL PROTECTED] or [EMAIL PROTECTED] Software Engineer: Full Compass Systems LTD. Phone: 608-831-7330 x 1347 FAX: 608-831-6330 === There are only three sports: bullfighting, mountaineering and motor racing, all the rest are merely games! - Ernest Hemmingway
hits -,
Hello, My spam filter (postfix/amavisd/sa/clamav) has been working well for 3 + months now but several days ago I started getting reports of increased spam getting through. I checked the mail logs and see some messages are scored and others are listed with hits -,. It seems like the legitimate email is getting scored while the spam is not being scored. Does this sound familiar to anyone? Any help would be appreciated greatly. I can't figure out what is going on. Thanks, Andy
Re: Network Tests
Daniel A. de Araujo wrote: Hi guys, I have the Spam Assassin 2.63 with Amavis installed in my box and now I am trying to enable network tests with SpamcopURI. Its working but the delivery of the messages is very slow when network tests are enabled, so I´d to disable it. Any ideas to make the deliver of messages faster with network tests enabled ? First a warning: DO NOT run SA 2.63 on a production server. Upgrade to 2.64 or 3.x because 2.63 has a mime parsing bug that can be used to DoS your server. As for speed: 1) run a caching nameserver on the same box as SA. 2) run a local mirror of some of the RBLs that you can get RSYNC access to the zonefiles for. 3) experiment to see which specific network tests are slow by setting their score to 0 one at a time. 4) if you use DCC, run a local server if you've got a high volume of messages.
Re: Spammers Target Secondary MX hosts?
--On Friday, March 18, 2005 3:17 PM +0100 Alexander Bochmann [EMAIL PROTECTED] wrote: It shurely doesn't make sense if the secondary MX is under your control, but there are many setups where the ISP or someone else runs a backup MX for his customer's domains as a service. With this configuration, the secondary MX will usually not know about valid users in the destination domain. Therefore it makes sense for the spammers to deliver mail to the secondary MX, as they can always claim that 100% of the mails have been successfully delivered. One possibility is to list your primary again as the tertiary, possibly under a different name and/or IP address. Spammers that deliver in reverse MX order will still end up trying to deliver to your primary first. You could also list a bogus server in IP dark space (ie. an address known to have no listening server) so that the spammer must first check the empty address first. Even better is when there's a host there that drops packets (no TCP reset or ICMP port unreachable reply) to port 25, so that the spammer must time out the TCP connection attempt.
Re: DCC License Change
Thomas Cameron wrote: Has anyone been following the DCC license change thread on the DCC mailing list? Is anyone going to be negatively affected by it? I run a small mail server for my own small business, so I don't imagine that it will affect me. Does anyone have any opinions on the licensing change? Well, I can give you my opinions, but they mean nothing whatsoever. I think Vernon's opinions matter much more: http://www.rhyolite.com/pipermail/dcc/2005/002575.html Basically the primary target is those specifically selling managed services and appliances. In general the date-based archive is here, for reference: http://www.rhyolite.com/pipermail/dcc/2005/date.html
Re: DCC License Change
On Fri, Mar 18, 2005 at 11:54:26AM -0500, Matt Kettler wrote: imagine that it will affect me. Does anyone have any opinions on the licensing change? Basically the primary target is those specifically selling managed services and appliances. This was the first I've heard of a license change, but it means that DCC will have to be disabled by default in SA, for the same reason as Razor. -- Randomly Generated Tagline: Disk storage does not only come in 3.5-or-5.25-inch squares. A third type of storage medium-the CD-ROM-is spherical. - PC Novice pgpsb8CkFljz4.pgp Description: PGP signature
Re: Spammers Target Secondary MX hosts?
Larry Starr wrote: On Friday 18 March 2005 08:17, Alexander Bochmann wrote: there are many setups where the ISP or someone else runs a backup MX for his customer's domains as a service. With this configuration, the secondary MX will usually not know about valid users in the destination domain. That, in fact, is the setup that I am operating and, yes, most of what comes through my secondary MX, at my ISP, is SPAM. Some time ago I implemented a rule that adds a (small) spam score for mail received via my secondary MX. I'm on the flip side of that: we provide secondary MX services for some of our customers, and I've started adding a small bonus score for mail being sent *to* them through our server. I've also added meta-rules to treat certain rules more harshly. The really annoying thing, from our standpoint, is the backscatter we have to process: 1. Spammer sends to secondary MX (us). 2. We filter out some of the more obvious spam (for the most part using our regular criteria). 3. We relay what's left to the primary MX. 4. Primary MX rejects mail to nonexistant users and mail that trips their own spam filters. 5. We generate DSNs that go to third parties or nonexistant hosts, contributing to backscatter and cluttering up our outbound queue. The backscatter becomes a real problem in the legitimate relay situation, because it's basically unavoidable. If the spam is sent directly to you, you can accept it, discard it, or reject it, and it stops. But if you're relaying to someone, and *they* reject it, now you have to decide whether to generate a DSN or not. We've actually set up a separate queue for bounces that aren't delivered immediately, so that it won't bog down normal mail. -- Kelson Vibber SpeedGate Communications www.speed.net
Re: Spammers Target Secondary MX hosts?
On Fri, Mar 18, 2005 at 10:24:25AM -0800, Kelson wrote: ... 5. We generate DSNs that go to third parties or nonexistant hosts, contributing to backscatter and cluttering up our outbound queue. ... Even worse, the result of bounces sent by _our_ MTA was being Spamcop-RBLed for hitting spamtraps with those bounces! So being a secondary MX might even disrupt your (own) service, and only the second queue you mentioned might have helped agains that! But we don't have THAT yet. Stucki (bounce-annoyed postmaster) -- Christoph von Stuckrad * * |nickname |[EMAIL PROTECTED]\ Freie Universitaet Berlin |/_*|'stucki' |Tel(days):+49 30 838-75 459| Fachbereich Mathematik, EDV|\ *|if online|Tel(else):+49 30 77 39 6600| Arnimallee 2-6/14195 Berlin* * |on IRCnet|Fax(alle):+49 30 838-75454/
RE: Spammers Target Secondary MX hosts?
Kelson wrote: Larry Starr wrote: On Friday 18 March 2005 08:17, Alexander Bochmann wrote: there are many setups where the ISP or someone else runs a backup MX for his customer's domains as a service. With this configuration, the secondary MX will usually not know about valid users in the destination domain. That, in fact, is the setup that I am operating and, yes, most of what comes through my secondary MX, at my ISP, is SPAM. Some time ago I implemented a rule that adds a (small) spam score for mail received via my secondary MX. I'm on the flip side of that: we provide secondary MX services for some of our customers, and I've started adding a small bonus score for mail being sent *to* them through our server. I've also added meta-rules to treat certain rules more harshly. The really annoying thing, from our standpoint, is the backscatter we have to process: 1. Spammer sends to secondary MX (us). 2. We filter out some of the more obvious spam (for the most part using our regular criteria). 3. We relay what's left to the primary MX. 4. Primary MX rejects mail to nonexistant users and mail that trips their own spam filters. 5. We generate DSNs that go to third parties or nonexistant hosts, contributing to backscatter and cluttering up our outbound queue. The backscatter becomes a real problem in the legitimate relay situation, because it's basically unavoidable. If the spam is sent directly to you, you can accept it, discard it, or reject it, and it stops. But if you're relaying to someone, and *they* reject it, now you have to decide whether to generate a DSN or not. We've actually set up a separate queue for bounces that aren't delivered immediately, so that it won't bog down normal mail. Two solutions occur to me: 1) Allow a way for the secondary MX to tell whether the primary MX is up - if it is, don't accept any connections 2) Allow a way for the secondary MX to tell what email addresses on the primary MX are valid (LDAP occurs to me) Matthew.van.Eerde (at) hbinc.com 805.964.4554 x902 Hispanic Business Inc./HireDiversity.com Software Engineer perl -emap{y/a-z/l-za-k/;print}shift Jjhi pcdiwtg Ptga wprztg,
Re: DCC License Change
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Theo Van Dinter writes: On Fri, Mar 18, 2005 at 11:54:26AM -0500, Matt Kettler wrote: imagine that it will affect me. Does anyone have any opinions on the licensing change? Basically the primary target is those specifically selling managed services and appliances. This was the first I've heard of a license change, but it means that DCC will have to be disabled by default in SA, for the same reason as Razor. Well, I guess this gives us a good reason to finally get around to writing our own hashing subsystem... - --j. -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.5 (GNU/Linux) Comment: Exmh CVS iD8DBQFCOyMeMJF5cimLx9ARAtanAKCg9JIbo7A5p5jaKjDl65R7JHgn1ACfVeV5 CXhMpjgjkfBoeNhRhsVYv6c= =k1h/ -END PGP SIGNATURE-
Re: Spammers Target Secondary MX hosts?
...on Fri, Mar 18, 2005 at 10:24:25AM -0800, Kelson wrote: The backscatter becomes a real problem in the legitimate relay situation, because it's basically unavoidable. If the spam is sent directly to you, you can accept it, discard it, or reject it, and it stops. But if you're relaying to someone, and *they* reject it, now you have to decide whether to generate a DSN or not. We've actually set up When I was in that situation, my solution turned out to be milter-ahead, http://www.milter.info/milter-ahead/index.shtml but that won't help you if you're not running sendmail :) Alex.
Re: Spammers Target Secondary MX hosts?
--On Friday, March 18, 2005 10:24 AM -0800 Kelson [EMAIL PROTECTED] wrote: But if you're relaying to someone, and *they* reject it, now you have to decide whether to generate a DSN or not. Using MIMEDefang I don't reject for mail relayed from my secondary: http://www.mimedefang.org/kwiki/index.cgi?CheckForMX
Re: Spammers Target Secondary MX hosts?
... | One possibility is to list your primary again as the tertiary, possibly | under a different name and/or IP address. Spammers that deliver in reverse | MX order will still end up trying to deliver to your primary first. I tried this and it resulted in mail loops when one of the servers was down. I like the suggestion below better. | You could also list a bogus server in IP dark space (ie. an address known | to have no listening server) so that the spammer must first check the empty | address first. Even better is when there's a host there that drops packets | (no TCP reset or ICMP port unreachable reply) to port 25, so that the | spammer must time out the TCP connection attempt. | | Be very careful if the dark space is not under your control. Using a reserved address will get you a rfci listing, using somebody else's address in the US is fraud (of course IANAL). If you do have the space, the best thing is probably to setup a *very* slow server, that always gives a 4xx at the end of the conversation and preferably is doing greylisting too (look at the program from OpenBSD or NetBSD unfortunately also called spamd - part of pf). Paul Shupak [EMAIL PROTECTED]