Re: Clearly bogus false positives -- on abuse contact point, no less

2008-02-17 Thread Philip Prindeville

Matt Kettler wrote:

Philip Prindeville wrote:

Karsten Bräckelmann wrote:

Please, do not paste a gigantic blob of multipart MIME messages. Put it
up somewhere, raw, and simply provide a link.


On Sat, 2008-02-16 at 18:44 -0800, Philip Prindeville wrote:
 
Anyway, I have no idea why I'm seeing some of these scores.  URL 
matches when there aren't even URL's in my message?



There are. Self-inflicted. The ones in square brackets with the leading
550 code, which you seem to keep sending back and forth. :)
  


And just *mentioning* the domain name, without any sort of valid URL 
(ftp: or http: or anything of the sort) is going to match it as a 
URL?  That's highly bogus.


A domain name alone does not a URL make.
You tell that to most windows-based clients, which will automatically 
make clickalble URLs out of things like www.google.com in text sections.


snip



Oh, and DNS_FROM_OPENWHOIS probably is http://open-whois.org/, which
gives you a hint about what it actually is. The hit itself pretty much
mentions this...
  


Yeah, I read this.  And I don't get that either.

How does having your domain be anonymous (for whatever reason... 
maybe you're a small company operating below the radar) make your 
email any more likely to be spam
Decidedly so. The people with the strongest reason to hide their 
contact information are the spammers, and other shady businesses.


That's not to say they're aren't some legitimate folks that use this 
kind of anonymization.  However, the domains by proxy model is a 
questionable practice, as it violates the spirit of the whois 
requirements. Also, many of them violate the letter of the 
requirements, such as the phone issue noted on the open-whois main 
page. (ie:  anyone registered using securewhois is not correctly 
reigstered, per ICANN requirements for whois)


Well, what's ironic here is this:

I go to the open-whois web-site, and read their blurb:

What do you have against privacy?

In a word: nothing. This is not about privacy, but about 
accountability. The Internet is built upon cooperation and 
accountability, anything which undermines accountability is a bad thing. 
The usability of the WHOIS database is seriously undermined by anonymous 
domains.


Ah...  But filtering your spam reports so no one can ever report spam to 
you... that's a lot more accountable, clearly.  :-)







TVD_STOCK1?  There's no mention of stock anywhere in the message.




Not sure, you migth want to try running it with debugging on.
The debug message from the code would be:

 dbg(eval: stock info hit: $1);

That should tell you what exact substring matched the stock info code.


From a quick glimpse of the code, it appears to identify common words
used in stock (as in stock exchange, pump-n-dump penny stocks) spam. It
does not search for the word stock. Just as pretty much no rule in SA
ever searches for single words only...
  


Again, I didn't see anything that should legitimately be causing this 
rule to fire, and certainly not with such a high score for such an 
unreliable rule.






Why am I seeing all of these bogus matches?



From what I can tell, and what you sent us, they don't appear to be
bogus.
  


Depends on whether you equate bare domains with URL's, I suppose.
If MUA's equate them with URLs, spammers will use this, and 
SpamAssassin will use it.


There is only so much braindeath in UA's that you can bend the rules 
for.  Clearly, this involves breaking them.







I looked on the wiki for some of these, but couldn't find 
descriptions.


What should I do?  Just block their domain?  I don't want to deal with
their misconfiguration issues.



Apparently you already exchanged messages? Try not sending the 
offensive

mail in question. Put it up somewhere as reference, if need be. Hmm,
sounds familiar... ;)

  guenther


  


No, I sent them back the offending email, initially.  Which they 
marked as spam (bloody brilliant, of course it's spam, otherwise I 
wouldn't be bothering to report it what else do they expect to 
come to their Abuse mailbox, anyway???).


So I sent back the SA scores back to them, and that's the part that I 
pasted previously.


How do you report Spam to such a site that's going to block your Spam 
reports for being... well, Spam!
Well, it's stupid, and probably a RFC violation to perform such 
filtering on your abuse box. So, I'm not saying the domain in question 
isn't behaving foolishly. You might want to point this out to them, 
and suggest they whitelist their abuse address. At the very least, ask 
them if they have an alternate reporting address that isn't filtered.




I'll give it another try.  If not, their CIDR range and domain name will 
go into my blacklist.  I don't want to open myself up to them if I can't 
reasonably expect them to respond to spam issues when/if they occur (again).


-Philip





Re: SVN notifications killing spamassassin

2008-02-17 Thread Philip Prindeville

Eric A. Hall wrote:

I sometimes get SVN notifications that contain lists of files and their
status. The filenames will often get picked up by the URI matching
algorithm, each of which end up being processed through numerous lookups
(URICOUNTRY, my LDAP filter, etc). Sometimes I get very large messages
with hundreds of file lists, which in turn causes spamassassin to go into
never-never land while it thinks about the hundreds of URI matches.

For example,

  Afpo/reports/perl/nagios_notifications1.pl.bak
  Afoo/reports/perl/nagios_outages1.pl
  Afoo/reports/perl/GWIR.pm

nagios_outages1.pl will be determined as a URI for .pl domain and GWIR.pm
will be determined as a URI for .pm domain, and so forth. The only way to
get these messages through is to disable spamassassin...

I've updated to 3.2.4 just now and it still has the same problem

I'm guessing the URI analyzer needs to be smarter.
  


That's strangely appropriate to the issue I had with calthurs.com.

It would be nice if this checker had an option to enforce checking only 
of well-formed URL's (i.e. not anything that might conceivably be 
munged into a URL by the most ignorant of UA's)... something requiring 
a protocol name (ftp:, http:, tftp:, etc.), a domain name, and a path 
name (even if it's just slash).


Or at the very least, to score complete URL's higher than just domain 
names alone.


-Philip




Re: Clearly bogus false positives -- on abuse contact point, no less

2008-02-17 Thread Philip Prindeville

Matt Kettler wrote:

Philip Prindeville wrote:

Matt Kettler wrote:

Philip Prindeville wrote:
 


Depends on whether you equate bare domains with URL's, I suppose.
If MUA's equate them with URLs, spammers will use this, and 
SpamAssassin will use it.


There is only so much braindeath in UA's that you can bend the rules 
for.  Clearly, this involves breaking them.
Erm.. What rule does this actually break? Is there a rule in an RFC 
somewhere specifying you MUST not interpret bare domains as URIs in 
text emails?


There is an RFC that defines what a URL looks like.  A bare domain 
doesn't cut it.


You want to forbid bare domains in email?  Go ahead.  You can forbid 
anything you like.


But don't call it a test for URL's, since it's clearly not.




Besides, when this braindeath is more the norm than the exception, 
it's a de facto standard. Particularly in the absence of any rules 
against it.


Yeah, I'll talk to the Outlook folks, and file a bug against 
Thunderbird... (I think the latter only does it to be compatible with 
the former...)




*EVERY* graphical MUA I've used in the past 10 years does this. 
Thunderbird, Outlook, Groupwise, Eudora, they all do it. I'm sure 
there are MUAs that don't, but there's an awful lot that do. Most 
webmails seem to do it too. Outlook web access, Comcast and Yahoo all 
do, but I'll concede that Verizon's webmail doesn't.






Clearly bogus false positives -- on abuse contact point, no less

2008-02-16 Thread Philip Prindeville
Hmmm.  I think we need a BL for reporting ISP's that are clueless as to 
run filtering on their abuse mailbox (or the mailbox that's listed for 
their ARIN/RIPE AbuseEmail attributes).


Anyway, I have no idea why I'm seeing some of these scores.  URL matches 
when there aren't even URL's in my message?


A 2.6 score on BAYES_00?  URIBL_JP_SURBL and URIBL_OB_SURBL?  And what 
the heck is DNS_FROM_OPENWHOIS???


TVD_STOCK1?  There's no mention of stock anywhere in the message.  Why am I 
seeing all of these bogus matches?

I looked on the wiki for some of these, but couldn't find descriptions.

What should I do?  Just block their domain?  I don't want to deal with their 
misconfiguration issues.

-Philip





Received: from localhost (localhost)
by mail.redfish-solutions.com (8.14.1/8.14.1) id m1H2M5XP027602;
Sat, 16 Feb 2008 19:22:05 -0700
Date: Sat, 16 Feb 2008 19:22:05 -0700
From: Mail Delivery Subsystem [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
MIME-Version: 1.0
Content-Type: multipart/report; report-type=delivery-status;
boundary=m1H2M5XP027602.1203214925/mail.redfish-solutions.com
Subject: Returned mail: see transcript for details
Auto-Submitted: auto-generated (failure)

This is a MIME-encapsulated message

--m1H2M5XP027602.1203214925/mail.redfish-solutions.com

The original message was received at Sat, 16 Feb 2008 19:22:01 -0700
from pool-71-112-32-245.sttlwa.dsl-w.verizon.net [71.112.32.245]

  - The following addresses had permanent fatal errors -
[EMAIL PROTECTED]
   (reason: 550-This email has been automatically tagged as spam)
[EMAIL PROTECTED]
   (reason: 550-This email has been automatically tagged as spam)

  - Transcript of session follows -
... while talking to alpha.inbound.mercury.spaceservers.net.:

DATA

 550-This email has been automatically tagged as spam
 550-Spam detection software, operated by UKDomains limited, has
 550-identified this incoming email as possible spam.
 550-contact [EMAIL PROTECTED] for details and error reports.
 550-pts rule name  description
 550- -- 
--
 550-1.1 DNS_FROM_OPENWHOIS RBL: Envelope sender listed in
 550-bl.open-whois.org.
 550--0.0 SPF_PASS   SPF: sender matches SPF record
 550--2.6 BAYES_00   BODY: Bayesian spam probability is 0 to 1%
 550-[score: 0.]
 550-1.5 URIBL_JP_SURBL Contains an URL listed in the JP SURBL
 550-blocklist
 550-[URIs: chalturs.com]
 550-1.5 URIBL_OB_SURBL Contains an URL listed in the OB SURBL
 550-blocklist
 550-[URIs: chalturs.com]
 550-0.5 WHOIS_DMNBYPROXY   Contains URL registered to Domains by Proxy
 550-[URIs: redfish-solutions.com]
 550 3.4 AWLAWL: From: address is in the auto white-list
554 5.0.0 Service unavailable

--m1H2M5XP027602.1203214925/mail.redfish-solutions.com
Content-Type: message/delivery-status

Reporting-MTA: dns; mail.redfish-solutions.com
Received-From-MTA: DNS; pool-71-112-32-245.sttlwa.dsl-w.verizon.net
Arrival-Date: Sat, 16 Feb 2008 19:22:01 -0700

Final-Recipient: RFC822; [EMAIL PROTECTED]
Action: failed
Status: 5.2.0
Remote-MTA: DNS; alpha.inbound.mercury.spaceservers.net
Diagnostic-Code: SMTP; 550-This email has been automatically tagged as spam
Last-Attempt-Date: Sat, 16 Feb 2008 19:22:05 -0700

Final-Recipient: RFC822; [EMAIL PROTECTED]
Action: failed
Status: 5.2.0
Remote-MTA: DNS; alpha.inbound.mercury.spaceservers.net
Diagnostic-Code: SMTP; 550-This email has been automatically tagged as spam
Last-Attempt-Date: Sat, 16 Feb 2008 19:22:05 -0700

--m1H2M5XP027602.1203214925/mail.redfish-solutions.com
Content-Type: message/rfc822

Return-Path: [EMAIL PROTECTED]
Received: from [192.168.10.120] (pool-71-112-32-245.sttlwa.dsl-w.verizon.net 
[71.112.32.245])
(authenticated bits=0)
by mail.redfish-solutions.com (8.14.1/8.14.1) with ESMTP id 
m1H2M0XQ027599
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
Sat, 16 Feb 2008 19:22:01 -0700
Message-ID: [EMAIL PROTECTED]
Date: Sat, 16 Feb 2008 18:21:27 -0800
From: Abuse Department [EMAIL PROTECTED]
User-Agent: Thunderbird 2.0.0.9 (Windows/20071031)
MIME-Version: 1.0
To: [EMAIL PROTECTED]
CC: [EMAIL PROTECTED]
Subject: Of course it's spam: it's an abuse mailbox
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.63 on 192.168.1.3

Of course it's spam.  It's a copy of an offending message (that 
originated from *your* site) being reported back to you, and do you 
abuse mailbox.


If it weren't spam, there'd hardly be a point in reporting it now, would 
there?


What other brilliant deductions are to follow?  That there are a lot of 
sick people in a hospital?


Get a clue.  Better yet, if you were as good at detecting *outbound* 
spam coming from your site as you are incoming spam, we wouldn't be 
having

Re: Clearly bogus false positives -- on abuse contact point, no less

2008-02-16 Thread Philip Prindeville

Karsten Bräckelmann wrote:

Please, do not paste a gigantic blob of multipart MIME messages. Put it
up somewhere, raw, and simply provide a link.


On Sat, 2008-02-16 at 18:44 -0800, Philip Prindeville wrote:
  
Anyway, I have no idea why I'm seeing some of these scores.  URL matches 
when there aren't even URL's in my message?



There are. Self-inflicted. The ones in square brackets with the leading
550 code, which you seem to keep sending back and forth. :)
  


And just *mentioning* the domain name, without any sort of valid URL 
(ftp: or http: or anything of the sort) is going to match it as a URL?  
That's highly bogus.


A domain name alone does not a URL make.

A 2.6 score on BAYES_00?  URIBL_JP_SURBL and URIBL_OB_SURBL?  And what 
the heck is DNS_FROM_OPENWHOIS???



Well, if you don't mind having a second look, that is MINUS 2.6 for
Bayes. What's wrong with that?\
  


Oh, sorry, read over the scores too quickly.  Never mind the BAYES_00.



Regarding your SURBL questions... Yes.  Wait, you where hoping for more?
Without any actually asked question? OK, good then. The domain
chalturs.com is listed in these RBLs, as the results tell you. See
http://surbl.org/ for more.
  


I read the top-level page, but didn't see anything really pertinent.  I 
get the idea.  But naming the domain in a message, again, is not the 
same as embedding an entire URL containing the domain.  The two aren't 
equivalent.




Oh, and DNS_FROM_OPENWHOIS probably is http://open-whois.org/, which
gives you a hint about what it actually is. The hit itself pretty much
mentions this...
  


Yeah, I read this.  And I don't get that either.

How does having your domain be anonymous (for whatever reason... maybe 
you're a small company operating below the radar) make your email any 
more likely to be spam



TVD_STOCK1?  There's no mention of stock anywhere in the message.



From a quick glimpse of the code, it appears to identify common words
used in stock (as in stock exchange, pump-n-dump penny stocks) spam. It
does not search for the word stock. Just as pretty much no rule in SA
ever searches for single words only...
  


Again, I didn't see anything that should legitimately be causing this 
rule to fire, and certainly not with such a high score for such an 
unreliable rule.




Why am I seeing all of these bogus matches?



From what I can tell, and what you sent us, they don't appear to be
bogus.
  


Depends on whether you equate bare domains with URL's, I suppose.


I looked on the wiki for some of these, but couldn't find descriptions.

What should I do?  Just block their domain?  I don't want to deal with
their misconfiguration issues.



Apparently you already exchanged messages? Try not sending the offensive
mail in question. Put it up somewhere as reference, if need be. Hmm,
sounds familiar... ;)

  guenther


  


No, I sent them back the offending email, initially.  Which they marked 
as spam (bloody brilliant, of course it's spam, otherwise I wouldn't be 
bothering to report it what else do they expect to come to their 
Abuse mailbox, anyway???).


So I sent back the SA scores back to them, and that's the part that I 
pasted previously.


How do you report Spam to such a site that's going to block your Spam 
reports for being... well, Spam!


(Yes, I'm shocked too to hear there's gambling going on in Casablanca...)



US Senate as bad internet citizens???

2007-11-13 Thread Philip Prindeville
Well, I recently called my Senator to ask him to support enhanced 
network neutrality legislation (since he worked on the 1998 
Telecommunications Bill).


I received his reply 2 days later by email.  Ok.  I found that there 
were some misconceptions he had about the topic on a purely technical 
basis, and decided to reply to him and set him straight (or more 
appropriately, set the staffer straight that had written the response on 
his behalf).


Well...

The original message was received at Tue, 13 Nov 2007 17:04:21 -0500 (EST)
from localhost [127.0.0.1]

  - The following addresses had permanent fatal errors -
[EMAIL PROTECTED]
   (reason: 550 5.1.1 User unknown)

  - Transcript of session follows -
... while talking to bridgeheadpsq.senate.gov.:


 DATA
  

 550 5.1.1 User unknown
550 5.1.1 [EMAIL PROTECTED]... User unknown
 503 5.5.2 Need Rcpt command.



So I'm wondering... if they send emails out that can't be replied to...  
doesn't that correspond to the very definition of a spammer?  Aren't 
they concealing their identity?


Sigh.

Oh, well.  I knew it was asking too much to have meaningful legislation 
on net neutrality (or digital rights, or copyright reform, etc) come 
from Washington D.C.  Perhaps in 50 years they'll finally have a handle 
on it.


But I dared to hope...

-Philip



Re: help

2007-11-13 Thread Philip Prindeville
As a heads up, more people will read your message if you make your 
Subject line more insightful.


Ironically, I contacted Kintera last Spring pointing out that I wasn't 
getting messages from one of their customers because they were sending 
malformed messages that pegged the spam-o-meter (in particular, they 
were sending broken Date: lines).


Didn't hear back.

Apparently, they've never heard of Spam, or have no interest in 
differentiating themselves from less-legitimate content.


Perhaps it's a marketing strategy to sell you more products and services 
to complement the ineffectual ones you're now using.  ;-)


-Philip



Kim Hurlbutt wrote:
Wondering if you can point me in the right direction on how to make 
our spam scores lower.  How can I get information on how to make edits 
to our pages to lower our scores?  We currently use Kintera to send 
our email newsletters.  Please help!!   Thanks
 
An example of our spam score:
 
Your spam score is: 2.9 points


Score Details:
pts rule name  description
 --
--
0.2 HTML_FONT_FACE_ODD BODY: HTML font face is not a commonly used
face
0.2 HTML_MESSAGE   BODY: HTML included in message
0.3 HTML_FONT_BIG  BODY: HTML has a big font
0.6 HTML_TABLE_THICK_BORD  BODY: HTML table has thick border
0.7 MIME_HTML_ONLY BODY: Message only has text/html MIME parts
0.7 HTML_50_60 BODY: Message is 50% to 60% HTML
0.4 FORGED_YAHOO_RCVD  'From' yahoo.com http://yahoo.com/ does 
not match 'Received'

headers


  Kim Hurlbutt
  Development
   Proctor Academy
  603.735.6218
www.proctoracademy.org http://www.proctoracademy.org 




Re: It's a fine line...

2007-11-06 Thread Philip Prindeville

Olivier Nicole wrote:

It's not a matter of cultural imperialism, if that's what you're getting at.

It's an acknowledgment of the importance of the rule of law in cyberspace.



Except that I don't think it is anything close to a rule of law, but
rather a sign of short view.

As I said, I doubt you ever got any spam from my organisation (either
originated from, or relayed).
  


So, what are you saying?  One well behaved citizen obviates the need for 
laws for all others?


It doesn't work that way.


Some countries enforce anti-spam, anti-trespass laws.  Others lack them 
or don't enforce them.



The attitude goes by organisation, not by country.
  


Organizations don't make laws.  Countries do.


When these countries put some teeth into the enforcement of their laws, 
then they will stop being blacklisted.



Plus if we would to ban the oginating country for 50% of spam (not my
figure), USA should be banned.
  


Do the math.  50% of the spam (if that is indeed the case) is very low, 
considering that the US generates a much larger percentage of the total 
Internet traffic than just half.


In any case, you might get spammed from the US, but I don't:  it would 
be too easy for me to make a complaint against the spammer and have them 
be charged, shut down, and fined.


That's what effectively laws, properly enforced, do.


But hey, that is a too big cut from Internet, so in some way it is
cultural imperialism.

Bests,

Olivier

  


That's a fairly specious argument.

-Philip




Re: It's a fine line...

2007-11-06 Thread Philip Prindeville

Matus UHLAR - fantomas wrote:

The advise I've seen (iirc it was in rfc-ignorant lists) was not to allow
send the mail to abuse and non-abuse mailboxes together, e.g. when it's sent
to abuse mailbox, reject rcpt to:non-abuse mailboxes with temporary error
and vice versa. The result should be, once the mail will be sent to all
non-abuse mailboxes, once to abuse mailboxes, and they can be filtered with
different rules.

  


If only it were that easy.

The issue is that a lot of sites are ignorant and haven't filled out all 
of their ICANN required fields in their ARIN (or RIPE or APNIC or LACNIC 
or AFRNIC) registrations  So there might be a OrgTech contact as 
[EMAIL PROTECTED]  who you Bcc: on the message, but you guess that 
there's also an abuse mailbox, and they just forgot to register it.


However, you don't want to mail to the abuse mailbox to see if it gets 
delivered, and then if it bounced, mail to the OrgTech mailbox 
instead... because that's too much wasted time...  So you To: the abuse 
mailbox on the odd chance that it exists, and you Bcc: the noc mailbox 
(or the hostmaster or whatever) as a fallback address.


-Philip



Re: How to filter messages from this list?

2007-11-06 Thread Philip Prindeville

mouss wrote:

Marcin Praczko wrote:
  

It is possible add some text to Subject: For example [SPLIST] - to make easier 
set up filter for emails?
  



How about having the logo in png format on the subject line :)

List managers (and other software) should not alter email unless
absolutely necessary. This includes subject tagging, reply-to munging,
removal of trace headers, format conversion, ... etc. The people who
compose messages know better how their messages should look like. Local
policies may override this, because local users have a chance to hang
their sysadmin ;-p

  


If they're lucky they can.  If they work for Uncle Sam, and their 
sysadmin trots out security requirements as their lame excuse for 
breaking things they don't understand, then they're screwed.  As in:


Received: from gate3-sandiego.nmci.navy.mil (gate3-sandiego.nmci.navy.mil 
[138.163.0.43])
by mail.redfish-solutions.com (8.13.8/8.13.8) with ESMTP id 
l8AGAjaQ028222
for XYZZY; Mon, 10 Sep 2007 10:10:50 -0600
Received: from nawesdnims03.nmci.navy.mil by gate3-sandiego.nmci.navy.mil
 via smtpd (for mail.redfish-solutions.com [66.232.79.143]) with ESMTP; 
Mon, 10 Sep 2007 16:00:18 +
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain;
charset=us-ascii
Content-Transfer-Encoding: quoted-printable
Subject: RE: Chuckle
Date: Mon, 10 Sep 2007 09:10:39 -0700
Message-ID: [EMAIL PROTECTED]
In-Reply-To: [EMAIL PROTECTED]
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
Thread-Topic: Chuckle

Thread-Index: AcfzukOHakkCi8HDRJ2nEhvQOY8RZgACopXw
References: [EMAIL PROTECTED]
From: John Doe [EMAIL PROTECTED]
To: Philip Prindeville [EMAIL PROTECTED]
X-OriginalArrivalTime: 10 Sep 2007 16:10:40.0158 (UTC) 
FILETIME=[219FDBE0:01C7F3C5]


Could they have just *deleted* the Received: lines they didn't want to show?  No, 
of course not.  That would be too easy.  Let's mangle them into something that doesn't 
conform to RFC-822 instead.

As it is, they were leaking hostnames through the Reference: and 
Message-Id: fields anyway...  but we won't talk about that.

They couldn't even leave the id and timestamp fields in the Received: lines 
because that would be revealing... ummm... revealing...  uhh...  how many licks it takes to get to 
the center of a tootsie pop... or some such nonsense.





It's a fine line...

2007-11-05 Thread Philip Prindeville
Between the truly clueless administrator, and those that feign ignorance 
to cover up their implicit approval of spammers...


What do you do in the case where someone is filtering deliveries to 
their abuse mailbox?  (Like 99% of mail sent there isn't going to 
score positively...)


Sigh.



Return-Path: 
Received: from localhost (localhost)
by mail.redfish-solutions.com (8.14.1/8.14.1) id lA5HEMTM017203;
Mon, 5 Nov 2007 10:14:22 -0700
Date: Mon, 5 Nov 2007 10:14:22 -0700
From: Mail Delivery Subsystem [EMAIL PROTECTED]
Message-Id: [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
MIME-Version: 1.0
Content-Type: multipart/report; report-type=delivery-status;
boundary=lA5HEMTM017203.1194282862/mail.redfish-solutions.com
Subject: Returned mail: see transcript for details
Auto-Submitted: auto-generated (failure)

This is a MIME-encapsulated message

--lA5HEMTM017203.1194282862/mail.redfish-solutions.com

The original message was received at Mon, 5 Nov 2007 10:14:14 -0700
from pool-71-112-36-94.sttlwa.dsl-w.verizon.net [71.112.36.94]

  - The following addresses had permanent fatal errors -
[EMAIL PROTECTED]
   (reason: 550 Rejecting message scored for more than 8.0 (9.0) SPAM points.)

  - Transcript of session follows -
... while talking to arminco.com.:

DATA

 550 Rejecting message scored for more than 8.0 (9.0) SPAM points.
554 5.0.0 Service unavailable

--lA5HEMTM017203.1194282862/mail.redfish-solutions.com
Content-Type: message/delivery-status

Reporting-MTA: dns; mail.redfish-solutions.com
Received-From-MTA: DNS; pool-71-112-36-94.sttlwa.dsl-w.verizon.net
Arrival-Date: Mon, 5 Nov 2007 10:14:14 -0700

Final-Recipient: RFC822; [EMAIL PROTECTED]
Action: failed
Status: 5.2.0
Remote-MTA: DNS; arminco.com
Diagnostic-Code: SMTP; 550 Rejecting message scored for more than 8.0 (9.0) 
SPAM points.
Last-Attempt-Date: Mon, 5 Nov 2007 10:14:22 -0700

--lA5HEMTM017203.1194282862/mail.redfish-solutions.com
Content-Type: message/rfc822

Return-Path: [EMAIL PROTECTED]
Received: from [192.168.10.148] (pool-71-112-36-94.sttlwa.dsl-w.verizon.net 
[71.112.36.94])
(authenticated bits=0)
by mail.redfish-solutions.com (8.14.1/8.14.1) with ESMTP id 
lA5HECTN017198
(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO)
for [EMAIL PROTECTED]; Mon, 5 Nov 2007 10:14:14 -0700
Message-ID: [EMAIL PROTECTED]
Date: Mon, 05 Nov 2007 09:14:05 -0800
From: Abuse Department [EMAIL PROTECTED]
User-Agent: Thunderbird 2.0.0.6 (Windows/20070728)
MIME-Version: 1.0
To:  [EMAIL PROTECTED]
Subject: Filtering abuse reports
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Scanned-By: MIMEDefang 2.63 on 192.168.1.3

Of course submitted mail to the Abuse mailbox is going to score as 
spam.  It is spam.  Why else would anyone be reporting it?


Please get a clue and turn off filtering on your abuse mailbox:

The original message was received at Mon, 5 Nov 2007 10:10:58 -0700
from pool-71-112-36-94.sttlwa.dsl-w.verizon.net [71.112.36.94]

  - The following addresses had permanent fatal errors -
[EMAIL PROTECTED]
   (reason: 550 Rejecting message scored for more than 8.0 (20.6) SPAM points.)

  - Transcript of session follows -
... while talking to styx.aic.net.:


 DATA
  

 550 Rejecting message scored for more than 8.0 (15.1) SPAM points.
554 5.0.0 Service unavailable
... while talking to arminco.com.:


 DATA
  

 550 Rejecting message scored for more than 8.0 (20.6) SPAM points.
554 5.0.0 Service unavailable


--lA5HEMTM017203.1194282862/mail.redfish-solutions.com--




Re: It's a fine line...

2007-11-05 Thread Philip Prindeville

Steven Kurylo wrote:

Philip Prindeville wrote:
Between the truly clueless administrator, and those that feign 
ignorance to cover up their implicit approval of spammers...


What do you do in the case where someone is filtering deliveries to 
their abuse mailbox?  (Like 99% of mail sent there isn't going to 
score positively...) 
I filter my abuse address.  Otherwise it would get so many spam 
messages, the ham would get lost in the noise.


Only send the headers.  If the body is actually needed post it on some 
webpage.


A lot of sites won't accept just header lines.  They need both (to 
confirm that it's software piracy, or pornography, or phishing... and 
with phishing, you need the 4th party:  the link that is being used to 
spoof the legitimate organization).  And who bothers to keep track of 
who wants what?


I send everyone a complete copy of the message inline, because some 
braindead sites don't accept attachments, etc.


-Philip



Re: It's a fine line...

2007-11-05 Thread Philip Prindeville

John D. Hardin wrote:

On Mon, 5 Nov 2007, Steven Kurylo wrote:

  

Philip Prindeville wrote:

Between the truly clueless administrator, and those that feign 
ignorance to cover up their implicit approval of spammers...


What do you do in the case where someone is filtering deliveries to 
their abuse mailbox?  (Like 99% of mail sent there isn't going to 
score positively...) 
  


I have a form note that I send to the postmaster address whenever a 
report to the abuse address is bounced. It says (1) you need a working 
abuse address and (2) you shouldn't filter it.


  

I filter my abuse address.  Otherwise it would get so many spam
messages, the ham would get lost in the noise.

Only send the headers.  If the body is actually needed post it on
some webpage.



To heck with that. If I have to jump through that many hoops to report
abuse in *your* network, I'm just going to roundfile it. It's enough
work to pick out all of the relevant abuse addresses to forward the
message to, and note the type of abuse (lottery, 419, money
laundering, etc.).

I almost don't report abuse to Yahoo because they refuse to deal with
RFC-822 attachments and want the entire original message in the body,
and that makes reporting abuse containing a Yahoo.* contact address
two separate operations - forward as attachment to the relay owner,
and forward in the body to Yahoo.
  


Well, Yahoo is a waste of time for other reasons, right?  They tell you 
that it doesn't come from their site...  but to use the top-most 
Received: line's IP address, then to look that up on ARIN  which... 
surprise! ... typically points to Yahoo! (or one of their surrogates, 
like Inktomi...  do their tier-1 people not *know* that Yahoo owns 
Inktomi?  or are they just playing dumb?).


-Philip



Re: It's a fine line...

2007-11-05 Thread Philip Prindeville

Olivier Nicole wrote:

And not to point fingers, how to react with a narrow minded sysadmin
that ban per IP?

From my legitimate mail server in Thailand, that has never been
blacklisted as far as I know:

mailon45: telnet mail.redfish-solutions.com 25
Trying 66.232.79.143...
Connected to mail.redfish-solutions.com (66.232.79.143).
Escape character is '^]'.
554 mail.redfish-solutions.com ESMTP not accepting messages

From another mailserver I administrate, but located in Germany:

sinoon72: telnet mail.redfish-solutions.com 25
Trying 66.232.79.143...
Connected to mail.redfish-solutions.com.
Escape character is '^]'.
220 mail.redfish-solutions.com ESMTP Sendmail 8.14.1/8.14.1; Mon, 5 Nov 
2007 19:10:02 -0700

No need to remind that any person seriously looking at spam problem

know that spam is mainly originated from USA, even if relayed through
other, possibly Asian, countries.

Yes I am quite pisse dby such attitude.

Olivier
  


It's not a matter of cultural imperialism, if that's what you're getting at.

It's an acknowledgment of the importance of the rule of law in cyberspace.

Some countries enforce anti-spam, anti-trespass laws.  Others lack them 
or don't enforce them.


When these countries put some teeth into the enforcement of their laws, 
then they will stop being blacklisted.


-Philip



OT: Motivating good behavior from negligent ISP's

2007-07-11 Thread Philip Prindeville
We're seeing a lot of unwanted attempts to relay traffic through our 
site by Orange.fr, and we've reported this to their Abuse contact as 
well as their upstream provider (rain.fr):


Jul 11 11:30:37 mail mimedefang.pl[31610]: relay: bad tld orange.fr
Jul 11 11:30:37 mail mimedefang.pl[31610]: filter_relay rejected host 
194.250.131.236 (smtp-wifi.orange.fr)
Jul 11 11:30:37 mail sendmail[32044]: l6BHUb3j032044: Milter: connect: 
host=smtp-wifi.orange.fr, addr=194.250.131.236, rejecting commands


No joy.

We'd like to take escalatory measures now.  What is a good RBL site (or 
as appropriate) to get them listed on until they start playing well 
with others?


Would the FAQ's Reporting Spam section be a good place to mention the 
various sites that you can rat out offenders?


Thanks,

-Philip



Re: OT: Motivating good behavior from negligent ISP's

2007-07-11 Thread Philip Prindeville

Michele Neylon :: Blacknight wrote:

Philip Prindeville wrote:
We're seeing a lot of unwanted attempts to relay traffic through our 
site by Orange.fr, and we've reported this to their Abuse contact as 
well as their upstream provider (rain.fr):


Jul 11 11:30:37 mail mimedefang.pl[31610]: relay: bad tld orange.fr
Jul 11 11:30:37 mail mimedefang.pl[31610]: filter_relay rejected host 
194.250.131.236 (smtp-wifi.orange.fr)
Jul 11 11:30:37 mail sendmail[32044]: l6BHUb3j032044: Milter: 
connect: host=smtp-wifi.orange.fr, addr=194.250.131.236, rejecting 
commands



No joy.


How long ago did you report it?



Which time?  It happens regularly, and it's been going on over a month.

-Philip



Re: OT: Motivating good behavior from negligent ISP's

2007-07-11 Thread Philip Prindeville

Phil Barnett wrote:

On Wednesday 11 July 2007, Philip Prindeville wrote:
  

Michele Neylon :: Blacknight wrote:


Philip Prindeville wrote:
  

No joy.


How long ago did you report it?
  

Which time?  It happens regularly, and it's been going on over a month.


Ok. That changes things, but you didn't say anything in your post
about it going on for a month 
  

I note also that they aren't using exponential back-off with a 2 hour
maximum retry interval as suggested by the RFC's:

Jul 11 00:08:19 mail mimedefang.pl[26738]: filter_relay rejected host
194.250.131.236 (smtp-wifi.orange.fr) 



(snip)

  

We've started to take defensive measures...



That would earn them a rule in my firewall.

  


But back to my original question:

What are the websites to get them RBL blacklisted?

How does one nominate them to a place of infamy?

-Philip



Re: ldap: failed to load user scores from LDAP server

2007-07-06 Thread Philip S. Hempel
 I have been getting this error for some time now and have been trying to
 find the root cause of it.


 spamd[2681]: ldap: failed to load user scores from LDAP server, ignored
 (Can't locate object method schema via package URI::ldap at
 /usr/share/perl5/Mail/SpamAssassin/Conf/LDAP.pm line 133, GEN13 line 2.


 I did an upgrade about 2 weeks ago to perl and a few modules, I really am
 not sure what part if caused this.

 If anyone has a clue please help me out on this. I have looked through the
 list and on the Net trying to find something close and I have come up with
 very little. Really nothing related to spamassasin and ldap.


OK, after having looked around a bit I have found that when going from SA
3.1.7 to 3.2 there as been included the marked snippet in the ldap
lookups.


my $port   = $uri-port;
  my $base   = $uri-dn;
  my @attr   = $uri-attributes;
  my $scope  = $uri-scope;
  my $filter = $uri-filter;

  my $schema = $uri-schema;  

 --

  my %extn   = $uri-extensions; # unused


Now for some reason perl uri does not include schema as part of it's
instructions for using uri::ldap


=head1 NAME

URI::ldap - LDAP Uniform Resource Locators

=head1 SYNOPSIS

  use URI;

  $uri = URI-new(ldap:$uri_string);
  $dn = $uri-dn;
  $filter = $uri-filter;
  @attr   = $uri-attributes;
  $scope  = $uri-scope;
  %extn   = $uri-extensions;

  $uri = URI-new(ldap:);  # start empty
  $uri-host(ldap.itd.umich.edu);
  $uri-dn(o=University of Michigan,c=US);
  $uri-attributes(qw(postalAddress));
  $uri-scope('sub');
  $uri-filter('(cn=Babs Jensen)');
  print $uri-as_string,\n;

=head1 DESCRIPTION


Notice that schema is not mentioned.

I looked and found that yes schema is used in perl net ldap but I would
assume if your going to use uri::ldap to parse the uri it needs to follow
uri::ldaps requirements.

I could be wrong, but it seems that the change made from 3.1 to 3.2 has
been implemented wrong.

Is  Net::LDAP::Schema what should be used for getting the schema instead?

Thanks

Philip S. Hempel



ldap: failed to load user scores from LDAP server

2007-06-20 Thread Philip S. Hempel
I have been getting this error for some time now and have been trying to
find the root cause of it.


spamd[2681]: ldap: failed to load user scores from LDAP server, ignored
(Can't locate object method schema via package URI::ldap at
/usr/share/perl5/Mail/SpamAssassin/Conf/LDAP.pm line 133, GEN13 line 2.


I did an upgrade about 2 weeks ago to perl and a few modules, I really am
not sure what part if caused this.

If anyone has a clue please help me out on this. I have looked through the
list and on the Net trying to find something close and I have come up with
very little. Really nothing related to spamassasin and ldap.

Thanks.

Philip S. Hempel


Re: KANA damage and locales

2007-05-07 Thread Philip Prindeville
Philip Prindeville wrote:
 I'm looking at the headers I just got from a Canadian
 ISP's autoresponder  I guess the software is called
 KANA.  Anyone know who owns this?  (Yes, someone
 not very clueful, I know... let's be more specific than
 that...)

   
 Date: sam., 05 mai 2007 18:46:43 -0400
 To: Abuse Department [EMAIL PROTECTED]
 Subject: Bell Nexxia Inc. - Internet Abuse Centre  (KMM15550329V5186L0KM)
 From: Nexxia Abuse Bell Nexxia [EMAIL PROTECTED]
 Reply-To: Nexxia Abuse Bell Nexxia [EMAIL PROTECTED]
 MIME-Version: 1.0
 Content-Type: text/plain; charset = iso-8859-1
 Content-Transfer-Encoding: 8bit
 X-Mailer: KANA Response 7.1.0.9
 


 The Date: line is damaged, apparently.  KANA should be
 clobbering the LOCALE, LC_ALL and LC_TIME variables when it
 starts up, but clearly isn't doing this.

 Ironically, KANA is supposed to track Spam (and other service)
 tickets... but it just ends up muddying the waters by potentially
 creating more incidents of Spam.

 Quelle folie.

 -Philip


   

Spoke to someone at tech support for Kana.  It seems they
immediately closed initial the ticket.

They said it didn't come from a customer, it wasn't clear
what the problem was, and it sounded like hate mail...  so
I should take it up with the ISP.

I told them that (a) I had taken it up with the ISP, but it seems
that on your first day of training on ISP helpdesk staff, they
indoctrinate you to the fact that no customer will ever know
more than you do, so resist any temptation to believe that
a customer might know something you don't or actually know
what he's talking about... so the complaint with BellNexxia
went nowhere after 4 months... and (b) I pointed out to them
that if their product does allow customer configuration to
break standards-conformance... well, then they're allowing
customers to change things that shouldn't be changeable.

So once they got their hackles down, we actually had a
productive conversation where I explained to them that:

* I'm a software developer, and I've identified and diagnosed
   issues with their product;
* That the customer, the public at large, and Kana are not well-
   served by Kana having software that breaks standards
* I pointed them at the chapter and verse of RFC-2822 which
   makes it quite clear what the format of date-specs is
* That having tickets generated in response to Spam complaints
   which in turn trigger as false-positives as Spam just makes
   matters worse...  not better.

Of course, the issue now is that Kana is shipping version
9-dot-something of Response, and Bellnexxia is running 7.0.

Oh, well.

You can lead a horse to water...  Oh, reminds me!  The
Lippenzaners were just in town 2 days ago... ;-)

Anyone else know of any auto responders that seem to be
broken it terms of standards-conformance?

It would be cool for someone to set up a mailbox that you
could have your software send an email to... and it would send
back a conformance report stating what looked right and
what didn't.

Of course, that opens a whole can of potential joe-job attacks...

-Philip



KANA damage and locales

2007-05-05 Thread Philip Prindeville
I'm looking at the headers I just got from a Canadian
ISP's autoresponder  I guess the software is called
KANA.  Anyone know who owns this?  (Yes, someone
not very clueful, I know... let's be more specific than
that...)

 Date: sam., 05 mai 2007 18:46:43 -0400
 To: Abuse Department [EMAIL PROTECTED]
 Subject: Bell Nexxia Inc. - Internet Abuse Centre  (KMM15550329V5186L0KM)
 From: Nexxia Abuse Bell Nexxia [EMAIL PROTECTED]
 Reply-To: Nexxia Abuse Bell Nexxia [EMAIL PROTECTED]
 MIME-Version: 1.0
 Content-Type: text/plain; charset = iso-8859-1
 Content-Transfer-Encoding: 8bit
 X-Mailer: KANA Response 7.1.0.9


The Date: line is damaged, apparently.  KANA should be
clobbering the LOCALE, LC_ALL and LC_TIME variables when it
starts up, but clearly isn't doing this.

Ironically, KANA is supposed to track Spam (and other service)
tickets... but it just ends up muddying the waters by potentially
creating more incidents of Spam.

Quelle folie.

-Philip





Re: whitelist_from ip_range

2007-04-19 Thread Philip Prindeville
Benny Pedersen wrote:
 On Tue, April 17, 2007 01:57, Duane Hill wrote:

   
 http://wiki.apache.org/spamassassin/TrustPath
 

 to me a bit hardcore to read, but it have all ip that is known forwards mails
 to me as trusted_networks even if its still not my servers, and have maked the
 complete rfc1918 in trusted_networks and internal_networks added to this i
 have my own wan ip's in both

 should be it :-)

 trusted_networks 10.0.0.0/8
 trusted_networks 172.16.0.0/12
 trusted_networks 192.168.0.0/16
 trusted_networks 127.0.0.0/8

 internal_networks 10.0.0.0/8
 internal_networks 172.16.0.0/12
 internal_networks 192.168.0.0/16
 internal_networks 127.0.0.0/8

 and last my wan ips as trusted_networks and internal_networks

 after this all known forward ips as trusted_networks
   

Given the number of ISP's that don't have rDNS configured,
whitelist_from_rcvd should probably be extended to support
IP/CIDR addresses as well...

Let's not overload the meanings of trusted_networks and
internal_networks.  These latter two are already confusing
enough for most newbies without having them take on
additional unintended meanings.

-Philip



OT: Dealing w/ poor network citizens like Yahoo!

2007-04-19 Thread Philip Prindeville
Hi.

This isn't so much a technical question as a philosophical one.
We're tired of dealing with Yahoo! which seems to either
(a) have the poorest trained abuse staff of any large email
service provider on the planet, or (b) they have a malicious
corporate culture of flat-out denying any email originated
from their networks, no matter how compelling the evidence.

I'm out of ideas, so I thought I'd turn to the group.

I recently had a message with the following headers:

Return-Path: [EMAIL PROTECTED]
Received: from smtp105.biz.mail.mud.yahoo.com (smtp105.biz.mail.mud.yahoo.com 
[68.142.200.253])
by mail.redfish-solutions.com (8.13.8/8.13.8) with SMTP id 
l2V1kkqG009611
for [EMAIL PROTECTED]; Fri, 30 Mar 2007 19:46:52 -0600
Received: (qmail 65061 invoked from network); 31 Mar 2007 01:46:46 -
Received: from unknown (HELO localhost) ([EMAIL PROTECTED]@4.79.181.240 with 
plain)
  by smtp105.biz.mail.mud.yahoo.com with SMTP; 31 Mar 2007 01:46:46 -
X-YMail-OSG: 
4SuIk60VM1mrOGBAKk3UQSIGXvsb4QmL0rwvi97gE9mIpViIsNyNpLnGy2BQbmYSCoUdeywpxW25RWzcK6ECZbX37ayshFDwIXNvRKxXqW3hqhkRMIw-
Date: Sat, 31 Mar 2007 01:46:45 -0400
From: Monster.com [EMAIL PROTECTED]
X-Mailer: Microsoft Outlook, Build 10.0.2627
Reply-To: Monster.com [EMAIL PROTECTED]
X-Priority: 3 (Normal)
Message-ID: [EMAIL PROTECTED]
To: Philip Prindeville [EMAIL PROTECTED]
Subject: Money-Investment
Mime-Version: 1.0
Content-Type: multipart/mixed;boundary=--



If someone can prove to me that this message didn't come from
Yahoo!, I will eat my shorts.

But until then, my next course of action seems to be blacklisting
Yahoo!, because I'm tired of their not investigating messages that
pretty obviously seem to be coming from them.

Any collective wisdom?

-Philip




[Mimedefang] Adding new From: variable

2007-04-11 Thread Philip Prindeville
Is it worth adding From:fullname and From:comment to
handle the Full Name and (comment) fields of the From:
line?

I'm thinking something like:

header L_LUX_CAPITALFrom:fullname =~ /^The Lux Capital$/

...


might be handy, in addition to From:raw and From:addr.

File a bug for the enhancement?

-Philip


Tutorial for setting up a spamassassin mailfilter? (No local mailboxes except spamtrap)

2007-04-09 Thread Philip Seccombe
Hey everyone,

 

I've found various opinions on this when I search so thought I would ask
here and see what people say.

 

Currently I have a debian box with spamassassin being called by qmail on
incoming emails, emails are then forwarded back out to the external mail
servers.

 

I am wanting to reset this up and wanting to do so on a virtual machine
using vmware.

 

Does anyone know of or have a tutorial or can advise about setting this
up? It does not need to be the same programs eg qmail, but must be
spamassassin and debian.

 

Basically we change customer ABC Ltd mx record so mail.abcltd.com goes
to our spamassassin filters ip address.

Email to @abcltd.com goes to our spam filter, it checks it, if its spam
it saves it in a local mailbox, if its ham it forwards it to ABC Ltd's
server.

 

 

Kind Regards,

Philip Seccombe

Turnstone Technologies NZ Limited

 

Phone: +64 9 970 5550

Fax: +64 9 970 5559

DDI: +64 9 970 5552

Email: [EMAIL PROTECTED] 

Web: www.turnstone.co.nz 

 



RE: spam mails bypassing spamassassin?

2007-02-23 Thread Philip Seccombe
I take it your saving your email on the same server that does the spam 
filtering?
Only other thing I could think of if this is not the case is email being sent 
directly to your mail server via secondry mx records or something.
I run a server which filters mail for clients which is what made me think of 
it, not sure if this is going to affect you though?

Cheers
Phil


-Original Message-
From: Mathias Homann [mailto:[EMAIL PROTECTED]
Sent: Fri 2/23/2007 9:56 PM
To: users@spamassassin.apache.org
Subject: spam mails bypassing spamassassin?
 
Hi,


I'm running the following mail chain:
fetchmail - postfix - clamsmtpd - postfix - spamassassin 3.1.7 (as 
local_transport via the spamdeliver python script that came with the 
spamassassin sources) - cyrus imapd (where spam gets sorted out based on its 
score).

now, since a few days, i keep getting the same spam mail several times a day, 
which has _no_ spamassassin headers at all, as if it has found a way _around_ 
my spamassassin.

Anyone got any ideas?

...where can i put the mail for general inspection? I guess if I attached it 
to a mail to this list, it would get filtered, right?


bye,
MH



RE: FuzzyOcr - no image files found in samples?

2007-02-13 Thread Philip Seccombe
What if you point directly to the .eml eg
Spamassassin -tD  /this/is/the/directory/samples/ocr-animated.eml

Just to be absolutely sure it is findingthe correct place??
Check permissions on the .eml, view it and see if it seems to have an
image inside

Just the usual I can suggest sorry


Kind Regards,
Philip Seccombe
Turnstone Technologies NZ Limited

Phone: +64 9 970 5550
Fax: +64 9 970 5559
DDI: +64 9 970 5552
Email: [EMAIL PROTECTED] 
Web: www.turnstone.co.nz 

-Original Message-
From: Steve Pfister [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, 14 February 2007 10:51 a.m.
To: 'Matt Kettler'
Cc: users@spamassassin.apache.org
Subject: RE: FuzzyOcr - no image files found in samples?

Sorry... I guess I wasn't clear. I'm running:

Spamassassin -tD  ocr-animated.eml

In the samples directory of FuzzyOcr-3.5.1.

It's saying there's no image files found.

-Original Message-
From: Matt Kettler [mailto:[EMAIL PROTECTED]
Sent: Tuesday, February 13, 2007 4:41 PM
To: Steve Pfister
Cc: users@spamassassin.apache.org
Subject: Re: FuzzyOcr - no image files found in samples?

Steve Pfister wrote:

 I'm trying to install FuzzyOcr 3.51 (with patches for  10.34 netpbm)
 on RedHat Linux 9 with Spamassassin 3.1.7. I'm trying to test it out
 with the samples images, but I keep getting:



 [25404] dbg: FuzzyOcr: Starting FuzzyOcr...

 [25404] info: FuzzyOcr: Processing Message with ID no messageid
 (no sender - no receipients)

 [25404] dbg: FuzzyOcr: Skipping OCR, no image files found...

 [25404] dbg: FuzzyOcr: Processed in 0.001779 sec.



 And the log file just says:



 2007-02-13 13:38:56 [26451] Processing Message with ID no
 messageid (no sender - no receipients)



 What might I be missing?

Sounds like you're missing an email that the images are attached to.





SpamAssassin using spamc but not using rules correctly? Is my time being wasted changing local.cf etc?

2007-02-12 Thread Philip Seccombe
/$file_id - $!);

  open(SOUT,|$spamc_binary $spamc_options 
$scandir/$wmaildir/new/$file_id.spamc)||error_condition(cannot open
for write $scand

ir/$wmaildir/new/$file_id.spamc - $!);

 

  print SOUT X-Envelope-From: $headers{'MAILFROM'}\n;

  while (SIN) {

print SOUT;

  }

  close(SIN)||error_condition(cannot close
$scandir/$wmaildir/new/$file_id - $!);

  close SOUT;

  $spamassassin_status=($?  8);

  $sa_status=$spamassassin_status if ($sa_fast);

 
open(SA,$scandir/$wmaildir/new/$file_id.spamc)||error_condition(can
not open for read $scandir/$wmaildir/new/$file_id.spamc - 

$!);

  while (SA) {

if ($sa_fast) {

  chomp;

  ($sa_score,$sa_max)=split(/\//,$_,2);

  $sa_tag++;

  last;

} else {

  #X-Spam-Status: No, hits=2.8 required=5.0

  if (/^X-Spam-Status: (Yes|No), (hits|score)=(-?[\d\.]*)
required=([\d\.]*)/) {

$sa_tag++;

$sa_status=1 if ($1 eq Yes);

$sa_score=$3;$sa_max=$4;

  }

}

  }

  close SA ;

 

  $sa_score='?' if (!$sa_score);

  $sa_max='?' if (!$sa_max);

 

  if (!$sa_fast  -s $scandir/$wmaildir/new/$file_id.spamc 
$spamassassin_status == 0) {

debug(SA: overwriting $scandir/$wmaildir/new/$file_id with
$scandir/$wmaildir/new/$file_id.spamc);

rename
($scandir/$wmaildir/new/$file_id.spamc,$scandir/$wmaildir/new/$file_i
d);

  } else {

unlink($scandir/$wmaildir/new/$file_id.spamc);

  }

if ($sa_max  $sa_score || ($sa_score == 0)) {

$tag_score .= SA:0($sa_score/$sa_max):;

$sa_comment = No, hits=$sa_score required=$sa_max if ($sa_fast);

  } else {

$tag_score .= SA:1($sa_score/$sa_max):;

$sa_comment = Yes, hits=$sa_score required=$sa_max if ($sa_fast);

debug(SA: yup, this smells like SPAM);

  }

  if ($sa_score  0) {

$sa_score=int($sa_score);

#Keep it RFC compliant

$sa_score=100 if ($sa_score  100);

my $si=0;

if ($sa_fast) {

  while ($si  $sa_score) {

$si++;

$sa_level .= $sa_symbol;

  }

}

  }

  $stop_spamassassin_time=[gettimeofday];

  $spamassassin_time = tv_interval ($start_spamassassin_time,
$stop_spamassassin_time);

  debug(spamassassin: finished scan of dir \$ENV{'TMPDIR'}\ in
$spamassassin_time secs);

}

 

Does anyone have any idea what on earth is going on here?

I'm not a huge linux guru so I'm a little confused, qmail appears to
download the message, check if it is a virus, then call spamc and check
if it is spam, if it is then it puts it on a pop mailbox on the server
else it forwards the message onto the customers mail server

 

Appologies on the huge email, I wanted to give as much detail as I could

 

Kind Regards,

Philip Seccombe

Turnstone Technologies NZ Limited

 

Phone: +64 9 970 5550

Fax: +64 9 970 5559

DDI: +64 9 970 5552

Email: [EMAIL PROTECTED] 

Web: www.turnstone.co.nz 

 



RE: Blocking MMS messages?

2007-02-12 Thread Philip Seccombe
Whitelisting @mms1.telstra.com would be best wouldn't it?
Rather than change rules and end up letting through spam with numbers in
the email address etc
Big things there seem to be all numbers in email address, missing
subject etc

I'm not sure but can you whitelist */[EMAIL PROTECTED] or
something?


Kind Regards,
Philip Seccombe
Turnstone Technologies NZ Limited

Phone: +64 9 970 5550
Fax: +64 9 970 5559
DDI: +64 9 970 5552
Email: [EMAIL PROTECTED] 
Web: www.turnstone.co.nz 

-Original Message-
From: Steve Monkhouse [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, 13 February 2007 12:13 p.m.
To: users@spamassassin.apache.org
Subject: Blocking MMS messages?


Hi All.. 

Im getting a lot of complaints at the moment about spamassassin blocking
legitimate MMS messages sent from mobiles direct to email addresses.. 

How is everyone combating this ?

SPAM, 61413444333/[EMAIL PROTECTED] - [EMAIL PROTECTED],
Yes,
score=12.207 tag=2 tag2=6.31 kill=6.31 tests=[AWL=0.301,
BAYES_00=-2.312,
EXTRA_MPART_TYPE=0.733, FROM_ALL_NUMS=2.312, FROM_LOCAL_HEX=2.24,
FROM_STARTS_WITH_NUMS=1.829, FRONTPAGE=1.459, HTML_70_80=0.144,
HTML_IMAGE_ONLY_16=0.338, HTML_MESSAGE=0.001,
HTML_SHORT_LINK_IMG_2=2.739,
MIME_HTML_ONLY=0.389, MISSING_SUBJECT=2.035], autolearn=no, quarantine
2hLLdfl23DgS (spam-quarantine)

The email was a simple photo attached and emailed.. that was it.. 

Thoughts ?

Steve




RE: SpamAssassin using spamc but not using rules correctly? Is my time being wasted changing local.cf etc?

2007-02-12 Thread Philip Seccombe
Nope, haven't restarted anything recently, I have since changing rules though

qmail just runs spamc as a command line to the email message and gets its 
response afaik


-Original Message-
From: Bob McClure Jr [mailto:[EMAIL PROTECTED]
Sent: Tue 2/13/2007 12:29 PM
To: users@spamassassin.apache.org
Subject: Re: SpamAssassin using spamc but not using rules correctly? Is my time 
being wasted changing local.cf etc?
 
On Tue, Feb 13, 2007 at 11:42:22AM +1300, Philip Seccombe wrote:
 Hi everyone,
 
  
 
 I've taken over a mail server from a previous technician and he's
 modified qmail to call spamassassin and the problem is I make changes to
 local.cf but I don't think they get used.
 
  
 
 Reasoning is that mail.info shoes it saying that required score is 5.0
 but I've changed this to 4.5
 
 spamassassin --lint -D will say that 4.5 is required:
 
  
 
 [21280] dbg: rules: running full-text regexp tests; score so far=1.046
 
 [21280] dbg: check: is spam? score=1.046 required=4.5
 
 [21280] dbg: check:
 tests=BAYES_05,MISSING_SUBJECT,NO_RECEIVED,NO_RELAYS,TO_CC_NONE
 
 [21280] dbg: check:
 subtests=__HAS_MSGID,__MSGID_OK_DIGITS,__MSGID_OK_HOST,__NONEMPTY_BODY,_
 _SANE_MSGID,__UNUSABLE_MSGID
 
  
 
 /var/log/mail.info shows the following:
 
  
 
 Feb 13 11:26:53 nibbler spamd[14048]: spamd: connection from localhost
 [127.0.0.1] at port 44594 

Umm, did you restart spamd?

 
 remainder snipped
  
 
 Does anyone have any idea what on earth is going on here?
 
 I'm not a huge linux guru so I'm a little confused, qmail appears to
 download the message, check if it is a virus, then call spamc and check
 if it is spam, if it is then it puts it on a pop mailbox on the server
 else it forwards the message onto the customers mail server
 
  
 
 Appologies on the huge email, I wanted to give as much detail as I could
 
  
 
 Kind Regards,
 
 Philip Seccombe
 
 Turnstone Technologies NZ Limited
 
  
 
 Phone: +64 9 970 5550
 
 Fax: +64 9 970 5559
 
 DDI: +64 9 970 5552
 
 Email: [EMAIL PROTECTED] 
 
 Web: www.turnstone.co.nz 

Cheers,
-- 
Bob McClure, Jr. Bobcat Open Systems, Inc.
[EMAIL PROTECTED] http://www.bobcatos.com
This day I call heaven and earth as witnesses against you that I have
set before you life and death, blessings and curses. Now choose life,
so that you and your children may live.  Deuteronomy 30:19 (NIV)



RE: A New Approach: Find the Ham

2007-02-11 Thread Philip Seccombe
Apologies if this has been answered before or anything, but where/how
are you generating those stats?
I'm not using SA with SQL so I'm not sure if it will work for me, but
those I like!

Stats in question: http://www.blue-canoe.com/stats/index.php?D1=11 


Kind Regards,
Philip Seccombe
Turnstone Technologies NZ Limited

Phone: +64 9 970 5550
Fax: +64 9 970 5559
DDI: +64 9 970 5552
Email: [EMAIL PROTECTED] 
Web: www.turnstone.co.nz 

-Original Message-
From: Nigel Frankcom [mailto:[EMAIL PROTECTED] 
Sent: Sunday, 11 February 2007 9:23 a.m.
To: Miles Fidelman
Cc: SpamAssassin Users
Subject: Re: A New Approach: Find the Ham

On Sat, 10 Feb 2007 15:14:56 -0500, Miles Fidelman
[EMAIL PROTECTED] wrote:

Dan wrote:
 I've developed a new approach to scoring that I want to 1) share with

 everyone and 2) make into a working system thats as accurate as what 
 I've already built, but easier to use.  First, the theory:

 NEW ASSUMPTION
 All messages are spam unless x,y,z score says they're ham.

 NEW APPROACH
 Block everything, then create rules to not catch what you do want.  
 ie, build tests that target the spam (keeping all the tests you've 
 already built), then score the thousands of ways ham triggers on
those 
 tests.
It strikes me that the hardest part of this approach is filtering out 
too much ham.  At least for me, it's more important to make sure that 
people reach me, than to filter out all spam.  If we take the approach 
that everything is to be filtered out, except x,y,z - then the risk of 
filtering out too much seems pretty high.

These are my local stats... I'd far rather those numbers were the
other way round.

Even if Dan is wrong, at least he's thinking.

http://www.blue-canoe.com/stats/index.php?D1=11

What do Theo, Matt  Co have to say? They've been doing this a lot
longer than us.

Kind regards


RE: How to block yahoogroups?

2007-02-11 Thread Philip Seccombe
Can you blacklist @ returns.groups.yahoo.com and then whitelist
[EMAIL PROTECTED] or something?

 

I'm not sure how the yahoo groups work, but is the reply address
specific to each group or does it get sent from the person to the group
address like this list?

 

Kind Regards,

Philip Seccombe

Turnstone Technologies NZ Limited

 

Phone: +64 9 970 5550

Fax: +64 9 970 5559

DDI: +64 9 970 5552

Email: [EMAIL PROTECTED] 

Web: www.turnstone.co.nz 

 

From: Firdaus Tjahyadi [mailto:[EMAIL PROTECTED] 
Sent: Monday, 12 February 2007 3:53 p.m.
To: users@spamassassin.apache.org
Subject: How to block yahoogroups?

 

Dear All

I'm having trouble blok a few yahoogroups milist
i want blok this milist

[EMAIL PROTECTED] 
[EMAIL PROTECTED]
[EMAIL PROTECTED]
[EMAIL PROTECTED]

but i did'nt want to blok this milist

[EMAIL PROTECTED]

how to set that rule ?

i'v tried setting in badmailfrom but did'nt work cause yahoogroups is
sent by @ returns.groups.yahoo.com http://returns.groups.yahoo.com/ 
i'v tried to set in my local.cf http://local.cf/  spamassassin like
this blacklist_from [EMAIL PROTECTED]
but it did'nt work too 

thanks for any help



sa-update gives error message Insecure dependency in open while running with -T switch

2007-02-08 Thread Philip Seccombe
Hi everyone,

 

Tried Googling this but no success

 

Any advise would be greatly appreciated

 

Is it updating or is that error mean it is stopping at the end and not
updating?

 

When I run sa-update -D I get the following:

 

nibbler:/etc/spamassassin# sa-update -D

[9013] dbg: logger: adding facilities: all

[9013] dbg: logger: logging level is DBG

[9013] dbg: generic: SpamAssassin version 3.1.0

[9013] dbg: config: score set 0 chosen.

[9013] dbg: dns: is Net::DNS::Resolver available? yes

[9013] dbg: dns: Net::DNS version: 0.48

[9013] dbg: dns: name server: 202.162.177.194, family: 2, ipv6: 0

[9013] dbg: generic: sa-update version svn231362

[9013] dbg: generic: using update directory: /etc/mail/spamassassin

[9013] dbg: diag: perl platform: 5.008004 linux

[9013] dbg: diag: module installed: Digest::SHA1, version 2.10

[9013] dbg: diag: module installed: Net::SMTP, version 2.26

[9013] dbg: diag: module installed: Mail::SPF::Query, version 1.997

[9013] dbg: diag: module not installed: IP::Country::Fast ('require'
failed)

[9013] dbg: diag: module installed: Razor2::Client::Agent, version 2.67

[9013] dbg: diag: module installed: Net::Ident, version 1.20

[9013] dbg: diag: module installed: IO::Socket::INET6, version 2.51

[9013] dbg: diag: module not installed: IO::Socket::SSL ('require'
failed)

[9013] dbg: diag: module installed: Time::HiRes, version 1.59

[9013] dbg: diag: module installed: DBI, version 1.46

[9013] dbg: diag: module installed: Getopt::Long, version 2.34

[9013] dbg: diag: module installed: LWP::UserAgent, version 2.033

[9013] dbg: diag: module installed: HTTP::Date, version 1.46

[9013] dbg: diag: module installed: Archive::Tar, version 1.23

[9013] dbg: diag: module installed: IO::Zlib, version 1.04

[9013] dbg: diag: module installed: DB_File, version 1.808

[9013] dbg: diag: module installed: HTML::Parser, version 3.45

[9013] dbg: diag: module installed: MIME::Base64, version 3.07

[9013] dbg: diag: module installed: Net::DNS, version 0.48

[9013] dbg: channel: attempting channel updates.spamassassin.org

[9013] dbg: channel: update directory
/etc/mail/spamassassin/updates_spamassassin_org

[9013] dbg: channel: channel cf file
/etc/mail/spamassassin/updates_spamassassin_org.cf

[9013] dbg: dns: 0.1.3.updates.spamassassin.org = 503923, parsed as
503923

[9013] dbg: channel: reading MIRRORED.BY file

[9013] dbg: channel: skipping non-HTTP mirror: # test mirror: zone,
cached via Coral

[9013] dbg: channel: skipping non-HTTP mirror:
#http://buildbot.spamassassin.org.nyud.net:8090/updatestage/

[9013] dbg: channel: found mirror http://spamassassin.kluge.net/updates/

[9013] dbg: channel: selected mirror
http://spamassassin.kluge.net/updates

[9013] dbg: http: GET request,
http://spamassassin.kluge.net/updates/503923.tar.gz

[9013] dbg: http: GET request,
http://spamassassin.kluge.net/updates/503923.tar.gz.sha1

[9013] dbg: http: IMS GET request,
http://spamassassin.kluge.net/updates/MIRRORED.BY, Fri, 02 Feb 2007
04:46:19 GMT

[9013] dbg: sha1: verification expected:
bc55e350cb7a31aa24b4aadcf46bfb9e8d4104ce

[9013] dbg: sha1: verification got :
bc55e350cb7a31aa24b4aadcf46bfb9e8d4104ce

[9013] dbg: channel: populating temp content file

[9013] dbg: channel: file verification passed, installing update

[9013] dbg: channel: updating MIRRORED.BY contents

[9013] dbg: channel: cleaning out update directory

[9013] dbg: channel: extracting archive

Insecure dependency in open while running with -T switch at
/usr/lib/perl/5.8/IO/File.pm line 70.

 

 

 

Kind Regards,

Philip Seccombe

Turnstone Technologies NZ Limited

 

Phone: +64 9 970 5550

Fax: +64 9 970 5559

DDI: +64 9 970 5552

Email: [EMAIL PROTECTED] 

Web: www.turnstone.co.nz 

 



RE: sa-update gives error message Insecure dependency in open while running with -T switch

2007-02-08 Thread Philip Seccombe
This is what happens:

commit: wrote /etc/perl/CPAN/Config.pm
CPAN: Storable loaded ok
CPAN: LWP::UserAgent loaded ok
Fetching with LWP:
  ftp://ftp.perl.org/pub/CPAN/authors/01mailrc.txt.gz
LWP failed with code[500] message[LWP::Protocol::MyFTP: Bad hostname
'ftp.perl.org']
Fetching with Net::FTP:
  ftp://ftp.perl.org/pub/CPAN/authors/01mailrc.txt.gz
Going to read /root/.cpan/sources/authors/01mailrc.txt.gz
CPAN: Compress::Zlib loaded ok
Fetching with LWP:
  ftp://ftp.perl.org/pub/CPAN/modules/02packages.details.txt.gz
Going to read /root/.cpan/sources/modules/02packages.details.txt.gz
  Database was generated on Wed, 07 Feb 2007 23:09:31 GMT

  There's a new CPAN.pm version (v1.8802) available!
  [Current version is v1.7601]
  You might want to try
install Bundle::CPAN
reload cpan
  without quitting the current session. It should be a seamless upgrade
  while we are running...

Fetching with LWP:
  ftp://ftp.perl.org/pub/CPAN/modules/03modlist.data.gz
Going to read /root/.cpan/sources/modules/03modlist.data.gz
Going to write /root/.cpan/Metadata
Warning: Cannot install File::IO, don't know what it is.
Try the command

i /File::IO/

to find objects with matching identifiers.
nibbler:~#


Kind Regards,
Philip Seccombe
Turnstone Technologies NZ Limited

Phone: +64 9 970 5550
Fax: +64 9 970 5559
DDI: +64 9 970 5552
Email: [EMAIL PROTECTED] 
Web: www.turnstone.co.nz 


-Original Message-
From: Doc Schneider [mailto:[EMAIL PROTECTED] 
Sent: Friday, 9 February 2007 11:53 a.m.
To: Philip Seccombe
Cc: users@spamassassin.apache.org
Subject: Re: sa-update gives error message Insecure dependency in open
while running with -T switch

Philip Seccombe wrote:
 Hi everyone,
 
  
 
 Tried Googling this but no success
 
  
 
 Any advise would be greatly appreciated
 
  
 
 Is it updating or is that error mean it is stopping at the end and not

 updating?
 
  
 
 When I run sa-update -D I get the following:

 [9013] dbg: channel: extracting archive
 
 Insecure dependency in open while running with -T switch at 
 /usr/lib/perl/5.8/IO/File.pm line 70.

You can more than likely re-install File::IO which is part of the perl 
base but seems to me to be borked.

#perl -MCPAN -e 'install File::IO'

Should work. from your directory is appears you're using perl 5.8.?? Do 
a perl -v and if that install fails send along the version info.

-- 

  -Doc

  SA/SARE/URIBL/SURBL -- Ninja
4:48pm  up 5 days,  8:14, 17 users,  load average: 0.40, 0.67, 0.66

  SARE HQ  http://www.rulesemporium.com/


RE: sa-update gives error message Insecure dependency in open while running with -T switch

2007-02-08 Thread Philip Seccombe
I ran perl -MCPAN -e 'install Bundle:CPAN' and went through all the
updates using defaults

Now it says:

nibbler:~# perl -MCPAN -e 'install File::IO'
CPAN: File::HomeDir loaded ok
Sorry, we have to rerun the configuration dialog for CPAN.pm due to
the following indispensable but missing parameters:

mbuild_arg, mbuild_install_arg, mbuild_install_build_command,
mbuildpl_arg


The next questions deal with Module::Build support.

A Build.PL is run by perl in a separate process. Likewise we run
'./Build' and './Build install' in separate processes. If you have any
parameters you want to pass to the calls, please specify them here.

Parameters for the 'perl Build.PL' command?
Typical frequently used settings:

--install_base /home/xxx # different installation
directory

Your choice:  []


Oops :s


Kind Regards,
Philip Seccombe
Turnstone Technologies NZ Limited

Phone: +64 9 970 5550
Fax: +64 9 970 5559
DDI: +64 9 970 5552
Email: [EMAIL PROTECTED] 
Web: www.turnstone.co.nz 


-Original Message-
From: Bob McClure Jr [mailto:[EMAIL PROTECTED] 
Sent: Friday, 9 February 2007 12:12 p.m.
To: users@spamassassin.apache.org
Subject: Re: sa-update gives error message Insecure dependency in open
while running with -T switch

On Fri, Feb 09, 2007 at 12:02:52PM +1300, Philip Seccombe wrote:
 This is what happens:
 
 commit: wrote /etc/perl/CPAN/Config.pm
 CPAN: Storable loaded ok
 CPAN: LWP::UserAgent loaded ok
 Fetching with LWP:
   ftp://ftp.perl.org/pub/CPAN/authors/01mailrc.txt.gz
 LWP failed with code[500] message[LWP::Protocol::MyFTP: Bad hostname
 'ftp.perl.org']
 Fetching with Net::FTP:
   ftp://ftp.perl.org/pub/CPAN/authors/01mailrc.txt.gz
 Going to read /root/.cpan/sources/authors/01mailrc.txt.gz
 CPAN: Compress::Zlib loaded ok
 Fetching with LWP:
   ftp://ftp.perl.org/pub/CPAN/modules/02packages.details.txt.gz
 Going to read /root/.cpan/sources/modules/02packages.details.txt.gz
   Database was generated on Wed, 07 Feb 2007 23:09:31 GMT
 
   There's a new CPAN.pm version (v1.8802) available!
   [Current version is v1.7601]
   You might want to try
 install Bundle::CPAN
 reload cpan
   without quitting the current session. It should be a seamless
upgrade
   while we are running...
 
 Fetching with LWP:
   ftp://ftp.perl.org/pub/CPAN/modules/03modlist.data.gz
 Going to read /root/.cpan/sources/modules/03modlist.data.gz
 Going to write /root/.cpan/Metadata
 Warning: Cannot install File::IO, don't know what it is.
 Try the command
 
 i /File::IO/

That should be IO::FILE.

 to find objects with matching identifiers.
 nibbler:~#
 
 
 Kind Regards,
 Philip Seccombe
 Turnstone Technologies NZ Limited
 
 Phone: +64 9 970 5550
 Fax: +64 9 970 5559
 DDI: +64 9 970 5552
 Email: [EMAIL PROTECTED] 
 Web: www.turnstone.co.nz 
 
 
 -Original Message-
 From: Doc Schneider [mailto:[EMAIL PROTECTED] 
 Sent: Friday, 9 February 2007 11:53 a.m.
 To: Philip Seccombe
 Cc: users@spamassassin.apache.org
 Subject: Re: sa-update gives error message Insecure dependency in open
 while running with -T switch
 
 Philip Seccombe wrote:
  Hi everyone,
  
   
  
  Tried Googling this but no success
  
   
  
  Any advise would be greatly appreciated
  
   
  
  Is it updating or is that error mean it is stopping at the end and
not
 
  updating?
  
   
  
  When I run sa-update -D I get the following:
 
  [9013] dbg: channel: extracting archive
  
  Insecure dependency in open while running with -T switch at 
  /usr/lib/perl/5.8/IO/File.pm line 70.
 
 You can more than likely re-install File::IO which is part of the perl

 base but seems to me to be borked.
 
 #perl -MCPAN -e 'install File::IO'
 
 Should work. from your directory is appears you're using perl 5.8.??
Do 
 a perl -v and if that install fails send along the version info.
 
 -- 
 
   -Doc
 
   SA/SARE/URIBL/SURBL -- Ninja
 4:48pm  up 5 days,  8:14, 17 users,  load average: 0.40, 0.67,
0.66
 
   SARE HQ  http://www.rulesemporium.com/

Cheers,
-- 
Bob McClure, Jr. Bobcat Open Systems, Inc.
[EMAIL PROTECTED] http://www.bobcatos.com
Ah, Sovereign LORD, you have made the heavens and the earth by your
great power and outstretched arm. Nothing is too hard for you.
Jeremiah 32:17 (NIV)


RE: sa-update gives error message Insecure dependency in open while running with -T switch

2007-02-08 Thread Philip Seccombe
Running through that gets me to this:

Typical frequently used setting:

--uninst 1   # uninstall conflicting files

Your choice:  [] --uninst 1


Please remember to call 'o conf commit' to make the config permanent!

CPAN: Storable loaded ok
Going to read /root/.cpan/Metadata
  Database was generated on Wed, 07 Feb 2007 23:09:31 GMT
Test::Harness is up to date (2.64).
ExtUtils::CBuilder is up to date (0.18).
Module::Build is up to date (0.2806).
File::Spec is up to date (3.24).
File::Temp is up to date (0.18).
Scalar::Util is up to date (1.19).
Test::More is up to date (0.67).
Data::Dumper is up to date (2.121).
Digest::SHA is up to date (5.44).
File::HomeDir is up to date (0.63).
Compress::Zlib is up to date (2.003).
Archive::Tar is up to date (1.30).
Archive::Zip is up to date (1.18).
Net::Cmd is up to date (2.27).
Net::FTP is up to date (2.77).
Term::ReadKey is up to date (2.30).
Term::ReadLine::Perl is up to date (1.0302).
YAML is up to date (0.62).
Text::Glob is up to date (0.07).
CPAN is up to date (1.8802).
File::Which is up to date (0.05).
nibbler:~#

And there's just nothing happening


Kind Regards,
Philip Seccombe
Turnstone Technologies NZ Limited

Phone: +64 9 970 5550
Fax: +64 9 970 5559
DDI: +64 9 970 5552
Email: [EMAIL PROTECTED] 
Web: www.turnstone.co.nz 


-Original Message-
From: Bob McClure Jr [mailto:[EMAIL PROTECTED] 
Sent: Friday, 9 February 2007 12:41 p.m.
To: users@spamassassin.apache.org
Subject: Re: sa-update gives error message Insecure dependency in open
while running with -T switch

On Fri, Feb 09, 2007 at 12:26:31PM +1300, Philip Seccombe wrote:
 I ran perl -MCPAN -e 'install Bundle:CPAN' and went through all the
 updates using defaults
 
 Now it says:
 
 nibbler:~# perl -MCPAN -e 'install File::IO'

Don't forget that should be IO::File.

 CPAN: File::HomeDir loaded ok
 Sorry, we have to rerun the configuration dialog for CPAN.pm due to
 the following indispensable but missing parameters:
 
 mbuild_arg, mbuild_install_arg, mbuild_install_build_command,
 mbuildpl_arg
 
 
 The next questions deal with Module::Build support.
 
 A Build.PL is run by perl in a separate process. Likewise we run
 './Build' and './Build install' in separate processes. If you have any
 parameters you want to pass to the calls, please specify them here.
 
 Parameters for the 'perl Build.PL' command?
 Typical frequently used settings:
 
 --install_base /home/xxx # different installation
 directory
 
 Your choice:  []
 
 
 Oops :s

Okay, you're just running the setup for CPAN.  Take most of the
defaults, but I recommend you specify UNINST=1 for the install option,
as suggested, and then select the CPAN server(s) you want.

Then it will proceed with the install of IO::File.

 Kind Regards,
 Philip Seccombe
 Turnstone Technologies NZ Limited
 
 Phone: +64 9 970 5550
 Fax: +64 9 970 5559
 DDI: +64 9 970 5552
 Email: [EMAIL PROTECTED] 
 Web: www.turnstone.co.nz 
 
 major snippage

Cheers,
-- 
Bob McClure, Jr. Bobcat Open Systems, Inc.
[EMAIL PROTECTED] http://www.bobcatos.com
Ah, Sovereign LORD, you have made the heavens and the earth by your
great power and outstretched arm. Nothing is too hard for you.
Jeremiah 32:17 (NIV)


RE: sa-update gives error message Insecure dependency in open while running with -T switch

2007-02-08 Thread Philip Seccombe
I really am getting confused here

nibbler:/etc/init.d# spamassassin -V
SpamAssassin version 3.0.3
  running on Perl version 5.8.4
nibbler:/etc/init.d#

nibbler:/etc/init.d# apt-get install spamassassin
Reading Package Lists... Done
Building Dependency Tree... Done
spamassassin is already the newest version.
0 upgraded, 0 newly installed, 0 to remove and 38 not upgraded.
nibbler:/etc/init.d#

If apt-get will not install it, how do I upgrade it properly?



Kind Regards,
Philip Seccombe
Turnstone Technologies NZ Limited

Phone: +64 9 970 5550
Fax: +64 9 970 5559
DDI: +64 9 970 5552
Email: [EMAIL PROTECTED] 
Web: www.turnstone.co.nz 


-Original Message-
From: Daryl C. W. O'Shea [mailto:[EMAIL PROTECTED] 
Sent: Friday, 9 February 2007 11:59 a.m.
To: Philip Seccombe
Cc: users@spamassassin.apache.org
Subject: Re: sa-update gives error message Insecure dependency in open
while running with -T switch

Philip Seccombe wrote:
 [9013] dbg: generic: SpamAssassin version 3.1.0

Upgrade SA to anything newer than 3.1.0.


Training Bayes ham messages when they are sent out of the server

2007-02-06 Thread Philip Seccombe
Hi there,

 

Apologies if this has been answered, I cannot find on the web anything
to say about this and not being a Linux guru I'm a little bewildered.

 

Basically a previous technician setup SpamAssassin on a server for us
and has since left the company on bad terms so is not able to support
anymore

 

Server is setup in the following manner

 

QMail with Vpopmail and spamassassin with sophie antivirus

 

Its setup as a spam filtering server for our customers. We point their
mx records at the server then in qmail smtproutes we direct emails to
their actual mail servers.

 

If an email comes in and is spam it is put into
[EMAIL PROTECTED] which are mailboxes setup on the
server accessed via Squirrelmail (Using courier_imap) to release emails
(I've changed the forward button to say release)

Ham emails are forwarded onto the customers mail server as setup in
smtproutes file.

 

The problem is that spam has increased and I want to turn bayes
filtering on, I've trained it up on spam emails as they are saved on the
server, but all the ham emails are sent away and not kept.

 

Does anyone have any ideas how I can get the emails back on the server,
or keep a copy on the server to create a bayes database on?

I thought of forwarding emails back, but then its a forwarded email and
not the actual one which will mess up the database.

 

The other question I had was regarding setting up squirrelmail for
releasing emails. I've just butchered the template of squirrelmail to
look like a spam filter release but its far from ideal, does anyone know
of any templates for squirrelmail or have they developed any?

 

 

Kind Regards,

Philip Seccombe

Turnstone Technologies NZ Limited

 

Phone: +64 9 970 5550

Fax: +64 9 970 5559

DDI: +64 9 970 5552

Email: [EMAIL PROTECTED] 

Web: www.turnstone.co.nz 

 



Using whitelist_from_rcvd when there's no rDNS

2006-12-11 Thread Philip Prindeville
I was wondering if SA could be modified to take an IP address
for the second argument to whitelist_from_rcvd as well as a
domain/host name string.

Lately I seem to be dealing with a lot of small businesses with
poorly set-up mail servers, and no rDNS.  Sigh.

It would be useful to not bounce their email.

Thanks,

-Philip



This seen on Dice

2006-12-08 Thread Philip Prindeville
Any takers?  ;-)

http://seeker.dice.com/seeker.epl?rel_code=1102op=5type=14dockey=xml/7/a/[EMAIL
 PROTECTED]bb=0source=15




Re: Braindeath in the Navy

2006-11-23 Thread Philip Prindeville
Jonas Eckerman wrote:

Philip Prindeville wrote:

  

Received: (private information removed)



  

It just boggles my mind why anyone would go through that much trouble
to deliberately damage a header line, rather than just delete it.



The only reason I can think of for that (in this case) is that ther want to 
keep those headers in order for server hop counting to work.

Of course, it would be much better (and more useful) if they kept the time 
stamp and obfuscated the headers without breaking the format. :-/

/Jonas
  


Sure.

They could replace it with either:

Received (deleted); timestamp-here


and still be in complaince, or rewrite it:

X-Header-Rewrite: Received-obfucated-for-no-obvious-purpose=3


Or any other number of imaginative yet-ultimately-misguided but
still-in-compliance-with-applicable-standards ways.

-Philip




List weirdness

2006-11-23 Thread Philip Prindeville
I'm seeing the following (attached).

I went back and looked at the message that seems to have
provoked it, and there was nothing odd about the message:
no attachments, nothing but text/plain 7-bit, in the body
(though it's weird that it's a 7-bit body, but charset=iso-8859-1).

Is this a lurking ratware writer?  Who on this list runs Exchange?

Why is this bouncing back to me, and not the envelope sender,
which was:

Return-Path: [EMAIL PROTECTED]


-Philip


---BeginMessage---
Subject of the message: Redundant QP encoding of Subject/From fields...
Recipient of the message: SpamAssassin Users
---End Message---


Re: Interesting text content in the new spams

2006-11-23 Thread Philip Prindeville
Charlie Clark wrote:

Looks like there are some pretty impressive self-learning systems out  
there. I'm enclosing the content of the text part of a new spam. I  
think it's quite an interesting vocabulary that they are using,  
presumably from their own trained ham database. This spam got through  
four different checks (postfix + blacklisting, spamassassin,  
spambayes and Opera's own spam system)! Given them a couple of years  
and we can finally close slashdot et al. and actually start reading  
this stuff! ;-)

Charlie

Raquo Areas Bugs. Open total a bug Tracking Support or Requests in  
Tech Patches.
Release archive is raquo of Areas?
Framework gd Engine Details Developers Beta Intended Audience. In  
Create Newscreate Farm Mapcreate or Projectnew am Wantedmy?  
Statistics currently Browse Most!
Of feeds available for this About by or the from. Activity Percentile  
last week View list of feeds available is.
Language a License gnu of. Patches Patch Feature a Request. Details  
Developers Beta Intended Audience Education Technology.
Education Technology or Other Topic English Unix name Registered.  
Language License gnu?
Va Software Ostg Source Group all Rights Reserved or Find.
Projectnew Wantedmy Statussite is.
Areas in Bugs open total bug Tracking Support. Va Software Ostg  
Source Group all Rights Reserved or Find.
Bug or Tracking Support Requests or Tech Patches am Patch in.
Audience or Education Technology Other Topic English Unix.
Support in Requests Tech Patches Patch Feature Request. Kolmafia sw  
Test Automation Framework gd. System of os Written an language of  
License gnu General Public.
License gnu General Public gpl. Create Newscreate is Farm of  
Mapcreate Projectnew am Wantedmy Statussite Status web!
Sprites a Release archive raquo of Areas Bugs?
Open total a bug Tracking Support or Requests in Tech Patches. Book  
Search is Advanced log in Create is. Va Software Ostg Source Group in  
all Rights.
Latest a News new or Graphics and Sprites Release archive. Va  
Software Ostg Source Group in all Rights.
Intended Audience Education.

--
  


I hear the New York Times isn't too picky about who they hire.

Someone could create an army of ghost writers and sit back and
collect the paychecks.

-Philip



Re: Interesting text content in the new spams

2006-11-23 Thread Philip Prindeville
Given that spammers read this list to figure out how to defeat us...
Why don't we just secure a copy of ratware and engineer a retro-virus
for it?

-Philip


Justin Mason wrote:

there was a very interesting project described in CEAS which did
just this -- engaged 419ers and other spammers in negotation,
to waste their time.  It's a great idea!

--j.

[EMAIL PROTECTED] writes:
  

Hi,

anybody recall that ELIZA program from ages ago? It would be interesting to
see her response to those utterances :)

Wolfgang Hamann



Looks like there are some pretty impressive self-learning systems out =20=

there. I'm enclosing the content of the text part of a new spam. I =20
think it's quite an interesting vocabulary that they are using, =20
presumably from their own trained ham database. This spam got through =20=

four different checks (postfix + blacklisting, spamassassin, =20
spambayes and Opera's own spam system)! Given them a couple of years =20
and we can finally close slashdot et al. and actually start reading =20
this stuff! ;-)

Charlie

Raquo Areas Bugs. Open total a bug Tracking Support or Requests in =20=

Tech Patches.
Release archive is raquo of Areas?
Framework gd Engine Details Developers Beta Intended Audience. In =20
Create Newscreate Farm Mapcreate or Projectnew am Wantedmy? =20
Statistics currently Browse Most!
Of feeds available for this About by or the from. Activity Percentile =20=

last week View list of feeds available is.
Language a License gnu of. Patches Patch Feature a Request. Details =20
Developers Beta Intended Audience Education Technology.
Education Technology or Other Topic English Unix name Registered. =20
Language License gnu?
Va Software Ostg Source Group all Rights Reserved or Find.
Projectnew Wantedmy Statussite is.
Areas in Bugs open total bug Tracking Support. Va Software Ostg =20
Source Group all Rights Reserved or Find.
Bug or Tracking Support Requests or Tech Patches am Patch in.
Audience or Education Technology Other Topic English Unix.
Support in Requests Tech Patches Patch Feature Request. Kolmafia sw =20
Test Automation Framework gd. System of os Written an language of =20
License gnu General Public.
License gnu General Public gpl. Create Newscreate is Farm of =20
Mapcreate Projectnew am Wantedmy Statussite Status web!
Sprites a Release archive raquo of Areas Bugs?
Open total a bug Tracking Support or Requests in Tech Patches. Book =20
Search is Advanced log in Create is. Va Software Ostg Source Group in =20=

all Rights.
Latest a News new or Graphics and Sprites Release archive. Va =20
Software Ostg Source Group in all Rights.
Intended Audience Education.

--
Charlie Clark
Helmholtzstr. 20
D=FCsseldorf
D- 40215
Tel: +49-211-938-5360
GSM: +49-178-782-6226









Re: Interesting text content in the new spams

2006-11-23 Thread Philip Prindeville
Poor choice of words.

Not a virus.  A vaccine.  ;-)

-Philip


Justin Mason wrote:

er, it's illegal, and we're not criminals like they are? ;)

--j.

Philip Prindeville writes:
  

Given that spammers read this list to figure out how to defeat us...
Why don't we just secure a copy of ratware and engineer a retro-virus
for it?

-Philip


Justin Mason wrote:



there was a very interesting project described in CEAS which did
just this -- engaged 419ers and other spammers in negotation,
to waste their time.  It's a great idea!

--j.

[EMAIL PROTECTED] writes:
 

  

Hi,

anybody recall that ELIZA program from ages ago? It would be interesting to
see her response to those utterances :)

Wolfgang Hamann

   



Looks like there are some pretty impressive self-learning systems out =20=

there. I'm enclosing the content of the text part of a new spam. I =20
think it's quite an interesting vocabulary that they are using, =20
presumably from their own trained ham database. This spam got through =20=

four different checks (postfix + blacklisting, spamassassin, =20
spambayes and Opera's own spam system)! Given them a couple of years =20
and we can finally close slashdot et al. and actually start reading =20
this stuff! ;-)

Charlie

Raquo Areas Bugs. Open total a bug Tracking Support or Requests in =20=

Tech Patches.
Release archive is raquo of Areas?
Framework gd Engine Details Developers Beta Intended Audience. In =20
Create Newscreate Farm Mapcreate or Projectnew am Wantedmy? =20
Statistics currently Browse Most!
Of feeds available for this About by or the from. Activity Percentile =20=

last week View list of feeds available is.
Language a License gnu of. Patches Patch Feature a Request. Details =20
Developers Beta Intended Audience Education Technology.
Education Technology or Other Topic English Unix name Registered. =20
Language License gnu?
Va Software Ostg Source Group all Rights Reserved or Find.
Projectnew Wantedmy Statussite is.
Areas in Bugs open total bug Tracking Support. Va Software Ostg =20
Source Group all Rights Reserved or Find.
Bug or Tracking Support Requests or Tech Patches am Patch in.
Audience or Education Technology Other Topic English Unix.
Support in Requests Tech Patches Patch Feature Request. Kolmafia sw =20
Test Automation Framework gd. System of os Written an language of =20
License gnu General Public.
License gnu General Public gpl. Create Newscreate is Farm of =20
Mapcreate Projectnew am Wantedmy Statussite Status web!
Sprites a Release archive raquo of Areas Bugs?
Open total a bug Tracking Support or Requests in Tech Patches. Book =20
Search is Advanced log in Create is. Va Software Ostg Source Group in =20=

all Rights.
Latest a News new or Graphics and Sprites Release archive. Va =20
Software Ostg Source Group in all Rights.
Intended Audience Education.

--
Charlie Clark
Helmholtzstr. 20
D=FCsseldorf
D- 40215
Tel: +49-211-938-5360
GSM: +49-178-782-6226




   






Re: Greylisting

2006-11-22 Thread Philip Prindeville
Don't they?  I thought the recommended retry time was 2 minutes,
doubling on each failure, and maxing out at 2 hours.

That's what sendmail does (unless it's retry time has been explicitly
set to more than 2 hours, of course).

-Philip


Richard Frovarp wrote:

I don't think the RFCs specify any time limit. Most timeout after 5 days 
of trying. We run 3 equivalent scanning machines, which requires us to 
run a greylisting that will sync between them. That could cause a large 
delay, if the sending machine tries to send to a different host that 
isn't synced. Messages that aren't sent from the same machine (SMTP 
farms like at GMail) can cause trouble as well, since the IP will 
change. The whitelist usually will timeout after a period of time, so 
there is a delay that may be induced again in the future, but that 
depends on setup.

If a sensitive piece of mail needs to get through, it may be possible 
for the user to send the message again after the delay period has 
elapsed. This would be a new message, but if it leaves the same IP, with 
the same from and to pair (or however your greylisting works), it would 
fire right on through the greylist no problem. Not a perfect solution, 
but should work for rare occasions.

One probably can whitelist recipients or recipient domains so that they 
are not affected by greylisting.

Last week greylisting stopped 1.3 million messages, which is after the 
blacklists and greet pause did their significant work.

Richard
  




Re: ??

2006-11-21 Thread Philip Prindeville
John D. Hardin wrote:

On Mon, 20 Nov 2006, twofers wrote:

  

I would like to know what local rule I could invoke to tag email that the 
subject is not in english.
   
  header   NOT_IN_ENGLISH Subject !~ /English/i
  describe NOT_IN_ENGLISH Subject Contains Non English Characters
  score NOT_IN_ENGLISH 3.5
   
  What regexp could I use?



I haven't tested this, but it may work:

header   NOT_IN_ENGLISH Subject =~ /[\x80-\xFF]{3}/

That should hit on a string of at least three charaters with the high
bit set.

You may need to drop it down to {2} to get good detection.

Don't score it very high.
  


Of course, that would exclude messages with ISO Latin 1 (8859.1)
characters like Yen, Pound Sterling, Trademark, etc. Plus, there are
words in English that when properly written do contain accents,
such as resume, dais, cliche, cooperation, etc.

Excluding words with pounds and yen in the Subject line might be
a good thing, however...

-Philip



Redundant QP encoding of Subject/From fields...

2006-11-21 Thread Philip Prindeville
I got the following spam.  I've included the header:

Return-Path: [EMAIL PROTECTED]
Received: from mail.libertysurf.net (webmail-out.libertysurf.net 
[213.36.80.105])
by mail.redfish-solutions.com (8.13.8/8.13.7) with ESMTP id 
kAM1ckKs008704
for [EMAIL PROTECTED]; Tue, 21 Nov 2006 18:38:52 -0700
Received: from aliceadsl.fr (192.168.10.57) by mail.libertysurf.net (7.1.026)
id 43F3DDC5003935BF; Wed, 22 Nov 2006 02:22:49 +0100
Date: Wed, 22 Nov 2006 02:22:49 +0100
Message-Id: [EMAIL PROTECTED]
Subject: =?iso-8859-1?Q?Representative_Needed.?=
MIME-Version: 1.0
X-Sensitivity: 3
Content-Type: multipart/alternative; 
boundary=_=__=_XaM3_.1164158569.2A.498089.42.6019.52.42.007.3770
From: [EMAIL PROTECTED] [EMAIL PROTECTED]

My question is this.  The encoding of the Subject: and From: lines
is redundant.  There are no non-USASCII characters in either field.
Hence, specifying =?iso-8859-1?Q? is not necessary.

The test SUBJECT_EXCESS_QP seems to handle this (at least the Subject:
part).  I'd like to crank it up to 3.5 or higher.

Any intuitive reasons why this wouldn't work?  Are there any
valid mailers that are braindead?

Thanks,

-Philip




Re: Greylisting

2006-11-21 Thread Philip Prindeville
John Andersen wrote:

On Monday 20 November 2006 15:08, Rick Macdougall wrote:
  

It's possible that they could send it all twice but I've never seen it.
  Remember that some unbelievable number of infected Windows clients are
the main source of spam and it would just be too much trouble for the
spammer to try every address twice after a 15 minute interval.



Oh come on!  It costs the spammer NOTHING to make that adjustment
to his bot net.  Its someone else's bandwidth, and someone else's
cpu cycles.

They are reading this list and planning the changes already.

  


If the graylist time is 15 minutes (for instance), and someone
reports them fairly soon after they start up... and their ISP is
quick to shut them down (cough, cough) then we're managed
to severely limit how many sites they hit before they get
shut down.

Of course, graylisting a larger value (2 hours) for totally
unknown correspondents would be more effective.

-Philip



Braindeath in the Navy

2006-11-21 Thread Philip Prindeville
Well, I tried to contact some people responsible for
the servers below that what they were doing was broken,
including citing chapter and verse where in RFC-2822 in
syntax of the Received: lines was spec'd out:

Received: from Gate2-sandiego.nmci.navy.mil (gate2-sandiego.nmci.navy.mil 
[138.163.0.42])
by mail.redfish-solutions.com (8.13.8/8.13.7) with ESMTP id 
kAGNLZHp020689
for [EMAIL PROTECTED]; Thu, 16 Nov 2006 16:21:40 -0700
Received: from nawesdnims03.nmci.navy.mil by Gate2-sandiego.nmci.navy.mil
  via smtpd (for mail.redfish-solutions.com [71.36.29.88]) with ESMTP; 
Thu, 16 Nov 2006 23:21:40 +
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)
Received: (private information removed)

and which fields it requires (like the semi-colon followed by the
timestamp coming after a comment field) [cf: RFC 2822, section 3.6.7:

received=   Received: name-val-list ; date-time CRLF

name-val-list   =   [CFWS http://tools.ietf.org/html/rfc2822#ref-CFWS] 
[name-val-pair *(CFWS name-val-pair)]

including the definition of CFWS in 3.2.3.]

It just boggles my mind why anyone would go through that much trouble
to deliberately damage a header line, rather than just delete it.

Well, maybe they'll get a whiff of the errs of their ways in the
Hall of Spam Shame...

-Philip




Accurately deprecating charsets

2006-11-17 Thread Philip Prindeville
I'll ask again...  Can someone who handles a fair mix of
email content (i.e. not just western European languages)
do a triage (individually) of the rules below for ham versus
spam?

I'd suspect that very little genuine ham contains IBM852
or Unicode or CP12[0-8] these days.

Thanks,

-Philip



Robert Nicholson wrote:

 so what is the conclusion to this issue?

 why when I set ok_locales to it th en does it allow any Charset with
 Windows in the name
 to bypass that setting?

 Why is it that is_charset_ok_for_locales written to give exceptions

 sub is_charset_ok_for_locales {
   my ($cs, @locales) = @_;

   $cs = uc $cs; $cs =~ s/[^A-Z0-9]//g;
   $cs =~ s/^3D//gs; # broken by quoted-printable
   $cs =~ s/:.*$//gs;# trim off multiple charsets, just use 1st

   study $cs;
   #warn JMD $cs;

   # always OK (the net speaks mostly roman charsets)
   return 1 if ($cs eq 'USASCII');
   return 1 if ($cs =~ /^ISO8859/);
   return 1 if ($cs =~ /^ISO10646/);
   return 1 if ($cs =~ /^UTF/);
   return 1 if ($cs =~ /^UCS/);
   return 1 if ($cs =~ /^CP125/);
   return 1 if ($cs =~ /^WINDOWS/);  # argh, Windows
   return 1 if ($cs eq 'IBM852');
   return 1 if ($cs =~ /^UNICODE11UTF[78]/); # wtf? never heard of it
   return 1 if ($cs eq 'XUNKNOWN'); # added by sendmail when converting
 to 8bit
   return 1 if ($cs eq 'ISO');   # Magellan, sending as 'charset=iso
 8859-15'. grr

   foreach my $locale (@locales) {
 if (!defined($locale) || $locale eq 'C') { $locale = 'en'; }
 $locale =~ s/^([a-z][a-z]).*$/$1/;  # zh_TW... = zh

 my $ok_for_loc = $charsets_for_locale{$locale};
 next if (!defined $ok_for_loc);

 if ($ok_for_loc =~ /(?:^| )\Q${cs}\E(?:$| )/) {
   return 1;
 }
   }

   return 0;
 }




Re: ????? ??? ??????

2006-11-16 Thread Philip Prindeville
I would say that this issue in general (and this file in particular) is
more than overdue for a revisiting.

I haven't seen UCS, CP125?, or IBM852 for a long time.  Likewise
for UNICODE or XUNKNOWN.

As for ISO (tout court) from Magellan... that's broken, and if it
hasn't been fixed by now, then it's their problem, not our.  Easier to
whitelist the few users still clinging to broken mailers than to
continue to compromise spamproofness.

As for Windows...  I would change the test from:

$cs =~ /^WINDOWS/

to:

$cs eq 'WINDOWS-1252'

instead.  There is no reason to use any of the other
Windows character sets:  they offer nothing that UTF doesn't
already have.

Being liberal in what you accept is good if interoperability is
your goal.  If security and integrity, however, are primal, then
being paranoid in what you accept might actually be more
appropriate.

Is there anyone out there (preferably in Central/Eastern Europe)
that handles a high volume of traffic that can tell us if
any of these encodings are still in legitimate use?  Like ISO10646
or UCS or ISO-8859-8 or CP125?, etc.

The alternative is to add checks per language for each of the
Windows-125[0-8] types.  Yes, you can encode English in
Windows-1256... but a sane mailer would detect that a message
all fits into 7-bits and use USASCII instead.

If it doesn't, then it's broken and needs to be fixed.

I'm not against reinventing the wheel when a new design is
offered that's better.  But I'm not convinced that Windows-1252
is an improvement over Latin-1.  For instance, the glyphs oe
and OE aren't a unique letter:  they are a presentation (i.e.
ligature) that renders (displays) differently from writing o and
e separately... but it is in fact just the two letters o and e
that are being represented (similarly for ij in Dutch, etc)
without kerning between them.

The bottom line is you don't need specific characters for
oe and ij, etc.  You just need a rendering engine that
understands when using a ligature is appropriate (same
as with ss in German, or ff, fl, etc. in English).

Making these distinct characters was folly.

But I digress.

Just out of curiosity, what are the charsets_for_locale{'en'}
anyway?  If it were up to me, I'd limit it to USASCII,
ISO-8859-1, and UTF-8.  Period.

Likewise, for Japanese, how many UA's use anything other
than ISO2022JP?  This is the blessed standard.  Anything else
is out-of-date and requires a fix.

-Philip


Robert Nicholson wrote:

 so what is the conclusion to this issue?

 why when I set ok_locales to it th en does it allow any Charset with
 Windows in the name
 to bypass that setting?

 Why is it that is_charset_ok_for_locales written to give exceptions

 sub is_charset_ok_for_locales {
   my ($cs, @locales) = @_;

   $cs = uc $cs; $cs =~ s/[^A-Z0-9]//g;
   $cs =~ s/^3D//gs; # broken by quoted-printable
   $cs =~ s/:.*$//gs;# trim off multiple charsets, just use 1st

   study $cs;
   #warn JMD $cs;

   # always OK (the net speaks mostly roman charsets)
   return 1 if ($cs eq 'USASCII');
   return 1 if ($cs =~ /^ISO8859/);
   return 1 if ($cs =~ /^ISO10646/);
   return 1 if ($cs =~ /^UTF/);
   return 1 if ($cs =~ /^UCS/);
   return 1 if ($cs =~ /^CP125/);
   return 1 if ($cs =~ /^WINDOWS/);  # argh, Windows
   return 1 if ($cs eq 'IBM852');
   return 1 if ($cs =~ /^UNICODE11UTF[78]/); # wtf? never heard of it
   return 1 if ($cs eq 'XUNKNOWN'); # added by sendmail when converting
 to 8bit
   return 1 if ($cs eq 'ISO');   # Magellan, sending as 'charset=iso
 8859-15'. grr

   foreach my $locale (@locales) {
 if (!defined($locale) || $locale eq 'C') { $locale = 'en'; }
 $locale =~ s/^([a-z][a-z]).*$/$1/;  # zh_TW... = zh

 my $ok_for_loc = $charsets_for_locale{$locale};
 next if (!defined $ok_for_loc);

 if ($ok_for_loc =~ /(?:^| )\Q${cs}\E(?:$| )/) {
   return 1;
 }
   }

   return 0;
 }

 On Nov 13, 2006, at 8:30 PM, Giampaolo Tomassoni wrote:

 # don't allow windows-1252 text attachments...

 mimeheader __CTYPE_MH_WIN1252   Content-Type =~ 

 /charset=(\windows-125[0-8]\|windows-125[0-8])/i

 meta WIN_CHARSET((__CTYPE_MH_HTML || 

 __CTYPE_MH_TEXT_PLAIN)  __CTYPE_MH_WIN1252)

 describe WIN_CHARSETContent-Type is Windows-specific text

 score WIN_CHARSET   0.01





Re: ????? ??? ??????

2006-11-16 Thread Philip Prindeville
You'd think, wouldn't you

-Philip


Robert Nicholson wrote:

 This is Japanese

 # Japanese: Peter Evans writes: iso-2022-jp = rfc approved, rfc 1468,
 created
   # by Jun Murai in 1993 back when he didnt have white hair!  rfc
 approved.
   # (rfc 2237) -- by M$.
   'ja' = 'EUCJP JISX020119760 JISX020819830 JISX020819900
 JISX020819970 '.
 'JISX021219900 JISX021320001 JISX021320002 SHIFT_JIS SHIFTJIS '.
 'ISO2022JP SJIS JIS7 JISX0201 JISX0208 JISX0212',

 Surely the MUA only changes the charset to Windows-1255 once it sees
 there are glyphs in which case you'd expect seldom to see Windows-1255
 when there are no glyphs present?

 On Nov 16, 2006, at 4:24 PM, Philip Prindeville wrote:

 Windows-1256... but a sane mailer would detect that a message

 all fits into 7-bits and use USASCII instead.





Re: ????? ??? ??????

2006-11-16 Thread Philip Prindeville
[EMAIL PROTECTED] wrote:

The bottom line is you don't need specific characters for
oe and ij, etc.  You just need a rendering engine that
understands when using a ligature is appropriate (same
as with ss in German, or ff, fl, etc. in English).

Making these distinct characters was folly.

But I digress.

  


Hi,

typography considers it a gross error to use ligature characters (fl) if they 
occur at the
boundary between word compounds. So either a rendering system has to be pretty 
smart, or the
transmitted text needs to be able to represent the ligature as well as the 
separate character.
This slightly resembles arabic languages where different glyphs are used for 
the same character
at the beginning of a word, in the middle, or at the end.
Of course, most email writers are not concerned about these fine details, and 
the company
behind the winows- charsets does not seem to understand kerning at all.

Wolfgang Hamann
  


You're right!  The rendering system does need to be pretty smart.

Unfortunately, few of them are.

But that's still no excuse to lobotomize character encodings.

The least offensive of all solutions would have been to create a
throw-away non-rendering character, like the non-break space,
that says, glue these two together as a ligature.  It would waste
a lot less of an already limited encoding space, too.

-Philip





Re: Microsoft blacklisted?

2006-11-14 Thread Philip Prindeville
SM wrote:

At 18:56 13-11-2006, Philip Prindeville wrote:
  

I recently saw an email get bounced that was legitimately coming


from Microsoft:

[snip]


  

I've put into my spamassassin/sa-mimedefang.cf file:

whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com


What am I missing at this point?

Does the 2nd arg to the whitelist_from_rcvd need to be
maila.microsoft.com instead?



Yes.

Regards,
-sm 

  


The problem with this is that the DNS returns the response (of the multiple
PTR records) in no particular order, so looking up the rDNS can return
one of three different names...

# nslookup
 set type=any
 server ns4.msft.net.
Default server: ns4.msft.net.
Address: 207.46.66.126#53
 212.115.107.131.in-addr.arpa
Server: ns4.msft.net.
Address:207.46.66.126#53

212.115.107.131.in-addr.arpaname = mail1.microsoft.com.
212.115.107.131.in-addr.arpaname = smtp.microsoft.com.
212.115.107.131.in-addr.arpaname = maila.microsoft.com.
 


So, if I put:


whitelist_from_rcvd [EMAIL PROTECTED] mail1.microsoft.com
whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com
whitelist_from_rcvd [EMAIL PROTECTED] maila.microsoft.com


will that work?  Or will each command clobber the previous one?

-Philip




Re: Microsoft blacklisted?

2006-11-14 Thread Philip Prindeville
SM wrote:

At 11:49 14-11-2006, Philip Prindeville wrote:
  

The problem with this is that the DNS returns the response (of the multiple
PTR records) in no particular order, so looking up the rDNS can return
one of three different names...

# nslookup


set type=any
server ns4.msft.net.
  

Default server: ns4.msft.net.
Address: 207.46.66.126#53


212.115.107.131.in-addr.arpa
  

Server: ns4.msft.net.
Address:207.46.66.126#53

212.115.107.131.in-addr.arpaname = mail1.microsoft.com.
212.115.107.131.in-addr.arpaname = smtp.microsoft.com.
212.115.107.131.in-addr.arpaname = maila.microsoft.com.


So, if I put:

whitelist_from_rcvd [EMAIL PROTECTED] mail1.microsoft.com



Then use:

whitelist_from_rcvd [EMAIL PROTECTED] microsoft.com

Regards,
-sm 
  


Yeah, in an earlier message, I considered that, but didn't want to
leave myself wide open to every misbehaving host at Microsoft.

So I take it the short answer is that you can't have three entries for
the same mail address, and can't have multiple hostname args (which
you really should be able to do... or maybe even take an IP address
directly!).

-Philip



Re: Microsoft blacklisted?

2006-11-14 Thread Philip Prindeville
John D. Hardin wrote:

On Tue, 14 Nov 2006, Daryl C. W. O'Shea wrote:

  

Philip Prindeville wrote:



whitelist_from_rcvd [EMAIL PROTECTED] mail1.microsoft.com
whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com
whitelist_from_rcvd [EMAIL PROTECTED] maila.microsoft.com

will that work?
  

It should.



A microsoft whitelist does appear in 70_sare_whitelist, though it does
trust all microsoft hosts rather than just the three listed above...

You might consider adding that ruleset.
  


Can't do that. Matter of principle: I'm tired of tacitly admitting that
they're the 800lb gorilla and they get to do whatever they please.

When '95 came out, I was willing to cut them some slack since this
whole Internetworking thing was new to them. That was 10 years
ago. Why they're still struggling to comply with standards I don't
know. It's not for lack of engineers.

-Philip



Microsoft blacklisted?

2006-11-13 Thread Philip Prindeville
I recently saw an email get bounced that was legitimately coming
from Microsoft:

Nov 13 14:59:26 mail mimedefang.pl[19053]: helo: maila.microsoft.com 
(131.107.115.212) said helo smtp.microsoft.com
Nov 13 14:59:26 mail sendmail[21067]: kADLxLLR021067: from=[EMAIL PROTECTED], 
size=1207, class=0, nrcpts=1, msgid=[EMAIL PROTECTED], bodytype=7BIT, 
proto=ESMTP, daemon=MTA-v4, relay=maila.microsoft.com [131.107.115.212]
Nov 13 14:59:29 mail mimedefang.pl[20521]: kADLxLLR021067: hits=6.909, req=5, 
names=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,L_WIN_CHARSET
Nov 13 14:59:29 mail mimedefang.pl[20521]: 
MDLOG,kADLxLLR021067,spam,6.909,131.107.115.212,[EMAIL PROTECTED],[EMAIL 
PROTECTED],Out of Office: Software Development with Microsoft
Nov 13 14:59:29 mail mimedefang.pl[20521]: filter: kADLxLLR021067:  bounce=1 
discard=1
Nov 13 14:59:29 mail mimedefang[5737]: kADLxLLR021067: Bouncing because filter 
instructed us to
Nov 13 14:59:29 mail sendmail[21067]: kADLxLLR021067: Milter: data, reject=554 
5.7.1 Message rejected; scored too high on the Spam test.
Nov 13 14:59:29 mail sendmail[21067]: kADLxLLR021067: to=[EMAIL PROTECTED], 
delay=00:00:03, pri=31207, stat=Message rejected; scored too high on the Spam 
test.

I've put into my spamassassin/sa-mimedefang.cf file:

whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com


What am I missing at this point?

Does the 2nd arg to the whitelist_from_rcvd need to be
maila.microsoft.com instead?

And what do DNS_FROM_RFC_ABUSE and DNS_FROM_RFC_POST correspond to?
Where do I get the descriptions of these tests, why some sites get
tagged with them, etc?

-Philip






Re: Microsoft blacklisted?

2006-11-13 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  

I recently saw an email get bounced that was legitimately coming
from Microsoft:

Nov 13 14:59:26 mail mimedefang.pl[19053]: helo: maila.microsoft.com 
(131.107.115.212) said helo smtp.microsoft.com
Nov 13 14:59:26 mail sendmail[21067]: kADLxLLR021067: from=[EMAIL 
PROTECTED], size=1207, class=0, nrcpts=1, msgid=[EMAIL PROTECTED], 
bodytype=7BIT, proto=ESMTP, daemon=MTA-v4, relay=maila.microsoft.com 
[131.107.115.212]
Nov 13 14:59:29 mail mimedefang.pl[20521]: kADLxLLR021067: hits=6.909, req=5, 
names=DNS_FROM_RFC_ABUSE,DNS_FROM_RFC_POST,L_WIN_CHARSET
Nov 13 14:59:29 mail mimedefang.pl[20521]: 
MDLOG,kADLxLLR021067,spam,6.909,131.107.115.212,[EMAIL PROTECTED],[EMAIL 
PROTECTED],Out of Office: Software Development with Microsoft
Nov 13 14:59:29 mail mimedefang.pl[20521]: filter: kADLxLLR021067:  bounce=1 
discard=1
Nov 13 14:59:29 mail mimedefang[5737]: kADLxLLR021067: Bouncing because 
filter instructed us to
Nov 13 14:59:29 mail sendmail[21067]: kADLxLLR021067: Milter: data, 
reject=554 5.7.1 Message rejected; scored too high on the Spam test.
Nov 13 14:59:29 mail sendmail[21067]: kADLxLLR021067: to=[EMAIL PROTECTED], 
delay=00:00:03, pri=31207, stat=Message rejected; scored too high on the Spam 
test.

I've put into my spamassassin/sa-mimedefang.cf file:

whitelist_from_rcvd [EMAIL PROTECTED] smtp.microsoft.com


What am I missing at this point?

Does the 2nd arg to the whitelist_from_rcvd need to be
maila.microsoft.com instead?

And what do DNS_FROM_RFC_ABUSE and DNS_FROM_RFC_POST correspond to?
  


postmaster and abuse lists at rfc-ignorant.org. Both are wildly prone to
false positives and have been removed from the 3.2 devel branch. They
effectively list sites that violate the RFCs for mail hosts and refuse
mail sent to postmaster or abuse.

That said, neither scores very high.. Assuming set3 (bayes and network)
the combined score in SA 3.1.x is only 1.908 points..

What's L_WIN_CHARSET.. that's not a stock rule I'm aware of. Looks like
an add-on to me, and probably the real culprit here. I found some
references to it from list conversations, and looks like it's trying to
match email with a windows-specific character set (windows-1252). But
it's not in any ruleset I can find anywhere.
  

Actually, it looks like a rule you yourself were developing back in
April.. What did you set the score to?
http://www.gossamer-threads.com/lists/spamassassin/users/72328

  



Yes, it's local.

I set it to 4.85.  Or maybe 4.99.

But why isn't the whitelisting kick in?

Could it be because:

# nslookup # nslookup 131.107.115.212
Server: 205.171.3.65
Address:205.171.3.65#53

Non-authoritative answer:
212.115.107.131.in-addr.arpaname = maila.microsoft.com.
212.115.107.131.in-addr.arpaname = smtp.microsoft.com.
212.115.107.131.in-addr.arpaname = mail1.microsoft.com.

Authoritative answers can be found from:
107.131.in-addr.arpanameserver = ns5.msft.net.
107.131.in-addr.arpanameserver = ns1.msft.net.
107.131.in-addr.arpanameserver = ns2.msft.net.
107.131.in-addr.arpanameserver = ns3.msft.net.
107.131.in-addr.arpanameserver = ns4.msft.net.
ns1.msft.netinternet address = 207.68.160.190
ns2.msft.netinternet address = 65.54.240.126
ns3.msft.netinternet address = 213.199.144.151
ns4.msft.netinternet address = 207.46.66.126
ns5.msft.netinternet address = 65.55.238.126


Server: 205.171.3.65
Address:205.171.3.65#53

Non-authoritative answer:
212.115.107.131.in-addr.arpaname = maila.microsoft.com.
212.115.107.131.in-addr.arpaname = smtp.microsoft.com.
212.115.107.131.in-addr.arpaname = mail1.microsoft.com.

Authoritative answers can be found from:
107.131.in-addr.arpanameserver = ns5.msft.net.
107.131.in-addr.arpanameserver = ns1.msft.net.
107.131.in-addr.arpanameserver = ns2.msft.net.
107.131.in-addr.arpanameserver = ns3.msft.net.
107.131.in-addr.arpanameserver = ns4.msft.net.
ns1.msft.netinternet address = 207.68.160.190
ns2.msft.netinternet address = 65.54.240.126
ns3.msft.netinternet address = 213.199.144.151
ns4.msft.netinternet address = 207.46.66.126
ns5.msft.netinternet address = 65.55.238.126

# 

(how hard can it be to follow $%^* RFC directions saying
only one PTR record per address)

What's the fix here?  Set the 2nd argument to the IP
address instead?  The man doesn't suggest you can do that.

And I don't want to wildcard it as microsoft.com -- that's
way too many potential hosts.

-Philip




  

Where do I get the descriptions of these tests, why some sites get
tagged with them, etc?



  




Can't upgrade w/ RPM

2006-11-02 Thread Philip Prindeville
Hi.

I'm running FC3 on an AMD64 platform for my mail server,
and I had last installed SpamAssassin 3.1.5.  Well, I grabbed the
tarball for 3.1.7, and did a rpmbuild -tb ... of the tarball.

Worked fine.

Then I tried to upgrade via RPM:

# rpm -v -U 
/home/src/redhat/RPMS/x86_64/perl-Mail-SpamAssassin-3.1.7-1.x86_64.rpm
error: Failed dependencies:
perl-Mail-SpamAssassin = 3.1.5-1 is needed by (installed) 
spamassassin-3.1.5-1.x86_64


any ideas why this is happening and what the fix is?

-Philip




Re: Can't upgrade w/ RPM

2006-11-02 Thread Philip Prindeville
Jim Maul wrote:

Philip Prindeville wrote:
  

Hi.

I'm running FC3 on an AMD64 platform for my mail server,
and I had last installed SpamAssassin 3.1.5.  Well, I grabbed the
tarball for 3.1.7, and did a rpmbuild -tb ... of the tarball.

Worked fine.

Then I tried to upgrade via RPM:

# rpm -v -U 
/home/src/redhat/RPMS/x86_64/perl-Mail-SpamAssassin-3.1.7-1.x86_64.rpm
error: Failed dependencies:
perl-Mail-SpamAssassin = 3.1.5-1 is needed by (installed) 
 spamassassin-3.1.5-1.x86_64


any ideas why this is happening and what the fix is?

-Philip
 



You cant just upgrade one of the RPM's, you need to do them all at once.

spamassassin-3.1.5-1.x86_64 is using 
perl-Mail-SpamAssassin-3.1.5-1.x86_64.rpm so you cant upgrade one 
without the other.

-Jim
  


You're right.  Sorry, I spaced.  I figured that the RPM container
actually contained several modules, like zaptel does (it also contains
zaptel-devices, zaptel-libs, etc).

Is there any reason to not have a single container contain multiple
packages?  Since they do both need to be installed simultaneously?

-Philip



Re: Image spams getting thru

2006-10-30 Thread Philip Prindeville
Logan Shaw wrote:

[snip]
And there's also an easy way around it.  Simply add noise to
the image.  There are a number of techniques, but an obvious
one to use with GIF is to assign two palette entries to
two nearly (but not quite) identical colors.  For example,
put 0xff and 0xfffeff in your palette.  Then, for every
white pixel in the original image, choose at random whether it
gets represented by a 0xff or 0xfffeff pixel.  There will
be virtually no discernable difference to the eye, but the
files will completely different, especially since GIF uses
LZW compression on the pixel data.

There are similar methods for other formats:  with JPEG, you
can just change the quality settings, causing the JPEG decoder
itself to add noise to your image.  (And perfectly legit noise,
too, since the quality parameters vary on legit images.)

And of course you can just add noise to the least significant
bit in any generic format as well.

   - Logan
  


If I could revisit this issue and be less sinister in doing so, I'm
trying to look at ways to generate a fingerprint from GIF stock
spams that could be used to filter them.

I'll need to reduce a large number of spam (no, I don't need any
extra, so don't bother forwarding them ;-)... and then do a stochastic
analysis of those parameters.

In the meantime, a couple of questions and observations...

First, CPAN seems to come up short on modules to parse and
decompose (and render!) GIF or PNG file formats. Most
disappointing. I finally decided on the now stagnant and
unsupported Image::Info module (sigh), but it doesn't
decompress that data once it deconstructs the GIF data stream
into its component parts.

I tried to use Compress::LZW to decompress the stream, but
that only seems to work on 12 or 16 bit minimum codesize,
whereas GIF images are routinely 4, 6, or 8 bits long.

Does anyone have a handle on what Perl modules to use for
dissecting GIF objects?

Thanks,

-Philip



Re: How to whitelist_from ?

2006-10-19 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  

There's no way to whitelist just the empty address then?  Rather than
everything?

-Philip

  


Not given the simple file-glob format of the whitelist commands. You'd
need a regular expression and negation.

You could do it with a rule...

header __NULL_RETURN   From !~   /./i
header __RCVD_MYHOST   Received =~ /insert Received header regex
matching your servers exchanging../
meta MY_NULL_RETURN   (__NULL_RETURN  __RCVD_MYHOST)


  


It's not the From, but rather the EnvelopeFrom.

--- Mail/SpamAssassin/Conf/Parser.pm.bak2006-08-29 09:16:46.0 
-0600
+++ Mail/SpamAssassin/Conf/Parser.pm2006-10-19 20:44:18.0 -0600
@@ -631,6 +631,10 @@
   unless (defined $value  $value !~ /^$/) {
 return $Mail::SpamAssassin::Conf::MISSING_REQUIRED_VALUE;
   }
+  # email from postmaster, abuse autoresponders, etc.
+  if ($value eq '') {
+return $conf-{parser}-add_to_addrlist ($key, '');
+  }
   $conf-{parser}-add_to_addrlist ($key, split (' ', $value));
 }


I tried the above fix, but it didn't work.

Not sure why...

-Philip





Re: How to whitelist_from ?

2006-10-19 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  

Matt Kettler wrote:

  


Philip Prindeville wrote:
 


  

There's no way to whitelist just the empty address then?  Rather than
everything?

-Philip

 
   

  


Not given the simple file-glob format of the whitelist commands. You'd
need a regular expression and negation.

You could do it with a rule...

header __NULL_RETURN   From !~   /./i
header __RCVD_MYHOST   Received =~ /insert Received header regex
matching your servers exchanging../
meta MY_NULL_RETURN   (__NULL_RETURN  __RCVD_MYHOST)


 


  

It's not the From, but rather the EnvelopeFrom.
  


A rule matching header From should match any from like header,
including Return-Path.
  


Not sure I follow.

The From: header will be [EMAIL PROTECTED] or [EMAIL PROTECTED]
or something similar (depending on the agent).

The Sender (EnvelopeFrom will be empty, however).  I believe that MdF
sticks that into the ReturnPath: header.

-Philip

Unless you're calling SA before the return-path header is created, in
which case you can't match it with SA at all.
  

  



  




Re: How to whitelist_from ?

2006-08-25 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  

There's no way to whitelist just the empty address then?  Rather than
everything?

-Philip

  


Not given the simple file-glob format of the whitelist commands. You'd
need a regular expression and negation.

You could do it with a rule...

header __NULL_RETURN   From !~   /./i
header __RCVD_MYHOST   Received =~ /insert Received header regex
matching your servers exchanging../
meta MY_NULL_RETURN   (__NULL_RETURN  __RCVD_MYHOST)
  


How about modifying the source to accept some sort of notation for an
empty address in whitelist_from_rcvd?

-Philip



Re: How to whitelist_from ?

2006-08-24 Thread Philip Prindeville
Matt Kettler wrote:

Philip Prindeville wrote:
  


  

Well, yes, especially since the IP address of the sender is reserved for
a machine that does ticketing and auto-replies exclusively (I was going
to use whitelist_from_rcvd and not just whitelist_from).



At that point, you should be able to use:

 whitelist_from_rcvd * rdns.host.name

Which will effectively white-list the host.
  


There's no way to whitelist just the empty address then?  Rather than
everything?

-Philip



How to whitelist_from ?

2006-08-23 Thread Philip Prindeville
Hmm  Maybe if I post with a more obvious subject line

What is the notation for writing a whitelist_from or whitelist_from_rcvd
when the sender is  ?  (As in MAIL FROM: )

Thanks,

-Philip


Philip Prindeville wrote:

Well, I have the following issue.  When I report abuse to [EMAIL PROTECTED],
they send me back an auto-generated email ticket with a broken Date: on
it (honestly, people, how hard is it to correctly format the date???).

They do this as  for the sending address.

How does one go about writing a whitelist_rcvd_from line for the empty
address

Aug 22 07:49:28 mail mimedefang.pl[458]: helo: dns-mx.noc.verio.net 
(129.250.49.11) said helo dns-mx.noc.verio.net
Aug 22 07:49:28 mail mimedefang.pl[458]: helo: whitelist dns-mx.noc.verio.net 
(129.250.49.11)
Aug 22 07:49:33 mail sendmail[472]: k7MDnN3u000472: from=, size=2062, 
class=0, nrcpts=1, msgid=[EMAIL PROTECTED], proto=ESMTP, daemon=MTA-v4, 
relay=dns-mx.noc.verio.net [129.250.49.11]
Aug 22 07:49:34 mail mimedefang.pl[458]: k7MDnN3u000472: hits=5.164, req=5, 
names=AWL,INVALID_DATE,NO_REAL_NAME
Aug 22 07:49:34 mail mimedefang.pl[458]: 
MDLOG,k7MDnN3u000472,spam,5.164,129.250.49.11,,[EMAIL PROTECTED],Re: 
[NTT-C2755649Z] Phishing from 161.58.27.23
Aug 22 07:49:34 mail mimedefang.pl[458]: filter: k7MDnN3u000472:  bounce=1 
discard=1
Aug 22 07:49:34 mail mimedefang[4220]: k7MDnN3u000472: Bouncing because filter 
instructed us to
Aug 22 07:49:34 mail sendmail[472]: k7MDnN3u000472: Milter: data, reject=554 
5.7.1 Message rejected; scored too high on the Spam test.
Aug 22 07:49:34 mail sendmail[472]: k7MDnN3u000472: to=[EMAIL PROTECTED], 
delay=00:00:05, pri=32062, stat=Message rejected; scored too high on the Spam 
test.


  




Re: How to whitelist_from ?

2006-08-23 Thread Philip Prindeville
John D. Hardin wrote:

On Wed, 23 Aug 2006, Philip Prindeville wrote:

  

Hmm  Maybe if I post with a more obvious subject line

What is the notation for writing a whitelist_from or
whitelist_from_rcvd when the sender is  ?  (As in MAIL FROM:
)



Are you sure you want to use that broad a brush? There is a *lot* of
garbage that is sent as faked mailer daemon bounces.
  


Well, yes, especially since the IP address of the sender is reserved for
a machine that does ticketing and auto-replies exclusively (I was going
to use whitelist_from_rcvd and not just whitelist_from).

When dealing with a known correspondent's brokenness, it's safer to
focus your permissiveness rather tightly. Try a meta rule that matches
a Received: line on a bounce from them, add a rule that ANDs that meta
with the rule that fires on their malformed date, and score it to
cancel out the malformed date score.
  


I'm not ready to work that hard...

I'd rather catch the broken email, point it out to them, have them fix it,
and then remove the whitelisting when they've fixed their agent.

-Philip




Broken abuse auto-responders

2006-08-22 Thread Philip Prindeville
Well, I have the following issue.  When I report abuse to [EMAIL PROTECTED],
they send me back an auto-generated email ticket with a broken Date: on
it (honestly, people, how hard is it to correctly format the date???).

They do this as  for the sending address.

How does one go about writing a whitelist_rcvd_from line for the empty
address

Aug 22 07:49:28 mail mimedefang.pl[458]: helo: dns-mx.noc.verio.net 
(129.250.49.11) said helo dns-mx.noc.verio.net
Aug 22 07:49:28 mail mimedefang.pl[458]: helo: whitelist dns-mx.noc.verio.net 
(129.250.49.11)
Aug 22 07:49:33 mail sendmail[472]: k7MDnN3u000472: from=, size=2062, 
class=0, nrcpts=1, msgid=[EMAIL PROTECTED], proto=ESMTP, daemon=MTA-v4, 
relay=dns-mx.noc.verio.net [129.250.49.11]
Aug 22 07:49:34 mail mimedefang.pl[458]: k7MDnN3u000472: hits=5.164, req=5, 
names=AWL,INVALID_DATE,NO_REAL_NAME
Aug 22 07:49:34 mail mimedefang.pl[458]: 
MDLOG,k7MDnN3u000472,spam,5.164,129.250.49.11,,[EMAIL PROTECTED],Re: 
[NTT-C2755649Z] Phishing from 161.58.27.23
Aug 22 07:49:34 mail mimedefang.pl[458]: filter: k7MDnN3u000472:  bounce=1 
discard=1
Aug 22 07:49:34 mail mimedefang[4220]: k7MDnN3u000472: Bouncing because filter 
instructed us to
Aug 22 07:49:34 mail sendmail[472]: k7MDnN3u000472: Milter: data, reject=554 
5.7.1 Message rejected; scored too high on the Spam test.
Aug 22 07:49:34 mail sendmail[472]: k7MDnN3u000472: to=[EMAIL PROTECTED], 
delay=00:00:05, pri=32062, stat=Message rejected; scored too high on the Spam 
test.




Whitelisting abuse and

2006-07-19 Thread Philip Prindeville
What are the steps to whitelist email sent from  (i.e. Postmaster
when bouncing mail) or [EMAIL PROTECTED]

Thanks,

-Philip



Re: Rejection text

2006-07-19 Thread Philip Prindeville
Will Nordmeyer wrote:

  

On Wed, 12 Jul 2006, Paul Dudley wrote:



If we decide to reject low grade spam messages rather than
quarantine them, is it possible to add text to the body of the
rejection message?
  

Rejecting (bouncing) spam is utterly pointless, as 99% of it will 


have
  

forged sender information. You will either be sending your notice 


to a
  

nonexistent address, in which case you get yet more useless traffic
back to your server in the form of a bounce of your bounce, or your
notice will go to some innocent third party, possibly contributing 


to
  

an effective DDoS against their email account.

--


I thought this was about having the MTA saying 555 we dont want that 


spam at the
  

end of data phase .
Whether it can be done at all, and whether the message can be 


changed, depends on the MTA
  

rather than SA

Wolfgang Hamann



Since MOST (if not all, these days) SPAM comes from invalid/forged 
addresses, doesn't that just bog down the email system with SPAM reject 
bounces bouncing back to you reporting that the address you were 
telling we rejected your SPAM is invalid?

(I had a user who had a 3rd party program that he'd do that with - I 
asked him to stop because when he'd do it, it'd bog down my email 
with invalid recipient type emails since the person he 
was notifying was an invalid address).
  


Thankfully there are fewer open relays each day, and hence if you
reject the message as it's being sent, then the sender is the spammer,
and he will know he is failing.

With any luck, he might even remove you from the list of addresses
that he will try to spam in the future.

-Philip



Re: On bichromatic GIF stock spam

2006-07-01 Thread Philip Prindeville
Loren Wilton wrote:

No, I was thinking of multipart/alternative where one of the
alternative streams is nothing but images. That doesn't strike me as
legitimate. Can anyone think of a scenario where images *are* a
legitimate alternative representation of text?



Doesn't really help.  The actual mails have a tiny gibberish text part, and
a tiny to medium html part that has a few words of gibberish (usually the
same as the text part) and the rest is calls to images.  So there really is
an html part.

I did a trivial test for alternative and gif, and it didn't pan out very
well.  Will need some additional conditions to make it more usable.

Loren

  


What Perl modules are there that can process (decode, perform certain
inspections and histogram analysis, etc) of GIF files?

I'd like to throw something together...

-Philip



Does SpamAssassin support SPF?

2006-07-01 Thread Philip Mak
Does SpamAssassin support SPF record checking?

Or is this something I have to patch into my incoming SMTP server?


Re: On bichromatic GIF stock spam

2006-06-25 Thread Philip Prindeville
John D. Hardin wrote:

On Sat, 24 Jun 2006, Philip Prindeville wrote:

  

the text and the images.  The spammers send multipart/alternative
because they want the text/plain section to confuse the Bayes
filters, since they know it won't be rendered...



It seems to me that right there is the spam sign you should be looking
for, then, and save all the heavy-duty mathematical analysis of the
images themselves.
  


A lot of mailers generate multipart/alternative legitimately, though if you
ask me sending both text/plain and text/html is bogus and no one should
configure their mailer to do that.

-Philip



Re: Adding Phishing Link rule

2006-06-24 Thread Philip Prindeville
What about combining this with a whitelist?

I.e. I regularly get emails from target.bifn0.com that contain links that
point to themselves, but say they are target.com  And in fact, this is
a 3rd party that Target has contracted to do outsource mailings for them,
so in that respect they are legitimate.  So I could easily whitelist
them and
continue to reject everyone else...

The other approach would be to push for an advisory standard (RFC)
that explains how to encode URL's so that they aren't flagged as
phishing.  (No flames from pissy people please... you know who you
are... ;-)  I.e. that at a minimum the host portions of the URL and the
label for the link would have to match...

If the sender REALLY needs to have the link reside somewhere else,
they could always have the published address send a Location: response
that redirects you to the eventual resting place.

-Philip


Loren Wilton wrote:

The rule you suggest isn't particularly good.  There are far too many legit
mails (mostly mailing list type of things) that do exactly what you want to
check for.  So the FP rate is higher than most people would like.  This has
been discussed many times in the past.

That said, I believe there is at least one SARE rule that checks for exactly
what you want to look for.

Loren

  




On bichromatic GIF stock spam

2006-06-24 Thread Philip Prindeville
I get a lot of spam that looks like:

http://pastebin.com/729105

on the alsa-devel mailing list, amongst others...  And noticed the
following.

If you decompress the GIF file and decode it into a pixmap image, then
do a color histogram of the image, you notice two things immediately.

There are two colors, black (the text), and the colored background.

Further, these spams seem to always use one of 6 common colors...

It should be trivial to write a filter that does exactly this decompression
and returns a list of what colors (maybe as a #281000 RGB encoding)
occur and what percentage of the total color map they occupy.

If, after excluding black, we find that 100% of the color map is that
nasty pastel pink or pastel lime green (etc) then it's a spam and we
toss it.

Sound reasonable?

There might be other tests that cover this, but they aren't used by
SourceForge unfortunately, which hosts a lot of the lists that I read...

-Philip



Re: On bichromatic GIF stock spam

2006-06-24 Thread Philip Prindeville
Michael Scheidell wrote:

-Original Message-
From: Philip Prindeville [mailto:[EMAIL PROTECTED] 
Sent: Saturday, June 24, 2006 2:10 PM
To: users@spamassassin.apache.org
Subject: On bichromatic GIF stock spam


I get a lot of spam that looks like:

http://pastebin.com/729105

on the alsa-devel mailing list, amongst others...  And 
noticed the following.

If you decompress the GIF file and decode it into a pixmap 
image, then do a color histogram of the image, you notice two 
things immediately.



Or feed it through character recognition software and then replace the
gif attachment with a plain text attachment and reinject it back into
SA.
  


Well, yeah, and that's already been discussed...  I wanted an alternative
to that that might be less CPU intensive.

-Philip



Re: On bichromatic GIF stock spam

2006-06-24 Thread Philip Prindeville
Loren Wilton wrote:

If, after excluding black, we find that 100% of the color map is that
nasty pastel pink or pastel lime green (etc) then it's a spam and we
toss it.

Sound reasonable?



I was thinking about this the other day.  I think the concept is reasonable,
but as stated doesn't go far enough, and would be trivial to bypass.

I think that someone first needs to come up with either a formula or a list
of RGB triples that are visually indistinguishable or some such.  (I
suspect this has been done several times now and the research should exist
in the wild.)

This can then be used as a fuzz to group colors that are very close down
into a common bucket.  As it is, trivial 1-bit variations on colors would
defeat the simple scheme.
  


Shh they might be listening... ;-)

Seriously, though, how many people send out 2-color GIFs (besides
BW scans of Dilbert and faxes) as email?

The formula is:

sqrt((r1 - r2) ^2 + (g1 - g2) ^2 + (b1 - b2) ^2))

to generate the RGB vector distance between to pixels.


It might also be interesting to accumulate a) total area of each color and
b) largest rectangle (or other easily detected shape) of each color.  The
first case we would have from the pixel counts.  The second case could be
used to detect large areas of fill color.  This might help classify a text
message vs a map of the world or a picture of downtown Camaroon.
  


Why?  What does downtown Cameroon look like?  ;-)

It also might be interesting to accumulate statistics on the common color
distributions for 10K or so legit images sent through email, possibly along
with some sort of indication of purpose: picture of me, picture of my
dog, billboard I saw, kids at Christmas, Hallmark greeting card, etc.
  


But those aren't sent as multipart/alternative... because you want to
see both
the text and the images.  The spammers send multipart/alternative because
they want the text/plain section to confuse the Bayes filters, since
they know
it won't be rendered...

With that info the color distribution might be able to help classify the
image fairly cheaply.

I don't know how much of the above would be absolutely necessary, but I
suspect at least some of it is.  Still, this is a fairly trivial sort of
thing to have to accumulate.  Expecially since all spam (at least currently)
uses gifs, which a blind man can decode with a hair comb - no fancy software
required.

Loren
  



Yup.  Exactly.

-Philip




Bad quoting

2006-06-08 Thread Philip Prindeville
I noticed the following message (well, I'll just put a fragment):

!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.0 Transitional//EN
HTMLHEAD
META http-equiv=3DContent-Type content=3Dtext/html; =
charset=3Dwindows-1252
META content=3DMSHTML 6.00.2900.2670 name=3DGENERATOR
STYLE/STYLE
/HEAD
BODY bgColor=3D#ff
DIVFONT face=3DArial size=3D2IMG alt=3D hspace=3D0=20
src=3Dcid:000e01c68b04$73437a90$41e45853@qop align=3Dbaseline=20
border=3D0IMG alt=3D hspace=3D0=20
src=3Dcid:000f01c68b04$73437aaa$41e45853@qop align=3Dbaseline=20
border=3D0IMG alt=3D hspace=3D0=20
src=3Dcid:001001c68b04$73437ac4$41e45853@qop align=3Dbaseline=20
border=3D0IMG alt=3D hspace=3D0=20
src=3Dcid:001101c68b04$73437ade$41e45853@qop align=3Dbaseline=20
border=3D0IMG alt=3D hspace=3D0=20
src=3Dcid:001201c68b04$73437af8$41e45853@qop align=3Dbaseline=20
border=3D0/FONT/DIV



Note that the '=' got escaped as '=3D'  they probably entered
the text and their HTML editor escaped it, not figuring it was
raw HTML being entered directly...

-Philip




Re: how do reject email with ....

2006-06-08 Thread Philip Prindeville
Call SA from Mimedefang.  And see the sample config I put up:

http://www.mimedefang.org/kwiki/index.cgi?PhilipsWorkingFilter

See the last test in filter_relay().

Note that there are two blocks that need to be downloaded and
put into the mimedefang-filter file.  I broke them up to be able to
document them.

-Philip


Screaming Eagle wrote:

I getting this type of spam:

  Return-Path: [EMAIL PROTECTED]
 X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on
 X-Spam-Virus: No
 X-Spam-Status: No, score=1.4 required=8.0 tests=BAYES_50,HTML_30_40,
 HTML_MESSAGE autolearn=no version=3.1.0
 X-Spam-Level: *
 Received: from 1802EC8 ([59.95.26.84]) by .
 (8.11.6/8.11.6) with SMTP id k58CtsN23285; Thu, 8 Jun 2006
08:55:55 -0400
 Received: from echoes (unknown [59.95.26.84]) by WXMVW (LBYSys) with ESMTP

The ip 59.95.26.84 is not resolvable. How can I not accept email
from sources which does not have a proper reverve lookup or name
lookup.

Thanks.
  




Blacklist of phone numbers?

2006-06-03 Thread Philip Mak
Is there a blacklist of phone numbers?

A lot of diploma spam I get has totally different message bodies,
except they list the same phone number to call.


Clarifying internal_networks

2006-05-31 Thread Philip Prindeville
I was rereading the sections on trusted_networks and internal_networks
in Mail::SpamAssassin::Conf, but something wasn't clear to me.

It talks about MXes and relays, but...  not about client workstations
that might
originate email locally and submit it via port 25 or port 465 (and not the
typical usage of submitting messages via a pipe into an exec'd sendmail
process
on the same machine, etc).

If I have a network 192.168.1.0/24, and I have workstations at 10-25 that
submit email, should I just have:

internal_networks 192.168.1.0/24

Thanks,

-Philip



Lots of this kind of spam getting through

2006-05-27 Thread Philip Mak
I'm getting about 50+ per day of these spams not being caught by
SpamAssassin (SpamAssassin version 3.1.1 running on Perl version
5.8.4). There's two types:

1. Lose weight type spam, uses bad English e.g. yrs instead of
years, u instead of you, ur instead of your, talks about not
having talked to the recipient in years

http://www.aaanime.net/pmak/spam/2006-05-27/1.txt
http://www.aaanime.net/pmak/spam/2006-05-27/2.txt
http://www.aaanime.net/pmak/spam/2006-05-27/3.txt

These spams all have different URLs, but if you visit them they're
exactly the same site. The first two resolve to the same IP address
too, though the third doesn't despite having the same content.

2. Homeowner credit, or something

http://www.aaanime.net/pmak/spam/2006-05-27/a.txt
http://www.aaanime.net/pmak/spam/2006-05-27/b.txt

These spams keep slipping through SpamAssassin consistently. Most of
my false negatives are variants of the messages I posted above. Any
suggestions on how to block them?

P.S. Looks like this mailing list's spam filter can block them! The
first time I tried to send this message, I had the spams included in
the body of my message and they got blocked.

users@spamassassin.apache.org:
140.211.166.49 failed after I sent the message.
Remote host said: 552 spam score (19.3) exceeded threshold


Re: Crosspost: [Mimedefang] Using per-list SA policies

2006-05-26 Thread Philip Prindeville

Loren Wilton wrote:

Well, I didn't get any responses on the MDF mailing list,
so I was wondering if SA was the better angle to be coming
at this with.



I think we can help you, but it will depend on exactly what you want to do.
SA normally is used to filter to mail for recipients.  You seem to be
talking about mail FROM a list, but then seem to be talking about filtering
mail TO the list.  So I'm a little confused on just what you want to do.
  


No, it's to a list.  At the list exploder, we want to be able to apply
certain per-list policies.  For instance, for most lists (but not all),
the following would be applicable:

languages_ok en

score SUBJ_FARAWAY 6.0
score SUBJ_ILLEGAL_CHARS 6.0
score UNWANTED_LANGUAGE_BODY 6.0

etc.

For some lists, we might have:

score MIME_HTML_ONLY 6.0

and for everyone, we would want:

score ILLEGAL_DATE 6.0
score DATE_IN_FUTURE_96_XX 6.0
score DATE_IN_FUTURE_48_96 6.0
score DATE_IN_PAST_96_XX 6.0
score DATE_IN_PAST_48_96 6.0

as examples.



I was wondering... Since MdF can be used to invoke SA, and it can
extract information from the headers such as a the envelope recipient
information...  I was wondering about a lot of the ML's on
lists.sourceforge.net.
They get a lot of spam.  Especially open forums like alsa-devel that you
don't have to be subscribed to in order to post to.



Of course, this is something that the owner of the ML should fix in their
configuration.  The sounds like a free spammer tool to me, configured as it
is.
  


Well, it's a common thing.  Many devel lists require open postings so
that users can tell developers about bugs without having to join the list.



So I was wondering if MdF could be used to have a clever hack
where one could see if the message was going to a single recipient
(in this case, the local recipient would be a list name) and try to have
SA apply additional rules for that list.



Sure.  Write a rule that checks for a specific To or envelope sender or
List-Id:

header __ML_LIST_1List-Id =~ /ALSA Devel/
  


Unfortunately, the List-Id isn't yet present at this point.  Sendmail
receives the message from the original poster, invokes SA and/or
MDF, and then when everything checks out, uses mailman as
the delivery agent (who then inserts the List-Id).

You might have multiple recipients, but only one of them should
be local, unless the message is being cross-posted to several lists
and more than one of them are on this host.



I.e. you might have a site-wide policy, that says you can't post if:



  

INVALID_DATE
DATE_IN_PAST_96_XX
DATE_IN_FUTURE_96_XX



meta BOGUS_SENDER __ML_LIST_1  INVALID_DATE  (__DATE_IN_PAST ||
__DATE_IN_FUTURE)
scoreBOGUS_SENDER 10

(You will have to build the metas for __DATE_IN_PAST, etc)

  

are fired...  And you might have a specific set of rules for a list like
alsa-devel (the 'L' in ALSA is for Linux, so it might be reasonable
to assume that no one will be posting with charset='windows-1252'...
it's also an English language list, so having 'ok_languages en' would
be reasonable as well).



This would be a pretty bad idea.  I develop Linux stuff and on linux, but my
mail system is either OE or Outlook on Windows boxen.  I can't be the only
one.

Loren

  


Well, I don't know.

The RFC's are pretty clear that western European languages are encoded
as USASCII = ISO-8859-1 = UTF-8 in that order, no exceptions.

Any UA breaking this (even, or perhaps especially if it's MS, since they're
big enough to have adequate personnel and resources to know better) should
be spanked.

Otherwise, it won't get fixed.

As I remember, setting the default codepage in Windows to be ISO-8859-1
system-wide isn't that hard.

-Philip




Re: Crosspost: [Mimedefang] Using per-list SA policies

2006-05-26 Thread Philip Prindeville

Kai Schaetzl wrote:

Philip Prindeville wrote on Fri, 26 May 2006 11:26:33 -0600:

  
No, it's to a list.  At the list exploder, we want to be able to apply 
certain per-list policies.  For instance, for most lists (but not all), 
the following would be applicable: 



I don't use MimeDefang, but I guess it should be possible to tell MD to use a 
specific configuration file when calling SA (depends on how it calls SA).


  
Well, it's a common thing.  Many devel lists require open postings so 
that users can tell developers about bugs without having to join the list.



Have two addresses, one for subscribers, one for non-subscribers?

Kai

  


Except that developers aren't vetted in any particular way.  The spammer
with time on his hands could subscribe himself...  plus there's always the
chance of a legitimate user's machine becoming infected with an email
worm...

-Philip




Re: Crosspost: [Mimedefang] Using per-list SA policies

2006-05-26 Thread Philip Prindeville

jdow wrote:

From: Philip Prindeville [EMAIL PROTECTED]

are fired...  And you might have a specific set of rules for a list 
like

alsa-devel (the 'L' in ALSA is for Linux, so it might be reasonable
to assume that no one will be posting with charset='windows-1252'...
it's also an English language list, so having 'ok_languages en' would
be reasonable as well).



This would be a pretty bad idea.  I develop Linux stuff and on 
linux, but my
mail system is either OE or Outlook on Windows boxen.  I can't be 
the only

one.

Loren

  


Well, I don't know.

The RFC's are pretty clear that western European languages are encoded
as USASCII = ISO-8859-1 = UTF-8 in that order, no exceptions.

Any UA breaking this (even, or perhaps especially if it's MS, since 
they're
big enough to have adequate personnel and resources to know better) 
should

be spanked.

Otherwise, it won't get fixed.

As I remember, setting the default codepage in Windows to be ISO-8859-1
system-wide isn't that hard.

-Philip


Loren has a good point, Phillip. I happen to use Outlook Express because
I make my income, you know - what you use to eat and keep a roof over
your head, off software developed for Windows that for one reason or
another cannot be done on Linux. I telecommute. I also, if you dig deep
enough, have a Linux Kernel contribution to the 2.6 kernel tree.

It happens that I am an anti-HTML and anti-base64 bigot so I don't have
that charset issue to deal with. But if I did switch over to base 64
with character level formatting (for any reason other than posting an
accurate but unreadable white on white response to an HTML posting) I
might face that charset issue.

Please note something particularly important here, Philip. I have an
issue with RFC bigots. RFCs are *NOT* standards. They are Requests For
Comments, nothing more. When they become STANDARDS it is FAR more
critical to deal with them correctly. (If you think this particular
issue is all that critical then get in there and help make the RFC
into a standard.)


Actually, they are standards.  Some are mandatory, some are elective,
some are experimental:

Network Working GroupP. Prindeville
Request for Comments:  1051   McGill University
March 1988


   A Standard for the Transmission of IP Datagrams
 and ARP Packets over ARCNET Networks


Status of this Memo

  This RFC specifies a standard protocol for the Internet community.
  Distribution of this memo is unlimited.



You're confusing RFC's with IDEA's, which are IETF Draft standards.  We
talked about changing the names of RFC's 15 years ago to something else
just because the name is misleading...  But traditional trumped rationale.

See section 2 (and sections 4.1  4.2 as well) of RFC-1140 if you want to
twist your head around the history:

2.  The Request for Comments Documents

  The documents called Request for Comments (or RFCs) are the working
  notes of the Network Working Group, that is the Internet research
  and development community.  A document in this series may be on
  essentially any topic related to computer communication, and may be
  anything from a meeting report to the specification of a standard.

  Notice:

 All standards are published as RFCs, but not all RFCs specify
 standards.

  Anyone can submit a document for publication as an RFC.  Submissions
  must be made via electronic mail to the RFC Editor (see the contact
  information at the end of this memo).

  While RFCs are not refereed publications, they do receive technical
  review from the task forces, individual technical experts, or the RFC
  Editor, as appropriate.

  The RFC series comprises a wide range of documents such as
  informational documents of general interests to specifications of
  standard Internet protocols.  In cases where submission is intended
  to document a proposed standard, draft standard, or standard
  protocol, the RFC Editor will publish the document only with the
  approval of both the IESG and the IAB.  For documents describing
  experimental work, the RFC Editor will typically request review
  comments from the relevant IETF working group or IRTF research group
  and provide those comments to the author prior to committing to
  publication.  See Section 5.1 for more detail.

  Once a document is assigned an RFC number and published, that RFC is
  never revised or re-issued with the same number.  There is never a
  question of having the most recent version of a particular RFC.
  However, a protocol (such as File Transfer Protocol (FTP)) may be
  improved and re-documented many times in several different RFCs.  It
  is important to verify that you have the most recent RFC on a
  particular protocol.  This IAB Official Protocol Standards memo is
  the reference for determining the correct RFC to refer to for the
  current specification of each protocol

Re: Crosspost: [Mimedefang] Using per-list SA policies

2006-05-26 Thread Philip Prindeville

Kai Schaetzl wrote:

Philip Prindeville wrote on Fri, 26 May 2006 13:32:10 -0600:

  
Except that developers aren't vetted in any particular way. 



vetted?
  


You can sign yourself up for most lists, if you have a valid address
and a web browser.

You have to try to get yourself blacklisted (manually) otherwise.

 The spammer 
  
with time on his hands could subscribe himself...  plus there's always the 
chance of a legitimate user's machine becoming infected with an email 
worm...



yes and yes. But how do you want to know this? My impression was that you 
want to score non-subscriber mail to the list differently than subscriber 
mail. If that wasn't the case I misunderstood you.
  


No, same scoring for subscribers/non-subscribers (not all lists on
SourceForge are open lists).  Just that each list has it's own personal
scoring template and rulesets...  I.e. on a list for a software product
that runs cross-platform (such as Thunderbird, if Thunderbird were
hosted on SourceForge--it's not) then you might want to allow
charset=windows-1252 or MIME_HTML_ONLY.  On a list
that relates to... ummm... adding translations to Mplayer documentation
and front-ends, you wouldn't want to have languages_ok en, for
example, or SUBJ_FARAWAY scoring.

-Philip




Kai

  




Crosspost: [Mimedefang] Using per-list SA policies

2006-05-25 Thread Philip Prindeville

Well, I didn't get any responses on the MDF mailing list,
so I was wondering if SA was the better angle to be coming
at this with.

Thanks,

-Philip

---BeginMessage---

I was wondering... Since MdF can be used to invoke SA, and it can
extract information from the headers such as a the envelope recipient
information...  I was wondering about a lot of the ML's on 
lists.sourceforge.net.


They get a lot of spam.  Especially open forums like alsa-devel that you
don't have to be subscribed to in order to post to.

So I was wondering if MdF could be used to have a clever hack
where one could see if the message was going to a single recipient
(in this case, the local recipient would be a list name) and try to have
SA apply additional rules for that list.

I.e. you might have a site-wide policy, that says you can't post if:

INVALID_DATE
DATE_IN_PAST_96_XX
DATE_IN_FUTURE_96_XX

are fired...  And you might have a specific set of rules for a list like
alsa-devel (the 'L' in ALSA is for Linux, so it might be reasonable
to assume that no one will be posting with charset='windows-1252'...
it's also an English language list, so having 'ok_languages en' would
be reasonable as well).

Is there a straightforward way to implement this?  Because having
a single policy work effectively across all lists on SF isn't going to
happen...

-Philip


___
NOTE: If there is a disclaimer or other legal boilerplate in the above
message, it is NULL AND VOID.  You may ignore it.

Visit http://www.mimedefang.org and http://www.roaringpenguin.com
MIMEDefang mailing list MIMEDefang@lists.roaringpenguin.com
http://lists.roaringpenguin.com/mailman/listinfo/mimedefang
---End Message---


Re: Filtering windows-1252 charset

2006-05-22 Thread Philip Prindeville

Kai Schaetzl wrote:

Philip Prindeville wrote on Thu, 18 May 2006 08:47:48 -0600:

  
How legitimate is email sent as 
windows-1252?



Very, because broken Windows clients use it.

Kai
  


Ah, the Strong Arm school of standards enforcement.  ;-)

-Philip



Re: Filtering windows-1252 charset

2006-05-18 Thread Philip Prindeville
Jonathan Armitage wrote:

I see some spam with windows-1252 or other unwanted character sets at 
the start of the subject. I reject them via an Exim ACL, so SA doesn't 
even have to scan them.
  


Which brings up the subject...  How legitimate is email sent as
windows-1252?

I see absolutely no reason to send it, since it offers no advantage over
iso-8859-1
or utf-8, and the RFC's are pretty clear about using the smallest
encoding that
will fit a message, i.e. usascii = iso-8859-1 = utf-8 (in that order).

Further, if you're in the Unix world (or more broadly, not in the
Windows world),
why would you want to use vendor-specific encodings for no reason other than
they're the broken defaults Microsoft chose to use?

-Philip



ALL_TRUSTED causing false negatives?

2006-05-10 Thread Philip Mak
I've been getting a lot of spam lately ever since I moved my mail
server to a new system. Here's one of the false negatives that slipped
through, for example:

X-Spam-Status: No, score=-2.1 required=5.0 tests=ALL_TRUSTED,BAYES_50,  
NO_REAL_NAME,RCVD_BY_IP,YOUR_INCOME autolearn=ham version=3.0.3 
X-Spam-Summary:  0.0 NO_REAL_NAME   From: does not include a real name  
0.1 RCVD_BY_IP Received by mail server with no name 
-3.3 ALL_TRUSTEDDid not pass through any untrusted hosts
1.1 YOUR_INCOMEBODY: Doing something with my income 
0.0 BAYES_50   BODY: Bayesian spam probability is 40 to 60% 
[score: 0.5000]  

Why does ALL_TRUSTED have a score of -3.3? Doesn't this mean that any
spammer who connects directly to my mail server has a good chance of
getting past SpamAssassin?

I did not define any trusted/internal networks when I installed
SpamAssassin.

SpamAssassin version 3.0.3
  running on Perl version 5.8.4

Linux naga.aaanime.net 2.6.8-11-amd64-k8 #1 Sun Oct 2 21:26:54 UTC 2005 x86_64 
GNU/Linux

Running Debian Sarge


Re: Non-English languages

2006-04-14 Thread Philip Prindeville
Kenneth Porter wrote:

the classes dragged so incredibly slowly that I learned just a little 
vocabulary and the most basic of grammar, and still led the class. I 
usually finished my physics homework in that class while waiting for 
everyone to catch up.

As a programmer I envy my professional peers who can speak Japanese and 
other non-European languages. My interest in programming languages extends 
to natural languages, and I find their differences fascinating.

To those of you who've successfully learned 2nd and 3rd languages as an 
adult, what do you recommend for accomplishing that?
  

 Same here. I took a couple years of high school Spanish in California and


Comic books. Or bande dessinee as it's called in French.

The story lines are often simple, and the pictures give a lot of context
to what is
being talked about.

-Philip



Re: xxxl spam

2006-04-14 Thread Philip Prindeville
mouss wrote:

  and I've got plenty of users that speak
  

multiple languages, not all of which use plain-ascii.




I guess so. now I'm not sure our situation isn't worst because people 
tried to find non standard solutions that are still used. I still 
remember the days when some customers were asking us to fix our 
software because it broke their accents... hopefully these times are 
gone, but I still see broken mail (much more than I should). actually, 
I also see mail that doesn't get rendered correctly on thunderbird. so 
I'll admit that the issue isn't really about accented chars...
  


This is a real sore point for me.  I worked on the Mime quoted-printable
encoding
14 years ago, and in some ways we haven't come nearly as far as we
should have
(see my posts as [EMAIL PROTECTED] when I was at France Telecom).

A lot of it has to do with idiots like Microsoft pushing competing
standards (like
Windows-1251) that offer no advantage whatsoever over their established
standards (like ISO Latin-1) and serve only to increase the exponential
problem
of interoperability matrices... the number of ways each agent must be tested
against other agents, etc...  thereby guaranteeing that complete testing
of all
possible permutations becomes an unattainable goal receding ever more
quickly
towards the horizon

Where we could have been smart and limited ourselves to a manageable and
very finite set of permutations instead...

This is why our site has the following rule:

# don't allow windows-125x text attachments...
mimeheader __CTYPE_MH_WIN1252   Content-Type =~
/charset=\windows-125[0-8]\/i
meta L_WIN_CHARSET  ((__CTYPE_MH_HTML ||
__CTYPE_MH_TEXT_PLAIN)  __CTYPE_MH_WIN1252)
describe L_WIN_CHARSET  Content-Type is Windows-specific text
score L_WIN_CHARSET 0.1


should probably do the same for non-MIME content, but it's not as much of a
problem since Outlook prefers MIME content.

If anyone wants to talk to us, they can stick with ISO Latin-1.  We
don't need no stinkin'
Windows-125x...  (or -839 for that matter).

-Philip



Re: Russian Spam

2006-04-14 Thread Philip Prindeville
Are you running Mimedefang?

It might be a start.

We block email from subscriber addresses at networks that are known to be
large sources of spam.

See:

http://www.mimedefang.org/kwiki/index.cgi?PhilipsWorkingFilter

in particular, how %bad_tld's is used.

-Philip


Kristopher Austin wrote:

I have received several copies of a spam message that is in Russian (I think 
it's Russian).  I get maybe 1 or 2 a week.  I wish I could block all Russian 
messages, but we are a University and could easily have Russian students.  I 
am unable to read this message and therefore have no ideas on how to block 
this.  Can anyone help me out with suggestions?

I apologize if this has been discussed in the last week.  I haven't had time 
to catch up on list messages over the last couple of days and didn't see 
anything skimming the subjects of recent threads.

Thanks,
Kris

Message with full headers below:

Microsoft Mail Internet Headers Version 2.0
Received: from gateway3.oc.edu ([205.143.222.12]) by fsmail.oc.edu with 
Microsoft SMTPSVC(6.0.3790.211);
Thu, 13 Apr 2006 08:50:17 -0500
Received: from ip-189.net-82-216-33.toulouse.rev.numericable.fr 
([82.216.33.189])(helo=ip-189.net-82-216-33.toulouse.rev.numericable.fr)
   by gateway3.oc.edu with smtp (Exim 4.54)
   id 1FU2CH-0008JS-AY
   for [EMAIL PROTECTED]; Thu, 13 Apr 2006 08:49:43 -0500
From: Litvinova Elena [EMAIL PROTECTED]
To: Samusenko Tat'jana [EMAIL PROTECTED]
Date: Thu, 13 Apr 2006 13:50:06 +
Message-ID: [EMAIL PROTECTED]
MIME-Version: 1.0
Content-Type: text/plain;
   format=flowed;
   charset=koi8-r;
   reply-type=original
Content-Transfer-Encoding: 8bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1441
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1441
X-SA-Exim-Connect-IP: 82.216.33.189
X-SA-Exim-Rcpt-To: [EMAIL PROTECTED]
X-SA-Exim-Mail-From: [EMAIL PROTECTED]
X-Spam-Checker-Version: SpamAssassin 3.1.0 (2005-09-13) on gateway3.oc.edu
X-Spam-Level: 
X-Spam-Status: No, score=0.3 required=5.0 tests=DNS_FROM_AHBL_RHSBL,RELAY_FR 
   autolearn=disabled version=3.1.0
Subject: Re[6]: =?koi8-r?B?9Nkgzc7Px88gxMzRIM3FztEg2s7B3snb2A==?= davavsheju
X-SA-Exim-Version: 4.2 (built Thu, 03 Mar 2005 10:44:12 +0100)
X-SA-Exim-Scanned: Yes (on gateway3.oc.edu)
Return-Path: [EMAIL PROTECTED]
X-OriginalArrivalTime: 13 Apr 2006 13:50:17.0572 (UTC) 
FILETIME=[32A1FA40:01C65F01]

Рад Вас снова видеть!

Вы собираетесь в США? Хотите свободно работать
с технической документацией? Расширить свой кругозор?

Центр Американского Английского
приглашает выучить английский язык!!!
Все стадии обучения - от нуля до высшего. Ассоциативно-
образная методика. Преподаватели из США.

Без больших скидок не уйдёте! :)

Наши телефоны в Москве:
105 пять-один-восемь-шесть
два-три-восемь-три-три-восемь-шесть


Не хотите получать информацию от Центра? Отправьте свой адрес нам:
[EMAIL PROTECTED]



сил. Но он не мог понять того, -- вдруг как бы вырвавшимся тонким голосом
закричал князь Андрей, -- но он не мог понять, что мы в первый раз дрались
там за русскую землю, что в войсках был такой дух, какого никогда я не
видал, что мы два дня сряду отбивали французов и что этот успех удесятерял
наши силы. Он велел отступать, и все усилия и потери пропали даром. Он не
думал об измене, он старался все сделать как можно лучше, он все обдум
от этого-то он и не годится. Он не годится теперь именно потому, что он все
обдумывает очень основательно и аккуратно, как и следует всякому немцу. Как
бы тебе сказать... Ну, у отца твоего немец-лакей, и он прекрасный лакей и
удовлетворит всем его нуждам лучше тебя, и пускай он служит; но ежели отец
при смерти болен, ты прогонишь лакея и своими непривычными, неловкими 
станешь ходить за отцом и лучше успокоишь его, чем искусный, но чужой
человек. Так и сделали с Барклаем. Пока Россия была здорова, ей мог служить

  




Haven't seen this one before... Premature padding of base64 data

2006-04-13 Thread Philip Prindeville
This appeared in my logs.  Running 3.1.1 on Linux FC3 (x86_64) with
Sendmail 8.13.1 and Mimedefang 2.56:

Apr 13 16:57:05 mail sendmail[23371]: NOQUEUE: connect from
lists-outbound.sourceforge.net [66.35.250.225]
Apr 13 16:57:05 mail sendmail[23371]: k3DMv5s4023371: Milter
(mimdefang): init success to negotiate
Apr 13 16:57:05 mail sendmail[23371]: k3DMv5s4023371: Milter: connect to
filters
Apr 13 16:57:05 mail mimedefang.pl[22325]: helo:
lists-outbound.sourceforge.net
(66.35.250.225) said helo lists-outbound.sourceforge.net
Apr 13 16:57:05 mail sendmail[23371]: k3DMv5s4023371:
from=[EMAIL PROTECTED], size=15309, class=-60,
nrcpts=1, msgid=[EMAIL PROTECTED], proto=ESMTP, daemon=MTA-v4,
relay=lists-outbound.sourceforge.net [66.35.250.225]
Apr 13 16:57:06 mail mimedefang-multiplexor[11341]: Slave 8 stderr:
Premature padding of base64 data at
/usr/lib/perl5/vendor_perl/5.8.5/MIME/Decoder/Base64.pm
line 109.
Apr 13 16:57:07 mail mimedefang.pl[22325]: k3DMv5s4023371: hits=18.463,
req=5,
names=DATE_IN_PAST_96_XX,FORGED_MSGID_MSN,HTML_IMAGE_ONLY_12,HTML_MESSAGE,HTML_SHORT_LINK_IMG_1,L_ALSA_DEVEL,MIME_HTML_ONLY,MSGID_SHORT,SPF_PASS,URIBL_SBL,URIBL_WS_SURBL
Apr 13 16:57:07 mail mimedefang.pl[22325]:
MDLOG,k3DMv5s4023371,spam,18.463,66.35.250.225,[EMAIL PROTECTED],[EMAIL 
PROTECTED],[Alsa-devel]
Your mortagee approval
Apr 13 16:57:07 mail mimedefang.pl[22325]: filter: k3DMv5s4023371: 
bounce=1 discard=1
Apr 13 16:57:07 mail mimedefang[11357]: k3DMv5s4023371: Bouncing because
filter
instructed us to
Apr 13 16:57:07 mail sendmail[23371]: k3DMv5s4023371: Milter: data,
reject=554 5.7.1 Message rejected; scored too high on the Spam test.


Any ideas?  Didn't see any mention of it in previous postings...

Interesting msg-id.  Hmmm.  Already a rule for that.  Good...

-Philip





Re: Internal email marked as spam...

2006-04-10 Thread Philip Prindeville
Daryl C. W. O'Shea wrote:

Screaming Eagle wrote:
  

All,
Emailing with outlook and from internal network is marked as spam:
pts rule name  description
 -- 
--
-1.8 ALL_TRUSTEDPassed through trusted hosts only via SMTP
 1.1 MIME_HTML_MOSTLY   BODY: Multipart message mostly text/html MIME
 1.0 HTML_MESSAGE   BODY: HTML included in message
 0.1 HTML_90_100BODY: Message is 90% to 100% HTML
 3.0 BAYES_00   BODY: Bayesian spam probability is 0 to 1%
[score: 0.0002]
 2.8 RATWARE_OUTLOOK_NONAME Bulk email fingerprint (Outlook no name)
found
 1.9 RATWARE_MS_HASHBulk email fingerprint (msgid ms hash) found
 1.7 MSGID_DOLLARS  Message-Id has pattern used in spam
-0.8 AWLAWL: From: address is in the auto white-list

I think the RATWARE_OUTLOOK_NOMAME, RATWARE_MS_HASH,and MSGID_DOLLARS 
is  skewing the score.  I have only seen this score if you use MS 
OUTLOOK. Any idea why and if there is work around for this?  Thanks.




   3.0 BAYES_00   BODY: Bayesian spam probability is 0 to 1%
  [score: 0.0002]

Whoever set the score for BAYES_00 to 3.0 must have been high!

Daryl
  


That's true, but you'd still be over 5.0 even without it.

-Philip



Auto-whitelist format

2006-04-06 Thread Philip Prindeville
I tried to do a makedb -u on the .spamassassin/auto-whitelist file, but
it failed with:

makedb: cannot open database file `/root/.spamassassin/auto-whitelist':
Invalid argument

Is there a handy way to manipulate this db manually (no pun intended)?

Thanks,

-Philip



Re: Filtering based on the recipients

2006-04-05 Thread Philip Prindeville
Magnus Holmgren wrote:

onsdag 05 april 2006 06:43 skrev Philip Prindeville:
  

I was looking on the FAQ and the Wiki, but couldn't find this...

How do I filter based on the recipient mailbox address?  For instance, I'm
running Linux, so if I get email sent to [EMAIL PROTECTED] or [EMAIL 
PROTECTED]
then I know they're bogus...

And can probably block it, even if some of the recipients are valid email
addresses.



You might want to make sure that your MTA adds headers for envelope 
recipients, for example Envelope-To:.

Then you can use 

 blacklist_to [EMAIL PROTECTED]
 blacklist_to [EMAIL PROTECTED]

but I would be careful because one *could* legitimately guess addresses, 
especially if it's difficult to find detailed contact information on the 
corresponding website. (Generally speaking; I don't know anything about your 
website or to which addresses it would be plausible to send mail at your 
domains.)
  


Actually, a lot of these boilerplate email addresses like daemon,
uucp, etc.
all get aliased to root, and root goes to me.

The legitimate addresses are few and go to different mailboxes.


Your MTA does reject nonexistent recipients, and you just want to block mail 
that is sent to certain nonexistent recipients *and* one or more existant 
ones, I presume?

  


Yes.

-Philip




Re: Filtering based on the recipients

2006-04-05 Thread Philip Prindeville
Matt Kettler wrote:

[EMAIL PROTECTED] wrote:
  

Matt Kettler wrote:
  


[It] has no access to the message envelope, only the headers and
body, so this information isn't accessible to SA.

  

Well, unless you add an Apparently-To header in the MTA prior to calling 
SpamAssassin.  MIMEDefang has an $AddApparentlyToForSpamAssassin variable you 
can set to 1 in mimedefang-filter for this.

I assume SpamAssassin uses this header?

  


Yes, but I've never seen an Apparently-To implementation that listed
all the recipients of a multi-recipient message...

All the implementations I've seen add this after the message has been
split up and only the current recipient is added, which doesn't help. We
are trying to detect one which has a BCC to another user.
  


Exactly.

I'm using Sendmail and Mimedefang 2.56 if that helps any.  Looking at
spam_assassin_mail() in Mimedefang, I see:

if ($AddApparentlyToForSpamAssassin and
($#Recipients = 0)) {
push(@sahdrs, Apparently-To:  .
 join(, , @Recipients) . \n);
}

Are you sure the value of @Recipients is fragmented at this point?

-Philip



Re: Filtering based on the recipients

2006-04-05 Thread Philip Prindeville
Philip Prindeville wrote:

Matt Kettler wrote:

  

[EMAIL PROTECTED] wrote:
 



Matt Kettler wrote:
 
   

  

[It] has no access to the message envelope, only the headers and
body, so this information isn't accessible to SA.
   
 



Well, unless you add an Apparently-To header in the MTA prior to calling 
SpamAssassin.  MIMEDefang has an $AddApparentlyToForSpamAssassin variable 
you can set to 1 in mimedefang-filter for this.

I assume SpamAssassin uses this header?

 
   

  

Yes, but I've never seen an Apparently-To implementation that listed
all the recipients of a multi-recipient message...

All the implementations I've seen add this after the message has been
split up and only the current recipient is added, which doesn't help. We
are trying to detect one which has a BCC to another user.
 




Exactly.

I'm using Sendmail and Mimedefang 2.56 if that helps any.  Looking at
spam_assassin_mail() in Mimedefang, I see:

if ($AddApparentlyToForSpamAssassin and
($#Recipients = 0)) {
push(@sahdrs, Apparently-To:  .
 join(, , @Recipients) . \n);
}

Are you sure the value of @Recipients is fragmented at this point?

-Philip

  


Oh, never mind.  Hadn't yet caught up with all of the comments.

Regarding using the /etc/mail/access file... yeah, I could do that, but
I can
get much more powerful filtering in mimedefang or spamassassin, so I'm
gradually going through the process of moving all of that functionality
out of sendmail and into either SA or MdF.

-Philip



<    1   2   3   4   >