Re: URI with spaces are not recognized

2009-02-14 Thread Franz Schwartau
Hi John!

John Hardin wrote:
 On Fri, 13 Feb 2009, Benny Pedersen wrote:
 
 On Fri, February 13, 2009 18:12, John Hardin wrote:
 If a URI rule works, what's wrong with a body rule?

 nothing wroung making bad rules either, point is that if bad rules
 is needed one have also bad behaving browser problem
 
 Why should the fact that a mail client won't render that URI as a
 clickable link mean there shouldn't be a rule for it? Spammers have been
 obfuscating URIs in this manner for a long time. There's nothing wrong
 with rules for obfuscated URIs.

Thanks for pointing out! :-) Our primary goal is to identify spam, not
to prevent people from typing these obfuscated URLs in their browser...

 OT: Benny, could you refrain from setting your Reply-To to the email
 address of the original poster? Setting it to the mailing list address
 is fine, but setting it to the original poster is just
 passive-aggressive rudeness.
 
 On Fri, 13 Feb 2009, Franz Schwartau wrote:
 
 So, does anyone know a more general solution for this kind of spam
 instead of individual body rules?
 
 You might try a rule like:
 
  body URI_SPC_OBFU_SPC
 /\bwww\s{1,20}\.\s{1,20}\w{5,20}\s{1,20}\.\s{1,20}net\b/i
 
 I think it would be risky to make the URI parser attempt too much
 deobfuscation; however, accepting \s+\.\s+ as \. might be justified.
 Perhaps \s+dot\s+ as well.
 
 If the spammer uses something more complex they're reducing the
 likelihood the recipient will bother to deobfuscate the URI, and it's
 more likely to be caught by bayes, so I'd suggest the ROI to SA for
 making it more aggressive isn't large enough.

I thought about this generic body rule, too. Unfortunally this rule
catches also legitimate mistyped URLs containing spaces. Think of users
typing URLs fast and hitting the space bar accidentally while typing. ;-)

After reading PerMsgStatus.pm again another idea came up. Instead of
modifying $schemelessRE (which wouldn't help anyway) the URLs containing
spaces are replaced by URLs without spaces before spamassassin gathers
URIs. Thus all URI specific rules can be applied (e. g. uri directive
and URI blacklists).

The regexp is kept simple intentionally and matches legitimate (without
spaces) URLs as well but this doesn't hurt much.

This patch works for me and perhaps someone else finds it useful.
Comments are welcome, too. :-)

Best regards
Franz
--- PerMsgStatus.pm.new.orig2009-02-14 11:21:20.0 +0100
+++ PerMsgStatus.pm.new 2009-02-14 11:20:54.0 +0100
@@ -1417,7 +1417,13 @@
 =cut
 
 sub get_decoded_stripped_body_text_array {
-  return $_[0]-{msg}-get_rendered_body_text_array();
+  my $textary = $_[0]-{msg}-get_rendered_body_text_array();
+
+  for (@$textary) {
+
s/(www)\s{0,2}\.\s{0,2}([a-z\d._-]{10,32})\s{0,2}\.\s{0,2}((net|org))/$1.$2.$3/i;
+  }
+
+  return $textary;
 }
 
 ###


Re: URI with spaces are not recognized

2009-02-14 Thread mouss
Wolfgang Zeikat a écrit :
 I think the discussion is getting carried in a direction where we are
 missing a point: spam detection.
 

exactly.

otherwise, there's no point to waste resources running SA. after all,
nobody would die for visiting a porn/casino/pharma/... site ;-p

and there's also another case: in a school or at home, you don't want
these messages to reach children mailboxes, and if an adult
(moderator) checks the delivered messages, you want to reduce his work
by filtering out as much junk as you can.



Re: Spam or Not Spam :)

2009-02-14 Thread Karsten Bräckelmann
On Fri, 2009-02-13 at 19:24 -0800, an anonymous Nabble user wrote:
 I have some mails that I know they are spam but spamassassin gives secores
 below 5.0(generally zero) for some of them. I updated the rules,changed the
 score threshold but spamassassin still sees them as normal emails. Am I
 missing something or is this normal?

Depends on the amount -- but yes, generally sounds just about right.

There *is* spam out there, that basically dos not hit any rules other
than Bayes and some URI and DNS BLs.

Now, as you are post-processing (old?) messages for some stats, it is
entirely possible the blacklist listings have expired, as someone
explained before. Given the previous discussions and this description, I
can only assume you are not using Bayes -- so that won't trigger either.


 I have like 1800 emails but it sees only 5 of them as spam.

Mixed up these numbers, eh? ;)


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: Last-5-percent tuning

2009-02-14 Thread Ricardo Kleemann

Hi,


 Do you use any MTA-level DNSBLs?


No.


If you have ample of ressources you can do this. If you are getting
tenthousands of mails you can't (or won't). We reject about 90% of the
spam at MTA. That's mostly Bot spam. Why should we burn good ressources
for that stuff? Interestingly, that also kills almost all of the fierce
spam that might slip thru SA. So, SA then does a very good job on the rest
which lets slip only a few by. With SA only we would have much more slip
by. And we don't need any extra rules (like SARE, KAM) anymore. I'm using
sought, but it doesn't appear to be too efective.


Which SA plugin performs this? Is it Mail::SpamAssassin::Plugin::URIDNSBL? 



Re: Two servers, one database. A question

2009-02-14 Thread Bob Proulx
Kris Deugau wrote:
 John Hardin wrote:
 The question is which is better, sending the message body (spamc -  
 spamd traffic) or database queries (spamd - mysql traffic) over the  
 expensive link?

 I would bet on Bayes/userpref queries being more efficient than the  
 spamc/spamd traffic.

I like that you are asking the question.  But I hate to guess at which
is better though.  The weakest benchmark data point is better than the
strongest guess.  Too often I have taken my best guess and been wrong.
In this case I would guess the opposite would be more efficient, that
the one spamc-spamd connection per message would be more efficient
than the many mysql queries per message, which is why I bring this up.

Bob


Re: Two servers, one database. A question

2009-02-14 Thread Lindsay Haisley
On Sat, 2009-02-14 at 15:04 -0600, Bob Proulx wrote:
  I would bet on Bayes/userpref queries being more efficient than
 the  
  spamc/spamd traffic.
 
 I like that you are asking the question.  But I hate to guess at which
 is better though.  The weakest benchmark data point is better than the
 strongest guess.  Too often I have taken my best guess and been wrong.
 In this case I would guess the opposite would be more efficient, that
 the one spamc-spamd connection per message would be more efficient
 than the many mysql queries per message, which is why I bring this up.

Well that's something to consider.  I had hoped when I subscribed to
this list to ask this question that I'd find people, possibly SA
developers on it, who had benchmarked the options I presented for
decision and could give me some definitive answers based on this, but it
appears that this isn't the case.  Instead I've found several people of
good will who don't seem to know a whole lot more about SA than I do,
but have given me some good points to think about.

Do you have any idea where I might inquire to get advice from people
with more precise knowledge?

-- 
Lindsay Haisley   | Everything works|Accredited
FMP Computer Services |   if you let it |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: Two servers, one database. A question

2009-02-14 Thread Michael Parker


On Feb 14, 2009, at 3:47 PM, Lindsay Haisley wrote:


On Sat, 2009-02-14 at 15:04 -0600, Bob Proulx wrote:

I would bet on Bayes/userpref queries being more efficient than

the

spamc/spamd traffic.


I like that you are asking the question.  But I hate to guess at  
which
is better though.  The weakest benchmark data point is better than  
the
strongest guess.  Too often I have taken my best guess and been  
wrong.

In this case I would guess the opposite would be more efficient, that
the one spamc-spamd connection per message would be more efficient
than the many mysql queries per message, which is why I bring this  
up.


Well that's something to consider.  I had hoped when I subscribed to
this list to ask this question that I'd find people, possibly SA
developers on it, who had benchmarked the options I presented for
decision and could give me some definitive answers based on this,  
but it
appears that this isn't the case.  Instead I've found several people  
of

good will who don't seem to know a whole lot more about SA than I do,
but have given me some good points to think about.

Do you have any idea where I might inquire to get advice from people
with more precise knowledge?



This is the best place.  Its not a common setup so I don't doubt that  
anyone really knows the correct answer.


One data point I'll add is that spamc has a compress mode that might  
be useful (spamc -z).  Also, it would take a little work on your end  
but you can also pass in --headers to further reduce the spamc/spamc  
traffic.  Check out the spamc man page for more info.


One other thing related to MySQL.  I've never personally done it but  
I'm certain there are ways you could use MySQL proxy or perhaps even  
federated tables to manage this sort of thing.  MySQL proxy has lots  
of different functions, I'm sure compression is either one of them or  
at least something that can be easily bolted on.


Michael





--
Lindsay Haisley   | Everything works|Accredited
FMP Computer Services |   if you let it |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau





Re: Two servers, one database. A question

2009-02-14 Thread Karsten Bräckelmann
On Sat, 2009-02-14 at 17:07 -0600, Michael Parker wrote:
 On Feb 14, 2009, at 3:47 PM, Lindsay Haisley wrote:

  Well that's something to consider.  I had hoped when I subscribed to
  this list to ask this question that I'd find people, possibly SA
  developers on it, who had benchmarked the options I presented for
  decision and could give me some definitive answers based on this, but it
  appears that this isn't the case.  Instead I've found several people of
  good will who don't seem to know a whole lot more about SA than I do,
  but have given me some good points to think about.

Being a SA dev doesn't necessarily imply any need to use SQL based
storage. Let alone scanning on an off-site server. :)  I, for one,
don't. So take it with a grain of salt.

  Do you have any idea where I might inquire to get advice from people
  with more precise knowledge?
 
 This is the best place.  Its not a common setup so I don't doubt that  
 anyone really knows the correct answer.
 
 One data point I'll add is that spamc has a compress mode that might  
 be useful (spamc -z).  Also, it would take a little work on your end  
 but you can also pass in --headers to further reduce the spamc/spamc  
 traffic.  Check out the spamc man page for more info.

Ah, good one -- I forgot about the -z option, otherwise I would have
chipped in before. The headers option is something I was thinking about
already. This basically reduces the traffic from 2 times the mail stream
(as mentioned), to one times.

Regarding SQL traffic and Bayes -- tokenizing a message into unique
tokens, then adding the SQL overhead. Would that really be less than the
raw average message? Another thing to keep in mind is latency, iff there
are multiple queries involved. Versus the single round-trip of spamc.

On the other hand, there is manageability. Single spamd is easier, than
keeping two in sync. Probably not too challenging, though. ;)

To throw in another crack idea: What about consolidating the MXs? And
then internally forwarding the already processed messages?


Lindsay, if you end up doing some benchmarking, please let us know. I
wouldn't be surprised if you're actually the first one to do this across
the Internet. :)

  guenther


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



Re: Last-5-percent tuning

2009-02-14 Thread Karsten Bräckelmann
On Sat, 2009-02-14 at 10:42 -0800, Ricardo Kleemann wrote:

   Do you use any MTA-level DNSBLs?
 
  No.
 
  If you have ample of ressources you can do this. If you are getting
  tenthousands of mails you can't (or won't). We reject about 90% of the
  spam at MTA. That's mostly Bot spam. Why should we burn good ressources
  for that stuff? Interestingly, that also kills almost all of the fierce
  spam that might slip thru SA. So, SA then does a very good job on the rest
  which lets slip only a few by. With SA only we would have much more slip
  by. And we don't need any extra rules (like SARE, KAM) anymore. I'm using
  sought, but it doesn't appear to be too efective.
 
 Which SA plugin performs this? Is it Mail::SpamAssassin::Plugin::URIDNSBL? 
   
Err, what exactly do you mean by this?  Sought?


-- 
char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4;
main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1:
(c=*++x); c128  (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}



DNS MX Question [OT]

2009-02-14 Thread Marc Perkel

Hi,

I have a quick bind question. I want to set the MX records on a domain 
to something normal but I want to set the MX for all subdomains to 
something else.


example.com mail.example.com
xxx.example.com blackhole.example.com

Thanks in advance



Re: DNS MX Question [OT]

2009-02-14 Thread Duane Hill

On Sat, 14 Feb 2009, Marc Perkel wrote:


Hi,

I have a quick bind question. I want to set the MX records on a domain to 
something normal but I want to set the MX for all subdomains to something 
else.


example.com mail.example.com
xxx.example.com blackhole.example.com


So do just that:

example.com.   IN   MX10 mail.example.com.
xxx.example.com.   IN   MX10 blackhole.example.com.

Simple Google search of 'subdomain mx record' explains the usage of 
'$ORIGIN' taking the first link off the top:


http://zytrax.com/books/dns/ch8/mx.html


Re: DNS MX Question [OT]

2009-02-14 Thread John Lundin
On Sat, Feb 14, 2009 at 06:37:14PM -0800, Marc Perkel wrote:
 I have a quick bind question. I want to set the MX records on a domain 
 to something normal but I want to set the MX for all subdomains to 
 something else.
 
 example.com mail.example.com
 xxx.example.com blackhole.example.com

See http://www.ietf.org/rfc/rfc1035.txt etc

Briefly, in the zone file for example.com:

@ MX 10   mail.example.com.
xxx   MX 10   blackhole.example.com.

-- 
  lun...@fini.net
Please phrase your question in the form of a question.


Re: DNS MX Question [OT]

2009-02-14 Thread Marc Perkel



Marc Perkel wrote:

Hi,

I have a quick bind question. I want to set the MX records on a domain 
to something normal but I want to set the MX for all subdomains to 
something else.


example.com mail.example.com
xxx.example.com blackhole.example.com

Thanks in advance




I should be more specific. I asked the question wrong.

*.example.com blackhole.example.com

What I need is that any subdomain point to blackhole.



Re: DNS MX Question [OT]

2009-02-14 Thread Dave Funk

On Sat, 14 Feb 2009, Marc Perkel wrote:




Marc Perkel wrote:

Hi,

I have a quick bind question. I want to set the MX records on a domain to 
something normal but I want to set the MX for all subdomains to something 
else.


example.com mail.example.com
xxx.example.com blackhole.example.com

Thanks in advance




I should be more specific. I asked the question wrong.

*.example.com blackhole.example.com

What I need is that any subdomain point to blackhole.



Then replace 'xxx' with '*'. EG:

Briefly, in the zone file for example.com:

@   MX  10   mail.example.com.
*   MX  10   blackhole.example.com.

Yes, it -is- that simple. ;)
Not recommended for normal use but if you understand the risks involved,
it does work that way.


--
Dave Funk  University of Iowa
dbfunk (at) engineering.uiowa.eduCollege of Engineering
319/335-5751   FAX: 319/384-0549   1256 Seamans Center
Sys_admin/Postmaster/cell_adminIowa City, IA 52242-1527
#include std_disclaimer.h
Better is not better, 'standard' is better. B{


Re: DNS MX Question [OT]

2009-02-14 Thread Marc Perkel



Dave Funk wrote:

On Sat, 14 Feb 2009, Marc Perkel wrote:




Marc Perkel wrote:

Hi,

I have a quick bind question. I want to set the MX records on a 
domain to something normal but I want to set the MX for all 
subdomains to something else.


example.com mail.example.com
xxx.example.com blackhole.example.com

Thanks in advance




I should be more specific. I asked the question wrong.

*.example.com blackhole.example.com

What I need is that any subdomain point to blackhole.



Then replace 'xxx' with '*'. EG:

Briefly, in the zone file for example.com:

@MX10   mail.example.com.
*MX10   blackhole.example.com.

Yes, it -is- that simple. ;)
Not recommended for normal use but if you understand the risks involved,
it does work that way.




Thanks Dave, but I already tried that and it didn't work.

dig @localhost churchofreality.com mx

;  DiG 9.5.1-P1-RedHat-9.5.1-1.P1.fc10  @localhost 
churchofreality.com mx

; (1 server found)
;; global options:  printcmd
;; Got answer:
;; -HEADER- opcode: QUERY, status: SERVFAIL, id: 48505
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;churchofreality.com.   IN  MX


Thanks for the attempt though.



Re: DNS MX Question [OT]

2009-02-14 Thread Marc Perkel



Dave Funk wrote:

On Sat, 14 Feb 2009, Marc Perkel wrote:




Marc Perkel wrote:

Hi,

I have a quick bind question. I want to set the MX records on a 
domain to something normal but I want to set the MX for all 
subdomains to something else.


example.com mail.example.com
xxx.example.com blackhole.example.com

Thanks in advance




I should be more specific. I asked the question wrong.

*.example.com blackhole.example.com

What I need is that any subdomain point to blackhole.



Then replace 'xxx' with '*'. EG:

Briefly, in the zone file for example.com:

@MX10   mail.example.com.
*MX10   blackhole.example.com.

Yes, it -is- that simple. ;)
Not recommended for normal use but if you understand the risks involved,
it does work that way.




It didn't work - but this might be related. I have this in there as 
wellso that all A record subdirs resolve to the same IP.


mailIN  CNAME   mail.ctyme.com.
mailman IN  CNAME   mailman.ctyme.com.
mailman.mailman IN  CNAME   mailman.ctyme.com.
ssh IN  A   65.49.42.101
ftp IN  A   65.49.42.101
www IN  A   65.49.42.100
*   IN  CNAME   @







Re: DNS MX Question [OT]

2009-02-14 Thread Lindsay Haisley
On Sat, 2009-02-14 at 22:06 -0800, Marc Perkel wrote:
 
 Dave Funk wrote:
  Yes, it -is- that simple. ;)
  Not recommended for normal use but if you understand the risks involved,
  it does work that way.
 
 
 
 Thanks Dave, but I already tried that and it didn't work.

See http://en.wikipedia.org/wiki/Wildcard_DNS_record and in particular
the quote from RFC 1912.

-- 
Lindsay Haisley   | Everything works|Accredited
FMP Computer Services |   if you let it |  by the
512-259-1190  |(The Roadie)  |   Austin Better
http://www.fmp.com|  |  Business Bureau



Re: DNS MX Question [OT]

2009-02-14 Thread Marc Perkel



Lindsay Haisley wrote:

On Sat, 2009-02-14 at 22:06 -0800, Marc Perkel wrote:
  

Dave Funk wrote:


Yes, it -is- that simple. ;)
Not recommended for normal use but if you understand the risks involved,
it does work that way.


  

Thanks Dave, but I already tried that and it didn't work.



See http://en.wikipedia.org/wiki/Wildcard_DNS_record and in particular
the quote from RFC 1912.

  


Is that going to tell me what I need to know to do what I asked to do?