Re: malformed To: header blocks further parsing

2013-06-06 Thread Matteo Dessalvi
Hi Fabio.

Have you tried also the 'Language options' of SpamAssassin? Like the one
described here: 
http://spamassassin.apache.org/full/3.2.x/doc/Mail_SpamAssassin_Conf.html#language_options

Matteo


- Messaggio originale -
Da: Fabio Sangiovanni sangiova...@nweb.it
A: users@spamassassin.apache.org
Cc: 
Inviato: Mercoledì 5 Giugno 2013 12:26
Oggetto: malformed To: header blocks further parsing

Hi everybody,

I'm using spamassassin 3.3.2, along with postfix 2.6.6 and amavisd-new 
2.8.0.
The system spamassassin is running on is used primarily for URIDNSBL checks.
Recently I had some messages classified as spam because of these rules:

X-Spam-Report:
  *  1.2 TO_MALFORMED To: has a malformed address
  *  0.5 NULL_IN_BODY FULL: Message has NUL (ASCII 0) byte in message
  *  0.1 MISSING_MID Missing Message-Id: header
  *  1.8 MISSING_SUBJECT Missing Subject: header
  *  1.4 MISSING_DATE Missing Date: header

The problem is that the body is not null at all, and headers aren't 
missing: what happens here is that the To: header contains chinese 
characters that are not in encoded word format and that interfere with 
spamassassin's parsing.
This is a problem because the mail body can't be checked against other 
rules.

Is there a way to fix this, other than changing MUA's behaviour to 
encode the message properly? How can I have spamassassin to parse the 
remaining part of the message in such conditions?

Configuration follows (please let me know if you need further information):

loaded plugins:

loadplugin Mail::SpamAssassin::Plugin::URIDNSBL
loadplugin Mail::SpamAssassin::Plugin::Check
loadplugin Mail::SpamAssassin::Plugin::Rule2XSBody


local.cf:

trusted_networks 192.168/16
internal_networks 192.168/16
skip_rbl_checks 1
use_learner 0
use_bayes 0
use_bayes_rules 0
bayes_auto_learn 0
score        URIBL_SBL    5
score        URIBL_DBL_SPAM    5
score        URIBL_DBL_REDIR    5
score        URIBL_DBL_ERROR    5
score        URIBL_SC_SURBL    5
score        URIBL_WS_SURBL    5
score        URIBL_PH_SURBL    5
score        URIBL_MW_SURBL    5
score        URIBL_AB_SURBL    5
score        URIBL_JP_SURBL    5
score        URIBL_BLACK    0
score        URIBL_RED    0
score        URIBL_GREY    0
score        URIBL_BLOCKED    0



Re: Question about T_KHOP_FOREIGN_CLICK

2013-06-06 Thread Bowie Bailey

On 6/5/2013 10:30 PM, Adam Katz wrote:

On 05/31/2013 06:51 AM, Bowie Bailey wrote:


On 5/31/2013 8:30 AM, Matteo Vannucchi - TeamEnterprise wrote:

Hello, my name is Matteo.

I do not manage a spamassassin installation, but I would like to ask 
this simple question, because I saw it is a rule which is used to 
evaluate spam score.
I tried searching Google, the users forum, the Wiki and the Docs 
page in the site, but did not find any information. The simple 
question is: how does T_KHOP_FOREIGN_CLICK rule work?


Hope the answer is as simple.


It's a fairly complex regex rule.  Without spending too much time 
analyzing it, I think it is looking for a link that says click here 
in a language other than english.


You are correct, though it also matches English.  I've placed a 
syntactical explanation of this regex at http://regex101.com/r/qS8nF4


Ah... That makes it perfectly clear!   ;)

Nice site though...  I'll have to bookmark that one for the next time 
one of my regexs isn't doing what I expect.  I can never remember those 
sites when I need them.




A related question is why is this rule name duplicated?  My guess is 
that it was changed at some point from a rawbody rule to a uri_detail 
rule and the old one was left in there.  One of them should be 
removed to avoid confusion.


from 72_active.cf:

rawbodyT_KHOP_FOREIGN_CLICK 
m{\bhref=[^]{9,199}[^]{0,80}(?:(?!/a\b)[^]{0,299}[^]{0,80}){0,9}[^]{0,80}\b(?:cli(?:quez\W|ck\Wa)ici\b|cli(?:cca\W|c\Wa|que\Wa)qu[^.,a 
]|klie?k(?:\Whi?er|ni(?:j|nite)\Wtu[tk]aj)\b)}si


uri_detail T_KHOP_FOREIGN_CLICK text =~ 
/\b(?:cli(?:quez\W|ck\Wa)ici\b|cli(?:cca\W|c\Wa|que\Wa)qu[^.,a 
]|klie?k(?:\Whi?er|ni(?:j|nite)\Wtu[tk]aj)\b)/i


The sandbox promotion system does make this a bit more confusing than 
it should be (using a double negative), but it is assembling the two 
versions of the rule correctly:


##{ T_KHOP_FOREIGN_CLICK if ! plugin (Mail::SpamAssassin::Plugin::URIDetail)

if ! plugin (Mail::SpamAssassin::Plugin::URIDetail)
   rawbodyT_KHOP_FOREIGN_CLICK   
m{\bhref=[^]{9,199}[^]{0,80}(?:(?!/a\b)[^]{0,299}[^]{0,80}){0,9}[^]{0,80}\b(?:cli(?:quez\W|ck\Wa)ici\b|cli(?:cca\W|c\Wa|que\Wa)qu[^.,a
 ]|klie?k(?:\Whi?er|ni(?:j|nite)\Wtu[tk]aj)\b)}si
endif
##} T_KHOP_FOREIGN_CLICK if ! plugin (Mail::SpamAssassin::Plugin::URIDetail)

##{ if !(! plugin (Mail::SpamAssassin::Plugin::URIDetail))_sandbox

if !(! plugin (Mail::SpamAssassin::Plugin::URIDetail))
   uri_detail T_KHOP_FOREIGN_CLICK   text =~ 
/\b(?:cli(?:quez\W|ck\Wa)ici\b|cli(?:cca\W|c\Wa|que\Wa)qu[^.,a 
]|klie?k(?:\Whi?er|ni(?:j|nite)\Wtu[tk]aj)\b)/i
endif
##} if !(! plugin (Mail::SpamAssassin::Plugin::URIDetail))_sandbox
This means that the rawbody version is used if URIDetail isn't loaded 
and the uri_detail version is used if the URIDetail plugin is loaded.


That explains it.  I was grepping the file and didn't think to look for 
conditionals around the rules.


--
Bowie


MariaDB replacing MySQL for Bayes

2013-06-06 Thread Marc Perkel
So - after a couple of weeks it just works. I recommend getting rid of 
MySQL in favor of MariaDB. Besides bayes I'm using it on my web server 
and it just works and it's a lot more solid.


My 2 centz

--
Marc Perkel - Sales/Support
supp...@junkemailfilter.com
http://www.junkemailfilter.com
Junk Email Filter dot com
415-992-3400



Re: MariaDB replacing MySQL for Bayes

2013-06-06 Thread Antony Stone
On Thursday 06 Jun 2013 at 21:03:50, Marc Perkel wrote:

 So - after a couple of weeks it just works. I recommend getting rid of
 MySQL in favor of MariaDB. Besides bayes I'm using it on my web server
 and it just works and it's a lot more solid.

Just out of interest, how do you define solid?

For example, what problems did you previously see with MySQL which you no 
longer have with MariaDB?


Thanks,


Antony.

-- 
I love deadlines.   I love the whooshing noise they make as they go by.

 - Douglas Noel Adams

 Please reply to the list;
   please don't CC me.


FP on SPOOF_COM2OTH (and potentially SPOOF_COM2COM)

2013-06-06 Thread Daniel McDonald
I had a recent FP message that hit noth the SPOOF_COM2OTH and SPOOF_COM2COM
rules.  I don¹t think COM2OTH is appropriate:
Jun  6 13:55:49.469 [26386] dbg: rules: ran uri rule SPOOF_COM2OTH ==
got hit: http://wwwDOTMUNGEDDOTcomDOTtemp.DOTlivebooks.
Jun  6 13:55:49.469 [26386] dbg: rules: ran uri rule SPOOF_COM2COM ==
got hit: http://wwwDOTMUNGEDDOTcomDOTtempDOTlivebooksDOTcom

A scan of the message shows that these two rules are hitting the same line.

A quick check of my logs show 100% overlap in one direction:

[mcdonalddj@sa ~]$ sudo grep SPOOF_COM2OTH /var/log/mail/info.log | grep -vc
SPOOF_COM2COM
0
[mcdonalddj@sa ~]$ sudo grep SPOOF_COM2OTH /var/log/mail/info.log | grep -c
SPOOF_COM2COM
26
[mcdonalddj@sa ~]$ sudo grep SPOOF_COM2COM /var/log/mail/info.log | grep -vc
SPOOF_COM2OTH
13

I¹ll be disabling SPOOF_COM2OTH for now, but thought someone might want to
look into it.  I also see a single exception of s3.amazonaws.com from the
rule.  I might add livebooks to that list locally.


-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281



Spam rule

2013-06-06 Thread Rejaine Monteiro

Hi list,

How can I make a rule to do something like this:  block messages with 
body or subject contains  'lalalalala'   AND url  with PDF  NOT contains 
'trusted.net'




Re: Spam rule

2013-06-06 Thread Daniel McDonald
On 6/6/13 4:23 PM, Rejaine Monteiro reja...@bhz.jamef.com.br wrote:

Hi list, 
  
  How can I make a rule to do something like this:  block messages

For the pedantic, SpamAssassin doesn't block mail.  It marks it.  Whether
you block mail that has been marked with some other process is up to you...

 with body or 
 subject contains  'lalalalala'   AND url  with PDF  NOT contains 'trusted.net'

body__LALA_B  /la{5}/
header  __LALA_H Subject =~ /la{5}/
header  __LALA_TRUST Received =~ /192\.162\.101\.\d{1,3}/
metaMY_LALA  (__LALA_B || __LALA_H)  __HAS_ANY_URI  __PDF_ATTACH 
!__LALA_TRUST
score   MY_LALA 5.0


-- 
Daniel J McDonald, CCIE # 2495, CISSP # 78281




Re: Spam rule

2013-06-06 Thread Wolfgang Zeikat

Hi,

In an older episode, on 2013-06-06 23:54, Daniel McDonald wrote:

with body or 
subject contains  'lalalalala'   AND url  with PDF  NOT contains 'trusted.net'


body__LALA_B  /la{5}/
header  __LALA_H Subject =~ /la{5}/


shouldn't that be
/(la){5}/
???

I think /la{5}/ would match
la instead of lalalalala ...

Cheers,

wolfgang




Re: Spam rule

2013-06-06 Thread Rejaine Monteiro


tala was only an example, thanks for the tip, I will test here

Em 06-06-2013 19:14, Wolfgang Zeikat escreveu:

Hi,

In an older episode, on 2013-06-06 23:54, Daniel McDonald wrote:

with body or subject contains 'lalalalala'   AND url  with PDF  NOT 
contains 'trusted.net'


body__LALA_B  /la{5}/
header  __LALA_H Subject =~ /la{5}/


shouldn't that be
/(la){5}/
???

I think /la{5}/ would match
la instead of lalalalala ...

Cheers,

wolfgang





--
Rejaine da Silveira Monteiro
Suporte-TI
Jamef Encomendas Urgentes
Matriz - Contagem/MG
Tel: (31) 2102-8854
www.jamef.com.br



Re: Spam rule

2013-06-06 Thread Daniel McDonald



On 6/6/13 5:14 PM, Wolfgang Zeikat wolfgang.zei...@desy.de wrote:

 Hi,
 
 In an older episode, on 2013-06-06 23:54, Daniel McDonald wrote:
 
 with body or 
 subject contains  'lalalalala'   AND url  with PDF  NOT contains
 'trusted.net'
 
 body__LALA_B  /la{5}/
 header  __LALA_H Subject =~ /la{5}/
 
 shouldn't that be
 /(la){5}/

Well, more properly /(?:la){5}/

 
 I think /la{5}/ would match
 la instead of lalalalala ...

Quite right...




Re: Spam rule

2013-06-06 Thread Wolfgang Zeikat

In an older episode, on 2013-06-07 00:17, Rejaine Monteiro wrote:


tala was only an example, thanks for the tip, I will test here


For basics of writing SA rules, maybe look at
http://wiki.apache.org/spamassassin/WritingRules

Hope this helps,

wolfgang




Re: Spam rule

2013-06-06 Thread Martin Gregorie
On Thu, 2013-06-06 at 16:54 -0500, Daniel McDonald wrote:
 On 6/6/13 4:23 PM, Rejaine Monteiro reja...@bhz.jamef.com.br wrote:
 
 Hi list, 
   
   How can I make a rule to do something like this:  block messages
 
 For the pedantic, SpamAssassin doesn't block mail.  It marks it.  Whether
 you block mail that has been marked with some other process is up to you...
 
  with body or 
  subject contains  'lalalalala'   AND url  with PDF  NOT contains 
  'trusted.net'
 
 body__LALA_B  /la{5}/

That will only match la - you'll need to use /lalalalala/

 header  __LALA_H Subject =~ /la{5}/

IIRC this isn't needed because the subject is treated as part of the
body.

 
 header  __LALA_TRUST Received =~ /192\.162\.101\.\d{1,3}/

I'm uncertain what the OP means he wants done with trusted.net, except
that it doesn't look as though he thinks its the sender. Assuming he
means it should be in the body, try something this:

body   __LA5B /lalalalala/
uri__LA5T /trusted\.net/
mimeheader __LA5P Content-type =~ /application\/pdf/
meta  LA5  (__LA5B  !__LA5T  __LA5P)
score LA5  5.0

...of course you'll need to have the MimeMagic plugin installed if
__LA5P is to work. 

Disclaimer: although I've tested __LA5B and written __LA5P 
against a real PDF attachment, this set of rules and subrules is
untested.


Martin





Re: Spam rule

2013-06-06 Thread John Hardin

On Thu, 6 Jun 2013, Daniel McDonald wrote:


On 6/6/13 4:23 PM, Rejaine Monteiro reja...@bhz.jamef.com.br wrote:


   Hi list,

 How can I make a rule to do something like this:  block messages


For the pedantic, SpamAssassin doesn't block mail.  It marks it.  Whether
you block mail that has been marked with some other process is up to you...


with body or
subject contains  'lalalalala'   AND url  with PDF  NOT contains 'trusted.net'


body__LALA_B  /la{5}/
header  __LALA_H Subject =~ /la{5}/


Not needed, the subject is automatically included in the body.

Agree it should be /(?:la){5}/, but as that's just a placeholder, that 
optimization shouldn't have been offered. It's a confusing optimization 
for someone asking about RE basics.



header  __LALA_TRUST Received =~ /192\.162\.101\.\d{1,3}/
metaMY_LALA  (__LALA_B || __LALA_H)  __HAS_ANY_URI  __PDF_ATTACH 
!__LALA_TRUST
score   MY_LALA 5.0


--
 John Hardin KA7OHZhttp://www.impsec.org/~jhardin/
 jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
---
  The fetters imposed on liberty at home have ever been forged out
  of the weapons provided for defense against real, pretended, or
  imaginary dangers from abroad.   -- James Madison, 1799
---
 Today: the 69th anniversary of D-Day


Re: Spam rule

2013-06-06 Thread Benny Pedersen

Daniel McDonald skrev den 2013-06-06 23:54:


body__LALA_B  /la{5}/
header  __LALA_H Subject =~ /la{5}/
header  __LALA_TRUST Received =~ /192\.162\.101\.\d{1,3}/
metaMY_LALA  (__LALA_B || __LALA_H)  __HAS_ANY_URI  
__PDF_ATTACH 

!__LALA_TRUST
score   MY_LALA 5.0



good example, but since it contains rfc1918 ips it can be abused

--
senders that put my email into body content will deliver it to my own 
trashcan, so if you like to get reply, dont do it


Re: Spam rule

2013-06-06 Thread staticsafe
On Fri, Jun 07, 2013 at 01:54:37AM +0200, Benny Pedersen wrote:
 Daniel McDonald skrev den 2013-06-06 23:54:
 
 body__LALA_B  /la{5}/
 header  __LALA_H Subject =~ /la{5}/
 header  __LALA_TRUST Received =~ /192\.162\.101\.\d{1,3}/
 metaMY_LALA  (__LALA_B || __LALA_H)  __HAS_ANY_URI 
 __PDF_ATTACH 
 !__LALA_TRUST
 score   MY_LALA 5.0
 
 
 good example, but since it contains rfc1918 ips it can be abused
 
 -- 
 senders that put my email into body content will deliver it to my
 own trashcan, so if you like to get reply, dont do it

Not quite:
 10.0.0.0-   10.255.255.255  (10/8 prefix)
 172.16.0.0  -   172.31.255.255  (172.16/12 prefix)
 192.168.0.0 -   192.168.255.255 (192.168/16 prefix)

192.162.*.* doesn't fall with that range.
-- 
staticsafe
O ascii ribbon campaign - stop html mail - www.asciiribbon.org
Please don't top post - http://goo.gl/YrmAb
Don't CC me! I'm subscribed to whatever list I just posted on.


Re: Spam rule

2013-06-06 Thread Benny Pedersen

staticsafe skrev den 2013-06-07 02:02:


192.162.*.* doesn't fall with that range.


close but not close enough, its late here in danmark :)

--
senders that put my email into body content will deliver it to my own 
trashcan, so if you like to get reply, dont do it