Re: max length of pcre rule?

2010-03-29 Thread Wietse Venema
Louis-David Mitterrand:
> Hi,
> 
> I am using an (insanely) long pcre (see below) to reject
> african/chinese/etc. spam that relays through large ISP's. An now it
> seems I have reached a limit. When trying to add a single more
> expression with a set of () parens I get this error:
> 
>   postmap: warning: pcre map /etc/postfix/header_access_local, line 2: 
> too many (...)

Postfix logs this when pcre_exec() returns a match count of zero.
According to documentation:

   If the vector is too small to hold all the captured substring  offsets,
   it is used as far as possible (up to two-thirds of its length), and the
   function returns a value of zero. 

About 10 years ago, someone decided that Postfix needs no more than
99 () in a PCRE regular expression. I suppose this is one of many
things in Postfix that should eventually be made configurable.

For now, you would have to edit dict_pcre.c, and update the
PCRE_MAX_CAPTURE constant.

Wietse


Re: max length of pcre rule?

2010-03-29 Thread Steve

 Original-Nachricht 
> Datum: Mon, 29 Mar 2010 14:54:47 +0200
> Von: Louis-David Mitterrand 
> An: postfix-users@postfix.org
> Betreff: max length of pcre rule?

> Hi,
> 
Hello,


> I am using an (insanely) long pcre (see below) to reject
> african/chinese/etc. spam that relays through large ISP's. An now it
> seems I have reached a limit. When trying to add a single more
> expression with a set of () parens I get this error:
> 
>   postmap: warning: pcre map /etc/postfix/header_access_local, line 2: too
> many (...)
> 
> Is this a limitation (or sanity check) of the pcre engine?
> 
I think it is a hardcoded limit in Postfix.


> My rule is long because I need to share the prefix:
> 
>   
> Received|X-((Origin(ating)?|Client|MDRemote|Sender)-?IP|(Client|Remote_)Addr|PHP-Script)
> 
> and would rather not edit it more than once each time a new variation of
> 'X-Originating-IP' appears.
> 
> Any suggestion on improving the following rule is welcome.
> 
You can wrap the regexp into an if statement:
--
if 
/^Received|X-((Origin(ating)?|Client|MDRemote|Sender)-?IP|(Client|Remote_)Addr|PHP-Script):/
/[^:]*.+\b(41\.245.\d+\.\d+)\b/ 
REJECT aviso.ci 
junk 2
/[^:]*.+\b(60\.1(6[6-9]|7[0-5])\.\d+\.\d+)\b/   
REJECT aviso.ci junk 2
/[^:]*.+\b(41\.184\.(3[2-9]|4[0-7])\.\d+)\b/
REJECT aviso.ci junk 2
/[^:]*.+\b(112\.110\.(46|61|9[6-9]|1([01]\d|2[0-7]))\.\d+)\b/   
REJECT aviso.ci junk 2
/[^:]*.+\b(41\.214\.(3[2-9]|4[0-7]|9[6-9])\.\d+)\b/ 
REJECT aviso.ci junk 2
/[^:]*.+\b(211\.144\.(6[4-9]|[78]\d|9[0-5])\.\d+)\b/
REJECT aviso.ci junk 2
/[^:]*.+\b(192\.83\.191\.\d+)\b/
REJECT aviso.ci 
junk 2
/[^:]*.+\b(41\.2[78]\.\d+\.\d+)\b/  
REJECT aviso.ci 
junk 2
/[^:]*.+\b(121\.148\.199\.\d+)\b/   
REJECT aviso.ci 
junk 2
/[^:]*.+\b(112\.20[0-7]\.\d+\.\d+)\b/   
REJECT aviso.ci junk 2
/[^:]*.+\b(119\.(9[6-9]|10[0-3])\.\d+\.\d+)\b/  
REJECT aviso.ci junk 2
/[^:]*.+\b(117\.(2[4-9]|3[01])\.\d+\.\d+)\b/
REJECT aviso.ci junk 2
/[^:]*.+\b(123\.16[0-3]\.\d+\.\d+)\b/   
REJECT aviso.ci junk 2
/[^:]*.+\b(222\.8[89]\.\d+\.\d+)\b/ 
REJECT aviso.ci 
junk 2
/[^:]*.+\b(41\.138\.1([678]\d|9[01])\.\d+)\b/   
REJECT aviso.ci junk 2
/[^:]*.+\b(195\.78\.11[23]\.\d+)\b/ 
REJECT aviso.ci 
junk 2
/[^:]*.+\b(61\.54\.\d+\.\d+)\b/ 
REJECT aviso.ci 
junk 2
/[^:]*.+\b(80\.255\.61\.\d+)\b/ 
REJECT aviso.ci 
junk 2
/[^:]*.+\b(213\.255\.1(2[89]|[3-5]\d)\.\d+)\b/  
REJECT aviso.ci junk 2
/[^:]*.+\b(86\.62\.([0-5]?\d|6[0-3])\.\d+)\b/   
REJECT aviso.ci junk 2
/[^:]*.+\b(41.221.194\.\d+)\b/  
REJECT aviso.ci 
junk 2
/[^:]*.+\b(41\.29\.\d+\.\d+)\b/ 
REJECT aviso.ci 
junk 2
/[^:]*.+\b(218\.1[3-8].\d+\.\d+)\b/ 
REJECT aviso.ci 
junk 2
/[^:]*.+\b(213\.136\.(9[6-9]|1([01]\d|2[0-7]))\.\d+)\b/ 
REJECT aviso.ci junk 2
/

Re: max length of pcre rule?

2010-03-29 Thread Steve

 Original-Nachricht 
> Datum: Mon, 29 Mar 2010 16:35:49 +0200
> Von: "Steve" 
> An: postfix-users@postfix.org
> Betreff: Re: max length of pcre rule?

> 
>  Original-Nachricht 
> > Datum: Mon, 29 Mar 2010 14:54:47 +0200
> > Von: Louis-David Mitterrand 
> > An: postfix-users@postfix.org
> > Betreff: max length of pcre rule?
> 
> > Hi,
> > 
> Hello,
> 
> 
> > I am using an (insanely) long pcre (see below) to reject
> > african/chinese/etc. spam that relays through large ISP's. An now it
> > seems I have reached a limit. When trying to add a single more
> > expression with a set of () parens I get this error:
> > 
> > postmap: warning: pcre map /etc/postfix/header_access_local, line 2:
> too
> > many (...)
> > 
> > Is this a limitation (or sanity check) of the pcre engine?
> > 
> I think it is a hardcoded limit in Postfix.
> 
> 
> > My rule is long because I need to share the prefix:
> > 
> >
>   
> Received|X-((Origin(ating)?|Client|MDRemote|Sender)-?IP|(Client|Remote_)Addr|PHP-Script)
> > 
> > and would rather not edit it more than once each time a new variation of
> > 'X-Originating-IP' appears.
> > 
> > Any suggestion on improving the following rule is welcome.
> > 
> You can wrap the regexp into an if statement:
> --
> if
> /^Received|X-((Origin(ating)?|Client|MDRemote|Sender)-?IP|(Client|Remote_)Addr|PHP-Script):/
> /[^:]*.+\b(41\.245.\d+\.\d+)\b/   
> 
> REJECT aviso.ci junk 2
> /[^:]*.+\b(60\.1(6[6-9]|7[0-5])\.\d+\.\d+)\b/ 
> REJECT aviso.ci
> junk 2
> /[^:]*.+\b(41\.184\.(3[2-9]|4[0-7])\.\d+)\b/  
> REJECT aviso.ci
> junk 2
> /[^:]*.+\b(112\.110\.(46|61|9[6-9]|1([01]\d|2[0-7]))\.\d+)\b/ 
> REJECT
> aviso.ci junk 2
> /[^:]*.+\b(41\.214\.(3[2-9]|4[0-7]|9[6-9])\.\d+)\b/   
> REJECT
> aviso.ci junk 2
> /[^:]*.+\b(211\.144\.(6[4-9]|[78]\d|9[0-5])\.\d+)\b/  
> REJECT
> aviso.ci junk 2
> /[^:]*.+\b(192\.83\.191\.\d+)\b/  
> REJECT 
> aviso.ci junk 2
> /[^:]*.+\b(41\.2[78]\.\d+\.\d+)\b/
> REJECT 
> aviso.ci junk 2
> /[^:]*.+\b(121\.148\.199\.\d+)\b/ 
> REJECT 
> aviso.ci junk 2
> /[^:]*.+\b(112\.20[0-7]\.\d+\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /[^:]*.+\b(119\.(9[6-9]|10[0-3])\.\d+\.\d+)\b/
> REJECT aviso.ci
> junk 2
> /[^:]*.+\b(117\.(2[4-9]|3[01])\.\d+\.\d+)\b/  
> REJECT aviso.ci
> junk 2
> /[^:]*.+\b(123\.16[0-3]\.\d+\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /[^:]*.+\b(222\.8[89]\.\d+\.\d+)\b/   
> REJECT 
> aviso.ci junk 2
> /[^:]*.+\b(41\.138\.1([678]\d|9[01])\.\d+)\b/ 
> REJECT aviso.ci
> junk 2
> /[^:]*.+\b(195\.78\.11[23]\.\d+)\b/   
> REJECT 
> aviso.ci junk 2
> /[^:]*.+\b(61\.54\.\d+\.\d+)\b/   
> 
> REJECT aviso.ci junk 2
> /[^:]*.+\b(80\.255\.61\.\d+)\b/   
> 
> REJECT aviso.ci junk 2
> /[^:]*.+\b(213\.255\.1(2[89]|[3-5]\d)\.\d+)\b/
>  

Re: max length of pcre rule?

2010-03-29 Thread Louis-David Mitterrand
On Mon, Mar 29, 2010 at 04:38:17PM +0200, Steve wrote:
> 
> > 
> Ohhh boy. Now looking at the regexp I see an error. Every line
> starting with "/[^:]*.+" should be replaced by "/[^:]*:.+". Sorry for
> that.
> 

Hi Steve,

You if/endif suggestion for the prefix is interesting.

For added safety, the individual rules should be anchored with ^ and the
bracketed atom plussed, no?

/^[^:]+:.+

Thanks,


Re: max length of pcre rule?

2010-03-29 Thread Steve

 Original-Nachricht 
> Datum: Mon, 29 Mar 2010 16:44:58 +0200
> Von: Louis-David Mitterrand 
> An: postfix-users@postfix.org
> Betreff: Re: max length of pcre rule?

> On Mon, Mar 29, 2010 at 04:38:17PM +0200, Steve wrote:
> > 
> > > 
> > Ohhh boy. Now looking at the regexp I see an error. Every line
> > starting with "/[^:]*.+" should be replaced by "/[^:]*:.+". Sorry for
> > that.
> > 
> 
> Hi Steve,
> 
Hallo Louis-David,


> You if/endif suggestion for the prefix is interesting.
> 
> For added safety, the individual rules should be anchored with ^ and the
> bracketed atom plussed, no?
> 
> /^[^:]+:.+
> 
Yes. You are right. But to be honest this should be enough (just an example):
001) if 
/^Received|X\-((Origin(ating)?|Client|MDRemote|Sender)\-?IP|(Client|Remote_)Addr|PHP\-Script):/
002) /\b(127\.0.\d+\.\d+)\b/ REJECT aviso.ci junk 2
003) endif


* Rule 001 will match a specific header.
* Rule 002 will match 127.0.xxx.xxx
* 127.0.xxx.xxx could be anchored with ^ but the rule/if-condition in 001 is 
already taking care of that 127.0.xxx.xxx is not part of the header name. So 
you can shorten the regexp to just "/\b()/b REJECT 
blah-blah-blah"



> Thanks,
>
No problem.


// Steve
-- 
Sicherer, schneller und einfacher. Die aktuellen Internet-Browser -
jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/atbrowser


Re: max length of pcre rule?

2010-03-29 Thread Henrik K
On Mon, Mar 29, 2010 at 09:13:31AM -0400, Wietse Venema wrote:
> Louis-David Mitterrand:
> > Hi,
> > 
> > I am using an (insanely) long pcre (see below) to reject
> > african/chinese/etc. spam that relays through large ISP's. An now it
> > seems I have reached a limit. When trying to add a single more
> > expression with a set of () parens I get this error:
> > 
> > postmap: warning: pcre map /etc/postfix/header_access_local, line 2: 
> > too many (...)
> 
> Postfix logs this when pcre_exec() returns a match count of zero.
> According to documentation:
> 
>If the vector is too small to hold all the captured substring  offsets,
>it is used as far as possible (up to two-thirds of its length), and the
>function returns a value of zero. 
> 
> About 10 years ago, someone decided that Postfix needs no more than
> 99 () in a PCRE regular expression. I suppose this is one of many
> things in Postfix that should eventually be made configurable.
> 
> For now, you would have to edit dict_pcre.c, and update the
> PCRE_MAX_CAPTURE constant.

I guess this means the rule can simply by fixed by not capturing anything,
always using: (?:foobar)



Re: max length of pcre rule?

2010-03-29 Thread Steve

 Original-Nachricht 
> Datum: Mon, 29 Mar 2010 16:44:58 +0200
> Von: Louis-David Mitterrand 
> An: postfix-users@postfix.org
> Betreff: Re: max length of pcre rule?

> On Mon, Mar 29, 2010 at 04:38:17PM +0200, Steve wrote:
> > 
> > > 
> > Ohhh boy. Now looking at the regexp I see an error. Every line
> > starting with "/[^:]*.+" should be replaced by "/[^:]*:.+". Sorry for
> > that.
> > 
> 
> Hi Steve,
> 
Hello Louis-David,


> You if/endif suggestion for the prefix is interesting.
> 
> For added safety, the individual rules should be anchored with ^ and the
> bracketed atom plussed, no?
> 
> /^[^:]+:.+
> 
I have fixed some issues in your regexp and sorted the rules:
if 
/^Received|^X\-((Origin(ating)?|Client|MDRemote|Sender)\-?IP|(Client|Remote_)Addr|PHP\-Script):/
/\b(41\.1(6\d|7[0-5])\.\d+\.\d+)\b/ 
REJECT aviso.ci junk 2
/\b(41\.138\.1([678]\d|9[01])\.\d+)\b/  
REJECT aviso.ci junk 2
/\b(41\.184\.(1[6-9]|2\d|3[01])\.\d+)\b/
REJECT aviso.ci junk 2
/\b(41\.184\.(3[2-9]|4[0-7])\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.189\.([1-3]?\d|4[0237]|5[0-6]|9[6-9]|1([01]\d|2[0-7]))\.\d+)\b/ 
REJECT aviso.ci junk 2
/\b(41\.191\.(6[89]|7[01]|8[4-7])\.\d+)\b/  
REJECT aviso.ci junk 2
/\b(41\.191\.1(0[89]|1[01])\.\d+)\b/
REJECT aviso.ci junk 2
/\b(41\.19\.\d+\.\d+)\b/
REJECT aviso.ci junk 2
/\b(41\.202\.\d+\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.203\.(6[4-9]|[78]\d|9[0-5]|2(2[4-9]|3[0-9]))\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.204\.2(2[4-7]|[34]\d|5[0-5])\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.205\.1(6[57]|72)\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.207\.([0-9]|1[5-9]|2[0-9]|3[01]|1([6-9]\d)|2([01]\d|2[0-3]))\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.208\.(1(2[89]|[3-9]\d)|2(0[0-7]))\.\d+)\b/  
REJECT aviso.ci junk 2
/\b(41\.211\.([0-3]|19[2-9]|2([0-4]\d|5[0-5]))\.\d+)\b/ 
REJECT aviso.ci junk 2
/\b(41\.213\.(\d?\d|1([01]\d|2[0-7]))\.\d+)\b/  
REJECT aviso.ci junk 2
/\b(41\.214\.(3[2-9]|4[0-7]|9[6-9])\.\d+)\b/
REJECT aviso.ci junk 2
/\b(41\.215\.1(6\d|7[0-5])\.\d+)\b/ 
REJECT aviso.ci junk 2
/\b(41\.216\.(3[2-9]|[45]\d|6[0-3])\.\d+)\b/
REJECT aviso.ci junk 2
/\b(41\.217\.([1-9]?\d|1([01]\d|2[0-7]))\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.218\.(19[2-9]|2([0-4]\d|5[0-5]))\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.218\.(19[2-9]|2([01]\d|2[0-3]))\.\d+)\b/
REJECT aviso.ci junk 2
/\b(41\.219\.(1(2[89]|[3-8]\d|9[016])|24[24])\.\d+)\b/  
REJECT aviso.ci junk 2
/\b(41\.220\.(75|1(7[6-9]|8\d|9[01]))\.\d+)\b/  
REJECT aviso.ci junk 2
/\b(41\.221\.194\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.222\.19[2-5]\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.223\.2(48|51)\.\d+)\b/  
REJECT aviso.ci junk 2
/\b(41\.232\.\d+\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.245\.\d+\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.25[2-5]\.\d+\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(41\.26\.\d+\.\d+)\b/
REJECT aviso.ci junk 2
/\b(41\.29\.\d+\.\d+)\b/
REJECT aviso.ci junk 2
/\b(41\.2[78]\.\d+\.\d+)\b/ 
REJECT aviso.ci junk 2
/\b(41\.30\.\d+\.\d+)\b/
REJECT aviso.ci junk 2
/\b(41\.31\.\d+\.\d+)\b/
REJECT aviso.ci junk 2
/\b(58\.(4[89]|5[0-5])\.\d+\.\d+)\b/
REJECT aviso.ci junk 2
/\b(58\.2(0[89]|1\d|2[0-3])\.\d+\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(60\.1(6[6-9]|7[0-5])\.\d+\.\d+)\b/  
REJECT aviso.ci junk 2
/\b(60\.2(0[89]1[0-7]|)\.\d+\.\d+)\b/   
REJECT aviso.ci junk 2
/\b(61\.134\.0\.\d+)\b/ 

Re: max length of pcre rule?

2010-03-29 Thread Steve

 Original-Nachricht 
> Datum: Mon, 29 Mar 2010 17:12:58 +0200
> Von: "Steve" 
> An: postfix-users@postfix.org
> Betreff: Re: max length of pcre rule?

> 
>  Original-Nachricht 
> > Datum: Mon, 29 Mar 2010 16:44:58 +0200
> > Von: Louis-David Mitterrand 
> > An: postfix-users@postfix.org
> > Betreff: Re: max length of pcre rule?
> 
> > On Mon, Mar 29, 2010 at 04:38:17PM +0200, Steve wrote:
> > > 
> > > > 
> > > Ohhh boy. Now looking at the regexp I see an error. Every line
> > > starting with "/[^:]*.+" should be replaced by "/[^:]*:.+". Sorry for
> > > that.
> > > 
> > 
> > Hi Steve,
> > 
> Hello Louis-David,
> 
> 
> > You if/endif suggestion for the prefix is interesting.
> > 
> > For added safety, the individual rules should be anchored with ^ and the
> > bracketed atom plussed, no?
> > 
> > /^[^:]+:.+
> > 
> I have fixed some issues in your regexp and sorted the rules:
> if
> /^Received|^X\-((Origin(ating)?|Client|MDRemote|Sender)\-?IP|(Client|Remote_)Addr|PHP\-Script):/
> /\b(41\.1(6\d|7[0-5])\.\d+\.\d+)\b/   
> REJECT aviso.ci junk 2
> /\b(41\.138\.1([678]\d|9[01])\.\d+)\b/
> REJECT aviso.ci junk 2
> /\b(41\.184\.(1[6-9]|2\d|3[01])\.\d+)\b/  
> REJECT aviso.ci junk 2
> /\b(41\.184\.(3[2-9]|4[0-7])\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /\b(41\.189\.([1-3]?\d|4[0237]|5[0-6]|9[6-9]|1([01]\d|2[0-7]))\.\d+)\b/   
> REJECT
> aviso.ci junk 2
> /\b(41\.191\.(6[89]|7[01]|8[4-7])\.\d+)\b/
> REJECT aviso.ci junk 2
> /\b(41\.191\.1(0[89]|1[01])\.\d+)\b/  
> REJECT aviso.ci junk 2
> /\b(41\.19\.\d+\.\d+)\b/  
> REJECT aviso.ci junk 2
> /\b(41\.202\.\d+\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /\b(41\.203\.(6[4-9]|[78]\d|9[0-5]|2(2[4-9]|3[0-9]))\.\d+)\b/ 
> REJECT
> aviso.ci junk 2
> /\b(41\.204\.2(2[4-7]|[34]\d|5[0-5])\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /\b(41\.205\.1(6[57]|72)\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /\b(41\.207\.([0-9]|1[5-9]|2[0-9]|3[01]|1([6-9]\d)|2([01]\d|2[0-3]))\.\d+)\b/ 
> REJECT
> aviso.ci junk 2
> /\b(41\.208\.(1(2[89]|[3-9]\d)|2(0[0-7]))\.\d+)\b/
> REJECT aviso.ci junk
> 2
> /\b(41\.211\.([0-3]|19[2-9]|2([0-4]\d|5[0-5]))\.\d+)\b/   
> REJECT aviso.ci
> junk 2
> /\b(41\.213\.(\d?\d|1([01]\d|2[0-7]))\.\d+)\b/
> REJECT aviso.ci junk 2
> /\b(41\.214\.(3[2-9]|4[0-7]|9[6-9])\.\d+)\b/  
> REJECT aviso.ci junk 2
> /\b(41\.215\.1(6\d|7[0-5])\.\d+)\b/   
> REJECT aviso.ci junk 2
> /\b(41\.216\.(3[2-9]|[45]\d|6[0-3])\.\d+)\b/  
> REJECT aviso.ci junk 2
> /\b(41\.217\.([1-9]?\d|1([01]\d|2[0-7]))\.\d+)\b/ 
> REJECT aviso.ci junk
> 2
> /\b(41\.218\.(19[2-9]|2([0-4]\d|5[0-5]))\.\d+)\b/ 
> REJECT aviso.ci junk
> 2
> /\b(41\.218\.(19[2-9]|2([01]\d|2[0-3]))\.\d+)\b/  
> REJECT aviso.ci junk 2
> /\b(41\.219\.(1(2[89]|[3-8]\d|9[016])|24[24])\.\d+)\b/
> REJECT aviso.ci
> junk 2
> /\b(41\.220\.(75|1(7[6-9]|8\d|9[01]))\.\d+)\b/
> REJECT aviso.ci junk 2
> /\b(41\.221\.194\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /\b(41\.222\.19[2-5]\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /\b(41\.223\.2(48|51)\.\d+)\b/
> REJECT aviso.ci junk 2
> /\b(41\.232\.\d+\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /\b(41\.245\.\d+\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /\b(41\.25[2-5]\.\d+\.\d+)\b/ 
> REJECT aviso.ci junk 2
> /\b(41\.26\.\d+\.\d+)\b/  
> REJECT aviso.ci junk 2
> /\b(41\.29\.\d+\.\d+)\b/  
> REJECT aviso.ci junk 2
> /\b(41\.2[78]\.\d+\.\d+)\b/ 

Re: max length of pcre rule?

2010-03-29 Thread Louis-David Mitterrand
On Mon, Mar 29, 2010 at 04:55:19PM +0200, Steve wrote:
> > You if/endif suggestion for the prefix is interesting.
> > 
> > For added safety, the individual rules should be anchored with ^ and the
> > bracketed atom plussed, no?
> > 
> > /^[^:]+:.+
> > 
> Yes. You are right. But to be honest this should be enough (just an example):
> 001) if 
> /^Received|X\-((Origin(ating)?|Client|MDRemote|Sender)\-?IP|(Client|Remote_)Addr|PHP\-Script):/
> 002) /\b(127\.0.\d+\.\d+)\b/ REJECT aviso.ci junk 2
> 003) endif
> 
> 
> * Rule 001 will match a specific header.
> * Rule 002 will match 127.0.xxx.xxx
> * 127.0.xxx.xxx could be anchored with ^ but the rule/if-condition in
> 001 is already taking care of that 127.0.xxx.xxx is not part of the
> header name. So you can shorten the regexp to just "/\b( check/rule>)/b REJECT blah-blah-blah"

Indeed, on second thought the anchoring is useless in individual rules,
making it much more readable/managable.

Thanks for taking to time to de-parse my giga-rule into its component
parts!


Re: max length of pcre rule?

2010-03-29 Thread Louis-David Mitterrand
On Mon, Mar 29, 2010 at 05:16:39PM +0200, Steve wrote:
> > 
> Ach. Again. I made errors. Sorry. It's hard to write here in such a
> small edit box in a web interface. The above is not 100% correct. What
> I wanted to write is:

You need the 'itsalltext' firefox extension to edit any web textarea
with your $EDITOR.

https://addons.mozilla.org/en-US/firefox/addon/4125

> --
> /\b(41\.26\.\d+\.\d+)\b/  REJECT aviso.ci junk 2
> /\b(41\.29\.\d+\.\d+)\b/  REJECT aviso.ci junk 2
> /\b(41\.2[78]\.\d+\.\d+)\b/   REJECT aviso.ci junk 2
> /\b(41\.30\.\d+\.\d+)\b/  REJECT aviso.ci junk 2
> /\b(41\.31\.\d+\.\d+)\b/  REJECT aviso.ci junk 2
> 
> Could be shortened to:
> /\b(41\.(2[6-9]|3[01])\.\d+\.\d+)\b/  REJECT aviso.ci junk 2
> --

And of course your line-by-line rules makes it so much easier to sort,
and regroup, them.

Cheers,


Re: max length of pcre rule?

2010-03-29 Thread Henrik K
On Mon, Mar 29, 2010 at 05:17:22PM +0200, Louis-David Mitterrand wrote:
> On Mon, Mar 29, 2010 at 04:55:19PM +0200, Steve wrote:
> > > You if/endif suggestion for the prefix is interesting.
> > > 
> > > For added safety, the individual rules should be anchored with ^ and the
> > > bracketed atom plussed, no?
> > > 
> > > /^[^:]+:.+
> > > 
> > Yes. You are right. But to be honest this should be enough (just an 
> > example):
> > 001) if 
> > /^Received|X\-((Origin(ating)?|Client|MDRemote|Sender)\-?IP|(Client|Remote_)Addr|PHP\-Script):/
> > 002) /\b(127\.0.\d+\.\d+)\b/ REJECT aviso.ci junk 2
> > 003) endif
> > 
> > 
> > * Rule 001 will match a specific header.
> > * Rule 002 will match 127.0.xxx.xxx
> > * 127.0.xxx.xxx could be anchored with ^ but the rule/if-condition in
> > 001 is already taking care of that 127.0.xxx.xxx is not part of the
> > header name. So you can shorten the regexp to just "/\b( > check/rule>)/b REJECT blah-blah-blah"
> 
> Indeed, on second thought the anchoring is useless in individual rules,
> making it much more readable/managable.
> 
> Thanks for taking to time to de-parse my giga-rule into its component
> parts!

In theory that's quite inefficient. Given your traffic it might not make a
difference.

A better approach would be keeping all the IPs etc in a file and generating
the rule using for example perl + Regexp::Assemble.



Re: max length of pcre rule?

2010-03-29 Thread Steve

 Original-Nachricht 
> Datum: Mon, 29 Mar 2010 19:00:36 +0300
> Von: Henrik K 
> An: postfix-users@postfix.org
> Betreff: Re: max length of pcre rule?

> On Mon, Mar 29, 2010 at 05:17:22PM +0200, Louis-David Mitterrand wrote:
> > On Mon, Mar 29, 2010 at 04:55:19PM +0200, Steve wrote:
> > > > You if/endif suggestion for the prefix is interesting.
> > > > 
> > > > For added safety, the individual rules should be anchored with ^ and
> the
> > > > bracketed atom plussed, no?
> > > > 
> > > > /^[^:]+:.+
> > > > 
> > > Yes. You are right. But to be honest this should be enough (just an
> example):
> > > 001) if
> /^Received|X\-((Origin(ating)?|Client|MDRemote|Sender)\-?IP|(Client|Remote_)Addr|PHP\-Script):/
> > > 002) /\b(127\.0.\d+\.\d+)\b/ REJECT aviso.ci junk 2
> > > 003) endif
> > > 
> > > 
> > > * Rule 001 will match a specific header.
> > > * Rule 002 will match 127.0.xxx.xxx
> > > * 127.0.xxx.xxx could be anchored with ^ but the rule/if-condition in
> > > 001 is already taking care of that 127.0.xxx.xxx is not part of the
> > > header name. So you can shorten the regexp to just "/\b( > > check/rule>)/b REJECT blah-blah-blah"
> > 
> > Indeed, on second thought the anchoring is useless in individual rules,
> > making it much more readable/managable.
> > 
> > Thanks for taking to time to de-parse my giga-rule into its component
> > parts!
> 
> In theory that's quite inefficient.
>
What is inefficient? The combining of rules or the splitting?


> Given your traffic it might not make a
> difference.
> 
> A better approach would be keeping all the IPs etc in a file and
> generating
> the rule using for example perl + Regexp::Assemble.

-- 
Sicherer, schneller und einfacher. Die aktuellen Internet-Browser -
jetzt kostenlos herunterladen! http://portal.gmx.net/de/go/chbrowser


Re: max length of pcre rule?

2010-03-29 Thread Henrik K
On Mon, Mar 29, 2010 at 06:13:15PM +0200, Steve wrote:
> 
>  Original-Nachricht 
> > Datum: Mon, 29 Mar 2010 19:00:36 +0300
> > Von: Henrik K 
> > An: postfix-users@postfix.org
> > Betreff: Re: max length of pcre rule?
> 
> > On Mon, Mar 29, 2010 at 05:17:22PM +0200, Louis-David Mitterrand wrote:
> > > On Mon, Mar 29, 2010 at 04:55:19PM +0200, Steve wrote:
> > > > > You if/endif suggestion for the prefix is interesting.
> > > > > 
> > > > > For added safety, the individual rules should be anchored with ^ and
> > the
> > > > > bracketed atom plussed, no?
> > > > > 
> > > > > /^[^:]+:.+
> > > > > 
> > > > Yes. You are right. But to be honest this should be enough (just an
> > example):
> > > > 001) if
> > /^Received|X\-((Origin(ating)?|Client|MDRemote|Sender)\-?IP|(Client|Remote_)Addr|PHP\-Script):/
> > > > 002) /\b(127\.0.\d+\.\d+)\b/ REJECT aviso.ci junk 2
> > > > 003) endif
> > > > 
> > > > 
> > > > * Rule 001 will match a specific header.
> > > > * Rule 002 will match 127.0.xxx.xxx
> > > > * 127.0.xxx.xxx could be anchored with ^ but the rule/if-condition in
> > > > 001 is already taking care of that 127.0.xxx.xxx is not part of the
> > > > header name. So you can shorten the regexp to just "/\b( > > > check/rule>)/b REJECT blah-blah-blah"
> > > 
> > > Indeed, on second thought the anchoring is useless in individual rules,
> > > making it much more readable/managable.
> > > 
> > > Thanks for taking to time to de-parse my giga-rule into its component
> > > parts!
> > 
> > In theory that's quite inefficient.
> >
> What is inefficient? The combining of rules or the splitting?

Executing a bunch of expressions instead of one.



Re: max length of pcre rule?

2010-03-31 Thread mouss
Steve a écrit :
>  Original-Nachricht 
>> Datum: Mon, 29 Mar 2010 16:44:58 +0200
>> Von: Louis-David Mitterrand 
>> An: postfix-users@postfix.org
>> Betreff: Re: max length of pcre rule?
> 
>> On Mon, Mar 29, 2010 at 04:38:17PM +0200, Steve wrote:
>>> Ohhh boy. Now looking at the regexp I see an error. Every line
>>> starting with "/[^:]*.+" should be replaced by "/[^:]*:.+". Sorry for
>>> that.
>>>
>> Hi Steve,
>>
> Hello Louis-David,
> 
> 
>> You if/endif suggestion for the prefix is interesting.
>>
>> For added safety, the individual rules should be anchored with ^ and the
>> bracketed atom plussed, no?
>>
>> /^[^:]+:.+
>>
> I have fixed some issues in your regexp and sorted the rules:
> if 
> /^Received|^X\-((Origin(ating)?|Client|MDRemote|Sender)\-?IP|(Client|Remote_)Addr|PHP\-Script):/
> /\b(41\.1(6\d|7[0-5])\.\d+\.\d+)\b/   
> REJECT aviso.ci junk 2
> [snip]

you're not trying to implement an IP BL using string matches in
header_checks, are you? This is inefficient.

if you want to do that, write a content_filter/proxy_filter/milter that
extracts the string, converts it to an IP and checks that in a cidr map.

of course, this is already implemented in spamassassin... if you're
avoiding SA because of performances, we're sure you'll get back to
country after some travel:)


Re: max length of pcre rule?

2010-04-01 Thread Henrik K
On Thu, Apr 01, 2010 at 12:04:59AM +0200, mouss wrote:
> Steve a écrit :
> >  Original-Nachricht 
> >> Datum: Mon, 29 Mar 2010 16:44:58 +0200
> >> Von: Louis-David Mitterrand 
> >> An: postfix-users@postfix.org
> >> Betreff: Re: max length of pcre rule?
> > 
> >> On Mon, Mar 29, 2010 at 04:38:17PM +0200, Steve wrote:
> >>> Ohhh boy. Now looking at the regexp I see an error. Every line
> >>> starting with "/[^:]*.+" should be replaced by "/[^:]*:.+". Sorry for
> >>> that.
> >>>
> >> Hi Steve,
> >>
> > Hello Louis-David,
> > 
> > 
> >> You if/endif suggestion for the prefix is interesting.
> >>
> >> For added safety, the individual rules should be anchored with ^ and the
> >> bracketed atom plussed, no?
> >>
> >> /^[^:]+:.+
> >>
> > I have fixed some issues in your regexp and sorted the rules:
> > if 
> > /^Received|^X\-((Origin(ating)?|Client|MDRemote|Sender)\-?IP|(Client|Remote_)Addr|PHP\-Script):/
> > /\b(41\.1(6\d|7[0-5])\.\d+\.\d+)\b/ 
> > REJECT aviso.ci junk 2
> > [snip]
> 
> you're not trying to implement an IP BL using string matches in
> header_checks, are you? This is inefficient.

So what is the "mouss" limit? Checking 1 IP? 10? 100? 1000? 1?

You are underestimating big optimized PCREs. I just tried the one from
original post and got 15000 mails grepped per second. I didn't look if the
expression could be optimized more.

> if you want to do that, write a content_filter/proxy_filter/milter that
> extracts the string, converts it to an IP and checks that in a cidr map.
>
> of course, this is already implemented in spamassassin... if you're
> avoiding SA because of performances, we're sure you'll get back to
> country after some travel:)

Your suggestion has no merit if someone really really wants to directly
block some IPs by header. There is no need to have big filter overhead if
they aren't used otherwise. Can SA handle 15000 mails/s?

Only thing I'd be little careful is to not hit anything falsely in Received,
since there could be exotic versions strings etc..