Re: inspecting incoming tcp content

Willy Tarreau Tue, 04 Mar 2014 22:27:26 -0800

On Wed, Mar 05, 2014 at 12:55:47AM +0100, PiBa-NL wrote:
> Ok seems to work now knowing this. Though it hase some side affects.
> 
> i could now match "param=TEST" using the following acl:
> acl PAYLOADcheck req.payload(0,0) -m reg -i 706172616d3D54455354
> 
> Case insensitive matching works 'perfectly', but for the hex code (see 
> the D and d above), but doesnt match different cases of letters which 
> one would probably expect. So even though i use -i, if i use the word 
> TEST in lower case it doesn't match anymore.


Indeed, you'd have to match it this way in order to match the
input bytes, not the hex string :

 acl PAYLOADcheck req.payload(0,0) -m reg -i 
[57]0[46]1[57]2[46]1[46]d3D[57]4[46]5[57]3[57]4

> There might be a workaround for that with the ",lower" option (i didnt 
> confirm if that is applied before the hex conversion.)

Yes it would be much easier. The way the match is done is :

  1) sample fetch function. Here, it is req.payload().
  2) converters. Here none, unless you add ",lower"
  3) cast to the input type of the ACL match (here, "reg" takes a string
     so it remains the same)
  4) execution of the match function (here "reg") for all patterns.

> Also the current documentation gives several examples which indicate a 
> different working:
> "
> On systems where the regex library is much slower when using "-i", it is 
> possible to convert the sample to lowercase before matching, like this : 
> acl script_tag payload(0,500),lower -m reg <script>
> "
> This doesn't work for detecting the text "<script> " as its hex 
> equivalent should be there, also if less than 500 bytes are send in the 
> initial request it doesn't match at all.

You're absolutely right. We really need to change this confusing behaviour
before the release. I'm sure we'll break one or two setups, but it we're
still in the development phase until we release, and the fix will be
trivial.

> So seems like this part of the manual could use a little more 
> clarification. (Praise though for the overall completeness/clarity of 
> the manual!)

I tend to consider that the doc is the reference which people use to
write their confs. So when something has never been working properly,
I prefer to make the code work as documented than fix the doc.

> Though if implementation now changes to match the manual, and possibly a 
> additional tohex option that would be great.

Yes it will be necessary so that the very few users (if any) who rely on
the current behaviour can fix their configs.

> As its used on mode tcp 
> certainly the option should exist to match binary/hex values that cannot 
> be easily expressed with normal text. So the original design 
> implementation does make sense, just not for 'textual' protocols.

I agree.

Thanks,
Willy

Re: inspecting incoming tcp content

Reply via email to