On Wed, 17 Feb 2010, Karsten Br�ckelmann wrote:

rawbody STYLE_GIBBERISH /<style[^>]{0,30}>(?:\s{1,20}|[^\s:;<]){175}/im

The problem is nested quantifiers with an alternation.

An alternative approach that should match the desired would look like
this -- eliminating the alternation with quantifiers inside.

 / (?: \s{1,20} [^\s:;<]{1,80} ){80} /x    # spaces for readability

Since there is no alternation and the two char classes are distinct, this RE can be simply expanded and matched from left to right, without any ambiguity.

John, does the above example help? :)

Not enough. What if there are more than 20 spaces? Or no spaces in a block of more than 80 non-punctuation characters?

I don't think there's any really _good_ way do what I'm trying to do in a rawbody rule. I'm now thinking a plugin that pulls out specified HTML tags and their contents and allows rules on them is the best way to approach this, for example:

  tagbody  STYLE_GIBBERISH  style =~ /^[^:;]{200}/

This would be more generally useful, too.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 [email protected]    FALaholic #11174     pgpk -a [email protected]
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Our government should bear in mind the fact that the American
  Revolution was touched off by the then-current government
  attempting to confiscate firearms from the people.
-----------------------------------------------------------------------
 5 days until George Washington's 278th Birthday

Reply via email to