On 02/12/2014 01:46 PM, John Hardin wrote:
> On Wed, 12 Feb 2014, Axb wrote:
>> On 02/12/2014 10:06 PM, John Hardin wrote:
>>>  Perhaps something like this:
>>>
>>>  body      __HEXHASHWORD   /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/
>>>  tflags    __HEXHASHWORD   multiple maxhits=5
>>>  meta      HEXHASH_WORD    __HEXHASHWORD > 4
>>>  describe  HEXHASH_WORD    Hexadecimal hash followed by a word
>>>
>>>  Added to my sandbox, just in case.
>>
>> John,
>>
>> Isn't {30,} (without a limit) dangerously expensive?
>
> Potentially expensive; the character class and the fact that the
> following atom is not in that class limits the risk - backtracking
> isn't a possibility. However, point taken - recommend {30,64} instead.

Given the nature of the content, I'd go the other direction and not
require the word boundary.  This removes the wildcard, though it doesn't
short circuit as quickly, so one could debate which version is more
efficient.

body      __HEXHASHWORD   /\b[a-z]{1,10}\s[0-9a-f]{30}/
tflags    __HEXHASHWORD   multiple maxhits=5
meta      HEXHASH_WORD    __HEXHASHWORD > 4
describe  HEXHASH_WORD    Five hexadecimal hashes, each following a word

I'm curious if the hex string is always so similar; it may be enough to
use  \bb8b177bf24975  and not need the tflags multiple piece.

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to