On 02/12/2014 01:46 PM, John Hardin wrote: > On Wed, 12 Feb 2014, Axb wrote: >> On 02/12/2014 10:06 PM, John Hardin wrote: >>> Perhaps something like this: >>> >>> body __HEXHASHWORD /\b[0-9a-f]{30,}\s[a-z]{1,10}\b/ >>> tflags __HEXHASHWORD multiple maxhits=5 >>> meta HEXHASH_WORD __HEXHASHWORD > 4 >>> describe HEXHASH_WORD Hexadecimal hash followed by a word >>> >>> Added to my sandbox, just in case. >> >> John, >> >> Isn't {30,} (without a limit) dangerously expensive? > > Potentially expensive; the character class and the fact that the > following atom is not in that class limits the risk - backtracking > isn't a possibility. However, point taken - recommend {30,64} instead.
Given the nature of the content, I'd go the other direction and not require the word boundary. This removes the wildcard, though it doesn't short circuit as quickly, so one could debate which version is more efficient. body __HEXHASHWORD /\b[a-z]{1,10}\s[0-9a-f]{30}/ tflags __HEXHASHWORD multiple maxhits=5 meta HEXHASH_WORD __HEXHASHWORD > 4 describe HEXHASH_WORD Five hexadecimal hashes, each following a word I'm curious if the hex string is always so similar; it may be enough to use \bb8b177bf24975 and not need the tflags multiple piece.
signature.asc
Description: OpenPGP digital signature