On Thu, 7 May 2020, Brent Clark wrote:
Good day Guys
Our good friends are at it again.
https://pastebin.com/raw/vjFcPzLE
I haven't written anything yet.
Thought I would share in the mean time.
100% 4-byte UTF8? That should be trivially easy to detect.
Comments solicited.
body __4BYTE_UTF8_WORD /(?:\xf0\x9d[\x9a-\x9f][\x80-\xff]){3,10}/
tflags __4BYTE_UTF8_WORD multiple, maxhits=10
meta SUSP_UTF8_WORD_MANY __4BYTE_UTF8_WORD > 9
Potential FP for some languages because it's rather broad, it might be
possible to narrow it to just the 4-byte math glyphs that render readable
English text.
--
John Hardin KA7OHZ http://www.impsec.org/~jhardin/
jhar...@impsec.org FALaholic #11174 pgpk -a jhar...@impsec.org
key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
If you ask amateurs to act as front-line security personnel,
you shouldn't be surprised when you get amateur security.
-- Bruce Schneier
-----------------------------------------------------------------------
Tomorrow: the 75th anniversary of VE day