On Mon, 12 Jan 2004, Larry Starr wrote: > Just noticed a message with an encoded URL, that misses, the "BIZ_TLD" rule, > etc. > > The message body contains: > <a href=3d"http://gf=2eclearmath=2ebiz/jsimp/index=2ehtml"><font > face=3d"arial">scored </font>this way=2e > <br><img src=3d"http://K=2eclearmath=2ebiz/images/js02=2ejpg" border=3d= > "0"> > </a> > > I know this wraps a bit ugly, when pasted into my mailer but, as you can see, > the punctuation, in the URI, is all hex encoded. "=2e", instead of ".". > > I have a local rule, in the form of bigevil.cf, with the following > sub-expression, that catches the above, but there has got to be a simpler way > to do this. > > uri uri MyEvilList_001 ( /\b(?:=2e){0,1}clearmath(?:\.|=2e)biz)\b\i > > Does anyone know of a ruleset that handles this sort of thing, perhaps code > that decodes the "=xx" expressions prior to the "URI" matches?
Actually that is a bastardized "quoted-printable" (QP) encoding of a URL. In QP the character sequence '=2E' is an encoded period, that spam-tool is generating '=2e' intending it to be interpreted as a period. SA is supposed to decode QP before running the various 'body' and 'uri' rules but there's a limitation in its decoding engine. If the QP encoding uses lower-case hex digits instead of CAPS hex digits, it does not recognize them as QP and fails to decode them. Strictly speaking RFC-2045 demands the usage of CAPS hex digits in QP (see section 6.7) and the lowercase stuff should be considered illegal. However many popular mail clients will decode the bastardized lowercase version and display the message to the user as the spammer intends (section 6.7, note (1) permits this). I can see two different ways to handle this, either make SA more flexible and decode the bastardized QP so normal rules will hit or write a rule that hits such bastardized QP coding as a spam-tool signature. Does anybody know if there are real (albeit brain-damaged) mail clients that generate such bastardized QP encoding? -- Dave Funk University of Iowa <dbfunk (at) engineering.uiowa.edu> College of Engineering 319/335-5751 FAX: 319/384-0549 1256 Seamans Center Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527 #include <std_disclaimer.h> Better is not better, 'standard' is better. B{ ------------------------------------------------------- This SF.net email is sponsored by: Perforce Software. Perforce is the Fast Software Configuration Management System offering advanced branching capabilities and atomic changes on 50+ platforms. Free Eval! http://www.perforce.com/perforce/loadprog.html _______________________________________________ Spamassassin-talk mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/spamassassin-talk