On Jul 23, 2005, at 5:49 PM, mouss wrote:
I lately received a lot of spam that contains a URL of with an
ampersand like the following ones:
http://mwbmphqks.com&uylnzptov306e74lz4hltp4l.wafddiwafd8.com.DEMUNGED/
http://wuqvqspsa.com&gwvjb5hnn3f2f1zk4j.impynjimpy9.com.DEMUNGED/
http://danwwzbmys.com&sxlxcemf2hnv6lky3ykao3k.telluristmj.net.DEMUNGED/
http://ezgezdmw.com&znxrazblhr3fl31vivhf0kh.wafddiwafd8.com.DEMUNGED/
http://rizssxavpbb.org&ktpvffvsy6hedrerd3zwd.choanosomeab.com.DEMUNGED/
so spammers are trying to evade filters that consider '&' as a
terminator, since rizssxavpbb.org is a random "domain" and won't be
listed.
The domains are now caught by various lists. but I think they can be
caught independently. one way I see is to add a score if '&' is found
in a URL. something like
#ampresand in domain
rawbody FOO_URI_AMPERSAND m{http://[\w\d\.\%\#]*\&}i
describe FOO_URI_AMPERSAND URL contains ampersand
score FOO_URI_AMPERSAND 1
would this cause false positives? how to improve this rule? (we could
also look for other suspicious chars).
maybe add a similar rule to increase the score if the ampersand
immediately follws a well-known tld (.org, .com,... at least)?
The only problem I can think of is than an ampersand in a _URL_ is
legal (IIRC, in CGI form urls, ampersand is used to delimit different
variables, so if the URL question contains some form of context, like
ack'ing a sign-up, it might legitimately contain an &). So, you need
to distinguish between "& before the third /" and "& after the third /
and probably after a ?". The former is bad. The latter should be ok.
But I could be misremembering.