At 10:41 AM -0700 08/09/2013, John Hardin wrote:
Can you provide a spample or two?
Sure.
http://pastebin.com/VfSCB7fw
http://pastebin.com/VCtvzjzV
Note the "outl" and "outi" links near the very bottom. The actual
domains used in these URIs vary... they used to be .pw, but recently
most have been .biz (though I've also seen some .mobi and I think
some .tv and even some .us).
Note that both of these hit BAYES_50... and that's pretty common for
these spams. For whatever reason, I don't know why, they seem to
only hit BAYES_50 and very rarely get higher scores (occasionally
they will get lower scores, too). Perhaps it's because most of the
spam is actually in the embedded image, rather than in rendered
text...
These are also great examples of the "HTML comment gibberish" that
pervades all of these spams. If you have time, it would be great if
you could adapt your STYLE_GIBBERISH rules to catch HTML comment
gibberish. (Presumably, you'd want to make sure the gibberish is
sufficiently long, too.)
They can be added but unless such spams appear in the masscheck
corpora the rules won't be scored and distributed.
No idea if they're in the masscheck corpora... but I and my users
have been getting them for months. I imagine they're relatively
widespread...
Thanks.
--- Amir