On 1/29/2014 11:53 AM, Andy Jezierski wrote:
I've been noticing a lot of spam getting through with the same traits, a bunch of random words within brackets. They all seem to come after the </body> or the </html> tag. Anyone much more knowledgeable than me care to assist with a rule to detect them?

Thanks
Andy


Example:

</html>

</body>
<style>
<geehrter>
<convaincre>
<eingerichtet>
<piuttosto>
<meny>
<Aufl>
<quilting>


<surveymonkey>
<update>
<Benoit>
<problemi>
<ese>
<telstra>
<checking>
<aglow>
<insegna>
<doorgeven>
I've been seeing that as well. They seem to all begin with <style> as well, to keep that crap from going through mail client HTML parsers.

You can probably exploit the fact that nobody is ever going to write a style block that doesn't match /[{}]/, but I haven't been able to experiment yet with any rules. I wouldn't recommend going the more general route of counting invalid HTML tags, simply due to the enormity of trying to maintain such a rule over time.

Reply via email to