On Mon, 13 Jul 2009, McDonald, Dan wrote:

On Mon, 2009-07-13 at 16:03 +0100, rich...@buzzhost.co.uk wrote:
On Mon, 2009-07-13 at 10:46 -0400, Charles Gregory wrote:
(?!www\.[a-z]{2,6}[0-9]{2,6}\.(com|net|org))
www[^a-z0-9]+[a-z]{2,6}[0-9]{2,6}[^a-z0-9]+(com|net|org)

Does not seem to work with;

www. meds .com

It shouldn't.  The spammers have been using domains with 2-4 alpha
characters and 2 digits.

Why be restrictive on the domain name?

\b(?!www\.\w{2,20}\.(?:com|net|org))www[^a-z0-9]+\w{2,20}[^a-z0-9]+(?:com|net|org)\b

The + signs are a little risky, it might be better to use {1,3} instead. And the older rule allowed for spaces in the TLD. I don't recall if anybody provided more than one spample with that though.

--
 John Hardin KA7OHZ                    http://www.impsec.org/~jhardin/
 jhar...@impsec.org    FALaholic #11174     pgpk -a jhar...@impsec.org
 key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C  AF76 D822 E6E6 B873 2E79
-----------------------------------------------------------------------
  Users mistake widespread adoption of Microsoft Office for the
  development of a document format standard.
-----------------------------------------------------------------------
 3 days until the 64th anniversary of the dawn of the Atomic Age

Reply via email to