If I recall correctly (and it's been a while), I was seeing false
positives where t.co was matching t.com (or something like that) so I
was only paying attention to the need to not allow an alpha-num.
Short-sighted, I know (and I might have forgotten that \b isn't a
character match).

The regex I use to anchor tlds these days (and please tell me if this
doesn't work the way I intend) looks like:

uri  NEWTLD_URI  /\.(accountant|beer|bid|......|win|work|xyz)\b[^\.-]/i

I have slightly different regexes to match email addresses or server
names in headers, but they all basically express the rule "I need to
see a word boundary here, but certain non-word characters don't count
because it implies the domain name may continue in the given context"

On Fri, 8 Sep 2017, RW wrote:

On Fri, 8 Sep 2017 13:03:57 -0400
Kevin A. McGrail wrote:

On 9/8/2017 12:24 PM, Robert Boyl wrote:
Hello, everyone!

Is there a way to create a Spamassassin rule that checks for a
certain URL suffix such as .ru but makes sure it has to be at the
end of the URI? Ends with string.

Thanks!
Rob

Yes, it's called an anchor and Shane Williams a long time ago gave me
some advice on that I used in this rule:

uri             __KAM_SHORT
/(\/|^|\b)(?:j\.mp|bit\.ly|goo\.gl|x\.co|t\.co|t\.cn|tinyurl\.com|hop\.kz|urla\.ru|fw\.to)(\/|$|\b)/i

That doesn't look right, at least not in the context of the OP's
question.

In  (\/|$|\b)  the \b seems superfluous as it will match a boundary
between a letter and a '.' so the rule will for example match

goo.gl.example.com


--
Public key #7BBC68D9 at            |                 Shane Williams
http://pgp.mit.edu/                |      System Admin - UT CompSci
=----------------------------------+-------------------------------
All syllogisms contain three lines |              sha...@shanew.net
Therefore this is not a syllogism  | www.ischool.utexas.edu/~shanew

Reply via email to