Re: Spamassassin default SHORT_URI list obsolete/outdated
On 07/01/2016 10:13 AM, Groach wrote: On 01/07/2016 09:56, Axb wrote: I then informed him that SA alreadyhas a URL_SHORTENER checking rule found in 72_ACTIVE.CF. I was currently using this as a META rule thus: meta MY_URI_URLSHORT __URL_SHORTENER # defined in 72_active.cf ATM it seems there is no such rule - pls verify the name after running sa-update As quoted, it is " __URL_SHORTENER " The entry reads as follows: uri __URL_SHORTENER /^http:\/\/(?:bit\.ly|tinyurl\.com|ow\.ly|is\.gd|tumblr\.com|formspring\.me|ff\.im|youtu\.be|tl\.gd|plurk\.com|migre\.me|j\.mp|cli\.gs|goo\.gl|yfrog\.com|lnk\.ms|su\.pr|fb\.me|alturl\.com|wp\.me|ping\.fm|chatter\.com|post\.ly|twurl\.nl|tiny\.cc|4sq\.com|ustre\.am|short\.to|u\.nu|flic\.kr|budurl\.com|digg\.com|twitvid\.com|gowal\.la|om\.ly|justin\.tv|icio\.us|p\.gs|loopt\.us|tcrn\.ch|xrl\.us|wpo\.st|bkite\.com)\/[^\/]{3}\/?/ ok - found it... and must say this rule is pretty sloppy and should probably be deprecated. I hope whoever compiled this list takes a look into this. It includes domains which are clearly not URI shorteners, or never used in spam, etc. Imo, this rule can probably be deprecated in favour of network lookups and is used in other META rules such as MONEY_FRAUD_5 (you see it is preceeded with "__" ) URL shorteners aren't bad per se so it makes little sense to waste cycles processing a long list which may or not be abused. Many of these sites won't be around in 6 months, some have zero abuse some may even be NXDOMAIN You can see from 72_ACTIVE that the idea of using a url shortener isnt bad by itself and that SA rules do use it in conjunction with other 'more likely' postive matching (such as MONEY_FRAUD_5) Such rules are best mantained/provided by interested third parties which may or not commit to keep them up to date. SA devs don't really have the time to chase sites/domains and to load the default rule set with extra bloat doesn't sound very wise. Why not make this YOUR project? Ok, well, I will leave it as HIS project ;-) (the guy who has already applied his research to provided this surbl lookup). He also has stated that many of these sites come and go (as you imply). His project is to mantain a domain list, similar to Spamhaus DBL's section "127.0.1.103 abused spammed redirector domain" To mantain a SA rule with that data seems like a redundant effort but if someone needs this in would be wiser to tackle it at source to avoid stale data.
Re: Spamassassin default SHORT_URI list obsolete/outdated
On 01/07/2016 09:56, Axb wrote: I then informed him that SA alreadyhas a URL_SHORTENER checking rule found in 72_ACTIVE.CF. I was currently using this as a META rule thus: meta MY_URI_URLSHORT __URL_SHORTENER # defined in 72_active.cf ATM it seems there is no such rule - pls verify the name after running sa-update As quoted, it is " __URL_SHORTENER " The entry reads as follows: uri __URL_SHORTENER /^http:\/\/(?:bit\.ly|tinyurl\.com|ow\.ly|is\.gd|tumblr\.com|formspring\.me|ff\.im|youtu\.be|tl\.gd|plurk\.com|migre\.me|j\.mp|cli\.gs|goo\.gl|yfrog\.com|lnk\.ms|su\.pr|fb\.me|alturl\.com|wp\.me|ping\.fm|chatter\.com|post\.ly|twurl\.nl|tiny\.cc|4sq\.com|ustre\.am|short\.to|u\.nu|flic\.kr|budurl\.com|digg\.com|twitvid\.com|gowal\.la|om\.ly|justin\.tv|icio\.us|p\.gs|loopt\.us|tcrn\.ch|xrl\.us|wpo\.st|bkite\.com)\/[^\/]{3}\/?/ and is used in other META rules such as MONEY_FRAUD_5 (you see it is preceeded with "__" ) URL shorteners aren't bad per se so it makes little sense to waste cycles processing a long list which may or not be abused. Many of these sites won't be around in 6 months, some have zero abuse some may even be NXDOMAIN You can see from 72_ACTIVE that the idea of using a url shortener isnt bad by itself and that SA rules do use it in conjunction with other 'more likely' postive matching (such as MONEY_FRAUD_5) Such rules are best mantained/provided by interested third parties which may or not commit to keep them up to date. SA devs don't really have the time to chase sites/domains and to load the default rule set with extra bloat doesn't sound very wise. Why not make this YOUR project? Ok, well, I will leave it as HIS project ;-) (the guy who has already applied his research to provided this surbl lookup). He also has stated that many of these sites come and go (as you imply). Thanks
Re: Spamassassin default SHORT_URI list obsolete/outdated
On 07/01/2016 09:35 AM, jimimaseye wrote: Recently I was in discussion with the creator of a URI_SHORTENER black list maintainer that created a list of domains handling short URLs. (You can find his full rule and details here: http://snork.ca/posts/2016-06-24-surbl-of-url-shorteners-for-spamassassin/). He has identified over 200 CURRENT url shorteners and maintains them accordingly (viewable here: http://snork.ca/posts/2016-06-24-surbl-of-url-shorteners-for-spamassassin/url_shorteners.txt). I then informed him that SA alreadyhas a URL_SHORTENER checking rule found in 72_ACTIVE.CF. I was currently using this as a META rule thus: meta MY_URI_URLSHORT __URL_SHORTENER # defined in 72_active.cf ATM it seems there is no such rule - pls verify the name after running sa-update He quite rightly pointed out that the 43 included shortener domains that SA checks for in the default rule is drastically short and outdated (some even dont exist anymore) compared to his more current recently 200 researched list. URL shorteners aren't bad per se so it makes little sense to waste cycles processing a long list which may or not be abused. Many of these sites won't be around in 6 months, some have zero abuse some may even be NXDOMAIN Such rules are best mantained/provided by interested third parties which may or not commit to keep them up to date. SA devs don't really have the time to chase sites/domains and to load the default rule set with extra bloat doesn't sound very wise. Why not make this YOUR project? Is there any way that maybe the default list that SA checks for in 72_ACTIVE can be updated and how is this request made or implemented? (Forgive me, I dont know how these things work). See above..
Spamassassin default SHORT_URI list obsolete/outdated
Recently I was in discussion with the creator of a URI_SHORTENER black list maintainer that created a list of domains handling short URLs. (You can find his full rule and details here: http://snork.ca/posts/2016-06-24-surbl-of-url-shorteners-for-spamassassin/). He has identified over 200 CURRENT url shorteners and maintains them accordingly (viewable here: http://snork.ca/posts/2016-06-24-surbl-of-url-shorteners-for-spamassassin/url_shorteners.txt). I then informed him that SA alreadyhas a URL_SHORTENER checking rule found in 72_ACTIVE.CF. I was currently using this as a META rule thus: meta MY_URI_URLSHORT __URL_SHORTENER # defined in 72_active.cf He quite rightly pointed out that the 43 included shortener domains that SA checks for in the default rule is drastically short and outdated (some even dont exist anymore) compared to his more current recently 200 researched list. Is there any way that maybe the default list that SA checks for in 72_ACTIVE can be updated and how is this request made or implemented? (Forgive me, I dont know how these things work). -- View this message in context: http://spamassassin.1065346.n5.nabble.com/Spamassassin-default-SHORT-URI-list-obsolete-outdated-tp121584.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.