List of urls
Hello, Does anyone know if it's possible to have a list of url's, and define a score for all of them in one line ? Now i do like this : uri url_1 /www.domain1.com/ uri url_2 /www.domain2.com/ uri url_3 /www.domain3.com/ uri url_4 /www.domain4.com/ score url_1 10 score url_2 10 score url_3 10 score url_4 10 But I want just one line to define the score. Are there more ways to do this ? Greetings .. Richard
Re: List of urls
On Tue, 2010-10-26 at 08:07 +0200, Richard Smits wrote: Hello, Does anyone know if it's possible to have a list of url's, and define a score for all of them in one line ? I developed a similar system for my own purposes that you might want to look at. The idea is that you define this type of rule in an easily edited file which contains header lines the set the rule name, score, description, whether it ignores case, etc. These are followed by one or more sections, each consisting of a line saying which part of the message it applies to (body, uri, etc) and a list of match terms. A shell script, which uses gawk for the heavy lifting, converts one or more definition files into rules (one rule per definition) and outputs a single .cf file containing them all. There's even a man page. Its all available in a GPLed tarball: http://www.libelle-systems.com/free/portmanteau/portmanteau.tgz Martin
Re: List of urls
Hi! Now i do like this : uri url_1 /www.domain1.com/ uri url_2 /www.domain2.com/ uri url_3 /www.domain3.com/ uri url_4 /www.domain4.com/ score url_1 10 score url_2 10 score url_3 10 score url_4 10 Isnt this a bit expensive? Report to SURBL or something and you get them added ;) (send a mail to raym...@surbl.org) For your question, why dont you regexp it? uri url_1 /www.domain(1|2|3|4).com/ The exact regexp is naturally depending on the domains but you dont need a seperate check for all. The best to handle domains is putting them in a small rbl, or get them added to a existing rbl. Bye, Raymond.
Re: List of urls
On Tue, 2010-10-26 at 10:53 +0200, Raymond Dijkxhoorn wrote: For your question, why dont you regexp it? uri url_1 /www.domain(1|2|3|4).com/ The exact regexp is naturally depending on the domains but you dont need a seperate check for all. One way to consolidate them, yes -- depending on the nature of the strings to match it can be very intuitive and natural. The other technique you can use are meta rules, together with non-scoring sub-rules to prevent the individual parts from scoring (default of 1, if not set explicitly). uri __MY_BL_001 /example.(com|net)/ uri __MY_BL_002 /example.org/ meta MY_BL __MY_BL_001 || __MY_BL_002 score MY_BL 10.0 Note though, that the above uri matches are not sufficiently strict (similar to the OPs example) and might result in FPs. The dot in an RE matches any char, and must be escaped to match a literal dot. Also, the REs should be anchored, either at the left or right end, to prevent possibly matching innocent bystanders. Since parsed URIs are guaranteed to have a protocol (pre-pended by SA, if none), this would be much more safe than the simple example above. uri __MY_BL_000 m~^https?://(www\.)?example\.org(/|$)~ It is anchored at the beginning of the URI, allows an optional www host name, and is anchored at the end to further prevent FPs. Oh, and it also uses m// with an alternative delimiter, so I don't have to escape the slash in the RE. How strict you want your uri rule REs depends on your level of paranoia and the domains to match. The best to handle domains is putting them in a small rbl, or get them added to a existing rbl. Well, it certainly depends on the amount of URIs, and how frequently the list may change. SA config is not suitable for frequent changes, but would be way easier to set up than a local RBL, if the list isn't too large and mostly static. Adding to existing URI DNSBLs isn't always an option, btw. URL shorteners may have a place in severely size-constrained messages of sorts, but have no business in mail. They won't be blacklisted by the mayor players out there, though. ;) -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: List of urls
On Tue, 26 Oct 2010, Karsten Br?ckelmann wrote: On Tue, 2010-10-26 at 10:53 +0200, Raymond Dijkxhoorn wrote: For your question, why dont you regexp it? uri url_1 /www.domain(1|2|3|4).com/ The other technique you can use are meta rules uri __MY_BL_001 /example.(com|net)/ uri __MY_BL_002 /example.org/ meta MY_BL __MY_BL_001 || __MY_BL_002 score MY_BL 10.0 The OP wasn't clear whether he wanted ten points _per URI hit_. If that's the case, the regex alternatives and meta solutions aren't appropriate and there's no way to avoid one score line per URI rule. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- ...the Fates notice those who buy chainsaws... -- www.darwinawards.com --- 5 days until Halloween
Re: List of urls
On Tue, 26 Oct 2010, Richard Smits wrote: Does anyone know if it's possible to have a list of url's, and define a score for all of them in one line ? Now i do like this : uri url_1 /www.domain1.com/ uri url_2 /www.domain2.com/ uri url_3 /www.domain3.com/ uri url_4 /www.domain4.com/ score url_1 10 score url_2 10 score url_3 10 score url_4 10 But I want just one line to define the score. Are there more ways to do this? Do you want ten points total if _any_ targeted URI hits, or ten points for each targeted URI that hits regardless of how many hit? The latter is what you are doing above. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- ...the Fates notice those who buy chainsaws... -- www.darwinawards.com --- 5 days until Halloween
Re: List of urls
On Tue, 2010-10-26 at 10:37 -0700, John Hardin wrote: On Tue, 26 Oct 2010, Karsten Brckelmann wrote: On Tue, 2010-10-26 at 10:53 +0200, Raymond Dijkxhoorn wrote: For your question, why dont you regexp it? uri url_1 /www.domain(1|2|3|4).com/ The other technique you can use are meta rules uri __MY_BL_001 /example.(com|net)/ uri __MY_BL_002 /example.org/ meta MY_BL __MY_BL_001 || __MY_BL_002 score MY_BL 10.0 The OP wasn't clear whether he wanted ten points _per URI hit_. If that's the case, the regex alternatives and meta solutions aren't appropriate and there's no way to avoid one score line per URI rule. ? What about 'tflags multiple' as in: uriRULE /(example.(com|net)|example.org|...)/ tflags RULE multiple score RULE 10 The only (minor) drawback I've found is that the list of firing rules can filled with RULE, RULE, RULE, by the type of spam that contains nothing but tens of lines pushing variations on a theme such as: Buy FAMOUS SHOE basketMax Buy FAMOUS SHOE basketSuper Buy FAMOUS SHOE basketWimp Buy FAMOUS SHOE runningMax Martin
Re: List of urls
On Tue, 26 Oct 2010, Martin Gregorie wrote: On Tue, 2010-10-26 at 10:37 -0700, John Hardin wrote: The OP wasn't clear whether he wanted ten points _per URI hit_. If that's the case, the regex alternatives and meta solutions aren't appropriate and there's no way to avoid one score line per URI rule. ? What about 'tflags multiple' as in: uriRULE /(example.(com|net)|example.org|...)/ tflags RULE multiple score RULE 10 You're right. I didn't think of that. -- John Hardin KA7OHZhttp://www.impsec.org/~jhardin/ jhar...@impsec.orgFALaholic #11174 pgpk -a jhar...@impsec.org key: 0xB8732E79 -- 2D8C 34F4 6411 F507 136C AF76 D822 E6E6 B873 2E79 --- ...the Fates notice those who buy chainsaws... -- www.darwinawards.com --- 5 days until Halloween
Re: List of urls
On Tue, 2010-10-26 at 20:10 +0100, Martin Gregorie wrote: On Tue, 2010-10-26 at 10:37 -0700, John Hardin wrote: The OP wasn't clear whether he wanted ten points _per URI hit_. If that's the case, the regex alternatives and meta solutions aren't appropriate and there's no way to avoid one score line per URI rule. ? What about 'tflags multiple' as in: uriRULE /(example.(com|net)|example.org|...)/ tflags RULE multiple score RULE 10 The only (minor) drawback I've found is that the list of firing rules can filled with RULE, RULE, RULE, by the type of spam that contains nothing but tens of lines pushing variations on a theme such as: tflags multiple can be quite dangerous, though, if it directly results in a hit. As per your example. Besides possibly flooding the report, it also can seriously bias the overall score easily. URI DNSBL hits, for example, do not count how often a domain is in the spam, but hit once only. The safest approach for tflags multiple rules is to trigger other rules based on the number of hits. meta rules explicitly support this. meta FOO_4 __TFLAGS_MULTIPLE_SUB = 4 -- char *t=\10pse\0r\0dtu...@ghno\x4e\xc8\x79\xf4\xab\x51\x8a\x10\xf4\xf4\xc4; main(){ char h,m=h=*t++,*x=t+2*h,c,i,l=*x,s=0; for (i=0;il;i++){ i%8? c=1: (c=*++x); c128 (s+=h); if (!(h=1)||!t[s+h]){ putchar(t[s]);h=m;s=0; }}}
Re: List of urls
On Tue, 2010-10-26 at 23:59 +0200, Karsten Bräckelmann wrote: The safest approach for tflags multiple rules is to trigger other rules based on the number of hits. meta rules explicitly support this. meta FOO_4 __TFLAGS_MULTIPLE_SUB = 4 Yes, I agree. Equally importantly, is to avoid use giant-killing scores. I'd think 1.0 per hit would be as high as you'd ever want to use. FWIW I have only two multiples - one scores 0.1 per hit and the other uses 1.0 - the second one scans for relatively complex phrases that are unlikely to be seen outside advertising blurb or the speech of a sales-droid, and as a consequence multiple hits are fairly rare - its only multiple to punish outbreaks of salesorrhea and is only used in metas (often with the othyer multiple, which tags product names and descriptions of stuff I'd never buy. I'm a private user, not an ISP: can you tell? :-) Martin