On Sat, 15 Jul 2017, Antony Stone wrote:

On Saturday 15 July 2017 at 11:19:54, mastered wrote:

Hi Nicola,

I'm not good at SHELL script language, but this might be fine:

1 - Save file into lista.txt

2 - trasform lista.txt in spamassassin rules:

cat lista.txt | sed s'/http:\/\///' | sed s'/\/.*//' | sed s'/\./\\./g' |
sed s'/^/\//' | sed s'/$/\\b\/i/' | nl | awk '{print "uri;RULE_NR_"$1";"$2"
describe;RULE_NR_"$1";Url;presente;nella;Blacklist;Ramsonware
score;RULE_NR_"$1";5.0" }' > listone.txt ;for i in $(sed -n p listone.txt)
; do echo "$i" ; done | sed s'/;/ /g' > blacklist.cf
[snip..]

One observation; that list has over 10,000 entries which means that you're going to be adding thousands of additional rules to SA on an automated basis.

Some time in the past other people had worked up automated mechanisms to add large numbers of rules derived from example spam messages (Hi Chris;) and there were performance issues (significant increase in SA load time, memory usage, etc).
Be aware, you may run into that situation. Using a URI-dnsbl avoids that risk.

I see that list gets updated frequently. How quickly do stale entries get removed from it? I couldn't find a policy statement about that other than the note about the 30 days retention for the RW_IPBL list. Checking a random sample of the URLs on that list, the majority of them hit 404 errors. If that list grows with out bound and isn't periodically pruned of stale entries then it will become problematic for automated rule generation.

I'm not saying that this isn't an idea worth pursuing, just be aware there may be issues.

--
Dave Funk                                  University of Iowa
<dbfunk (at) engineering.uiowa.edu>        College of Engineering
319/335-5751   FAX: 319/384-0549           1256 Seamans Center
Sys_admin/Postmaster/cell_admin            Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{

Reply via email to