On Sat, 15 Jul 2017, Antony Stone wrote:
On Saturday 15 July 2017 at 11:19:54, mastered wrote:
Hi Nicola,
I'm not good at SHELL script language, but this might be fine:
1 - Save file into lista.txt
2 - trasform lista.txt in spamassassin rules:
cat lista.txt | sed s'/http:\/\///' | sed s'/\/.*//' | sed s'/\./\\./g' |
sed s'/^/\//' | sed s'/$/\\b\/i/' | nl | awk '{print
"uri;RULE_NR_"$1";"$2"
describe;RULE_NR_"$1";Url;presente;nella;Blacklist;Ramsonware
score;RULE_NR_"$1";5.0" }' > listone.txt ;for i in $(sed -n p listone.txt)
; do echo "$i" ; done | sed s'/;/ /g' > blacklist.cf
[snip..]
One observation; that list has over 10,000 entries which means that you're going
to be adding thousands of additional rules to SA on an automated basis.
Some time in the past other people had worked up automated mechanisms to add
large numbers of rules derived from example spam messages (Hi Chris;) and there
were performance issues (significant increase in SA load time, memory usage,
etc).
Be aware, you may run into that situation. Using a URI-dnsbl avoids that risk.
I see that list gets updated frequently. How quickly do stale entries get
removed from it?
I couldn't find a policy statement about that other than the note about the 30
days retention for the RW_IPBL list.
Checking a random sample of the URLs on that list, the majority of them hit
404 errors.
If that list grows with out bound and isn't periodically pruned of stale entries
then it will become problematic for automated rule generation.
I'm not saying that this isn't an idea worth pursuing, just be aware there may
be issues.
--
Dave Funk University of Iowa
<dbfunk (at) engineering.uiowa.edu> College of Engineering
319/335-5751 FAX: 319/384-0549 1256 Seamans Center
Sys_admin/Postmaster/cell_admin Iowa City, IA 52242-1527
#include <std_disclaimer.h>
Better is not better, 'standard' is better. B{