> Can I see your script ? Sure.
I've attached the script as update_blacklist.txt (I added the .txt to
make it a little easier to manage as an attachment) I'll resend it as
in-line text if needed; just let me know.
The script has a good bit of in-line documentation; here are some
additional comments:
## The script uses wget to download the blacklists. Excellent program,
very easy to install, the link to download wget is in the script's
comments.
## It should run for you as is, if perl is available on your path. Or
you could put the full path to perl in the command. Or you can actually
accomplish the same thing using wc instead of perl (I used perl for the
learning experience). Let me know if you want the wc commands to swap
out.
## The diff file processing is only changed in the porn directory. To
prepare your porn blacklists to use this script:
a) Make backup copies of /porn/domains, /porn/urls, /porn/domains.diff,
and /porn/urls.diff.
b) Rename /porn/domains.diff to /porn/domains_diff.local.
c) Rename /porn/urls.diff to /porn/urls_diff.local.
d) Create /porn/archive directory. chown squid.squid
e) Create /porn/stats directory. chown squid.squid
## I hope the comments make the program flow understandable. If not, let
me know.
## Each time the script is run it will produce a stats file in the
/porn/stats directory. Here's an example:
cat 2001-11-28_050500_stats
Blacklist Line Counts for Wed 28 Nov 2001 05:05:00 CST
Squidguard blacklist files as downloaded
----------------------------------------
81607 /blacklists/porn/domains
27750 /blacklists/porn/urls
University Toulouse blacklist files as downloaded
-------------------------------------------------
133308 /adult/domains
7399 /adult/urls
Local _diff.local files
-----------------------
912 /blacklists/porn/domains_diff.local
19 /blacklists/porn/urls_diff.local
Local to_add and to_delete files
--------------------------------
880 /blacklists/porn/domains.to_add
32 /blacklists/porn/domains.to_delete
2 /blacklists/porn/urls.to_add
17 /blacklists/porn/urls.to_delete
Combined adult, blacklist and to_add files, deduped
---------------------------------------------------
170843 /blacklists/porn/domains.merged
34878 /blacklists/porn/urls.merged
After removing the contents of the to_delete files
--------------------------------------------------
170828 /blacklists/porn/domains.adjusted
34870 /blacklists/porn/urls.adjusted
Final production files
----------------------
170828 /blacklists/porn/domains
34870 /blacklists/porn/urls
The stats files are obviously not a requirement; feel free to comment
out those sections as well as the archive sections.
## Put your +additions and -deletions in the domains_diff.local and
urls_diff.local files. No need to worry about additional housekeeping to
remove old entries. It doesn't matter if nastysite.com is already in the
domains file, and it doesn't matter how many times +nastysite.com
appears in the domains_diff.local file, it will appear as one entry in
the final domains file. It doesn't matter how many times -yahoo.com
appears in the domains_diff.local file, and it doesn't matter if it even
appears in the domains file, the script will take the expected action.
## I also use another version of this script with the blacklist
downloads commented out to process additions/deletions without taking
the time to download the files again.
I hope you find this helpful.
Rick
-----Original Message-----
From: Sean O'Neill [mailto:[EMAIL PROTECTED]]
Sent: Thursday, November 29, 2001 11:18 AM
To: Rick Matthews; Squidguard Mailing List
Subject: RE: SquidGuard Newbie questions
At 10:29 AM 11/29/2001 -0600, Rick Matthews wrote:
>Well, I guess that would confuse you! The entries should have "+" in
>front of them, not "-". They are, of course, in my urls.diff file with
>"+". I apologize for adding to the confusion!
No prob :) Confusion gone.
> > I new to squidGuard so I'm learning. Right now in my
> > newbie-ness, I figure by deleting and then adding my
> > entries back I save myself the headache of trying to remember
> > what entries I added in the past without having to use a
> > TON of .diff versioned files.
>FWIW, I have not experienced any problems by letting them get added
>multiple times.
I'll change it to just add then. Making updating the .diff much easier.
>The vast majority of my changes are made in the porn database, and I
>have written a script to process the changes, so I don't need to worry
>about the issue that you mentioned. I make my changes to
>domains_diff.local and urls_diff.local, and the script takes it from
>there. It would be fairly easy to extend the processing to the other
>blacklist categories, I just haven't done it.
Can I see your script ?
> > +nytimes.com/realmedia/ads
>
>FWIW, in *nix, "/realmedia" is not the same thing as "/RealMedia".
Yeah, I'm a UNIX guy ... no need for explanation.
>I agree; I can't explain it. (But that doesn't stop me from blocking it
>using the other method! ;-))
You would agree this a pain in the arss.
Sean
-
........................................................
......... ..- -. .. -..- .-. ..- .-.. . ... ............
.-- .. -. -... .-.. --- .-- ... -.. .-. --- --- .-.. ...
Sean O'Neill
update_blacklists.txt
Description: Binary data
