I'm having some odd issues with squidGuard.  First off, I have a decent
sized blacklist... about half a million domains, and a few tens of
thousands of URLs, in two files... realdomains and realurls.

squidGuard was taking 20-25 minutes to start up with the domains not
sorted and in seperate files.  I wrote a quick C++ job that cuts down on
the subdomains, going from a.b.c.d.e.f.iloveadsandporn.com to
f.iloveadsandporn.com then sorting the resulting list and running it
through uniq.. 1.37 million entries was paried down to about half a
million.  Not bad, I thought... this should speed things up, right?

Wrong.  Instead of taking 20-25 minutes, I let the sucker run for over
six hours before finally deciding there *had* to be something wrong
here.  Looking at my logs from this morning, with five child processes,
it started at 03:45 and was ready for requests at 06:55.

After googling about for a while, I found information about pre-building
the B-trees.  I decided to try building them on my much faster system, a
1.7GHz P4.  It took but a few seconds.  I moved over the resulting .db
files and... no go.

2003-12-13 16:01:07 [401] loading dbfile
/usr/local/squidGuard/db/realdomains.db2003-12-13 16:01:07 [401] Error
db_open: Invalid argument

After checking, I found the faster box is using Berkeley DB 4.0.14
whereas the actual squid server is using Berkeley DB 2.7.7... so it
makes sense that it wouldn't work for me.

So, I decided to build them over there.  It seemed to have worked, it
took several minutes (no hours, though!)... I fired up squid, and see
this in the squidGuard log:

2003-12-13 16:27:50 [822] init domainlist /usr/local/squidGuard/db/realdomains
2003-12-13 16:27:50 [822] loading dbfile /usr/local/squidGuard/db/realdomains.db
2003-12-13 16:27:50 [822] domainlist empty, removed from memory
2003-12-13 16:27:50 [822] init urllist /usr/local/squidGuard/db/realurls
2003-12-13 16:27:50 [822] loading dbfile /usr/local/squidGuard/db/realurls.db
2003-12-13 16:27:50 [822] urllist empty, removed from memory
2003-12-13 16:27:50 [822] squidGuard 1.2.0 started (1071361670.060)
2003-12-13 16:27:50 [822] squidGuard ready for requests (1071361670.094)

Well, that's no good.  The realdomains.db file is 21MB in size, the
realurls.db file 2MB in size.

Any ideas what I'm doing wrong here?  The realdomains blacklist has been
fairly sanitized, I think... there shouldn't be anything in there to
cause problems anywhere.  There are no URLs in there, only domains and
IPs....

I'd appreciate any help or pointers!  I can provide any info that would
help.  The squidGuard -v line is here:

SquidGuard: 1.2.0 Sleepycat Software: Berkeley DB 2.7.7: (08/20/99)

Do I perhaps need to recompile against a newer Berkeley DB release?  I
thought there was no 3.0 or 4.0 support, although Gentoo built it with
4.0.14 somehow...

Rob

Reply via email to