Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2016-01-08 Thread Marcus Kool



On 01/07/2016 06:48 PM, Jason Haar wrote:

On 08/01/16 01:56, Marcus Kool wrote:

Can you explain what the huge number of regexes is used for ?

malware urls. I'm scraping them from publicly available sources like
phishtank, malwaredomains.com. Ironically, they don't need to be regexes
- but squid only has a "url_regex" acl type - so regex it is (can't use
dstdomain because we want to block "http://good.site/bad.url; - not all
of "good.site")



ufdbGuard always blocks "longer URLs", so if the database contains
www.example.com/foo, ufdbGuard also blocks www.example.com/foo?bar=1
and www.example.com/foobar.html so no regexes are required.

Marcus
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2016-01-07 Thread Marcus Kool



On 01/07/2016 12:31 AM, Jason Haar wrote:

On 06/01/16 00:04, Amos Jeffries wrote:

Yes. Squid always has been able to given enough RAM. Squid stores most
ACLs in memory as Splay trees, so entries are sorted by frequency of use
which is dynamically adapted over time. Regex are pre-parsed and
aggregated together for reduced matching instead of re-interpreted and
parsed per-request.

Great to hear. I've got some 600,000+ domain lists (ie dstdomain) and
60,000+ url lists (ie url_regex) acls, and there are a couple of
"gotchas" I've picked up during testing


Squid has regex optimisation that was donated by me and is essentially a
copy of what was already working a long time in ufdbGuard.
regexes are unlimited by the POSIX standard so you can have an
"unlimited" (limited by hardware resources) number of regexes.


1. at startup squid reports "WARNING: there are more than 100 regular
expressions. Consider using less REs". Is that now legacy and ignorable?
(should that be removed?). Obviously I have over 60,000 REs
2. making any change to squid and restarting/reconfiguring it now means
I'm seeing a 12sec outage as squid reads those acls off SSD
drives/parses them/etc. With squidguard that outage is hidden because
squidguard uses indexed files instead of the raw files and that
parsing/etc can be done offline. That behavioral change is pretty
dramatic: making a minor, unrelated change to squid now involves a
10+sec outage (instead of <1sec). I'd say "outsourcing" this kind of
function to another process (such as url_rewriter or ICAP) still has
it's advantages ;-)


ufdbGuard is 98% compatible with squidGuard, is free open source
software with regular updates.
ufdbGuard is also very fast due to a new database format optimised
for URLs.

As with squidGuard, when a new config is loaded by ufdbGuard, the web proxy
keeps on working without any interruption for the end user.

Can you explain what the huge number of regexes is used for ?

Marcus
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2016-01-07 Thread Amos Jeffries
On 8/01/2016 9:48 a.m., Jason Haar wrote:
> On 08/01/16 01:56, Marcus Kool wrote:
>> Can you explain what the huge number of regexes is used for ? 
> malware urls. I'm scraping them from publicly available sources like
> phishtank, malwaredomains.com. Ironically, they don't need to be regexes
> - but squid only has a "url_regex" acl type - so regex it is (can't use
> dstdomain because we want to block "http://good.site/bad.url; - not all
> of "good.site")
> 

But you do want to block all of http://good.site/bad\.url.* right?

Otherwise the malware can get around the protection trivially just by
adding a meaningless suffix to it.

With all the scraping are you also filtering for duplicates and reducing
multiple URLs in one doman down to fewer entries?

Amos

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2016-01-07 Thread Jason Haar
On 08/01/16 18:36, Amos Jeffries wrote:
> But you do want to block all of http://good.site/bad\.url.* right?
>
> Otherwise the malware can get around the protection trivially just by
> adding a meaningless suffix to it.

You are totally right - good catch :-)

>
> With all the scraping are you also filtering for duplicates and reducing
> multiple URLs in one doman down to fewer entries?

Yeah  - no dupes - but no manually reading to figure out patterns
either. That would take a human eye - and I want set-and-forget automation

-- 
Cheers

Jason Haar
Corporate Information Security Manager, Trimble Navigation Ltd.
Phone: +1 408 481 8171
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2016-01-07 Thread Jason Haar
On 08/01/16 01:56, Marcus Kool wrote:
> Can you explain what the huge number of regexes is used for ? 
malware urls. I'm scraping them from publicly available sources like
phishtank, malwaredomains.com. Ironically, they don't need to be regexes
- but squid only has a "url_regex" acl type - so regex it is (can't use
dstdomain because we want to block "http://good.site/bad.url; - not all
of "good.site")

-- 
Cheers

Jason Haar
Corporate Information Security Manager, Trimble Navigation Ltd.
Phone: +1 408 481 8171
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2016-01-06 Thread Eliezer Croitoru

On 07/01/2016 04:31, Jason Haar wrote:

On 06/01/16 00:04, Amos Jeffries wrote:

Yes. Squid always has been able to given enough RAM. Squid stores most
ACLs in memory as Splay trees, so entries are sorted by frequency of use
which is dynamically adapted over time. Regex are pre-parsed and
aggregated together for reduced matching instead of re-interpreted and
parsed per-request.

Great to hear. I've got some 600,000+ domain lists (ie dstdomain) and
60,000+ url lists (ie url_regex) acls, and there are a couple of
"gotchas" I've picked up during testing


commercial paid lists?


1. at startup squid reports "WARNING: there are more than 100 regular
expressions. Consider using less REs". Is that now legacy and ignorable?
(should that be removed?). Obviously I have over 60,000 REs



2. making any change to squid and restarting/reconfiguring it now means
I'm seeing a 12sec outage as squid reads those acls off SSD
drives/parses them/etc. With squidguard that outage is hidden because
squidguard uses indexed files instead of the raw files and that
parsing/etc can be done offline. That behavioral change is pretty
dramatic: making a minor, unrelated change to squid now involves a
10+sec outage (instead of <1sec). I'd say "outsourcing" this kind of
function to another process (such as url_rewriter or ICAP) still has
it's advantages ;-)


I have been working for a while on SquidBlocker which is a filtering 
engine\DB that has a built-in ICAP service.
My plan is to publish the stable version 1.0 at the end of the 
month(31/01/2016) after couple long month of testing in production.

It's not open source but parts of it are.

One of the main points with it was blazing fast online updates with 
almost 0 down time if at all, ie a restart doesn't really exists(The 
only reasons I have restarted the service was for an update).


Eliezer
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2016-01-06 Thread Jason Haar
On 06/01/16 00:04, Amos Jeffries wrote:
> Yes. Squid always has been able to given enough RAM. Squid stores most
> ACLs in memory as Splay trees, so entries are sorted by frequency of use
> which is dynamically adapted over time. Regex are pre-parsed and
> aggregated together for reduced matching instead of re-interpreted and
> parsed per-request.
Great to hear. I've got some 600,000+ domain lists (ie dstdomain) and
60,000+ url lists (ie url_regex) acls, and there are a couple of
"gotchas" I've picked up during testing

1. at startup squid reports "WARNING: there are more than 100 regular
expressions. Consider using less REs". Is that now legacy and ignorable?
(should that be removed?). Obviously I have over 60,000 REs
2. making any change to squid and restarting/reconfiguring it now means
I'm seeing a 12sec outage as squid reads those acls off SSD
drives/parses them/etc. With squidguard that outage is hidden because
squidguard uses indexed files instead of the raw files and that
parsing/etc can be done offline. That behavioral change is pretty
dramatic: making a minor, unrelated change to squid now involves a
10+sec outage (instead of <1sec). I'd say "outsourcing" this kind of
function to another process (such as url_rewriter or ICAP) still has
it's advantages ;-)

-- 
Cheers

Jason Haar
Corporate Information Security Manager, Trimble Navigation Ltd.
Phone: +1 408 481 8171
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2016-01-05 Thread Jason Haar
On 31/12/15 23:43, Amos Jeffries wrote:
>  But that said; everything SG provides a current Squid can also do
> (maybe better) by itself. 
Hi Amos

Are you saying the squid acl model can support (say) 100M acl lists? The
main feature of the squidguard redirector was that it had indexed files
that allowed for rapid searching for matches - is this done within squid
now? (presumably it wasn't some time ago?). If so, is that done in
memory or via the acl files? (ala SG) - the former means a much slower
squid startup?

Thanks

-- 
Cheers

Jason Haar
Corporate Information Security Manager, Trimble Navigation Ltd.
Phone: +1 408 481 8171
PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2016-01-05 Thread Amos Jeffries
On 5/01/2016 10:39 p.m., Jason Haar wrote:
> On 31/12/15 23:43, Amos Jeffries wrote:
>>  But that said; everything SG provides a current Squid can also do
>> (maybe better) by itself. 
> Hi Amos
> 
> Are you saying the squid acl model can support (say) 100M acl lists? The
> main feature of the squidguard redirector was that it had indexed files
> that allowed for rapid searching for matches - is this done within squid
> now? (presumably it wasn't some time ago?). If so, is that done in
> memory or via the acl files? (ala SG) - the former means a much slower
> squid startup?
> 

Yes. Squid always has been able to given enough RAM. Squid stores most
ACLs in memory as Splay trees, so entries are sorted by frequency of use
which is dynamically adapted over time. Regex are pre-parsed and
aggregated together for reduced matching instead of re-interpreted and
parsed per-request.

SquidGuard is from the era when servers only had 100's MB of RAM, not
tens of GB. So storing things on disk in files made sense. With OS level
file caching in memory that can look like fast ACLs - but in reality it
is still slower than directly accessing the listed value in RAM where
the entries are stored in a format that can be quickly tested against
the on-wire protocol data, not to mention the Squid<->helper protocol
overheads.

Amos

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2015-12-31 Thread Amos Jeffries

On 2015-12-29 11:46, George Hollingshead wrote:

I've had squid3.0 running with squidGuard on my old ubuntu 10.04
system with no problems for a few months now.

I just recently was enlightened by Yuri how to compile using a local
copy of openssl so i could upgrade to latest squid.  This was a
success.  Thanx again Yuri :)

Only problem now is when squidGuard goes to redirect a blocked page it
comes up with something like   URL /block.html192.168.2.20
192.168.2.20/GET [1] page not found.

the local net address their is the computer that is being blocked.

Like i said earlier, this all worked before redirecting to
http://localhost/block.html when needed, but since i upgraded squid
from ubuntu's 3.0  to a compiled 3.5 i get this responce.

Any ideas as i'm not sure if squidGuard is been updated in 6 years.



SG is no longer maintained software. As Marcus already mentioned 
ufdbguard can be used instead if you really need the helper at all.


The distro providers which still provide SG are patching their packages 
to cope with the newer Squid helper protocol. If you cannot get away 
from SG, then you need to also to rebuild SG from that newer repository 
to maintain the lock-step dependency between them.
 But that said; everything SG provides a current Squid can also do 
(maybe better) by itself.


Amos

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


Re: [squid-users] problem with squidGuard redirect page after upgrading squid

2015-12-28 Thread Marcus Kool



On 12/28/2015 08:46 PM, George Hollingshead wrote:

I've had squid3.0 running with squidGuard on my old ubuntu 10.04 system with no 
problems for a few months now.

I just recently was enlightened by Yuri how to compile using a local copy of 
openssl so i could upgrade to latest squid.  This was a success.  Thanx again 
Yuri :)

Only problem now is when squidGuard goes to redirect a blocked page it comes up with 
something like   URL /block.html192.168.2.20 192.168.2.20/GET 
 page not found.

the local net address their is the computer that is being blocked.

Like i said earlier, this all worked before redirecting to 
http://localhost/block.html when needed, but since i upgraded squid from 
ubuntu's 3.0  to a compiled 3.5 i get this responce.

Any ideas as i'm not sure if squidGuard is been updated in 6 years.  i realize 
this is a squid mailing list, but i could use some info if anyone has 
experienced this or has insight!

Thanx guys


You can use ufdbGuard instead of squidGuard.
ufdbGuard is Open Source and updated regularly.
If you do not use ssl-bump, ufdbGuard 1.31 works fine.
If you need ssl-bump, you have to wait for ufdbGuard 1.32 which is expected to 
be released in February 2016.

Marcus

___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users


[squid-users] problem with squidGuard redirect page after upgrading squid

2015-12-28 Thread George Hollingshead
I've had squid3.0 running with squidGuard on my old ubuntu 10.04 system
with no problems for a few months now.

I just recently was enlightened by Yuri how to compile using a local copy
of openssl so i could upgrade to latest squid.  This was a success.  Thanx
again Yuri :)

Only problem now is when squidGuard goes to redirect a blocked page it
comes up with something like   URL /block.html192.168.2.20 192.168.2.20/GET
page not found.

the local net address their is the computer that is being blocked.

Like i said earlier, this all worked before redirecting to
http://localhost/block.html when needed, but since i upgraded squid from
ubuntu's 3.0  to a compiled 3.5 i get this responce.

Any ideas as i'm not sure if squidGuard is been updated in 6 years.  i
realize this is a squid mailing list, but i could use some info if anyone
has experienced this or has insight!

Thanx guys
___
squid-users mailing list
squid-users@lists.squid-cache.org
http://lists.squid-cache.org/listinfo/squid-users