Re: [squid-users] reverse proxy filtering?

2009-04-19 Thread Jeff Sadowski
On Sat, Apr 18, 2009 at 10:24 PM, Amos Jeffries squ...@treenet.co.nz wrote:
 Jeff Sadowski wrote:

 On Sat, Apr 18, 2009 at 5:18 PM, Amos Jeffries squ...@treenet.co.nz
 wrote:

 Jeff Sadowski wrote:

 I'm new to trying to use squid as a reverse proxy.

 I would like to filter out certain pages and if possible certain words.
 I installed perl so that I can use it to rebuild pages if that is
 possible?

 My squid.conf looks like so
  start
 acl all src all
 http_port 80 accel defaultsite=outside.com
 cache_peer inside parent 80 0 no-query originserver name=myAccel
 acl our_sites dstdomain outside.com

 aha, aha, ..

 http_access allow all

 eeek!!

 I want everyone on the outside to see the inside server minus one or
 two pages. Is that not what I set up?

 By lucky chance of some background defaults only, and assuming that the web
 server is highly secure on its own.

 If you have a small set of sites, such as those listed in our_sites then
 its best to be certain and use that ACL for the allow as well.

  http_access allow our_sites
  http_access deny all

 ... same on the cache_peer_access below.


 cache_peer_access myAccell all
  end

 how would I add it so that for example

 http://inside/protect.html

 is blocked?

 http://wiki.squid-cache.org/SquidFaq/SquidAcl

 so I want redirector_access?
 Is there an example line of this in a file

 I tried using

 url_rewrite_program c:\perl\bin\perl.exe c:\replace.pl

 but I guess that requires more to use it? an acl?
 should acl all src all be acl all redirect all ?

 No to all three. The above is all line you mention trying is all thats
 needed.

  url_rewrite_access allow all

 but the above should be the default when a url_rewrite_program  is set.

so how do you tell it to use the url_rewrite_program with the inside site?
Or does it use the script on all pages passing through the proxy?

Is this only a rewrite on the requested url from the web browser?
Ahh that might answer some of my questions before. I never tried
clicking on it after implementing the rewrite script. I was only
hovering over the url and seeing that it was still the same.


 What is making you think its not working? and what do the logs say about it?
 Also what is the c:/replace.pl code?


=== start
#!c:\perl\bin\perl.exe
$| = 1;
$replace=a href=http://inside/login.html.*?/a;
$with=no login;
while ($INPUT=) {
$INPUT=~s/$replace/$with/gi;
print $INPUT;
}
=== end

I think I see the problem now I guess I am looking for something else
besides url_rewrite maybe a full text replacement :-/



 and is it possible to filter/replace certain words on the site

 like replace Albuquerque with Duke City for an example on all pages?

 No. no. no. Welcome to copyright violation hell.

 This was an example. I have full permission to do the real translations.
 I am told to remove certain links/buttons to login pages. thus I
 replace a herf=insidebutton/a with  Currently I have a
 pathetic perl script that doesn't support cookies and is gong through
 each set of previous pages to bring up the content. I was hoping squid
 would greatly simplify this.
 I was using www::mechanize I know this isn't the best way but they
 just need a fast and dirty way.

 Ah, okay. Well the only ways squid has for doing content alteration is far
 too much as well for that use. (coding up an ICAP server and processing
 rules or a full eCAP adaptor plugin).

 IMO you need to kick the webapp developers to make their app do the removal
 under the right conditions. It would solve many more problems than having
 different copies of a page available with identical identifiers.

 Amos
 --
 Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
  Current Beta Squid 3.1.0.7



Re: [squid-users] reverse proxy filtering?

2009-04-19 Thread Jeff Sadowski
On Sun, Apr 19, 2009 at 12:29 AM, Jeff Sadowski jeff.sadow...@gmail.com wrote:
 On Sat, Apr 18, 2009 at 10:24 PM, Amos Jeffries squ...@treenet.co.nz wrote:
 Jeff Sadowski wrote:

 On Sat, Apr 18, 2009 at 5:18 PM, Amos Jeffries squ...@treenet.co.nz
 wrote:

 Jeff Sadowski wrote:

 I'm new to trying to use squid as a reverse proxy.

 I would like to filter out certain pages and if possible certain words.
 I installed perl so that I can use it to rebuild pages if that is
 possible?

 My squid.conf looks like so
  start
 acl all src all
 http_port 80 accel defaultsite=outside.com
 cache_peer inside parent 80 0 no-query originserver name=myAccel
 acl our_sites dstdomain outside.com

 aha, aha, ..

 http_access allow all

 eeek!!

 I want everyone on the outside to see the inside server minus one or
 two pages. Is that not what I set up?

 By lucky chance of some background defaults only, and assuming that the web
 server is highly secure on its own.

 If you have a small set of sites, such as those listed in our_sites then
 its best to be certain and use that ACL for the allow as well.

  http_access allow our_sites
  http_access deny all

 ... same on the cache_peer_access below.


 cache_peer_access myAccell all
  end

 how would I add it so that for example

 http://inside/protect.html

 is blocked?

 http://wiki.squid-cache.org/SquidFaq/SquidAcl

 so I want redirector_access?
 Is there an example line of this in a file

 I tried using

 url_rewrite_program c:\perl\bin\perl.exe c:\replace.pl

 but I guess that requires more to use it? an acl?
 should acl all src all be acl all redirect all ?

 No to all three. The above is all line you mention trying is all thats
 needed.

  url_rewrite_access allow all

 but the above should be the default when a url_rewrite_program  is set.

 so how do you tell it to use the url_rewrite_program with the inside site?
 Or does it use the script on all pages passing through the proxy?

 Is this only a rewrite on the requested url from the web browser?
 Ahh that might answer some of my questions before. I never tried
 clicking on it after implementing the rewrite script. I was only
 hovering over the url and seeing that it was still the same.


 What is making you think its not working? and what do the logs say about it?
 Also what is the c:/replace.pl code?


 === start
 #!c:\perl\bin\perl.exe
 $| = 1;
 $replace=a href=http://inside/login.html.*?/a;
 $with=no login;
 while ($INPUT=) {
 $INPUT=~s/$replace/$with/gi;
 print $INPUT;
 }
 === end

 I think I see the problem now I guess I am looking for something else
 besides url_rewrite maybe a full text replacement :-/



 and is it possible to filter/replace certain words on the site

 like replace Albuquerque with Duke City for an example on all pages?

 No. no. no. Welcome to copyright violation hell.

 This was an example. I have full permission to do the real translations.
 I am told to remove certain links/buttons to login pages. thus I
 replace a herf=insidebutton/a with  Currently I have a
 pathetic perl script that doesn't support cookies and is gong through
 each set of previous pages to bring up the content. I was hoping squid
 would greatly simplify this.
 I was using www::mechanize I know this isn't the best way but they
 just need a fast and dirty way.

 Ah, okay. Well the only ways squid has for doing content alteration is far
 too much as well for that use. (coding up an ICAP server and processing
 rules or a full eCAP adaptor plugin).


One more thing for the night if squid is written in C I think I can
easily modify it to do what I want. The problem then becomes compiling
it for windows. Can I just use cygwin?
I'm thinking I can have an external program run on the page before
handing it off to the web client. no?

 IMO you need to kick the webapp developers to make their app do the removal
 under the right conditions. It would solve many more problems than having
 different copies of a page available with identical identifiers.

 Amos
 --
 Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
  Current Beta Squid 3.1.0.7




Re: [squid-users] reverse proxy filtering?

2009-04-19 Thread Amos Jeffries

Jeff Sadowski wrote:

On Sat, Apr 18, 2009 at 10:24 PM, Amos Jeffries squ...@treenet.co.nz wrote:

Jeff Sadowski wrote:

On Sat, Apr 18, 2009 at 5:18 PM, Amos Jeffries squ...@treenet.co.nz
wrote:

Jeff Sadowski wrote:

I'm new to trying to use squid as a reverse proxy.

I would like to filter out certain pages and if possible certain words.
I installed perl so that I can use it to rebuild pages if that is
possible?

My squid.conf looks like so
 start
acl all src all
http_port 80 accel defaultsite=outside.com
cache_peer inside parent 80 0 no-query originserver name=myAccel
acl our_sites dstdomain outside.com

aha, aha, ..


http_access allow all

eeek!!

I want everyone on the outside to see the inside server minus one or
two pages. Is that not what I set up?

By lucky chance of some background defaults only, and assuming that the web
server is highly secure on its own.

If you have a small set of sites, such as those listed in our_sites then
its best to be certain and use that ACL for the allow as well.

 http_access allow our_sites
 http_access deny all

... same on the cache_peer_access below.


cache_peer_access myAccell all
 end

how would I add it so that for example

http://inside/protect.html

is blocked?

http://wiki.squid-cache.org/SquidFaq/SquidAcl

so I want redirector_access?
Is there an example line of this in a file

I tried using

url_rewrite_program c:\perl\bin\perl.exe c:\replace.pl

but I guess that requires more to use it? an acl?
should acl all src all be acl all redirect all ?

No to all three. The above is all line you mention trying is all thats
needed.

 url_rewrite_access allow all

but the above should be the default when a url_rewrite_program  is set.


so how do you tell it to use the url_rewrite_program with the inside site?
Or does it use the script on all pages passing through the proxy?


It changes the request as passed on to the web server in-transit. So the 
client still sees what they clicked on, but gets content from the other 
site.  Does not affect link or such on the page content.




Is this only a rewrite on the requested url from the web browser?
Ahh that might answer some of my questions before. I never tried
clicking on it after implementing the rewrite script. I was only
hovering over the url and seeing that it was still the same.


What is making you think its not working? and what do the logs say about it?


If you only checked the pages links, they may not change. Logs should 
show where the client went to and IP/name of server fetched from. Which 
would be the name of redirected server.



Also what is the c:/replace.pl code?



=== start
#!c:\perl\bin\perl.exe
$| = 1;
$replace=a href=http://inside/login.html.*?/a;
$with=no login;
while ($INPUT=) {
$INPUT=~s/$replace/$with/gi;
print $INPUT;
}
=== end

I think I see the problem now I guess I am looking for something else
besides url_rewrite maybe a full text replacement :-/


thats what your code wants, not what I pointed you to using.

You know I'm thinking you could get away without altering those pages, 
but just blocking external clients from visiting those URL.





and is it possible to filter/replace certain words on the site

like replace Albuquerque with Duke City for an example on all pages?

No. no. no. Welcome to copyright violation hell.

This was an example. I have full permission to do the real translations.
I am told to remove certain links/buttons to login pages. thus I
replace a herf=insidebutton/a with  Currently I have a
pathetic perl script that doesn't support cookies and is gong through
each set of previous pages to bring up the content. I was hoping squid
would greatly simplify this.
I was using www::mechanize I know this isn't the best way but they
just need a fast and dirty way.

Ah, okay. Well the only ways squid has for doing content alteration is far
too much as well for that use. (coding up an ICAP server and processing
rules or a full eCAP adaptor plugin).

IMO you need to kick the webapp developers to make their app do the removal
under the right conditions. It would solve many more problems than having
different copies of a page available with identical identifiers.

Amos
--
Please be using
 Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
 Current Beta Squid 3.1.0.7




--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
  Current Beta Squid 3.1.0.7


Re: [squid-users] reverse proxy filtering?

2009-04-19 Thread Jeff Sadowski
On Sat, Apr 18, 2009 at 10:24 PM, Amos Jeffries squ...@treenet.co.nz wrote:
 Jeff Sadowski wrote:

 On Sat, Apr 18, 2009 at 5:18 PM, Amos Jeffries squ...@treenet.co.nz
 wrote:

 Jeff Sadowski wrote:

 I'm new to trying to use squid as a reverse proxy.

 I would like to filter out certain pages and if possible certain words.
 I installed perl so that I can use it to rebuild pages if that is
 possible?

 My squid.conf looks like so
  start
 acl all src all
 http_port 80 accel defaultsite=outside.com
 cache_peer inside parent 80 0 no-query originserver name=myAccel
 acl our_sites dstdomain outside.com

 aha, aha, ..

 http_access allow all

 eeek!!

 I want everyone on the outside to see the inside server minus one or
 two pages. Is that not what I set up?

 By lucky chance of some background defaults only, and assuming that the web
 server is highly secure on its own.

 If you have a small set of sites, such as those listed in our_sites then
 its best to be certain and use that ACL for the allow as well.

  http_access allow our_sites
  http_access deny all

 ... same on the cache_peer_access below.


 cache_peer_access myAccell all
  end

 how would I add it so that for example

 http://inside/protect.html

 is blocked?

 http://wiki.squid-cache.org/SquidFaq/SquidAcl

 so I want redirector_access?
 Is there an example line of this in a file

 I tried using

 url_rewrite_program c:\perl\bin\perl.exe c:\replace.pl

 but I guess that requires more to use it? an acl?
 should acl all src all be acl all redirect all ?

 No to all three. The above is all line you mention trying is all thats
 needed.

  url_rewrite_access allow all

 but the above should be the default when a url_rewrite_program  is set.

 What is making you think its not working? and what do the logs say about it?
 Also what is the c:/replace.pl code?



 and is it possible to filter/replace certain words on the site

 like replace Albuquerque with Duke City for an example on all pages?

 No. no. no. Welcome to copyright violation hell.

 This was an example. I have full permission to do the real translations.
 I am told to remove certain links/buttons to login pages. thus I
 replace a herf=insidebutton/a with  Currently I have a
 pathetic perl script that doesn't support cookies and is gong through
 each set of previous pages to bring up the content. I was hoping squid
 would greatly simplify this.
 I was using www::mechanize I know this isn't the best way but they
 just need a fast and dirty way.

 Ah, okay. Well the only ways squid has for doing content alteration is far
 too much as well for that use. (coding up an ICAP server and processing
 rules or a full eCAP adaptor plugin).

 IMO you need to kick the webapp developers to make their app do the removal
 under the right conditions. It would solve many more problems than having
 different copies of a page available with identical identifiers.

 Amos
 --
 Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
  Current Beta Squid 3.1.0.7


To easy your mind about what it is for.
I am helping a library to setup a way to display available books to the outside.
The internal website allows you to login and check out books which
they want blocked to the outside. They do not want to modify the web
developers code to fit their special needs, since it is a commonly
used program to the libraries. They just want me to stop people from
logging in and checking out books and they don't need it to be an
absolute just difficult. When they should only be allowed to check
books out from inside. The people asking me to help don't get to
choose the software either it is higher up where those decisions are
made. Don't you love government :-D


[squid-users] Tproxy + wccp + tcp_outgoing_address

2009-04-19 Thread Vivek

Hi All,



I have configured two squid servers in tproxy+wccp mode and its working 
fine. I am using squid 2.7 (ctt proxy) and gre tunnel. Browsing is very 
slow compare than normal tproxy+bridge mode. I assume the problem is 
both incoming and outgoing traffic passed via eth0 (Gigabit Ethernet ).


I have an idea to use eth1 interface and change the 
tcp_outgoing_address from eth0 ip to eth1 ip.


Is it possible ?. or any other way to avoid this bottleneck

Thanks in advance.



Regards

VIvek



You are invited to Get a Free AOL Email ID. - http://webmail.aol.in



Re: [squid-users] reverse proxy filtering?

2009-04-19 Thread Jeff Sadowski
On Sun, Apr 19, 2009 at 1:18 AM, Jeff Sadowski jeff.sadow...@gmail.com wrote:
 On Sat, Apr 18, 2009 at 10:24 PM, Amos Jeffries squ...@treenet.co.nz wrote:
 Jeff Sadowski wrote:

 On Sat, Apr 18, 2009 at 5:18 PM, Amos Jeffries squ...@treenet.co.nz
 wrote:

 Jeff Sadowski wrote:

 I'm new to trying to use squid as a reverse proxy.

 I would like to filter out certain pages and if possible certain words.
 I installed perl so that I can use it to rebuild pages if that is
 possible?

 My squid.conf looks like so
  start
 acl all src all
 http_port 80 accel defaultsite=outside.com
 cache_peer inside parent 80 0 no-query originserver name=myAccel
 acl our_sites dstdomain outside.com

 aha, aha, ..

 http_access allow all

 eeek!!

 I want everyone on the outside to see the inside server minus one or
 two pages. Is that not what I set up?

 By lucky chance of some background defaults only, and assuming that the web
 server is highly secure on its own.

 If you have a small set of sites, such as those listed in our_sites then
 its best to be certain and use that ACL for the allow as well.

  http_access allow our_sites
  http_access deny all

 ... same on the cache_peer_access below.


 cache_peer_access myAccell all
  end

 how would I add it so that for example

 http://inside/protect.html

 is blocked?

 http://wiki.squid-cache.org/SquidFaq/SquidAcl

 so I want redirector_access?
 Is there an example line of this in a file

 I tried using

 url_rewrite_program c:\perl\bin\perl.exe c:\replace.pl

 but I guess that requires more to use it? an acl?
 should acl all src all be acl all redirect all ?

 No to all three. The above is all line you mention trying is all thats
 needed.

  url_rewrite_access allow all

 but the above should be the default when a url_rewrite_program  is set.

 What is making you think its not working? and what do the logs say about it?
 Also what is the c:/replace.pl code?



 and is it possible to filter/replace certain words on the site

 like replace Albuquerque with Duke City for an example on all pages?

 No. no. no. Welcome to copyright violation hell.

 This was an example. I have full permission to do the real translations.
 I am told to remove certain links/buttons to login pages. thus I
 replace a herf=insidebutton/a with  Currently I have a
 pathetic perl script that doesn't support cookies and is gong through
 each set of previous pages to bring up the content. I was hoping squid
 would greatly simplify this.
 I was using www::mechanize I know this isn't the best way but they
 just need a fast and dirty way.

 Ah, okay. Well the only ways squid has for doing content alteration is far
 too much as well for that use. (coding up an ICAP server and processing
 rules or a full eCAP adaptor plugin).

 IMO you need to kick the webapp developers to make their app do the removal
 under the right conditions. It would solve many more problems than having
 different copies of a page available with identical identifiers.

 Amos
 --
 Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
  Current Beta Squid 3.1.0.7


 To easy your mind about what it is for.
 I am helping a library to setup a way to display available books to the 
 outside.
 The internal website allows you to login and check out books which
 they want blocked to the outside. They do not want to modify the web
 developers code to fit their special needs, since it is a commonly
 used program to the libraries. They just want me to stop people from
 logging in and checking out books and they don't need it to be an
 absolute just difficult. When they should only be allowed to check
 books out from inside. The people asking me to help don't get to
 choose the software either it is higher up where those decisions are
 made. Don't you love government :-D

I think maybe this project better fits my needs :-D
http://www.privoxy.org (I don't need to cache although it would have been nice)
ugh I would have thought squid would have had text replacement.
I'll see if privoxy works as good at replicating pages and using cookies.
Hopefully it will work better than my own home brew written in perl using apache


[squid-users] Re: Tproxy + wccp + tcp_outgoing_address

2009-04-19 Thread Henrik Nordstrom
sön 2009-04-19 klockan 03:52 -0400 skrev Vivek:

 I have configured two squid servers in tproxy+wccp mode and its working 
 fine. I am using squid 2.7 (ctt proxy) and gre tunnel. Browsing is very 
 slow compare than normal tproxy+bridge mode. I assume the problem is 
 both incoming and outgoing traffic passed via eth0 (Gigabit Ethernet ).

I kind of doubt you have more than 900Mbps of traffic.

 I have an idea to use eth1 interface and change the 
 tcp_outgoing_address from eth0 ip to eth1 ip.

Won't help. The problem is something else.

 Is it possible?

Ofcourse, but it's not as simple as tcp_outgoing_address.

 . or any other way to avoid this bottleneck

First step is to identify the cause to the bottleneck.

1. How is the performance if you configure the browser to use the proxy?

2. Have you verified cabling, switch negotiation etc?

Regards
Henrik



Re: [squid-users] reverse proxy filtering?

2009-04-19 Thread Gavin McCullagh
Hi,

On Sun, 19 Apr 2009, Jeff Sadowski wrote:

 I am helping a library to setup a way to display available books to the 
 outside.
 The internal website allows you to login and check out books which
 they want blocked to the outside. They do not want to modify the web
 developers code to fit their special needs, since it is a commonly
 used program to the libraries. They just want me to stop people from
 logging in and checking out books and they don't need it to be an
 absolute just difficult. When they should only be allowed to check
 books out from inside. 

I presume the login is required to do any task.

It might be simplest to just block access to any URLs which process a
check out and any other disallowed tasks?  You could give a custom error
page which says this task is not allowed to external users. I suppose it's
better for users to not show buttons which they can't use, but this would
be simple to implement, perform well and wouldn't require modifying html.

Some people do modify content indirectly using squid's url_rewrite,
including this amusing one:
http://www.ex-parrot.com/~pete/upside-down-ternet.html 

which involves running a webserver on squid.  The perl script downloads the
page to squid's web directory, translates it and rewrites the url to the
localhost location of the translated page.  It's a bit of a hack, but it
would probably work.

Gavin



[squid-users] Re: Tproxy + wccp + tcp_outgoing_address

2009-04-19 Thread Vivek

Henrik, Thanks for your reply.

I will check all the things you had mention. Get you back to you if i
need.
Thanks again for your reply.

Regards
Vivek

-Original Message-
From: Henrik Nordstrom hen...@henriknordstrom.net
To: Vivek vivek...@aol.in
Cc: squid-users@squid-cache.org
Sent: Sun, 19 Apr 2009 1:42 pm
Subject: Re: Tproxy + wccp + tcp_outgoing_address



sön 2009-04-19 klockan 03:52 -0400 skrev Vivek:


I have configured two squid servers in tproxy+wccp mode and its

working

fine. I am using squid 2.7 (ctt proxy) and gre tunnel. Browsing is

very

slow compare than normal tproxy+bridge mode. I assume the problem is
both incoming and outgoing=2

0traffic passed via eth0 (Gigabit Ethernet
).

I kind of doubt you have more than 900Mbps of traffic.


I have an idea to use eth1 interface and change the
tcp_outgoing_address from eth0 ip to eth1 ip.


Won't help. The problem is something else.


Is it possible?


Ofcourse, but it's not as simple as tcp_outgoing_address.


. or any other way to avoid this bottleneck


First step is to identify the cause to the bottleneck.

1. How is the performance if you configure the browser to use the proxy?

2. Have you verified cabling, switch negotiation etc?

Regards
Henrik



You are invited to
Get a Free AOL Email ID. - http://webmail.aol.in


Re: [squid-users] Allow Single IP to bypass Squid Proxy

2009-04-19 Thread robp2175

Would you be so kind as give me an example of how to do this, I am very new
to this. Thank you very much.



Amos Jeffries-2 wrote:
 
 robp2175 wrote:
 
 
 robp2175 wrote:
 I want one ip on my network to bypass the squid proxy. How do I go about
 accomplishing this. Any help is greatly appreciated.

 Transparent Proxy with dansguardian and wpad
 
 Then configure your firewall to omit that IP from the interception.
 Configure your PAC script to send that machine DIRECT
 
 Amos
 -- 
 Please be using
Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
Current Beta Squid 3.1.0.7
 
 

-- 
View this message in context: 
http://www.nabble.com/Allow-Single-IP-to-bypass-Squid-Proxy-tp23102257p23122028.html
Sent from the Squid - Users mailing list archive at Nabble.com.



Re: [squid-users] reverse proxy filtering?

2009-04-19 Thread Amos Jeffries

Gavin McCullagh wrote:

Hi,

On Sun, 19 Apr 2009, Jeff Sadowski wrote:


I am helping a library to setup a way to display available books to the outside.
The internal website allows you to login and check out books which
they want blocked to the outside. They do not want to modify the web
developers code to fit their special needs, since it is a commonly
used program to the libraries. They just want me to stop people from
logging in and checking out books and they don't need it to be an
absolute just difficult. When they should only be allowed to check
books out from inside. 


I presume the login is required to do any task.

It might be simplest to just block access to any URLs which process a
check out and any other disallowed tasks?  You could give a custom error
page which says this task is not allowed to external users. I suppose it's
better for users to not show buttons which they can't use, but this would
be simple to implement, perform well and wouldn't require modifying html.

Some people do modify content indirectly using squid's url_rewrite,
including this amusing one:
	http://www.ex-parrot.com/~pete/upside-down-ternet.html 


which involves running a webserver on squid.  The perl script downloads the
page to squid's web directory, translates it and rewrites the url to the
localhost location of the translated page.  It's a bit of a hack, but it
would probably work.

Gavin



Yes thats another approach.

Though the more I hear about the problem, the more I think your best 
solution would be to forget changing the HTML and simply block access to 
pages and form processors that are not publicly allowed.


This is how the most of the world does it, no problems or complications 
either. Usually the software at back-end can be the one saying access 
denied. But a Squid access rule works just as well.


Amos
--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
  Current Beta Squid 3.1.0.7


Re: [squid-users] Allow Single IP to bypass Squid Proxy

2009-04-19 Thread Amos Jeffries

robp2175 wrote:

Would you be so kind as give me an example of how to do this, I am very new
to this. Thank you very much.



Can't say what to set your firewall.
Take a gander at http://wiki.squid-cache.org/ConfigExamples/Intercept 
for several types of transparent setup and the firewall settings related.


As for the PAC file, it looks something like this:

function findProxyForUrl(host, url)
{
  if(host == 192.68.0.1) return DIRECT;

  return PROXY fu;
}





Amos Jeffries-2 wrote:

robp2175 wrote:


robp2175 wrote:

I want one ip on my network to bypass the squid proxy. How do I go about
accomplishing this. Any help is greatly appreciated.


Transparent Proxy with dansguardian and wpad

Then configure your firewall to omit that IP from the interception.
Configure your PAC script to send that machine DIRECT

Amos
--
Please be using
   Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
   Current Beta Squid 3.1.0.7







--
Please be using
  Current Stable Squid 2.7.STABLE6 or 3.0.STABLE14
  Current Beta Squid 3.1.0.7


Re: [squid-users] reverse proxy filtering?

2009-04-19 Thread Jeff Sadowski
On Sun, Apr 19, 2009 at 3:09 AM, Gavin McCullagh gavin.mccull...@gcd.ie wrote:
 Hi,

 On Sun, 19 Apr 2009, Jeff Sadowski wrote:

 I am helping a library to setup a way to display available books to the 
 outside.
 The internal website allows you to login and check out books which
 they want blocked to the outside. They do not want to modify the web
 developers code to fit their special needs, since it is a commonly
 used program to the libraries. They just want me to stop people from
 logging in and checking out books and they don't need it to be an
 absolute just difficult. When they should only be allowed to check
 books out from inside.

 I presume the login is required to do any task.

Actually no you can browse books without login in.

 It might be simplest to just block access to any URLs which process a
 check out and any other disallowed tasks?  You could give a custom error
 page which says this task is not allowed to external users. I suppose it's
 better for users to not show buttons which they can't use, but this would
 be simple to implement, perform well and wouldn't require modifying html.

 Some people do modify content indirectly using squid's url_rewrite,
 including this amusing one:
        http://www.ex-parrot.com/~pete/upside-down-ternet.html

 which involves running a webserver on squid.  The perl script downloads the
 page to squid's web directory, translates it and rewrites the url to the
 localhost location of the translated page.  It's a bit of a hack, but it
 would probably work.


Cool thanks but I'm seriously looking at using privoxy and maybe even
privoxy and squid together
because it appears privoxy makes a terrible reverse proxy and would
leave my proxy box open for others to download illegal content. So my
current plan is to run privoxy on some random port and point the
reverse proxy to that port and wala both inline editing via privoxy
with a simple search replace string and no other sites except the one
specified for the reverse proxy via squid.

 Gavin




Re: [squid-users] reverse proxy filtering?

2009-04-19 Thread Gavin McCullagh
Hi,

On Sun, 19 Apr 2009, Jeff Sadowski wrote:

 Actually no you can browse books without login in.

Why not just prevent logins then by having squid block the login processing
page with a custom error page stating no logins from outside?

 Cool thanks but I'm seriously looking at using privoxy and maybe even
 privoxy and squid together
 because it appears privoxy makes a terrible reverse proxy and would
 leave my proxy box open for others to download illegal content. So my
 current plan is to run privoxy on some random port and point the
 reverse proxy to that port and wala both inline editing via privoxy
 with a simple search replace string and no other sites except the one
 specified for the reverse proxy via squid.

Your call of course, but it seems like you're over-complicating life.  The
more links you have in the chain (squid, privoxy, ...) and the more complex
your setup, the more things can go wrong over the lifetime of the system.
For sure modifying the page content will be slower, but if you don't have
lots of users that may not matter.

Another thing to bear in mind is that upgrades to the web-based system may
well break either setup -- the URLs might change so your url blocking might
fail or the page content might change breaking your regular expressions.
In principal, a system which only _allowed_ certain URLs and blocked all
others would be more robust than blocking certain URLs, failing closed
rather than open.

Gavin



[squid-users] is there a squid cache rank value available for statistics?

2009-04-19 Thread Gavin McCullagh
Hi,

I'm wondering about ways to measure the optimum size for a cache, in terms
of the value you gain from each GB of cache space.  If you've got a 400GB
cache and only 99% of your hits come from the first 350GB, there's probably
no point looking for a larger cache.  If only 80% come from the first
350GB, then a bigger cache might well be useful.

I realise there are rules of thumb for cache size, it would be interesting
to be able to analyse a particular squid installation.

Squid obviously removes objects from its cache based on the chosen
cache_replacement_policy.  It appears from the comments in squid.conf that
in the case of the LRU policy, this is implemented as a list, presumably a
queue of pointers to objects in the cache.  Objects which come to the head
of the queue are presumably next for removal.  I guess if an object in the
cache gets used it goes back to the tail of the queue.   I suppose this
process must involve linearly traversing the queue to find the object and
remove it, which is presumably why heap-based policies are available.

I wonder if it would be feasible to calculate a cache rank, which
indicates the position an object was within the queue at the time of the
hit.  So, perhaps 0 means at the tail of the queue, 1 means at the head.
If this could be reported alongside each hit in the access.log, one could
draw stats on the amount of hits served by each portion of the queue and
therefore determine the value of expanding or contracting your cache.

In the case of simple LRU, if the queue must be traversed to find each
element and requeue it (perhaps this isn't the case?), I suppose one could
count the position in the queue and divide by the total length.  

With a heap, things are more complex.  I guess you could give an indication
of the depth in the heap but there would be so many objects on the lowest
levels, I don't suppose this would be a great guide.  Is there some better
value available, such as the key used in the heap maybe?

Or perhaps the whole idea is flawed somehow?

Comments, criticisms, explanations, rebukes all welcome.
Gavin




Re: [squid-users] is there a squid cache rank value available for statistics?

2009-04-19 Thread Gavin McCullagh
On Sun, 19 Apr 2009, Gavin McCullagh wrote:

 In the case of simple LRU, if the queue must be traversed to find each
 element and requeue it (perhaps this isn't the case?), 

On reflection, I presume this is not the case.  I imagine the struct in ram
for each cache object must include a pointer to prev and next.  That makes
me wonder how heap lru improves matters.  I guess I need to go and read the
paper referenced in the notes :-)

Gavin



Re: [squid-users] is there a squid cache rank value available for statistics?

2009-04-19 Thread Amos Jeffries
 Hi,

 I'm wondering about ways to measure the optimum size for a cache, in terms
 of the value you gain from each GB of cache space.  If you've got a
 400GB
 cache and only 99% of your hits come from the first 350GB, there's
 probably
 no point looking for a larger cache.  If only 80% come from the first
 350GB, then a bigger cache might well be useful.


Squid suffers from a little bit of an anochronism in the way it stores
object. the classic ufs systems essentially use round-robin and hash to
determine storage location for each object separately. This works wonders
on ensuring no clashes, but not so good for retrieval optimization.
Adrian Chadd has done a lot of study and some work in this area
particularly for Squid-2.6/2.7. His paper for FreeBSD conference is a good
read on how disk storage relates to Squid.
http://www.squid-cache.org/~adrian/talks/20081007%20-%20NYCBSDCON%20-%20Disk%20IO.pdf

 I realise there are rules of thumb for cache size, it would be interesting
 to be able to analyse a particular squid installation.


Feel free. We would be interested in any improvements you can come up with.


 Squid obviously removes objects from its cache based on the chosen
 cache_replacement_policy.  It appears from the comments in squid.conf that
 in the case of the LRU policy, this is implemented as a list, presumably a
 queue of pointers to objects in the cache.  Objects which come to the head
 of the queue are presumably next for removal.  I guess if an object in the
 cache gets used it goes back to the tail of the queue.   I suppose this
 process must involve linearly traversing the queue to find the object and
 remove it, which is presumably why heap-based policies are available.

IIRC there is a doubly-linked list with tail pointer for LRU.


 I wonder if it would be feasible to calculate a cache rank, which
 indicates the position an object was within the queue at the time of the
 hit.  So, perhaps 0 means at the tail of the queue, 1 means at the head.
 If this could be reported alongside each hit in the access.log, one could
 draw stats on the amount of hits served by each portion of the queue and
 therefore determine the value of expanding or contracting your cache.

 In the case of simple LRU, if the queue must be traversed to find each
 element and requeue it (perhaps this isn't the case?), I suppose one could
 count the position in the queue and divide by the total length.

Yes, same big problems with that in LRU as displaying all objects in the
cache ( 1 million is not uncommon cache sizes) and regex purges.


 With a heap, things are more complex.  I guess you could give an
 indication
 of the depth in the heap but there would be so many objects on the lowest
 levels, I don't suppose this would be a great guide.  Is there some better
 value available, such as the key used in the heap maybe?

There is fileno or hashed value rather than URL. You still have the same
issues of traversal though.


 Or perhaps the whole idea is flawed somehow?

 Comments, criticisms, explanations, rebukes all welcome.
 Gavin


If you want to investigate. I'll gently nudge you towards Squid-3 where
the rest of the development is going on and improvements have the best
chance of survival.

For further discussion you may want to bring this up in squid-dev where
the developers hang out.

Amos




[squid-users] squid 3.0STABLE14: Detected DEAD Parent [proxy]

2009-04-19 Thread vollkommen
Since v3.0 doesn't have IPv6 support, I set up the Polipo proxy 
(http://www.pps.jussieu.fr/~jch/software/polipo/) as a 6to4 relay in front of 
squid, by adding cache_peer 127.0.0.1 parent 8123 3130 to squid.conf. I was 
able to browse multiple IPv6 sites such as ipv6.googe.com with 
3.0.STABLE14-20090416 as the proxy in Firefox, for quite a while, before 
squid's cache.log spit out:

2009/04/19 20:10:20| Ready to serve requests.
2009/04/19 20:12:09| WARNING: Probable misconfigured neighbor at 127.0.0.1
2009/04/19 20:12:09| WARNING: 143 of the last 150 ICP replies are DENIED
2009/04/19 20:12:09| WARNING: No replies will be sent for the next 3600 seconds
2009/04/19 20:12:31| Detected DEAD Parent: 127.0.0.1
2009/04/19 20:12:52| ipcacheParse: No Address records in response to 
'www.ipv6.sixxs.net'

After building 3.0.STABLE14-20090419, I find the above issue surfaced much 
faster than with STABLE14-20090416. Since 20090416 is no longer downlable, I 
built STABLE14 dated 20090411. It lasted about as long as 20090416, but spit 
out slightly different logs:

2009/04/19 20:57:36| Ready to serve requests.
2009/04/19 20:59:01| ipcacheParse: No Address records in response to 
'ipv6.google.com'
2009/04/19 21:00:01| ipcacheParse: No Address records in response to 
'ipv6.google.com'
2009/04/19 21:00:30| ipcacheParse: No Address records in response to 
'www.ipv6.sixxs.net'
2009/04/19 21:00:39| 95%% of replies from '127.0.0.1' are UDP_DENIED


Could this issue be mitigated from the squid end, so it keeps talking to Polipo 
(which appears to remain functional)?

-- 
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss 
für nur 17,95 Euro/mtl.!* 
http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a


[squid-users] Re: squid 3.0STABLE14: Detected DEAD Parent [proxy]

2009-04-19 Thread vollkommen
I've made one change in squid.conf that seems to cure most of the problem, with 
STABLE14-20090411:

cache_peer 127.0.0.1 parent 8123 0 no-query no-digest no-netdb-exchange default

Interestingly, I was searching about tunnelReadServer: FD 70: read failure: 
(0) Unknown error: 0 when I found the above options on a Russian forum. 

However, it appears URLs with a question mar (e.g. 
http://www.ipv6.sixxs.net/forum/?msg=general or 
http://ipv6.google.com/advanced_search?hl=en) still causes the same DNS lookup 
failure: squid either spits out ipcacheParse: No Address records in response 
to 'www.ipv6.sixxs.net' in cache.log, or the brower gets Unable to determine 
IP address from host name www.ipv6.sixxs.net, but rarely both. Is this 
because of hierarchy_stoplist cgi-bin ?? If so, how do I work around it, 
given my setup?

 Original-Nachricht 
Since v3.0 doesn't have IPv6 support, I set up the Polipo proxy 
(http://www.pps.jussieu.fr/~jch/software/polipo/) as a 6to4 relay in front of 
squid, by adding cache_peer 127.0.0.1 parent 8123 3130 to squid.conf. I was 
able to browse multiple IPv6 sites such as ipv6.googe.com with 
3.0.STABLE14-20090416 as the proxy in Firefox, for quite a while, before 
squid's cache.log spit out:

2009/04/19 20:10:20| Ready to serve requests.
2009/04/19 20:12:09| WARNING: Probable misconfigured neighbor at 127.0.0.1
2009/04/19 20:12:09| WARNING: 143 of the last 150 ICP replies are DENIED
2009/04/19 20:12:09| WARNING: No replies will be sent for the next 3600 seconds
2009/04/19 20:12:31| Detected DEAD Parent: 127.0.0.1
2009/04/19 20:12:52| ipcacheParse: No Address records in response to 
'www.ipv6.sixxs.net'

After building 3.0.STABLE14-20090419, I find the above issue surfaced much 
faster than with STABLE14-20090416. Since 20090416 is no longer downlable, I 
built STABLE14 dated 20090411. It lasted about as long as 20090416, but spit 
out slightly different logs:

2009/04/19 20:57:36| Ready to serve requests.
2009/04/19 20:59:01| ipcacheParse: No Address records in response to 
'ipv6.google.com'
2009/04/19 21:00:01| ipcacheParse: No Address records in response to 
'ipv6.google.com'
2009/04/19 21:00:30| ipcacheParse: No Address records in response to 
'www.ipv6.sixxs.net'
2009/04/19 21:00:39| 95%% of replies from '127.0.0.1' are UDP_DENIED


Could this issue be mitigated from the squid end, so it keeps talking to Polipo 
(which appears to remain functional)?


-- 
Neu: GMX FreeDSL Komplettanschluss mit DSL 6.000 Flatrate + Telefonanschluss 
für nur 17,95 Euro/mtl.!* 
http://dslspecial.gmx.de/freedsl-surfflat/?ac=OM.AD.PD003K11308T4569a


[squid-users] Headers control in Squid 3.0

2009-04-19 Thread Oleg

Hi.

Before Squid 3.0 I can change a Proxy-Authenticate header through duplet:

header_access Proxy-Authenticate deny browserFirefox osLinux
header_replace Proxy-Authenticate Negotiate

That because first authenticate method is NTLM for IE.

After upgrade to Squid 3.0, header_access directive fork into
request_header_access and reply_header_access. Implicate in my case I
change directive header_access to reply_header_access. BUT! Now
directive header_replace works only with request_header_access and don't
change a Proxy-Authenticate headers.

How to resolve this problem without downgrade to Squid 2.7.6? Or may be
bypass this another way?

Oleg.