Re: [squid-users] Zero sized reply and other recent access problems

2005-03-07 Thread Henrik Nordstrom
On Tue, 8 Mar 2005, Reuben Farrelly wrote:
This means that as long as you have relaxed_header_parser set to on or warn, 
or simply not defined, the old behaviour will still be the same as older 
squid.
Not quite.. the old accepted a lot of crap which was ambigious, such as 
different content-length headers and a few other really bad things.

Personally I recommend at least "warn", as it has allowed me to see some of 
the broken sites and inform relevant people of their broken behaviour, but I 
understand not everyone can be bothered..
Same here, but I selected the silented to distribute Squid with the 
default "on" setting to not confuse people too much why Squid now 
complains about malformed HTTP responses.

Duplicate "Connection" headers on line 5 and 6, and whitespace on line 4 
between "Content" and "Location".  No wonder it does not work properly.
All of this is silently accepted by Squid-2.5.STABLE9 in it's default 
"relaxed_header_parser on" settings. Squid-2.5.STABLE8 rejects the 
response due to the whitespace.

Regards
Henrik


Re: [squid-users] Zero sized reply and other recent access problems

2005-03-07 Thread Reuben Farrelly
Hi again Hans,
At 08:52 a.m. 7/03/2005, H Matik wrote:
On Saturday 05 March 2005 23:41, Reuben Farrelly wrote:
> I think you've misunderstood something quite fundamental about how squid
> works:
>
may be I did not used the exact expressions you like to see but like you 
wrote
you did get it. Anyway, my intention like said in my mail was not to attack
anybody.
I know, I just am asking you to be specific with the errors you are 
reporting.  None of the developers would complain in the slightest if you 
could provide good evidence of a bug, believe me ;-)


> * Strict HTTP header parsing - implemented in the most recent STABLE
> releases of squid, you can turn this off via a squid.conf directive
> anyway (but it is useful to have it set to log bad pages).
>
what do you mean? relaxed_header_parser? I think this is on by default, not
off, turning it off it parse strict or am I wrong here?
Yes, it is on by default, in other words, (from the squid.conf)with this 
default setting, "Squid accepts certain forms of non-compliant HTTP 
messages where it is unambiguous what the sending application intended even 
if the message is not correctly formatted."

This means that as long as you have relaxed_header_parser set to on or 
warn, or simply not defined, the old behaviour will still be the same as 
older squid.
Personally I recommend at least "warn", as it has allowed me to see some of 
the broken sites and inform relevant people of their broken behaviour, but 
I understand not everyone can be bothered..

> * ECN on with Linux can cause 'zero sized reply' responses, although
> usually you'll get a timeout.  I have ECN on on my system and very few
> sites fail because of this, but there are a small number.  Read the
> squid FAQ for information about how to turn this off if it is a problem.
>
FYI it does not happens only on Linux, again, the problem and a possible
solution here is not the point, the point is that for the end-user the site
opens using "the other ISP" so for him it is an ISP problem, he doesn't care
if it is squid or the remote site, network congestion or other.
Yep, I understand.
anyway, IMO the error message is obscure for the user, it starts saying
the URL:  (blank)
Do the users have "Show friendly HTTP error messages" ticked in their 
Internet Explorer options?  If they do, they will usually not see the squid 
error which explains what the problem is and will see a generic message 
"the page could not be displayed".  Unfortunately, IE hides these useful 
squid messages with it's own garbage, which is often more useless to the 
end user than squid's messages.

If it's not that then you should either have something useful to look at in 
the users browser, or else in your cache.log.


the user obviously complains about that he typed correctly the URL and on the
error msg it is blank, so this cause understanding problems between the
support staff and the user
Then it does not help to send reading FAQs because what I am speaking 
about is
the user not the administrator. The user does not need to learn squid but
what he gets should be understandable enough and most important he should get
it when he gets it without squid.
Yes, of course.

I mean that a site should be accessible behind squid when it opens normally
with a Browser without squid. It is not interesting here if there is a wrong
header or whatever.
> * NTLM authentication, some uninformed site admins require or request
>
NO, I was not speaking about any authentication at all
>
> Can you give some examples of specific sites which you need to bypass
> squid for that you cannot get to display using the items I mentioned above?
>
First some banking and other secure sites which need gre protocol for example
but I was not speaking about this ones.
GRE should be unaffected.  Squid does not process or handle GRE, only 
TCP/IP.
Are you using your squid as a firewall/router box, and not allowing GRE 
through?

Lots of Blogger sites are giving erros. Sure there is a lot of underline and
whitespace problems but the latter ones often are not resolvable by squid
settings. On the other side they open normally with MSIE
I haven't seen any before..
At work I can check for more, one specific follows.
Other errors are like this, even if this specific site now is working after
contacting them. The site gave problem with squid > 2.5-S4 if I am not wrong
here.
GET / HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint,
application/x-shockwave-flash, */*
Accept-Language: pt-br
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)
Host: www.redecard.com.br
Connection: Keep-Alive
That one is one of the more broken ones I have seen yet:
[EMAIL PROTECTED] ~]# wget -S www.redecard.com.br
--00:38:38--  http://www.redecard.com.br/
   => `index.html.1'
Resolving www.redecard.com.br... 200.185.9.46
Connecting to www.redecard.com.br[200.

Re: [squid-users] Zero sized reply and other recent access problems

2005-03-06 Thread Henrik Nordstrom
On Sun, 6 Mar 2005, H Matik wrote:
anyway, IMO the error message is obscure for the user, it starts saying
the URL:  (blank)
Not on the problems I know of, with the odd exception where Squid can not 
parse the request sent by the client.

If you gave a few examples of the sites you have problems with helping you 
is a lot easier, preferably with the relevant cache.log messages about the 
request/response.

As you do not provide any information on which sites are failing for you 
or how they fail (cache.log messages, error messages etc) we have to make 
wild guesses about the details of the problems you refer to and this 
discussion is very unlikely to get anywhere meaningful.

Regards
Henrik


Re: [squid-users] Zero sized reply and other recent access problems

2005-03-06 Thread Henrik Nordstrom
On Sat, 5 Mar 2005, H Matik wrote:
Recently all of us are having problems with squid not serving certain
pages/objects anymore.
Examples please.
We do know that squid most probably does detect correct or incorrect html
codes and tells it via it's error messages.
But I am not so sure if this should be a squid task.
It isn't, and Squid does none of the kind. Squid could not care less about 
what is HTML. To Squid a HTML page is just a sequence of characters of no 
meaning to Squid.

As Reuben said Squid only cares about the validity of the HTTP protocol, 
and the things it cares about is for good reasons (mostly security). It is 
known that there is several quite broken web sites out there which will 
not work via 2.5.STABLE9, and due to the nature of the bugs in these sites 
it is unlikely they will work with any later Squid releases until the site 
administrator fixes their critical server bugs.

Squid IMO should cache and serve what it gets from the server.
And this is what Squid does. The server must however speak the HTTP 
protocol in a somewhat meaningful dialect for Squid to understand what the 
server says and not reject it as a hacker attempt or other malicious 
intent.

The code check should be done by the browser - means incorrect code is a
browser problem or a web server problem so it should be adviced by the
browser not by anything in the middle.
And this is exacly how it is.
We here do use transparent squid on lots of sites and soon someone complains
about this kind of problem we rewrite our fwd rules so that it does not goes
through squid anymore.
You complain all this about what a proxy should or should not do, and 
still you intentinally and focibly violate the fundamentals of TCP/IP by 
hijacking your users requests? Transparent interception violates Internet 
Standard #3 "Requirements for Internet hosts" and also the general spirit 
of the design of TCP/IP.

IMO I think it might be better for squid not checking code.
There is sertain things Squid must check in the HTTP protocol used for 
transferring the HTML code. But Squid absolutely does NOT care about the 
HTML or other contents of the requested site.

Custumers say: "Without your cache I can access the site, with your cache not.
I do not want to know about and if you do not resolve this problem for me I
do not use you service anymore but another where it works."
Unfortunately the world is not so unambigious.
It may be worth mentioning that many of the sites failing with Squid 
2.5.STABLE9 is likely to start failing with newer browsers as well for the 
same reasons Squid pukes on these sites.

So even if "I" loose first my customer second they do not use squid anymore. I
believe it could be considered to think about this.
I belive the 2.5.STABLE9 release has a very good balance in this.
Sure, there may still be a few buggy web servers out there where Squid 
could safely work around the server bugs, but each of these has to be 
analyzed very carefully individually.

In addition the only way of getting this done is to spend some time on 
identifying why Squid rejects the responses from a certain site, and then 
open a discussion here on squid-users on how Squid maybe could work around 
that broken web server.

If you can/will not investigate why problems arises but still expects 
everything to work then you should have a support contract, either for 
Squid from one of the Suqid support providers or for a commercial 
proxy/cache if you prefer.

Just complaining without any information won't get you anywhere, except 
perhaps blacklisted in some of the subscribers here.


I like to add that we here are using squid since 97/98 and what I wrote here
is not in any kind a meant as offending critic to the developers but a point
to think about. So what you think about this?
And beleive me, we think very careful about these things.
If we did not then Squid-2.5.STABLE8 would have been released with the 
HTTP parser in it's very strictest setting, i.e. the equivalence of 
2.5.STABLE9 configured with "relaxed_header_parser off" and in addition 
yelling a screenful of complaints per request in cache.log on each 
malfunctioning web server seen.

Regards
Henrik


Re: [squid-users] Zero sized reply and other recent access problems

2005-03-06 Thread H Matik
On Saturday 05 March 2005 23:41, Reuben Farrelly wrote:

> I think you've misunderstood something quite fundamental about how squid
> works:
>
may be I did not used the exact expressions you like to see but like you wrote 
you did get it. Anyway, my intention like said in my mail was not to attack 
anybody.

>
> * Strict HTTP header parsing - implemented in the most recent STABLE
> releases of squid, you can turn this off via a squid.conf directive
> anyway (but it is useful to have it set to log bad pages).
>
what do you mean? relaxed_header_parser? I think this is on by default, not 
off, turning it off it parse strict or am I wrong here?

> * ECN on with Linux can cause 'zero sized reply' responses, although
> usually you'll get a timeout.  I have ECN on on my system and very few
> sites fail because of this, but there are a small number.  Read the
> squid FAQ for information about how to turn this off if it is a problem.
>

FYI it does not happens only on Linux, again, the problem and a possible 
solution here is not the point, the point is that for the end-user the site 
opens using "the other ISP" so for him it is an ISP problem, he doesn't care 
if it is squid or the remote site, network congestion or other.

anyway, IMO the error message is obscure for the user, it starts saying

the URL:  (blank)

the user obviously complains about that he typed correctly the URL and on the 
error msg it is blank, so this cause understanding problems between the 
support staff and the user

Then it does not help to send reading FAQs because what I am speaking about is 
the user not the administrator. The user does not need to learn squid but 
what he gets should be understandable enough and most important he should get 
it when he gets it without squid.

I mean that a site should be accessible behind squid when it opens normally 
with a Browser without squid. It is not interesting here if there is a wrong 
header or whatever.  


> * NTLM authentication, some uninformed site admins require or request
>
NO, I was not speaking about any authentication at all


>
> Can you give some examples of specific sites which you need to bypass
> squid for that you cannot get to display using the items I mentioned above?
>

First some banking and other secure sites which need gre protocol for example 
but I was not speaking about this ones.

Lots of Blogger sites are giving erros. Sure there is a lot of underline and 
whitespace problems but the latter ones often are not resolvable by squid 
settings. On the other side they open normally with MSIE

At work I can check for more, one specific follows.

Other errors are like this, even if this specific site now is working after 
contacting them. The site gave problem with squid > 2.5-S4 if I am not wrong 
here.

GET / HTTP/1.1
Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, 
application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint, 
application/x-shockwave-flash, */*
Accept-Language: pt-br
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt)
Host: www.redecard.com.br
Connection: Keep-Alive


Hans




> Reuben

-- 
___
Infomatik
(18)8112.7007
http://info.matik.com.br
Mensagens não assinadas com GPG não são minhas.
Messages without GPG signature are not from me.
___


pgpKoxMMuk7RR.pgp
Description: PGP signature


RE: [squid-users] Zero sized reply and other recent access problems

2005-03-06 Thread Lucia Di Occhi
I must say that I totally agree with your comment but I'd like to hear from 
some of the developers to better understand the whole picture as far as code 
checking.  We do have some of the same issues you are experiencing and we 
are using transparent proxying as well.  Latley it is becoming more and more 
of an admin nightmare responding to users who cannot access certain sites 
while on our network when they can access them just fine from their home.


From: H Matik <[EMAIL PROTECTED]>
To: squid-users@squid-cache.org
Subject: [squid-users] Zero sized reply and other recent access problems
Date: Sat, 5 Mar 2005 18:08:44 -0300
Recently all of us are having problems with squid not serving certain
pages/objects anymore.
We do know that squid most probably does detect correct or incorrect html
codes and tells it via it's error messages.
But I am not so sure if this should be a squid task.
Squid IMO should cache and serve what it gets from the server.
The code check should be done by the browser - means incorrect code is a
browser problem or a web server problem so it should be adviced by the
browser not by anything in the middle.
Even if the page code is buggy the page could contain objects to be cached 
and
that is what squid should do.

I say so because who use squid is an ISP or a system admin of any kind of
network. So it should not turn into be this man's problem if somebody is
coding his server's html pages incorrectly. He with his squid only serves 
his
customers or his people on his network.

IMO this strict html code checking is complicating network support to end
customers what already was or is not so easy sometimes.
We here do use transparent squid on lots of sites and soon someone 
complains
about this kind of problem we rewrite our fwd rules so that it does not 
goes
through squid anymore.

Even if we know that the remote site owner has no interest in somebody not
capable to access his site we do not have the time to talk to him. Indeed 
it
is not our problem and we are not a html coding school teaching how to
correct errors. So here we simply desist and pass by squid for such kind of
sites.

IMO I think it might be better for squid not checking code.
Custumers say: "Without your cache I can access the site, with your cache 
not.
I do not want to know about and if you do not resolve this problem for me I
do not use you service anymore but another where it works."

So even if "I" loose first my customer second they do not use squid 
anymore. I
believe it could be considered to think about this.

I like to add that we here are using squid since 97/98 and what I wrote 
here
is not in any kind a meant as offending critic to the developers but a 
point
to think about. So what you think about this?

Hans



--
___
Infomatik
(18)8112.7007
http://info.matik.com.br
Mensagens não assinadas com GPG não são minhas.
Messages without GPG signature are not from me.
___
<< attach3 >>
_
FREE pop-up blocking with the new MSN Toolbar – get it now! 
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/



Re: [squid-users] Zero sized reply and other recent access problems

2005-03-05 Thread Reuben Farrelly
Hi,
H Matik wrote:
Recently all of us are having problems with squid not serving certain 
pages/objects anymore. 

We do know that squid most probably does detect correct or incorrect html 
codes and tells it via it's error messages.

But I am not so sure if this should be a squid task.
Squid IMO should cache and serve what it gets from the server.
The code check should be done by the browser - means incorrect code is a 
browser problem or a web server problem so it should be adviced by the 
browser not by anything in the middle. 

Even if the page code is buggy the page could contain objects to be cached and 
that is what squid should do.

I say so because who use squid is an ISP or a system admin of any kind of 
network. So it should not turn into be this man's problem if somebody is 
coding his server's html pages incorrectly. He with his squid only serves his 
customers or his people on his network.

IMO this strict html code checking is complicating network support to end 
customers what already was or is not so easy sometimes.

We here do use transparent squid on lots of sites and soon someone complains 
about this kind of problem we rewrite our fwd rules so that it does not goes 
through squid anymore.

Even if we know that the remote site owner has no interest in somebody not 
capable to access his site we do not have the time to talk to him. Indeed it 
is not our problem and we are not a html coding school teaching how to 
correct errors. So here we simply desist and pass by squid for such kind of 
sites.

IMO I think it might be better for squid not checking code. 

Custumers say: "Without your cache I can access the site, with your cache not. 
I do not want to know about and if you do not resolve this problem for me I 
do not use you service anymore but another where it works."

So even if "I" loose first my customer second they do not use squid anymore. I 
believe it could be considered to think about this.

I like to add that we here are using squid since 97/98 and what I wrote here 
is not in any kind a meant as offending critic to the developers but a point 
to think about. So what you think about this?
I think you've misunderstood something quite fundamental about how squid 
works:

  Squid does not read, complain or validate HTML
In other words, it does not check it or care if it is even HTML, or if 
it is a binary file.  Squid only cares about the HTTP _headers_ that the 
remote server is issuing when squid requests a document.  HTTP headers 
have nothing to do with HTML, HTTP headers are generated by the HTTP 
server and administered by the server administrator, they are not 
anything to do with the web pages on the server itself.

I suspect you are meaning to complain about a number of different things 
at once:

* Strict HTTP header parsing - implemented in the most recent STABLE 
releases of squid, you can turn this off via a squid.conf directive 
anyway (but it is useful to have it set to log bad pages).

* ECN on with Linux can cause 'zero sized reply' responses, although 
usually you'll get a timeout.  I have ECN on on my system and very few 
sites fail because of this, but there are a small number.  Read the 
squid FAQ for information about how to turn this off if it is a problem.

* NTLM authentication, some uninformed site admins require or request 
NTLM authentication, this is not supported, not recommended by Microsoft 
on the internet and will not work (you'll get an error message).  Squid 
should not support things which are known to be broken and not supposed 
to work!

Can you give some examples of specific sites which you need to bypass 
squid for that you cannot get to display using the items I mentioned above?

Reuben


[squid-users] Zero sized reply and other recent access problems

2005-03-05 Thread H Matik
Recently all of us are having problems with squid not serving certain 
pages/objects anymore. 

We do know that squid most probably does detect correct or incorrect html 
codes and tells it via it's error messages.


But I am not so sure if this should be a squid task.


Squid IMO should cache and serve what it gets from the server.

The code check should be done by the browser - means incorrect code is a 
browser problem or a web server problem so it should be adviced by the 
browser not by anything in the middle. 

Even if the page code is buggy the page could contain objects to be cached and 
that is what squid should do.

I say so because who use squid is an ISP or a system admin of any kind of 
network. So it should not turn into be this man's problem if somebody is 
coding his server's html pages incorrectly. He with his squid only serves his 
customers or his people on his network.

IMO this strict html code checking is complicating network support to end 
customers what already was or is not so easy sometimes.

We here do use transparent squid on lots of sites and soon someone complains 
about this kind of problem we rewrite our fwd rules so that it does not goes 
through squid anymore.

Even if we know that the remote site owner has no interest in somebody not 
capable to access his site we do not have the time to talk to him. Indeed it 
is not our problem and we are not a html coding school teaching how to 
correct errors. So here we simply desist and pass by squid for such kind of 
sites.

IMO I think it might be better for squid not checking code. 

Custumers say: "Without your cache I can access the site, with your cache not. 
I do not want to know about and if you do not resolve this problem for me I 
do not use you service anymore but another where it works."

So even if "I" loose first my customer second they do not use squid anymore. I 
believe it could be considered to think about this.

I like to add that we here are using squid since 97/98 and what I wrote here 
is not in any kind a meant as offending critic to the developers but a point 
to think about. So what you think about this?

Hans







-- 
___
Infomatik
(18)8112.7007
http://info.matik.com.br
Mensagens não assinadas com GPG não são minhas.
Messages without GPG signature are not from me.
___


pgphXRhDQqqxi.pgp
Description: PGP signature