Re: [squid-users] Zero sized reply and other recent access problems
On Tue, 8 Mar 2005, Reuben Farrelly wrote: This means that as long as you have relaxed_header_parser set to on or warn, or simply not defined, the old behaviour will still be the same as older squid. Not quite.. the old accepted a lot of crap which was ambigious, such as different content-length headers and a few other really bad things. Personally I recommend at least "warn", as it has allowed me to see some of the broken sites and inform relevant people of their broken behaviour, but I understand not everyone can be bothered.. Same here, but I selected the silented to distribute Squid with the default "on" setting to not confuse people too much why Squid now complains about malformed HTTP responses. Duplicate "Connection" headers on line 5 and 6, and whitespace on line 4 between "Content" and "Location". No wonder it does not work properly. All of this is silently accepted by Squid-2.5.STABLE9 in it's default "relaxed_header_parser on" settings. Squid-2.5.STABLE8 rejects the response due to the whitespace. Regards Henrik
Re: [squid-users] Zero sized reply and other recent access problems
Hi again Hans, At 08:52 a.m. 7/03/2005, H Matik wrote: On Saturday 05 March 2005 23:41, Reuben Farrelly wrote: > I think you've misunderstood something quite fundamental about how squid > works: > may be I did not used the exact expressions you like to see but like you wrote you did get it. Anyway, my intention like said in my mail was not to attack anybody. I know, I just am asking you to be specific with the errors you are reporting. None of the developers would complain in the slightest if you could provide good evidence of a bug, believe me ;-) > * Strict HTTP header parsing - implemented in the most recent STABLE > releases of squid, you can turn this off via a squid.conf directive > anyway (but it is useful to have it set to log bad pages). > what do you mean? relaxed_header_parser? I think this is on by default, not off, turning it off it parse strict or am I wrong here? Yes, it is on by default, in other words, (from the squid.conf)with this default setting, "Squid accepts certain forms of non-compliant HTTP messages where it is unambiguous what the sending application intended even if the message is not correctly formatted." This means that as long as you have relaxed_header_parser set to on or warn, or simply not defined, the old behaviour will still be the same as older squid. Personally I recommend at least "warn", as it has allowed me to see some of the broken sites and inform relevant people of their broken behaviour, but I understand not everyone can be bothered.. > * ECN on with Linux can cause 'zero sized reply' responses, although > usually you'll get a timeout. I have ECN on on my system and very few > sites fail because of this, but there are a small number. Read the > squid FAQ for information about how to turn this off if it is a problem. > FYI it does not happens only on Linux, again, the problem and a possible solution here is not the point, the point is that for the end-user the site opens using "the other ISP" so for him it is an ISP problem, he doesn't care if it is squid or the remote site, network congestion or other. Yep, I understand. anyway, IMO the error message is obscure for the user, it starts saying the URL: (blank) Do the users have "Show friendly HTTP error messages" ticked in their Internet Explorer options? If they do, they will usually not see the squid error which explains what the problem is and will see a generic message "the page could not be displayed". Unfortunately, IE hides these useful squid messages with it's own garbage, which is often more useless to the end user than squid's messages. If it's not that then you should either have something useful to look at in the users browser, or else in your cache.log. the user obviously complains about that he typed correctly the URL and on the error msg it is blank, so this cause understanding problems between the support staff and the user Then it does not help to send reading FAQs because what I am speaking about is the user not the administrator. The user does not need to learn squid but what he gets should be understandable enough and most important he should get it when he gets it without squid. Yes, of course. I mean that a site should be accessible behind squid when it opens normally with a Browser without squid. It is not interesting here if there is a wrong header or whatever. > * NTLM authentication, some uninformed site admins require or request > NO, I was not speaking about any authentication at all > > Can you give some examples of specific sites which you need to bypass > squid for that you cannot get to display using the items I mentioned above? > First some banking and other secure sites which need gre protocol for example but I was not speaking about this ones. GRE should be unaffected. Squid does not process or handle GRE, only TCP/IP. Are you using your squid as a firewall/router box, and not allowing GRE through? Lots of Blogger sites are giving erros. Sure there is a lot of underline and whitespace problems but the latter ones often are not resolvable by squid settings. On the other side they open normally with MSIE I haven't seen any before.. At work I can check for more, one specific follows. Other errors are like this, even if this specific site now is working after contacting them. The site gave problem with squid > 2.5-S4 if I am not wrong here. GET / HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint, application/x-shockwave-flash, */* Accept-Language: pt-br Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt) Host: www.redecard.com.br Connection: Keep-Alive That one is one of the more broken ones I have seen yet: [EMAIL PROTECTED] ~]# wget -S www.redecard.com.br --00:38:38-- http://www.redecard.com.br/ => `index.html.1' Resolving www.redecard.com.br... 200.185.9.46 Connecting to www.redecard.com.br[200.
Re: [squid-users] Zero sized reply and other recent access problems
On Sun, 6 Mar 2005, H Matik wrote: anyway, IMO the error message is obscure for the user, it starts saying the URL: (blank) Not on the problems I know of, with the odd exception where Squid can not parse the request sent by the client. If you gave a few examples of the sites you have problems with helping you is a lot easier, preferably with the relevant cache.log messages about the request/response. As you do not provide any information on which sites are failing for you or how they fail (cache.log messages, error messages etc) we have to make wild guesses about the details of the problems you refer to and this discussion is very unlikely to get anywhere meaningful. Regards Henrik
Re: [squid-users] Zero sized reply and other recent access problems
On Sat, 5 Mar 2005, H Matik wrote: Recently all of us are having problems with squid not serving certain pages/objects anymore. Examples please. We do know that squid most probably does detect correct or incorrect html codes and tells it via it's error messages. But I am not so sure if this should be a squid task. It isn't, and Squid does none of the kind. Squid could not care less about what is HTML. To Squid a HTML page is just a sequence of characters of no meaning to Squid. As Reuben said Squid only cares about the validity of the HTTP protocol, and the things it cares about is for good reasons (mostly security). It is known that there is several quite broken web sites out there which will not work via 2.5.STABLE9, and due to the nature of the bugs in these sites it is unlikely they will work with any later Squid releases until the site administrator fixes their critical server bugs. Squid IMO should cache and serve what it gets from the server. And this is what Squid does. The server must however speak the HTTP protocol in a somewhat meaningful dialect for Squid to understand what the server says and not reject it as a hacker attempt or other malicious intent. The code check should be done by the browser - means incorrect code is a browser problem or a web server problem so it should be adviced by the browser not by anything in the middle. And this is exacly how it is. We here do use transparent squid on lots of sites and soon someone complains about this kind of problem we rewrite our fwd rules so that it does not goes through squid anymore. You complain all this about what a proxy should or should not do, and still you intentinally and focibly violate the fundamentals of TCP/IP by hijacking your users requests? Transparent interception violates Internet Standard #3 "Requirements for Internet hosts" and also the general spirit of the design of TCP/IP. IMO I think it might be better for squid not checking code. There is sertain things Squid must check in the HTTP protocol used for transferring the HTML code. But Squid absolutely does NOT care about the HTML or other contents of the requested site. Custumers say: "Without your cache I can access the site, with your cache not. I do not want to know about and if you do not resolve this problem for me I do not use you service anymore but another where it works." Unfortunately the world is not so unambigious. It may be worth mentioning that many of the sites failing with Squid 2.5.STABLE9 is likely to start failing with newer browsers as well for the same reasons Squid pukes on these sites. So even if "I" loose first my customer second they do not use squid anymore. I believe it could be considered to think about this. I belive the 2.5.STABLE9 release has a very good balance in this. Sure, there may still be a few buggy web servers out there where Squid could safely work around the server bugs, but each of these has to be analyzed very carefully individually. In addition the only way of getting this done is to spend some time on identifying why Squid rejects the responses from a certain site, and then open a discussion here on squid-users on how Squid maybe could work around that broken web server. If you can/will not investigate why problems arises but still expects everything to work then you should have a support contract, either for Squid from one of the Suqid support providers or for a commercial proxy/cache if you prefer. Just complaining without any information won't get you anywhere, except perhaps blacklisted in some of the subscribers here. I like to add that we here are using squid since 97/98 and what I wrote here is not in any kind a meant as offending critic to the developers but a point to think about. So what you think about this? And beleive me, we think very careful about these things. If we did not then Squid-2.5.STABLE8 would have been released with the HTTP parser in it's very strictest setting, i.e. the equivalence of 2.5.STABLE9 configured with "relaxed_header_parser off" and in addition yelling a screenful of complaints per request in cache.log on each malfunctioning web server seen. Regards Henrik
Re: [squid-users] Zero sized reply and other recent access problems
On Saturday 05 March 2005 23:41, Reuben Farrelly wrote: > I think you've misunderstood something quite fundamental about how squid > works: > may be I did not used the exact expressions you like to see but like you wrote you did get it. Anyway, my intention like said in my mail was not to attack anybody. > > * Strict HTTP header parsing - implemented in the most recent STABLE > releases of squid, you can turn this off via a squid.conf directive > anyway (but it is useful to have it set to log bad pages). > what do you mean? relaxed_header_parser? I think this is on by default, not off, turning it off it parse strict or am I wrong here? > * ECN on with Linux can cause 'zero sized reply' responses, although > usually you'll get a timeout. I have ECN on on my system and very few > sites fail because of this, but there are a small number. Read the > squid FAQ for information about how to turn this off if it is a problem. > FYI it does not happens only on Linux, again, the problem and a possible solution here is not the point, the point is that for the end-user the site opens using "the other ISP" so for him it is an ISP problem, he doesn't care if it is squid or the remote site, network congestion or other. anyway, IMO the error message is obscure for the user, it starts saying the URL: (blank) the user obviously complains about that he typed correctly the URL and on the error msg it is blank, so this cause understanding problems between the support staff and the user Then it does not help to send reading FAQs because what I am speaking about is the user not the administrator. The user does not need to learn squid but what he gets should be understandable enough and most important he should get it when he gets it without squid. I mean that a site should be accessible behind squid when it opens normally with a Browser without squid. It is not interesting here if there is a wrong header or whatever. > * NTLM authentication, some uninformed site admins require or request > NO, I was not speaking about any authentication at all > > Can you give some examples of specific sites which you need to bypass > squid for that you cannot get to display using the items I mentioned above? > First some banking and other secure sites which need gre protocol for example but I was not speaking about this ones. Lots of Blogger sites are giving erros. Sure there is a lot of underline and whitespace problems but the latter ones often are not resolvable by squid settings. On the other side they open normally with MSIE At work I can check for more, one specific follows. Other errors are like this, even if this specific site now is working after contacting them. The site gave problem with squid > 2.5-S4 if I am not wrong here. GET / HTTP/1.1 Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/vnd.ms-excel, application/msword, application/vnd.ms-powerpoint, application/x-shockwave-flash, */* Accept-Language: pt-br Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Windows 98; DigExt) Host: www.redecard.com.br Connection: Keep-Alive Hans > Reuben -- ___ Infomatik (18)8112.7007 http://info.matik.com.br Mensagens não assinadas com GPG não são minhas. Messages without GPG signature are not from me. ___ pgpKoxMMuk7RR.pgp Description: PGP signature
RE: [squid-users] Zero sized reply and other recent access problems
I must say that I totally agree with your comment but I'd like to hear from some of the developers to better understand the whole picture as far as code checking. We do have some of the same issues you are experiencing and we are using transparent proxying as well. Latley it is becoming more and more of an admin nightmare responding to users who cannot access certain sites while on our network when they can access them just fine from their home. From: H Matik <[EMAIL PROTECTED]> To: squid-users@squid-cache.org Subject: [squid-users] Zero sized reply and other recent access problems Date: Sat, 5 Mar 2005 18:08:44 -0300 Recently all of us are having problems with squid not serving certain pages/objects anymore. We do know that squid most probably does detect correct or incorrect html codes and tells it via it's error messages. But I am not so sure if this should be a squid task. Squid IMO should cache and serve what it gets from the server. The code check should be done by the browser - means incorrect code is a browser problem or a web server problem so it should be adviced by the browser not by anything in the middle. Even if the page code is buggy the page could contain objects to be cached and that is what squid should do. I say so because who use squid is an ISP or a system admin of any kind of network. So it should not turn into be this man's problem if somebody is coding his server's html pages incorrectly. He with his squid only serves his customers or his people on his network. IMO this strict html code checking is complicating network support to end customers what already was or is not so easy sometimes. We here do use transparent squid on lots of sites and soon someone complains about this kind of problem we rewrite our fwd rules so that it does not goes through squid anymore. Even if we know that the remote site owner has no interest in somebody not capable to access his site we do not have the time to talk to him. Indeed it is not our problem and we are not a html coding school teaching how to correct errors. So here we simply desist and pass by squid for such kind of sites. IMO I think it might be better for squid not checking code. Custumers say: "Without your cache I can access the site, with your cache not. I do not want to know about and if you do not resolve this problem for me I do not use you service anymore but another where it works." So even if "I" loose first my customer second they do not use squid anymore. I believe it could be considered to think about this. I like to add that we here are using squid since 97/98 and what I wrote here is not in any kind a meant as offending critic to the developers but a point to think about. So what you think about this? Hans -- ___ Infomatik (18)8112.7007 http://info.matik.com.br Mensagens não assinadas com GPG não são minhas. Messages without GPG signature are not from me. ___ << attach3 >> _ FREE pop-up blocking with the new MSN Toolbar get it now! http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/
Re: [squid-users] Zero sized reply and other recent access problems
Hi, H Matik wrote: Recently all of us are having problems with squid not serving certain pages/objects anymore. We do know that squid most probably does detect correct or incorrect html codes and tells it via it's error messages. But I am not so sure if this should be a squid task. Squid IMO should cache and serve what it gets from the server. The code check should be done by the browser - means incorrect code is a browser problem or a web server problem so it should be adviced by the browser not by anything in the middle. Even if the page code is buggy the page could contain objects to be cached and that is what squid should do. I say so because who use squid is an ISP or a system admin of any kind of network. So it should not turn into be this man's problem if somebody is coding his server's html pages incorrectly. He with his squid only serves his customers or his people on his network. IMO this strict html code checking is complicating network support to end customers what already was or is not so easy sometimes. We here do use transparent squid on lots of sites and soon someone complains about this kind of problem we rewrite our fwd rules so that it does not goes through squid anymore. Even if we know that the remote site owner has no interest in somebody not capable to access his site we do not have the time to talk to him. Indeed it is not our problem and we are not a html coding school teaching how to correct errors. So here we simply desist and pass by squid for such kind of sites. IMO I think it might be better for squid not checking code. Custumers say: "Without your cache I can access the site, with your cache not. I do not want to know about and if you do not resolve this problem for me I do not use you service anymore but another where it works." So even if "I" loose first my customer second they do not use squid anymore. I believe it could be considered to think about this. I like to add that we here are using squid since 97/98 and what I wrote here is not in any kind a meant as offending critic to the developers but a point to think about. So what you think about this? I think you've misunderstood something quite fundamental about how squid works: Squid does not read, complain or validate HTML In other words, it does not check it or care if it is even HTML, or if it is a binary file. Squid only cares about the HTTP _headers_ that the remote server is issuing when squid requests a document. HTTP headers have nothing to do with HTML, HTTP headers are generated by the HTTP server and administered by the server administrator, they are not anything to do with the web pages on the server itself. I suspect you are meaning to complain about a number of different things at once: * Strict HTTP header parsing - implemented in the most recent STABLE releases of squid, you can turn this off via a squid.conf directive anyway (but it is useful to have it set to log bad pages). * ECN on with Linux can cause 'zero sized reply' responses, although usually you'll get a timeout. I have ECN on on my system and very few sites fail because of this, but there are a small number. Read the squid FAQ for information about how to turn this off if it is a problem. * NTLM authentication, some uninformed site admins require or request NTLM authentication, this is not supported, not recommended by Microsoft on the internet and will not work (you'll get an error message). Squid should not support things which are known to be broken and not supposed to work! Can you give some examples of specific sites which you need to bypass squid for that you cannot get to display using the items I mentioned above? Reuben
[squid-users] Zero sized reply and other recent access problems
Recently all of us are having problems with squid not serving certain pages/objects anymore. We do know that squid most probably does detect correct or incorrect html codes and tells it via it's error messages. But I am not so sure if this should be a squid task. Squid IMO should cache and serve what it gets from the server. The code check should be done by the browser - means incorrect code is a browser problem or a web server problem so it should be adviced by the browser not by anything in the middle. Even if the page code is buggy the page could contain objects to be cached and that is what squid should do. I say so because who use squid is an ISP or a system admin of any kind of network. So it should not turn into be this man's problem if somebody is coding his server's html pages incorrectly. He with his squid only serves his customers or his people on his network. IMO this strict html code checking is complicating network support to end customers what already was or is not so easy sometimes. We here do use transparent squid on lots of sites and soon someone complains about this kind of problem we rewrite our fwd rules so that it does not goes through squid anymore. Even if we know that the remote site owner has no interest in somebody not capable to access his site we do not have the time to talk to him. Indeed it is not our problem and we are not a html coding school teaching how to correct errors. So here we simply desist and pass by squid for such kind of sites. IMO I think it might be better for squid not checking code. Custumers say: "Without your cache I can access the site, with your cache not. I do not want to know about and if you do not resolve this problem for me I do not use you service anymore but another where it works." So even if "I" loose first my customer second they do not use squid anymore. I believe it could be considered to think about this. I like to add that we here are using squid since 97/98 and what I wrote here is not in any kind a meant as offending critic to the developers but a point to think about. So what you think about this? Hans -- ___ Infomatik (18)8112.7007 http://info.matik.com.br Mensagens não assinadas com GPG não são minhas. Messages without GPG signature are not from me. ___ pgphXRhDQqqxi.pgp Description: PGP signature