Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 06/26/2015 06:44 AM, Amos Jeffries wrote: +// invalid character somewhere in the line. +// As long as we can find the LF, accept the characters +// which we know are invalid in any URI, but actively used +LfDelim.add('\0'); // Java +LfDelim.add(' '); // IIS +LfDelim.add('\'); // Bing +LfDelim.add('\\'); // MSIE, Firefox +LfDelim.add('|'); // Amazon On 06/26/2015 09:40 AM, Alex Rousskov wrote: In your patch, please add support for all URI characters that we can support (or at least all the unwise ones from RFC 2396), not just the characters that recent deployments have already confirmed as necessary to accommodate. We do not want to come back to this every time some app starts sending slightly malformed URIs. Just got another bug report from the real world. This time it is about the ^ character used in URLs on a Microsoft news site (probably coming from some affiliated advertisement services). I am posting this not to just to emphasize that the list of added characters is too limited, but to re-emphasize that the whole allow what we know is actively used approach is unfortunate. In your long-term patch, please add support for all URI characters that we can support (or at least all the unwise ones from RFC 2396). Thank you, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 06/26/2015 06:44 AM, Amos Jeffries wrote: On 26/06/2015 7:55 a.m., Alex Rousskov wrote: Tokenizer cannot handle URIs with whitespaces directly, like your patch attempts to do: Tokenizer alone cannot handle ambiguous grammars. To handle such URIs well, you have two options IMO: A. The old do it by hand way: 0. Trim request line to remove trailing whitespace and CR*LF. 1. Find the *last* whitespace on the trimmed request line. 2. To get the request method and URI, apply the tokenizer to the request line prefix before that last whitespace. 3. To get the protocol parts, apply the tokenizer to the request line suffix after that last whitespace. 4. Make the code even messier to handle HTTP/0.9 cases if needed. B. The new optimized and simplified optimistic Tokenizer way: Here is a sketch: SBuf uri; // accumulates request URI characters if (!tok.prefix(uri, StrictUriChars)) return false; // URI does not start with a valid character // in the order of correctness (and popularity!): const bool parsedSuffix = // canonical HTTP/1 format parseCanonicalReqLineTail(uri, tok) || // HTTP/1 format but with bad characters in URI parseBadUriReqLineTail(uri, tok) || // HTTP/1 format but with bad characters and spaces in URI parseSpacedReqLineTail(uri, tok) || // HTTP/0 format (without the protocol part at all) parseHttpZeroReqLineTail(uri, tok); Assert(parsedSuffix); // parseHttpZeroReqLineTail() cannot fail ... C. locate LF character and parse in reverse. HTTP-Version label, WSP then remainder is URI. Yes, this is a better variant of option A, using Tokenizer::suffix(); I missed that method. It is still manual labor to accommodate rare special cases while complicating and slowing down parsing of the common case. (B) is Interesting alternative to what we have now (C). Although you are missing the relaxed vs strict characters which the RFC mandates (for both URI and whitspace separately permutating). That complicates things a bit. I do not see where design B is missing something. Obviously(?), if there are more special cases than the sketch shows, they should be added. I was just illustrating the approach. The actual implementation should be different. If I can find some time I'll give this (B) style a bit of testing and see how it matches up against the current code. Could prove faster on very long URIs. Given correct implementations of A and B, B should be a tiny bit faster for an average URI because it does less work in common cases. In your patch, please add support for all URI characters that we can support (or at least all the unwise ones from RFC 2396), not just the characters that recent deployments have already confirmed as necessary to accommodate. We do not want to come back to this every time some app starts sending slightly malformed URIs. I would also recommend accommodating whitespace after HTTP-version. Ideally, parseRequestFirstLine(), or at least the new code, should be split into several methods. It is too long, convoluted, and is getting worse. I remain opposed to downgrading HTTP/1 requests with slightly malformed URIs to HTTP/0. HTH, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 26/06/2015 7:55 a.m., Alex Rousskov wrote: On 06/25/2015 08:13 AM, Amos Jeffries wrote: Which is why I want to go the route of HTTP/0.9 handling. Its clear when products encounter it and cause themselves problems. Sigh. You are repeating essentially the same argument as before. Any let's create problems for something that appears to work without Squid approach often results in *us* wasting time on workarounds for those problems. You may be convinced that creating short-term problems for others will make the world a better place long-term, but there is no way to prove or disprove that assumption, and there is no objective way to even compare the possible long-term gain with the likely short-term pain (especially when it is somebody else who is hurting). The history is full of examples where the future happiness justifies short-term suffering approach either works great or fails miserably. I doubt we can come up with a formula that predicts the outcome. Was the '|' character non-compliance bug reported against the product? Unknown to me, but I suspect folks suffering from this do not report bugs to companies they have no relationship with (and may not even know what the product is!). Anyhow. I've thrown the URI you mentioned through my test copy of trunk and its not hanging looking for the end of mime headers like you said. Its waiting for the end of the URI: parseHttpRequest: Incomplete request, waiting for end of request line Please forgive me for misinterpreting the Squid log line that lies about an Incomplete request and waiting for end of request line. No lie there. It hasn't got near starting to search for URL yet. When relaxed parser is enabled (ie by default) it has to find the end of the request-line first and work backwards. It found characters that are invalid anywhere in the HTTP first-line / request-line before it found the LF. It just happens that they are in the URI portion of the request-line. It would report the same thing on receiving GET / HTTP/1.0|\r\n If you prefer the waiting for the end of the URI description of the broken state (where the complete URI, the complete request line, and the complete request are all available), I am not going to argue that it is also inaccurate. Attached is a patch that replaces the hang behaviour with an invalid-request error page on standards compliant builds. But, when --enable-http-violations is used it accepts the invalid characters we know are in use and passes the request through as HTTP/0.9. ... which does not work because the request does not actually use the HTTP/0.9 format. You end up with: Forwarding client request ... url=http://localhost:8080/path|with|unwise|charactersHTTP/1.1 GET /path|with|unwise|charactersHTTP/1.1 HTTP/1.1 Via: 0.9 protofroot (squid/4.0.0-BZR) The trailing HTTP/1.1 part of the request line is not a part of the URI but your changes make it so. Sorry. Attached is a patch which fixes that and resolves accel/intercept issues with the mime block as well. Tokenizer cannot handle URIs with whitespaces directly, like your patch attempts to do: Tokenizer alone cannot handle ambiguous grammars. To handle such URIs well, you have two options IMO: A. The old do it by hand way: 0. Trim request line to remove trailing whitespace and CR*LF. 1. Find the *last* whitespace on the trimmed request line. 2. To get the request method and URI, apply the tokenizer to the request line prefix before that last whitespace. 3. To get the protocol parts, apply the tokenizer to the request line suffix after that last whitespace. 4. Make the code even messier to handle HTTP/0.9 cases if needed. B. The new optimized and simplified optimistic Tokenizer way: Here is a sketch: SBuf uri; // accumulates request URI characters if (!tok.prefix(uri, StrictUriChars)) return false; // URI does not start with a valid character // in the order of correctness (and popularity!): const bool parsedSuffix = // canonical HTTP/1 format parseCanonicalReqLineTail(uri, tok) || // HTTP/1 format but with bad characters in URI parseBadUriReqLineTail(uri, tok) || // HTTP/1 format but with bad characters and spaces in URI parseSpacedReqLineTail(uri, tok) || // HTTP/0 format (without the protocol part at all) parseHttpZeroReqLineTail(uri, tok); Assert(parsedSuffix); // parseHttpZeroReqLineTail() cannot fail ... C. locate LF character and parse in reverse. HTTP-Version label, WSP then remainder is URI. (B) is Interesting alternative to what we have now (C). Although you are missing the relaxed vs strict characters which the RFC mandates (for both URI and whitspace separately permutating). That complicates things a bit. If I can find some time I'll give this (B) style a bit of testing and see how it matches up against the
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 06/25/2015 08:13 AM, Amos Jeffries wrote: Which is why I want to go the route of HTTP/0.9 handling. Its clear when products encounter it and cause themselves problems. Sigh. You are repeating essentially the same argument as before. Any let's create problems for something that appears to work without Squid approach often results in *us* wasting time on workarounds for those problems. You may be convinced that creating short-term problems for others will make the world a better place long-term, but there is no way to prove or disprove that assumption, and there is no objective way to even compare the possible long-term gain with the likely short-term pain (especially when it is somebody else who is hurting). The history is full of examples where the future happiness justifies short-term suffering approach either works great or fails miserably. I doubt we can come up with a formula that predicts the outcome. Was the '|' character non-compliance bug reported against the product? Unknown to me, but I suspect folks suffering from this do not report bugs to companies they have no relationship with (and may not even know what the product is!). Anyhow. I've thrown the URI you mentioned through my test copy of trunk and its not hanging looking for the end of mime headers like you said. Its waiting for the end of the URI: parseHttpRequest: Incomplete request, waiting for end of request line Please forgive me for misinterpreting the Squid log line that lies about an Incomplete request and waiting for end of request line. If you prefer the waiting for the end of the URI description of the broken state (where the complete URI, the complete request line, and the complete request are all available), I am not going to argue that it is also inaccurate. Attached is a patch that replaces the hang behaviour with an invalid-request error page on standards compliant builds. But, when --enable-http-violations is used it accepts the invalid characters we know are in use and passes the request through as HTTP/0.9. ... which does not work because the request does not actually use the HTTP/0.9 format. You end up with: Forwarding client request ... url=http://localhost:8080/path|with|unwise|charactersHTTP/1.1 GET /path|with|unwise|charactersHTTP/1.1 HTTP/1.1 Via: 0.9 protofroot (squid/4.0.0-BZR) The trailing HTTP/1.1 part of the request line is not a part of the URI but your changes make it so. Tokenizer cannot handle URIs with whitespaces directly, like your patch attempts to do: Tokenizer alone cannot handle ambiguous grammars. To handle such URIs well, you have two options IMO: A. The old do it by hand way: 0. Trim request line to remove trailing whitespace and CR*LF. 1. Find the *last* whitespace on the trimmed request line. 2. To get the request method and URI, apply the tokenizer to the request line prefix before that last whitespace. 3. To get the protocol parts, apply the tokenizer to the request line suffix after that last whitespace. 4. Make the code even messier to handle HTTP/0.9 cases if needed. B. The new optimized and simplified optimistic Tokenizer way: Here is a sketch: SBuf uri; // accumulates request URI characters if (!tok.prefix(uri, StrictUriChars)) return false; // URI does not start with a valid character // in the order of correctness (and popularity!): const bool parsedSuffix = // canonical HTTP/1 format parseCanonicalReqLineTail(uri, tok) || // HTTP/1 format but with bad characters in URI parseBadUriReqLineTail(uri, tok) || // HTTP/1 format but with bad characters and spaces in URI parseSpacedReqLineTail(uri, tok) || // HTTP/0 format (without the protocol part at all) parseHttpZeroReqLineTail(uri, tok); Assert(parsedSuffix); // parseHttpZeroReqLineTail() cannot fail ... where, for example, parseCanonicalReqLineTail() expects whitespace followed by protocol info while parseBadUriReqLineTail() expects a URI suffix (using RelaxedUriChars which do not include whitepsace) followed by a call to parseCanonicalReqLineTail(). If any of the helper functions returns false, it should not change its uri and tok parameters, of course. HTH, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 25/06/2015 4:05 p.m., Alex Rousskov wrote: On 06/24/2015 02:24 PM, Kinkie wrote: My 2c: I vote for reality; possibly with a shaming announce message; I would recommend against shaming any particular company unless somebody can personally vouch that that specific company is the one to blame. I cannot (without a lot more triage work which I have no time for). The benefits of shaming in this case are tiny, but the danger of hurting an innocent person (or even company) is always there. Indeed. Which is why I want to go the route of HTTP/0.9 handling. Its clear when products encounter it and cause themselves problems. But we dont need to name and shame any particular source. pPS. in my testing the URIs you presented as reason for the patch were accepted by Squid and handled as HTTP/0.9. I wouldn't even recommend logging the violation: there is nothing the average admin can do about it. An admin has reminded me privately that Squid already has a mechanism that can be used here if somebody wants to enforce strict parsing [and wants to see violations]: NAME: relaxed_header_parser In the default on setting Squid accepts certain forms of non-compliant HTTP messages where it is unambiguous what the sending application intended If set to warn then a warning will be emitted in cache.log each time such HTTP error is encountered. If set to off then such HTTP errors will cause the request or response to be rejected. The temporary fix can be enhanced to include all unwise URI characters if (and only if) relaxed_header_parser is on, and also to warn about those characters when relaxed_header_parser is warn. That would be better than always doing it in the strict parsing pathway. And is a useful temporary workaround for people having this issue you found since related parse should be on by default IIRC. On Wed, Jun 24, 2015 at 10:12 PM, Alex Rousskov wrote: On 06/24/2015 05:26 AM, Amos Jeffries wrote: On 24/06/2015 5:55 p.m., Alex Rousskov wrote: This temporary trunk fix adds support for request URIs containing '|' characters. Such URIs are used by popular Amazon product (and probably other) sites: /images/I/ID1._RC|ID2.js,ID3.js,ID4.js_.js Without this fix, all requests for affected URIs timeout while Squid waits for the end of request headers it has already received(*). This is not right. Squid should be identifying the message as non-HTTP/1.x (which it isn't due to the URI syntax violation) and treating it as such. I agree that Amazon violates URI syntax. On the other hand, the message can be interpreted as HTTP/1.x for all practical purposes AFAICT. If you want to implement a different fix, please do so. Meanwhile, folks suffering from this serious regression can try the temporary fix I posted. SMTP messages can be interpreted as HTTP/1.1 for all intents and purposes as well. That does not make it a good idea to do so. I don't care who it is sending the broken messages (AIUI, its not the company anyway, its the particular UA software you found). They need to stop getting away with it and the best incentive is bad behaviour leading to bad performance. Loosing the benefits of HTTP/1.1 does that cleanly. NP: We have a track record of several vendors who fixed their products after this type of bug was reported to them. Was the '|' character non-compliance bug reported against the product? They are probably able to roll out the proper fix faster than we can anyway. Anyhow. I've thrown the URI you mentioned through my test copy of trunk and its not hanging looking for the end of mime headers like you said. Its waiting for the end of the URI: RequestParser.cc(390) parse: request-line: method: GET RequestParser.cc(391) parse: request-line: url: RequestParser.cc(392) parse: request-line: proto: NONE/0.0 RequestParser.cc(393) parse: Parser: bytes processed=4 SBuf.cc(149) assign: assigning SBuf1845 from SBuf1850 parseHttpRequest: Incomplete request, waiting for end of request line Attached is a patch that replaces the hang behaviour with an invalid-request error page on standards compliant builds. But, when --enable-http-violations is used it accepts the invalid characters we know are in use and passes the request through as HTTP/0.9. The proper long-term fix is to allow any character in URI as long as we can reliably parse the request line (and, later, URI components). There is no point in hurting users by rejecting requests while slowly accumulating the list of benign characters used by web sites but prohibited by some RFC. The *proper* long term fix is to obey the standards in regard to message syntax so applications stop using these invalid (when un-encoded) characters and claiming HTTP/1.1 support. We had standards vs reality and policing traffic discussions several times in the past, with no signs of convergence towards a single approach, so I am not going to revisit that discussion now. We
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
About the shaming idea: I do not like in any way and time to write bad things on anyone. To make sure there is a bug and to make sure where the bug is from I would try to first understand the bug and maybe someone would be able to make this world a better place for programs and programmeres. I will not run to triage this issue for now but the last idea of amos seems to me pretty reasonable(reading the description and not the code). Eliezer On 24/06/2015 08:55, Alex Rousskov wrote: Hello, This temporary trunk fix adds support for request URIs containing '|' characters. Such URIs are used by popular Amazon product (and probably other) sites: /images/I/ID1._RC|ID2.js,ID3.js,ID4.js_.js Without this fix, all requests for affected URIs timeout while Squid waits for the end of request headers it has already received(*). The proper long-term fix is to allow any character in URI as long as we can reliably parse the request line (and, later, URI components). There is no point in hurting users by rejecting requests while slowly accumulating the list of benign characters used by web sites but prohibited by some RFC. HTH, Alex. P.S. (*) which is probably another parsing regression bug (not addressed by this temporary patch or the long-term fix discussed above), but I have not tested whether the very latest trunk still suffers from this other bug. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 24/06/2015 5:55 p.m., Alex Rousskov wrote: Hello, This temporary trunk fix adds support for request URIs containing '|' characters. Such URIs are used by popular Amazon product (and probably other) sites: /images/I/ID1._RC|ID2.js,ID3.js,ID4.js_.js Without this fix, all requests for affected URIs timeout while Squid waits for the end of request headers it has already received(*). This is not right. Squid should be identifying the message as non-HTTP/1.x (which it isn't due to the URI syntax violation) and treating it as such. There is no reason for HTTP/0.9 traffic to hang (thus the bug agreement). Squid should read in the first line, find a reply for the URI and close the connection once that reply is delivered. Note how thare are no mime headers involved to be waited on. I agree there is a bug there if its hanging. NP: it means broken syntax does work - but gets a slow response with no HTTP feature benefits. That slow vs fast is the designed incentive for such softwares authors to fix things. The proper long-term fix is to allow any character in URI as long as we can reliably parse the request line (and, later, URI components). There is no point in hurting users by rejecting requests while slowly accumulating the list of benign characters used by web sites but prohibited by some RFC. The *proper* long term fix is to obey the standards in regard to message syntax so applications stop using these invalid (when un-encoded) characters and claiming HTTP/1.1 support. Amos ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
My 2c: I vote for reality; possibly with a shaming announce message; I wouldn't even recommend logging the violation: there is nothing the average admin can do about it. I would consider adding a shaming comment in the release notes. On Wed, Jun 24, 2015 at 10:12 PM, Alex Rousskov rouss...@measurement-factory.com wrote: On 06/24/2015 05:26 AM, Amos Jeffries wrote: On 24/06/2015 5:55 p.m., Alex Rousskov wrote: This temporary trunk fix adds support for request URIs containing '|' characters. Such URIs are used by popular Amazon product (and probably other) sites: /images/I/ID1._RC|ID2.js,ID3.js,ID4.js_.js Without this fix, all requests for affected URIs timeout while Squid waits for the end of request headers it has already received(*). This is not right. Squid should be identifying the message as non-HTTP/1.x (which it isn't due to the URI syntax violation) and treating it as such. I agree that Amazon violates URI syntax. On the other hand, the message can be interpreted as HTTP/1.x for all practical purposes AFAICT. If you want to implement a different fix, please do so. Meanwhile, folks suffering from this serious regression can try the temporary fix I posted. The proper long-term fix is to allow any character in URI as long as we can reliably parse the request line (and, later, URI components). There is no point in hurting users by rejecting requests while slowly accumulating the list of benign characters used by web sites but prohibited by some RFC. The *proper* long term fix is to obey the standards in regard to message syntax so applications stop using these invalid (when un-encoded) characters and claiming HTTP/1.1 support. We had standards vs reality and policing traffic discussions several times in the past, with no signs of convergence towards a single approach, so I am not going to revisit that discussion now. We continue to disagree [while Squid users continue to suffer]. Thank you, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev -- Francesco ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 06/24/2015 05:24 PM, Kinkie wrote: My 2c: I vote for reality; possibly with a shaming announce message; I wouldn't even recommend logging the violation: there is nothing the average admin can do about it. I would consider adding a shaming comment in the release notes. more cents: correct. A standard can be considered a strong guideline but if important sites violate the standard (i.e. users/admins complain) then Squid should be able to cope with it or it risks getting abandoned because Squid cannot cope with traffic of sites that otherwise work without Squid. For an admin it is irrelevant if the problem is caused by Squid or by a website. And the admin who dares to say to its users only visit sites that comply with the standards probably gets fired. On Wed, Jun 24, 2015 at 10:12 PM, Alex Rousskov rouss...@measurement-factory.com wrote: On 06/24/2015 05:26 AM, Amos Jeffries wrote: On 24/06/2015 5:55 p.m., Alex Rousskov wrote: This temporary trunk fix adds support for request URIs containing '|' characters. Such URIs are used by popular Amazon product (and probably other) sites: /images/I/ID1._RC|ID2.js,ID3.js,ID4.js_.js Without this fix, all requests for affected URIs timeout while Squid waits for the end of request headers it has already received(*). This is not right. Squid should be identifying the message as non-HTTP/1.x (which it isn't due to the URI syntax violation) and treating it as such. I agree that Amazon violates URI syntax. On the other hand, the message can be interpreted as HTTP/1.x for all practical purposes AFAICT. If you want to implement a different fix, please do so. Meanwhile, folks suffering from this serious regression can try the temporary fix I posted. The proper long-term fix is to allow any character in URI as long as we can reliably parse the request line (and, later, URI components). There is no point in hurting users by rejecting requests while slowly accumulating the list of benign characters used by web sites but prohibited by some RFC. The *proper* long term fix is to obey the standards in regard to message syntax so applications stop using these invalid (when un-encoded) characters and claiming HTTP/1.1 support. We had standards vs reality and policing traffic discussions several times in the past, with no signs of convergence towards a single approach, so I am not going to revisit that discussion now. We continue to disagree [while Squid users continue to suffer]. Thank you, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 06/24/2015 02:24 PM, Kinkie wrote: My 2c: I vote for reality; possibly with a shaming announce message; I would recommend against shaming any particular company unless somebody can personally vouch that that specific company is the one to blame. I cannot (without a lot more triage work which I have no time for). The benefits of shaming in this case are tiny, but the danger of hurting an innocent person (or even company) is always there. I wouldn't even recommend logging the violation: there is nothing the average admin can do about it. An admin has reminded me privately that Squid already has a mechanism that can be used here if somebody wants to enforce strict parsing [and wants to see violations]: NAME: relaxed_header_parser In the default on setting Squid accepts certain forms of non-compliant HTTP messages where it is unambiguous what the sending application intended If set to warn then a warning will be emitted in cache.log each time such HTTP error is encountered. If set to off then such HTTP errors will cause the request or response to be rejected. The temporary fix can be enhanced to include all unwise URI characters if (and only if) relaxed_header_parser is on, and also to warn about those characters when relaxed_header_parser is warn. HTH, Alex. On Wed, Jun 24, 2015 at 10:12 PM, Alex Rousskov rouss...@measurement-factory.com wrote: On 06/24/2015 05:26 AM, Amos Jeffries wrote: On 24/06/2015 5:55 p.m., Alex Rousskov wrote: This temporary trunk fix adds support for request URIs containing '|' characters. Such URIs are used by popular Amazon product (and probably other) sites: /images/I/ID1._RC|ID2.js,ID3.js,ID4.js_.js Without this fix, all requests for affected URIs timeout while Squid waits for the end of request headers it has already received(*). This is not right. Squid should be identifying the message as non-HTTP/1.x (which it isn't due to the URI syntax violation) and treating it as such. I agree that Amazon violates URI syntax. On the other hand, the message can be interpreted as HTTP/1.x for all practical purposes AFAICT. If you want to implement a different fix, please do so. Meanwhile, folks suffering from this serious regression can try the temporary fix I posted. The proper long-term fix is to allow any character in URI as long as we can reliably parse the request line (and, later, URI components). There is no point in hurting users by rejecting requests while slowly accumulating the list of benign characters used by web sites but prohibited by some RFC. The *proper* long term fix is to obey the standards in regard to message syntax so applications stop using these invalid (when un-encoded) characters and claiming HTTP/1.1 support. We had standards vs reality and policing traffic discussions several times in the past, with no signs of convergence towards a single approach, so I am not going to revisit that discussion now. We continue to disagree [while Squid users continue to suffer]. Thank you, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev
Re: [squid-dev] [PATCH] Temporary fix to restore compatibility with Amazon
On 06/24/2015 05:26 AM, Amos Jeffries wrote: On 24/06/2015 5:55 p.m., Alex Rousskov wrote: This temporary trunk fix adds support for request URIs containing '|' characters. Such URIs are used by popular Amazon product (and probably other) sites: /images/I/ID1._RC|ID2.js,ID3.js,ID4.js_.js Without this fix, all requests for affected URIs timeout while Squid waits for the end of request headers it has already received(*). This is not right. Squid should be identifying the message as non-HTTP/1.x (which it isn't due to the URI syntax violation) and treating it as such. I agree that Amazon violates URI syntax. On the other hand, the message can be interpreted as HTTP/1.x for all practical purposes AFAICT. If you want to implement a different fix, please do so. Meanwhile, folks suffering from this serious regression can try the temporary fix I posted. The proper long-term fix is to allow any character in URI as long as we can reliably parse the request line (and, later, URI components). There is no point in hurting users by rejecting requests while slowly accumulating the list of benign characters used by web sites but prohibited by some RFC. The *proper* long term fix is to obey the standards in regard to message syntax so applications stop using these invalid (when un-encoded) characters and claiming HTTP/1.1 support. We had standards vs reality and policing traffic discussions several times in the past, with no signs of convergence towards a single approach, so I am not going to revisit that discussion now. We continue to disagree [while Squid users continue to suffer]. Thank you, Alex. ___ squid-dev mailing list squid-dev@lists.squid-cache.org http://lists.squid-cache.org/listinfo/squid-dev