Re: Occasional Zero Sized Reply error with 2.6
I've noticed this on squid3 as well. While trying to hunt this, I've found this in HttpStateData::readReply: ... } else if (flag == COMM_OK len == 0 !flags.headers_parsed) { fwd-fail(errorCon(ERR_ZERO_SIZE_OBJECT, HTTP_BAD_GATEWAY)); eof = 1; flags.do_next_read = 0; comm_close(fd); } else if (flag == COMM_OK len == 0) { /* Connection closed; retrieval done. */ eof = 1; if (!flags.headers_parsed) /* * When we called processReplyHeader() before, we * didn't find the end of headers, but now we are * definately at EOF, so we want to process the reply * headers. */ processReplyHeader(); else Some thoughts: 0) The 'if (!flags.headers_parsed)' is dead code. Any ideas what was the original idea of this? 1) A 'zero size object' is a good reply in this case? if server closed the connection before we even got reply headers, this could mean that server actually closed connection before it got our last request. 2) In fact, sniffing with ethereal I've 'verified' this. Not a foolproff check, but I have about 220ms of RTT to osnews.com, and you can see that between packets 88 89 of attached pcap file there are about 50ms. This means that: server closed connection, my squid3 sends the request, my squid3 receives the FIN packet for the very same TCP connection. 3) How to reproduce: As Guido pointed out, this can be reproduced (with some effort) while navigating through osnews.com. Server-side connection keepalive timeout timing makes it difficult to reproduce. 4) How to react if server closes connection before we even get reply headers? I propose to retry the same request (if method is GET, no authentication data present in request, and possible many other conditions). I believe it would be great if we could check last TCP ACK value from server-side when read(2) returns 0. This way, we could know if server actually got the request and only retry if it didn't. Is there any way to read that? tcp(7) manpage does not show anything useful :( (at least on Linux). Regards, -- Gonzalo A. Arana zero_size_object.pcap Description: Binary data
Re: Occasional Zero Sized Reply error with 2.6
tor 2006-06-29 klockan 10:40 -0300 skrev Gonzalo Arana: ... } else if (flag == COMM_OK len == 0 !flags.headers_parsed) { fwd-fail(errorCon(ERR_ZERO_SIZE_OBJECT, HTTP_BAD_GATEWAY)); [..] } else if (flag == COMM_OK len == 0) { [..] if (!flags.headers_parsed) /* * When we called processReplyHeader() before, we * didn't find the end of headers, but now we are * definately at EOF, so we want to process the reply * headers. */ processReplyHeader(); 0) The 'if (!flags.headers_parsed)' is dead code. Any ideas what was the original idea of this? Yes.. to handle broken servers not sending complete headers. 1) A 'zero size object' is a good reply in this case? Only if no data at all has been received I think. Maybe not if partial headers have been received. But it's a border case. if server closed the connection before we even got reply headers, this could mean that server actually closed connection before it got our last request. Yes, but this is handled in the FwdState. Or at least should be... retrying the request on certain classes of errors. This is also why local errors is not pushed to the StoreEntry by the protocol handlers but merely registered with the fwdstate.. 2) In fact, sniffing with ethereal I've 'verified' this. Not a foolproff check, but I have about 220ms of RTT to osnews.com, and you can see that between packets 88 89 of attached pcap file there are about 50ms. This means that: server closed connection, my squid3 sends the request, my squid3 receives the FIN packet for the very same TCP connection. Ok. So the retry logic has gone defunct, pretty much like the abort logic had... 4) How to react if server closes connection before we even get reply headers? If it's the first request on a new connection a ERR_ZERO_SIZED_REPLY. Else FwdState should retry the request on a new connection (preferably a brand new connection to not run into the same issue again). All the logics for doing this is there already. The question is why it doesn't get triggered. See FwdState::serverClosed. I believe it would be great if we could check last TCP ACK value from server-side when read(2) returns 0. This way, we could know if server actually got the request and only retry if it didn't. Is there any way to read that? tcp(7) manpage does not show anything useful :( (at least on Linux). Unfortunately not that I know of. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Occasional Zero Sized Reply error with 2.6
Hi Henrik, At 13.49 27/06/2006, Henrik Nordstrom wrote: tis 2006-06-27 klockan 13:30 +0200 skrev Henrik Nordstrom: One question: Are you using NTLM/Negotiate proxy authentication? If you are then the problem should be fixed now. No more errors, so it seems that the problem is fixed. Regards Guido - Guido Serassio Acme Consulting S.r.l. - Microsoft Certified Partner Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY Tel. : +39.011.9530135 Fax. : +39.011.9781115 Email: [EMAIL PROTECTED] WWW: http://www.acmeconsulting.it/
Re: Occasional Zero Sized Reply error with 2.6
ons 2006-06-28 klockan 08:32 +0200 skrev Guido Serassio: No more errors, so it seems that the problem is fixed. But spun off into more issues... (Bug #1638).. this connection pinning business has been a bit of a nightmare to get correct in all configurations.. hopefully no additional gremlins will pop up in there. should probably write a document on it while all detail requirements is fresh in memory.. Could be quite useful when implementing pinning support in Squid-3... Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Occasional Zero Sized Reply error with 2.6
tis 2006-06-27 klockan 10:25 +0200 skrev Guido Serassio: Could you set a breakpoint on the ERR_ZERO_SIZE_OBJECT fwdFail call in http.c httpReadReply, and print read_sz from there.. (non-optimized build please..) Obviously during a failed request ? Yes.. this code is only reached on such requests... I can run Squid under gdb on the Linux machine, tell me what other info you need. backtrace print read_sz print len maybe more.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Occasional Zero Sized Reply error with 2.6
Hi, At 10.40 27/06/2006, Henrik Nordstrom wrote: tis 2006-06-27 klockan 10:25 +0200 skrev Guido Serassio: Could you set a breakpoint on the ERR_ZERO_SIZE_OBJECT fwdFail call in http.c httpReadReply, and print read_sz from there.. (non-optimized build please..) Obviously during a failed request ? Yes.. this code is only reached on such requests... I can run Squid under gdb on the Linux machine, tell me what other info you need. backtrace print read_sz print len maybe more.. OK, I will run it tonight at home. Regards Guido - Guido Serassio Acme Consulting S.r.l. - Microsoft Certified Partner Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY Tel. : +39.011.9530135 Fax. : +39.011.9781115 Email: [EMAIL PROTECTED] WWW: http://www.acmeconsulting.it/
Re: Occasional Zero Sized Reply error with 2.6
tis 2006-06-27 klockan 12:39 +0200 skrev Guido Serassio: Hi, At 10.40 27/06/2006, Henrik Nordstrom wrote: tis 2006-06-27 klockan 10:25 +0200 skrev Guido Serassio: Could you set a breakpoint on the ERR_ZERO_SIZE_OBJECT fwdFail call in http.c httpReadReply, and print read_sz from there.. (non-optimized build please..) Obviously during a failed request ? Yes.. this code is only reached on such requests... I can run Squid under gdb on the Linux machine, tell me what other info you need. backtrace print read_sz print len maybe more.. OK, I will run it tonight at home. Excellent.. One question: Are you using NTLM/Negotiate proxy authentication? Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Occasional Zero Sized Reply error with 2.6
tis 2006-06-27 klockan 13:30 +0200 skrev Henrik Nordstrom: One question: Are you using NTLM/Negotiate proxy authentication? If you are then the problem should be fixed now. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Occasional Zero Sized Reply error with 2.6
lör 2006-06-24 klockan 08:48 +0800 skrev Adrian Chadd: I'm seeing it in my polygraph testing under 2.6 + COSS. It may be related. Most likely not... there was a nasty bug related to persistent connections and NTLM authentication.. Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Occasional Zero Sized Reply error with 2.6
sön 2006-06-25 klockan 09:56 +0200 skrev Guido Serassio: Hi Henrik, At 23.39 24/06/2006, Henrik Nordstrom wrote: Could you try collecting a tcpdump -s 1600 of that traffic? I have done an ethereal capture saved in tcpdump format. Thanks, but didn't make me very much wiser. But maybe.. Is this with or without delay pools? Could you set a breakpoint on the ERR_ZERO_SIZE_OBJECT fwdFail call in http.c httpReadReply, and print read_sz from there.. (non-optimized build please..) Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Occasional Zero Sized Reply error with 2.6
Hi Henrik, At 23.39 24/06/2006, Henrik Nordstrom wrote: Could you try collecting a tcpdump -s 1600 of that traffic? I have done an ethereal capture saved in tcpdump format. Regards Guido - Guido Serassio Acme Consulting S.r.l. - Microsoft Certified Partner Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY Tel. : +39.011.9530135 Fax. : +39.011.9781115 Email: [EMAIL PROTECTED] WWW: http://www.acmeconsulting.it/ chunk.pcap.gz Description: Binary data
Re: Occasional Zero Sized Reply error with 2.6
Hi Henrik, At 01.10 24/06/2006, Henrik Nordstrom wrote: fre 2006-06-23 klockan 19:14 +0200 skrev Guido Serassio: This is the access.log entry for the failing request 1151081760.062 0 172.30.128.1 TCP_MISS/502 1489 GET http://www1.it.squid-cache.org/Versions/v2/2.6/ acmeconsulting\guido.serassio DIRECT/81.174.20.197 text/html Is it reproducible? It happens very occasionally without any apparent logical rule ... I have looked to the web server log (it's my squid-cache.org mirror ), and this is what I have found in apache 2 access.log: xx.xx.xx.xx - - [23/Jun/2006:18:55:27 +0200] GET / HTTP/1.0 200 6295 - Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET /Icons/squid-name4.gif HTTP/1.0 200 1358 http://www1.it.squid-cache.org/; Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET /Icons/cache_now.gif HTTP/1.0 200 493 http://www1.it.squid-cache.org/; Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET /Icons/squidnow.gif HTTP/1.0 200 1761 http://www1.it.squid-cache.org/; Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET /graphics/squid-shade.png HTTP/1.0 200 33613 http://www1.it.squid-cache.org/; Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET /favicon.ico HTTP/1.0 200 894 - Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 xx.xx.xx.xx - - [23/Jun/2006:19:01:31 +0200] GET /Versions/v2/2.6/ HTTP/1.0 200 4926 http://www1.it.squid-cache.org/; Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 xx.xx.xx.xx - - [23/Jun/2006:19:26:47 +0200] GET /Versions/v3/3.0/ HTTP/1.0 200 3408 http://www1.it.squid-cache.org/; Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4 The failing request was not logged by Apache, and there is nothing in Apache's error.log. In cache.log was logged this: 2006/06/23 18:55:30| storeAufsOpenDone: (13) Permission denied 2006/06/23 18:55:30|g:/cache/squid/00/00/0001 2006/06/23 18:55:30| storeSwapOutFileClosed: dirno 0, swapfile 0001, errflag=-1 (13) Permission denied Hmm.. swapout failures have never been tested much. Would not be surprised if this cascades into more issues.. Why are you getting permission denied on create in your cache dir? Yesterday night I got again 4 - 5 times the Zero Sized Reply error, but without any new cache.log error, so likely this could be another problem. I will try to switch to ufs, I have still some doubt about the correct Windows threads implementation of awin32 ... :-( Regards Guido - Guido Serassio Acme Consulting S.r.l. - Microsoft Certified Partner Via Lucia Savarino, 1 10098 - Rivoli (TO) - ITALY Tel. : +39.011.9530135 Fax. : +39.011.9781115 Email: [EMAIL PROTECTED] WWW: http://www.acmeconsulting.it/
Re: Occasional Zero Sized Reply error with 2.6
lör 2006-06-24 klockan 09:46 +0200 skrev Guido Serassio: Hi Henrik, At 01.10 24/06/2006, Henrik Nordstrom wrote: fre 2006-06-23 klockan 19:14 +0200 skrev Guido Serassio: This is the access.log entry for the failing request 1151081760.062 0 172.30.128.1 TCP_MISS/502 1489 GET http://www1.it.squid-cache.org/Versions/v2/2.6/ acmeconsulting\guido.serassio DIRECT/81.174.20.197 text/html Is it reproducible? Just found the way !!!: Is seems that my Squid 2.6 doesn't like www.osnews.com I was able to log something ... :-) Look to 502 errors in the attached access.log. Could you try collecting a tcpdump -s 1600 of that traffic? Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Occasional Zero Sized Reply error with 2.6
fre 2006-06-23 klockan 19:14 +0200 skrev Guido Serassio: This is the access.log entry for the failing request 1151081760.062 0 172.30.128.1 TCP_MISS/502 1489 GET http://www1.it.squid-cache.org/Versions/v2/2.6/ acmeconsulting\guido.serassio DIRECT/81.174.20.197 text/html Is it reproducible? In cache.log was logged this: 2006/06/23 18:55:30| storeAufsOpenDone: (13) Permission denied 2006/06/23 18:55:30|g:/cache/squid/00/00/0001 2006/06/23 18:55:30| storeSwapOutFileClosed: dirno 0, swapfile 0001, errflag=-1 (13) Permission denied Hmm.. swapout failures have never been tested much. Would not be surprised if this cascades into more issues.. Why are you getting permission denied on create in your cache dir? Regards Henrik signature.asc Description: Detta är en digitalt signerad meddelandedel
Re: Occasional Zero Sized Reply error with 2.6
On Fri, Jun 23, 2006, Guido Serassio wrote: Hi Henrik, I'm observing some occasional Zero Sized Reply error using 2.6 on native Windows. Reloading the page, it works fine. I'm seeing it in my polygraph testing under 2.6 + COSS. It may be related. Adrian