Re: Occasional Zero Sized Reply error with 2.6

2006-06-29 Thread Gonzalo Arana

I've noticed this on squid3 as well. While trying to hunt this, I've
found this in
HttpStateData::readReply:
...
} else if (flag == COMM_OK  len == 0  !flags.headers_parsed) {
   fwd-fail(errorCon(ERR_ZERO_SIZE_OBJECT, HTTP_BAD_GATEWAY));
   eof = 1;
   flags.do_next_read = 0;
   comm_close(fd);
} else if (flag == COMM_OK  len == 0) {
   /* Connection closed; retrieval done. */
   eof = 1;

   if (!flags.headers_parsed)
   /*
   * When we called processReplyHeader() before, we
   * didn't find the end of headers, but now we are
   * definately at EOF, so we want to process the reply
   * headers.
*/
   processReplyHeader();
   else

Some thoughts:

0) The 'if (!flags.headers_parsed)' is dead code.  Any ideas what
was the original idea of this?

1) A 'zero size object' is a good reply in this case?  if server
closed the connection before we even got reply headers, this could
mean that server actually closed connection before it got our last
request.

2) In fact, sniffing with ethereal I've 'verified' this.  Not a
foolproff check, but I have about 220ms of RTT to osnews.com, and you
can see that between packets 88  89 of attached pcap file there are
about 50ms.  This means that: server closed connection, my squid3
sends the request, my squid3 receives the FIN packet for the very same
TCP connection.

3)  How to reproduce:  As Guido pointed out, this can be reproduced
(with some effort) while navigating through osnews.com.  Server-side
connection keepalive timeout timing makes it difficult to reproduce.

4) How to react if server closes connection before we even get reply
headers?  I propose to retry the same request (if method is GET, no
authentication data present in request, and possible many other
conditions).

I believe it would be great if we could check last TCP ACK value from
server-side when read(2) returns 0.  This way, we could know if server
actually got the request and only retry if it didn't.  Is there any
way to read that? tcp(7) manpage does not show anything useful :( (at
least on Linux).

Regards,

--
Gonzalo A. Arana


zero_size_object.pcap
Description: Binary data


Re: Occasional Zero Sized Reply error with 2.6

2006-06-29 Thread Henrik Nordstrom
tor 2006-06-29 klockan 10:40 -0300 skrev Gonzalo Arana:

 ...
 } else if (flag == COMM_OK  len == 0  !flags.headers_parsed) {
 fwd-fail(errorCon(ERR_ZERO_SIZE_OBJECT, HTTP_BAD_GATEWAY));
[..]
 } else if (flag == COMM_OK  len == 0) {
[..]
 if (!flags.headers_parsed)
 /*
 * When we called processReplyHeader() before, we
 * didn't find the end of headers, but now we are
 * definately at EOF, so we want to process the reply
 * headers.
  */
 processReplyHeader();


 0) The 'if (!flags.headers_parsed)' is dead code.  Any ideas what
 was the original idea of this?

Yes.. to handle broken servers not sending complete headers.

 1) A 'zero size object' is a good reply in this case?

Only if no data at all has been received I think. Maybe not if partial
headers have been received. But it's a border case.

 if server
 closed the connection before we even got reply headers, this could
 mean that server actually closed connection before it got our last
 request.

Yes, but this is handled in the FwdState. Or at least should be...
retrying the request on certain classes of errors.  This is also why
local errors is not pushed to the StoreEntry by the protocol handlers
but merely registered with the fwdstate..

 2) In fact, sniffing with ethereal I've 'verified' this.  Not a
 foolproff check, but I have about 220ms of RTT to osnews.com, and you
 can see that between packets 88  89 of attached pcap file there are
 about 50ms.  This means that: server closed connection, my squid3
 sends the request, my squid3 receives the FIN packet for the very same
 TCP connection.

Ok. So the retry logic has gone defunct, pretty much like the abort
logic had...

 4) How to react if server closes connection before we even get reply
 headers?

If it's the first request on a new connection a ERR_ZERO_SIZED_REPLY.
Else FwdState should retry the request on a new connection (preferably a
brand new connection to not run into the same issue again).

All the logics for doing this is there already. The question is why it
doesn't get triggered. See FwdState::serverClosed.

 I believe it would be great if we could check last TCP ACK value from
 server-side when read(2) returns 0.  This way, we could know if server
 actually got the request and only retry if it didn't.  Is there any
 way to read that? tcp(7) manpage does not show anything useful :( (at
 least on Linux).

Unfortunately not that I know of.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Occasional Zero Sized Reply error with 2.6

2006-06-28 Thread Guido Serassio

Hi Henrik,

At 13.49 27/06/2006, Henrik Nordstrom wrote:


tis 2006-06-27 klockan 13:30 +0200 skrev Henrik Nordstrom:

 One question: Are you using NTLM/Negotiate proxy authentication?

If you are then the problem should be fixed now.


No more errors, so it seems that the problem is fixed.

Regards

Guido



-

Guido Serassio
Acme Consulting S.r.l. - Microsoft Certified Partner
Via Lucia Savarino, 1   10098 - Rivoli (TO) - ITALY
Tel. : +39.011.9530135  Fax. : +39.011.9781115
Email: [EMAIL PROTECTED]
WWW: http://www.acmeconsulting.it/



Re: Occasional Zero Sized Reply error with 2.6

2006-06-28 Thread Henrik Nordstrom
ons 2006-06-28 klockan 08:32 +0200 skrev Guido Serassio:

 No more errors, so it seems that the problem is fixed.

But spun off into more issues... (Bug #1638)..

this connection pinning business has been a bit of a nightmare to get
correct in all configurations.. hopefully no additional gremlins will
pop up in there.

should probably write a document on it while all detail requirements is
fresh in memory.. Could be quite useful when implementing pinning
support in Squid-3...

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Occasional Zero Sized Reply error with 2.6

2006-06-27 Thread Henrik Nordstrom
tis 2006-06-27 klockan 10:25 +0200 skrev Guido Serassio:

 Could you set a breakpoint on the ERR_ZERO_SIZE_OBJECT fwdFail call in
 http.c httpReadReply, and print read_sz from there.. (non-optimized
 build please..)
 
 Obviously during a failed request ?

Yes.. this code is only reached on such requests...

 I can run Squid under gdb on the Linux machine, 
 tell me what other info you need.

backtrace

print read_sz

print len

maybe more..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Occasional Zero Sized Reply error with 2.6

2006-06-27 Thread Guido Serassio

Hi,

At 10.40 27/06/2006, Henrik Nordstrom wrote:


tis 2006-06-27 klockan 10:25 +0200 skrev Guido Serassio:

 Could you set a breakpoint on the ERR_ZERO_SIZE_OBJECT fwdFail call in
 http.c httpReadReply, and print read_sz from there.. (non-optimized
 build please..)

 Obviously during a failed request ?

Yes.. this code is only reached on such requests...

 I can run Squid under gdb on the Linux machine,
 tell me what other info you need.

backtrace

print read_sz

print len

maybe more..


OK, I will run it tonight at home.

Regards

Guido



-

Guido Serassio
Acme Consulting S.r.l. - Microsoft Certified Partner
Via Lucia Savarino, 1   10098 - Rivoli (TO) - ITALY
Tel. : +39.011.9530135  Fax. : +39.011.9781115
Email: [EMAIL PROTECTED]
WWW: http://www.acmeconsulting.it/



Re: Occasional Zero Sized Reply error with 2.6

2006-06-27 Thread Henrik Nordstrom
tis 2006-06-27 klockan 12:39 +0200 skrev Guido Serassio:
 Hi,
 
 At 10.40 27/06/2006, Henrik Nordstrom wrote:
 
 tis 2006-06-27 klockan 10:25 +0200 skrev Guido Serassio:
 
   Could you set a breakpoint on the ERR_ZERO_SIZE_OBJECT fwdFail call in
   http.c httpReadReply, and print read_sz from there.. (non-optimized
   build please..)
  
   Obviously during a failed request ?
 
 Yes.. this code is only reached on such requests...
 
   I can run Squid under gdb on the Linux machine,
   tell me what other info you need.
 
 backtrace
 
 print read_sz
 
 print len
 
 maybe more..
 
 OK, I will run it tonight at home.

Excellent..

One question: Are you using NTLM/Negotiate proxy authentication?

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Occasional Zero Sized Reply error with 2.6

2006-06-27 Thread Henrik Nordstrom
tis 2006-06-27 klockan 13:30 +0200 skrev Henrik Nordstrom:

 One question: Are you using NTLM/Negotiate proxy authentication?

If you are then the problem should be fixed now.

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Occasional Zero Sized Reply error with 2.6

2006-06-27 Thread Henrik Nordstrom
lör 2006-06-24 klockan 08:48 +0800 skrev Adrian Chadd:

 I'm seeing it in my polygraph testing under 2.6 + COSS. It may
 be related.

Most likely not... there was a nasty bug related to persistent
connections and NTLM authentication..

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Occasional Zero Sized Reply error with 2.6

2006-06-26 Thread Henrik Nordstrom
sön 2006-06-25 klockan 09:56 +0200 skrev Guido Serassio:
 Hi Henrik,
 
 At 23.39 24/06/2006, Henrik Nordstrom wrote:
 
 Could you try collecting a tcpdump -s 1600 of that traffic?
 
 I have done an ethereal capture saved in tcpdump format.

Thanks, but didn't make me very much wiser. But maybe..

Is this with or without delay pools?

Could you set a breakpoint on the ERR_ZERO_SIZE_OBJECT fwdFail call in
http.c httpReadReply, and print read_sz from there.. (non-optimized
build please..)

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Occasional Zero Sized Reply error with 2.6

2006-06-25 Thread Guido Serassio

Hi Henrik,

At 23.39 24/06/2006, Henrik Nordstrom wrote:


Could you try collecting a tcpdump -s 1600 of that traffic?


I have done an ethereal capture saved in tcpdump format.

Regards

Guido



-

Guido Serassio
Acme Consulting S.r.l. - Microsoft Certified Partner
Via Lucia Savarino, 1   10098 - Rivoli (TO) - ITALY
Tel. : +39.011.9530135  Fax. : +39.011.9781115
Email: [EMAIL PROTECTED]
WWW: http://www.acmeconsulting.it/


chunk.pcap.gz
Description: Binary data


Re: Occasional Zero Sized Reply error with 2.6

2006-06-24 Thread Guido Serassio

Hi Henrik,

At 01.10 24/06/2006, Henrik Nordstrom wrote:


fre 2006-06-23 klockan 19:14 +0200 skrev Guido Serassio:

 This is the access.log entry for the failing request

 1151081760.062  0 172.30.128.1 TCP_MISS/502 1489 GET
 http://www1.it.squid-cache.org/Versions/v2/2.6/
 acmeconsulting\guido.serassio DIRECT/81.174.20.197 text/html

Is it reproducible?


It happens very occasionally without any apparent logical rule ...

I have looked to the web server log (it's my squid-cache.org mirror 
), and this is what I have found in apache 2 access.log:


xx.xx.xx.xx - - [23/Jun/2006:18:55:27 +0200] GET / HTTP/1.0 200 
6295 - Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) 
Gecko/20060508 Firefox/1.5.0.4
xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET 
/Icons/squid-name4.gif HTTP/1.0 200 1358 
http://www1.it.squid-cache.org/; Mozilla/5.0 (Windows; U; Windows 
NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET 
/Icons/cache_now.gif HTTP/1.0 200 493 
http://www1.it.squid-cache.org/; Mozilla/5.0 (Windows; U; Windows 
NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET /Icons/squidnow.gif 
HTTP/1.0 200 1761 http://www1.it.squid-cache.org/; Mozilla/5.0 
(Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET 
/graphics/squid-shade.png HTTP/1.0 200 33613 
http://www1.it.squid-cache.org/; Mozilla/5.0 (Windows; U; Windows 
NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
xx.xx.xx.xx - - [23/Jun/2006:18:55:28 +0200] GET /favicon.ico 
HTTP/1.0 200 894 - Mozilla/5.0 (Windows; U; Windows NT 5.0; 
en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
xx.xx.xx.xx - - [23/Jun/2006:19:01:31 +0200] GET /Versions/v2/2.6/ 
HTTP/1.0 200 4926 http://www1.it.squid-cache.org/; Mozilla/5.0 
(Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4
xx.xx.xx.xx - - [23/Jun/2006:19:26:47 +0200] GET /Versions/v3/3.0/ 
HTTP/1.0 200 3408 http://www1.it.squid-cache.org/; Mozilla/5.0 
(Windows; U; Windows NT 5.0; en-US; rv:1.8.0.4) Gecko/20060508 Firefox/1.5.0.4


The failing request was not logged by Apache, and there is nothing in 
Apache's error.log.



 In cache.log was logged this:

 2006/06/23 18:55:30| storeAufsOpenDone: (13) Permission denied
 2006/06/23 18:55:30|g:/cache/squid/00/00/0001
 2006/06/23 18:55:30| storeSwapOutFileClosed: dirno 0, swapfile
 0001, errflag=-1
  (13) Permission denied

Hmm.. swapout failures have never been tested much. Would not be
surprised if this cascades into more issues..

Why are you getting permission denied on create in your cache dir?


Yesterday night I got again 4 - 5 times the Zero Sized Reply error, 
but without any new cache.log error, so likely this could be another problem.


I will try to switch to ufs, I have still some doubt about the 
correct Windows threads implementation of awin32 ... :-(


Regards

Guido



-

Guido Serassio
Acme Consulting S.r.l. - Microsoft Certified Partner
Via Lucia Savarino, 1   10098 - Rivoli (TO) - ITALY
Tel. : +39.011.9530135  Fax. : +39.011.9781115
Email: [EMAIL PROTECTED]
WWW: http://www.acmeconsulting.it/



Re: Occasional Zero Sized Reply error with 2.6

2006-06-24 Thread Henrik Nordstrom
lör 2006-06-24 klockan 09:46 +0200 skrev Guido Serassio:
 Hi Henrik,
 
 At 01.10 24/06/2006, Henrik Nordstrom wrote:
 
 fre 2006-06-23 klockan 19:14 +0200 skrev Guido Serassio:
 
   This is the access.log entry for the failing request
  
   1151081760.062  0 172.30.128.1 TCP_MISS/502 1489 GET
   http://www1.it.squid-cache.org/Versions/v2/2.6/
   acmeconsulting\guido.serassio DIRECT/81.174.20.197 text/html
 
 Is it reproducible?
 
 Just found the way !!!:
 
 Is seems that my Squid 2.6 doesn't like www.osnews.com  
 
 I was able to log something ... :-)
 
 Look to 502 errors in the attached access.log.

Could you try collecting a tcpdump -s 1600 of that traffic?

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Occasional Zero Sized Reply error with 2.6

2006-06-23 Thread Henrik Nordstrom
fre 2006-06-23 klockan 19:14 +0200 skrev Guido Serassio:

 This is the access.log entry for the failing request
 
 1151081760.062  0 172.30.128.1 TCP_MISS/502 1489 GET 
 http://www1.it.squid-cache.org/Versions/v2/2.6/ 
 acmeconsulting\guido.serassio DIRECT/81.174.20.197 text/html

Is it reproducible?

 In cache.log was logged this:
 
 2006/06/23 18:55:30| storeAufsOpenDone: (13) Permission denied
 2006/06/23 18:55:30|g:/cache/squid/00/00/0001
 2006/06/23 18:55:30| storeSwapOutFileClosed: dirno 0, swapfile 
 0001, errflag=-1
  (13) Permission denied

Hmm.. swapout failures have never been tested much. Would not be
surprised if this cascades into more issues..

Why are you getting permission denied on create in your cache dir?

Regards
Henrik


signature.asc
Description: Detta är en digitalt signerad	meddelandedel


Re: Occasional Zero Sized Reply error with 2.6

2006-06-23 Thread Adrian Chadd
On Fri, Jun 23, 2006, Guido Serassio wrote:
 Hi Henrik,
 
 I'm observing some occasional Zero Sized Reply error using 2.6 on 
 native Windows.
 Reloading the page, it works fine.

I'm seeing it in my polygraph testing under 2.6 + COSS. It may
be related.




Adrian