Re: keep alive connections
Hello Hrvoje, On Friday, November 7, 2003 at 11:50:53 PM +0100, Hrvoje Niksic wrote: Wget uses the `Keep-Alive' request header to request persistent connections, and understands both the HTTP/1.0 `Keep-Alive' and the HTTP/1.1 `Connection: keep-alive' response header. This doesn't seem to work together with --timestamping: Each HEAD and each possible GET uses a new connection. The server keeps responding: | HTTP/1.0 200 OK | [...] | Connection: Keep-Alive | Keep-Alive: timeout=15, max=5 But Wget 1.9 does each time: | Created socket 3. | [snip request/response] | Registered fd 3 for persistent reuse. | Closing fd 3 | Invalidating fd 3 from further reuse. | Remote file is newer, retrieving. | Created socket 3. | [and so on] Tcpdump confirms the TCP session is FIN closed by Wget. Without --timestamping Wget keeps Reusing fd 3. and closing it only once every 6 files (first + 5 more). At this moment the FIN would in any case be initiated by the server if not by Wget. Test made on an old Apache 1.1.3, but it seems the same with other servers. BTW, it's nice to see you back and active, Hrvoje! :-) Bye!Alain. -- Mutt 1.5.5.1 is released.
Re: keep alive connections
Alain Bench [EMAIL PROTECTED] writes: Hello Hrvoje, On Friday, November 7, 2003 at 11:50:53 PM +0100, Hrvoje Niksic wrote: Wget uses the `Keep-Alive' request header to request persistent connections, and understands both the HTTP/1.0 `Keep-Alive' and the HTTP/1.1 `Connection: keep-alive' response header. This doesn't seem to work together with --timestamping: Each HEAD and each possible GET uses a new connection. I think the difference is that Wget closes the connection when it decides not to read the request body. For example, it closes on redirections because it (intentionally) ignores the body. With the HEAD method you never know when you'll stumble upon a CGI that doesn't understand it and that will send the body anyway. But maybe it would actually be a better idea to read (and discard) the body than to close the connection and reopen it. Without --timestamping Wget keeps Reusing fd 3. and closing it only once every 6 files (first + 5 more). This might be due to redirections. Look out for the exact circumstances when Wget closes (or doesn't reuse) connections and you'll probably notice it.
Re: keep alive connections
On Tue, 11 Nov 2003, Hrvoje Niksic wrote: I think the difference is that Wget closes the connection when it decides not to read the request body. For example, it closes on redirections because it (intentionally) ignores the body. Another approach could be to read and just ignore the body of redirect pages. You'd gain a close/connect but lose the transfer time. With the HEAD method you never know when you'll stumble upon a CGI that doesn't understand it and that will send the body anyway. But maybe it would actually be a better idea to read (and discard) the body than to close the connection and reopen it. That approach is just as hard, only depending on different things to work correctly. Since we're talking about silly servers, they could just as well return a body to the HEAD request, and the response is said to be persistant and the Content-Length: is set. The size of the Content-Length in a HEAD request is the size of the body that would be returned if GET is request so you'd have no idea how much data to read Been there. Seen it happen. There's just no good way to deal with HEAD requests that sends back a body. I mean besides yelling at the author of the server side. -- -=- Daniel Stenberg -=- http://daniel.haxx.se -=- ech`echo xiun|tr nu oc|sed 'sx\([sx]\)\([xoi]\)xo un\2\1 is xg'`ol
RE: keep alive connections
From: Hrvoje Niksic [mailto:[EMAIL PROTECTED] With the HEAD method you never know when you'll stumble upon a CGI that doesn't understand it and that will send the body anyway. But maybe it would actually be a better idea to read (and discard) the body than to close the connection and reopen it. Wouldn't that be suboptimal in case that page is huge (and/or the connection slow) ? Heiko -- -- PREVINET S.p.A. www.previnet.it -- Heiko Herold [EMAIL PROTECTED] -- +39-041-5907073 ph -- +39-041-5907472 fax
Re: keep alive connections
Herold Heiko [EMAIL PROTECTED] writes: From: Hrvoje Niksic [mailto:[EMAIL PROTECTED] With the HEAD method you never know when you'll stumble upon a CGI that doesn't understand it and that will send the body anyway. But maybe it would actually be a better idea to read (and discard) the body than to close the connection and reopen it. Wouldn't that be suboptimal in case that page is huge (and/or the connection slow) ? You are right, it would. But it might make good sense for redirections, which typically have very small bodies.
AI_ADDRCONFIG
I noticed this with the latest CVS Wget: $ wget ftp://ftp.deepspace6.net/pub/ds6/sources/nc6/nc6-0.5.tar.bz2 --23:45:52-- ftp://ftp.deepspace6.net/pub/ds6/sources/nc6/nc6-0.5.tar.bz2 = `nc6-0.5.tar.bz2' Resolving ftp.deepspace6.net... 2001:760:204:10:10:a7ff:fe16:27f4, 3ffe:8300:0:1:10:a7ff:fe16:27f4, 192.167.215.13 Connecting to ftp.deepspace6.net|2001:760:204:10:10:a7ff:fe16:27f4|:21... failed: Address family not supported by protocol. Connecting to ftp.deepspace6.net|3ffe:8300:0:1:10:a7ff:fe16:27f4|:21... failed: Address family not supported by protocol. Connecting to ftp.deepspace6.net|192.167.215.13|:21... connected. [...] Wget works well, but it looks ugly because my machine is not configured for IPv6. According to OpenGroup's web site, AI_ADDRCONFIG flag should be of use here. Should I be worried that the getaddrinfo man page on my (RHL 9) system doesn't mention AI_ADDRCONFIG?