Yazeed Hamid wrote:
I'm using wget version 1.10.2 on cygwin running on Windows Vista (v
6.0.6000 Build 6000).
When there is a proxy to go through, the corresponsent proxy
address:port is exported to the environment (export http_proxy=$proxy")
The problem is, when wget is working through a proxy, it doesn't seem to
reuse the existing
http connection like it does when --proxy=off is set. When there is no
proxy, all objects
referenced in an html file are fetched over the same connection. On the
other hand, when I am
going through a proxy, each object is fetched in a new connection
although the Connection: Keep-Alive header
is both in the http request and response messages.
As a result, the measured response time through a proxy is very
much greater than that through direct connection <no proxy>.
----
I just tried both of your examples to mcfee.com (taking 3.56 and 20.42
seconds). My proxy server is on a linux machine, so I had to run my tests
from linux. I couldn't replicate your problem. My output:
time wget -pdEk --delete-after --proxy=off -o log-no-proxy-w-debug www.mcafee.com
Setting --html-extension (htmlextension) to 1;
Setting --convert-links (convertlinks) to 1
Setting --delete-after (deleteafter) to 1
Setting --proxy (useproxy) to off
Setting --output-file (logfile) to log-no-proxy-w-debug
2.62sec 0.01usr 0.02sys (1.44% cpu)
time wget -pdEk --delete-after --proxy=on -o log-no-proxy-w-debug www.mcafee.com
Setting --html-extension (htmlextension) to 1
Setting --convert-links (convertlinks) to 1
Setting --delete-after (deleteafter) to 1
Setting --proxy (useproxy) to on
Setting --output-file (logfile) to log-no-proxy-w-debug
2.56sec 0.02usr 0.03sys (2.14% cpu)
You are running on Windows. MS networking isn't known for its
speed -- especially on open/close operations, but I wouldn't think it
would be that bad. They deliberately put in slowdowns on non-server
editions of Windows starting in XP-SP2 on opening some types
of network connections -- that could be part of the cause -- but
again, a 17 second delay seems unreasonable.
Also depends on what your proxy server does. I know
squid has parameters (persistent_request_timeout, client_lifetime,
pconn_timeout) to set the timeout for re-usable connections.
While the defaults in a standard 'squid' setup are reasonable,
You didn't specify what proxy you were using nor do we know how
it is configured.
For What Its Worth -- I tried the wget statement on
my Windows-xp box. I only tested the 'with-proxy' case, since
my windows box isn't on the external net (has to go through
the linux proxy). It came out with times similar to those
run on the proxy machine: 2.68sec 0.04usr 0.10sys (5.72% cpu)
I'd check the proxy. My linux wget-1.10.1, and windows
wget (under cygwin) = 1.10.2.
Good luck.