> wget --version GNU Wget 1.12 built on linux-gnu. To reproduce:
Go to any sourceforge project and download a file whos URL contains a space. Copy the "direct link" from the download page into wget -i- Run wireshark and press ^D in the wget input stream. If the upstream strips spaces (e.g. squid, default setting in pfsense) the download goes round in circles. The bug does not exist in wget when passing the URL on the command line. I always use -i- because of all the shell crud in URLs. I am using the openSUSE 11.4 version, but the only source code change is additional support for libproxy. Problem: Looking at the source, in main.c url_parse() is called for each URL from the command line. For -i, it calls retrieve_from_file(). retrieve_from_file() (in retr.c) reads a list of URLs from the given file. It then calls url_parse() only if IRI is enabled (which in my version of wget is not even compiled in). Hence the URL is never parsed and never encoded before being downloaded with retrieve_url(). That's a bug. The fix is probably to always call url_parse() in retrieve_from_file(), and not only when IRI is turned on. If this goes to a mailing list, please cc me on replies, I am not subscribed. Thanks, Volker -- Volker Kuhlmann http://volker.dnsalias.net/
