Package: wget
Version: 1.12-3.1
Severity: normal

Hi,

While using "wget -c -r" on a directory of large binary files, I
noticed long delays after the "The file is already fully retrieved;
nothing to do." message.

It turns out this is because the server returned a 416 response with
Content-Type: text/html, and so Wget decides to scan the file for
links, as if it were HTML.  But the file is not HTML -- just the 416
response body was.

Example:

$ cd /tmp
$ wget -c -d -r http://www.gnu.org/graphics/t-desktop-4-small.jpg

(The file is downloaded as expected, and not scanned for URLs)

$ wget -c -d -r http://www.gnu.org/graphics/t-desktop-4-small.jpg

(This time, notice in the debug output how the file was "Loaded" and
scanned for "no-follow" links.  This is the source of the delay on
large binary files).

   ---response begin---
   HTTP/1.1 416 Requested Range Not Satisfiable
   Date: Mon, 16 May 2011 21:34:24 GMT
   Server: Apache
   Vary: Accept-Encoding
   Connection: close
   Content-Type: text/html; charset=iso-8859-1
   
   ---response end---
   416 Requested Range Not Satisfiable
   
       The file is already fully retrieved; nothing to do.
   
   Closed fd 3
   Loaded www.gnu.org/graphics/t-desktop-4-small.jpg (size 30195).
   no-follow in www.gnu.org/graphics/t-desktop-4-small.jpg: 0

-jim


-- System Information:
Debian Release: 6.0.1
  APT prefers stable
  APT policy: (200, 'stable'), (150, 'oldstable'), (80, 'testing'), (50, 
'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.37-trunk-amd64 (SMP w/4 CPU cores)
Locale: LANG=POSIX, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages wget depends on:
ii  dpkg                      1.15.8.10      Debian package management system
ii  install-info              4.13a.dfsg.1-6 Manage installed documentation in 
ii  libc6                     2.11.2-10      Embedded GNU C Library: Shared lib
ii  libssl1.0.0               1.0.0d-1       SSL shared libraries

wget recommends no packages.

wget suggests no packages.

-- no debconf information



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to