Here's some anomalous behaviour in wget that I've been noticing.
Occassionally, when wget has to restart a download, it will make an error in
displaying the total and remaining amounts of the transfer. Here is a
transcript of a transfer I did recently to get the getmail package from the
current branch of the Slackware mirror at Purdue University. The first line is
the wget command exactly as invoked. The rest is its output. I've prefixed
each line with a line number because it's so long.

From line 88, we see that the package is 194641 bytes large and the first
attempt to download the package retrieves 5472 bytes on line 90. So the next
try has 189169 bytes left to download, but on line 112, it reports that amount
as the total size of the file being downloaded, and subtracts the 5472 already
downloaded again to get an errant amount left.

After downloading a total of 112176 bytes at line 114, it has to restart again,
Line 136 shows the exact same errors in display as line 112. this time,
though, since more than half of the file has been gotten, the dual subtraction
of the amount already downloaded sends the displayed amount remaining into
negative territory, which at least doesn't cause wget to crash. When the
progress bar reaches its fullest extent but there's still actually data to
downlod, the equals signs just start growning backwards into the plus signs
proportionally, while the percentage indicator stays stuck at 100%, showing
the resilience of that code as well.

In the case of this bug, the file's contents themselves are downloaded
correctly, but I have another bug I've encountered that occasssionally causes
the retry to begin downloading the file from the beginning while concatenating
onto the file contents already downloaded. (See below) In that case, the final
on-disk file can be repaired by using the split command to remove everything
from the beginning of the file that shouldn't be there and then reassembling
and renaming the rest of the file.

I'd also like to take this opportunity to lobby for an argument to
--server-response that enables a user to specify which server responses he or
she is actually intersted in seeing (or not in this case). In the case of this
bug report transcript, I'd really like to suppress the 220 responce, but
without piping the output through grep, with the subsequent way in which that
causes wget to change the progress indicators, that's impossible.

  1: wget --no-host-directories --read-timeout=10 --tries=0 --server-response 
--passive-ftp --cut-dirs=4 
--output-document=slackware/n/getmail-4.6.3-noarch-1.tgz 
ftp://ftp.cerias.purdue.edu/pub/os/slackware/slackware-current/slackware/n/getmail-4.6.3-noarch-1.tgz
  2: --19:08:25--  
ftp://ftp.cerias.purdue.edu/pub/os/slackware/slackware-current/slackware/n/getmail-4.6.3-noarch-1.tgz
  3:            => `slackware/n/getmail-4.6.3-noarch-1.tgz'
  4: Resolving ftp.cerias.purdue.edu... 128.10.252.10
  5: Connecting to ftp.cerias.purdue.edu|128.10.252.10|:21... connected.
  6: Logging in as anonymous ...
  7: 220-
  8: 220-             Welcome to the CERIAS Security FTP Archive
  9: 220-
 10: 220-All activity is logged and may be monitored. If you object to this,
 11: 220-do not log into this service. Before downloading any tools, tips,
 12: 220-tricks or other bits of information from this site, please read and
 13: 220-understand all of the implications of the information provided located
 14: 220-in the root directory.
 15: 220-
 16: 220-        Limitation of Liability - README.liability
 17: 220-            Export Restrictions - README.export
 18: 220-               Copyright Notice - README.copyright
 19: 220-
 20: 220-----------------------------------------------------------------------
 21: 220-
 22: 220-             .;,
 23: 220-             iBMMMWt.
 24: 220-              :iVRMMMMY,
 25: 220-                ,tYVWBMMMi        .+IVXVYt+.
 26: 220-                   :tYYXRMW+.   :WRt;....:+YI;
 27: 220-                      .:+i++XItVR:          ,tY.
 28: 220-                           .:+RX              tR
 29: 220-                             :M ._..__. __.    Rt
 30: 220-                     _. _ ._.tR  | [__](__     IX
 31: 220-                    (_.(/,[  +M _|_|  |.__)    XY
 32: 220-                              RY              +M.
 33: 220-                              .WV.           YM,
 34: 220-                                +VI+,    ,+VBI     --vkoser
 35: 220-                                  ,;+tYVVVt:
 36: 220-
 37: 220-
 38: 220-                           Purdue University
 39: 220-
 40: 220-                       CERIAS - Security Archive
 41: 220-                ------------------------------------
 42: 220-                Center for Education and Research in
 43: 220-                 Information Assurance and Security
 44: 
220-------------------------------------------------------------------------
 45: 220- Local time is: Mon Jul  3 19:08:23 2006.  You are user 4 out of 150.
 46: 220-
 47: 220-\011The CERIAS FTP site has moved to its new home:
 48: 220-\011\011SunFire V20z server
 49: 220-\011\0111x AMD Opteron Model 242 CPU
 50: 220-\011\0111 GB RAM
 51: 220-\011\01173 GB 10,000 RPM SCSI hard drive
 52: 220-\011\0112x Gbit Ethernet
 53: 220-
 54: 220-\011The files in the  FTP Archive are now served off of a
 55: 220-\011\011Sun StorEdge RAID Array (1.5 TB total)
 56: 220-
 57: 220-   This equipment came to CERIAS as part of a very generous donation
 58: 220-                          from Sun Microsystems
 59: 220-
 60: 220-     To report a problem, contact [EMAIL PROTECTED]
 61: 220-
 62: 220-    For more information on CERIAS go to http://www.cerias.purdue.edu
 63: 220 128.10.252.10 FTP server ready
 64: --> USER anonymous
 65:
 66: 331 Anonymous login ok, send your complete email address as your password.
 67: --> PASS Turtle Power!
 68:
 69: 230 Anonymous access granted, restrictions apply.
 70: --> SYST
 71:
 72: 215 UNIX Type: L8
 73: --> PWD
 74:
 75: 257 "/" is current directory.
 76: --> TYPE I
 77:
 78: 200 Type set to I
 79: --> CWD /pub/os/slackware/slackware-current/slackware/n
 80:
 81: 250 CWD command successful
 82: --> PASV
 83:
 84: 227 Entering Passive Mode (128,10,252,10,248,254).
 85: --> RETR getmail-4.6.3-noarch-1.tgz
 86:
 87: 150 Opening BINARY mode data connection for getmail-4.6.3-noarch-1.tgz 
(194641 bytes)
 88: Length: 194,641 (190K) (unauthoritative)
 89:
 90:  2% [====>                                                                 
                                                                                   
                                ] 5,472         --.--K/s    ETA 06:40
 91:
 92: 19:08:42 (452.60 B/s) - Data connection: Connection timed out;
 93: 451 Transfer aborted. Broken pipe
 94: Data transfer aborted.
 95: Retrying.
 96:
 97: --19:08:43--  
ftp://ftp.cerias.purdue.edu/pub/os/slackware/slackware-current/slackware/n/getmail-4.6.3-noarch-1.tgz
 98:   (try: 2) => `slackware/n/getmail-4.6.3-noarch-1.tgz'
 99: ==> CWD not required.
100: --> SIZE getmail-4.6.3-noarch-1.tgz
101:
102: 550 getmail-4.6.3-noarch-1.tgz: Permission denied
103: --> PASV
104:
105: 227 Entering Passive Mode (128,10,252,10,248,105).
106: --> REST 5472
107:
108: 350 Restarting at 5472. Send STORE or RETRIEVE to initiate transfer
109: --> RETR getmail-4.6.3-noarch-1.tgz
110:
111: 150 Opening BINARY mode data connection for getmail-4.6.3-noarch-1.tgz 
(189169 bytes)
112: Length: 189,169 (185K), 183,697 (179K) remaining (unauthoritative)
113:
114: 59% 
[+++++=======================================================================================================>
                                                                            ] 
112,176       --.--K/s    ETA 00:39
115:
116: 19:09:39 (1.90 KB/s) - Data connection: Connection timed out;
117: 451 Transfer aborted. Broken pipe
118: Data transfer aborted.
119: Retrying.
120:
121: --19:09:42--  
ftp://ftp.cerias.purdue.edu/pub/os/slackware/slackware-current/slackware/n/getmail-4.6.3-noarch-1.tgz
122:   (try: 3) => `slackware/n/getmail-4.6.3-noarch-1.tgz'
123: ==> CWD not required.
124: --> SIZE getmail-4.6.3-noarch-1.tgz
125:
126: 550 getmail-4.6.3-noarch-1.tgz: Permission denied
127: --> PASV
128:
129: 227 Entering Passive Mode (128,10,252,10,248,190).
130: --> REST 112176
131:
132: 350 Restarting at 112176. Send STORE or RETRIEVE to initiate transfer
133: --> RETR getmail-4.6.3-noarch-1.tgz
134:
135: 150 Opening BINARY mode data connection for getmail-4.6.3-noarch-1.tgz 
(82465 bytes)
136: Length: 82,465 (81K), -29,711 (-29711) remaining (unauthoritative)
137:
138: 
100%[++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++==============================================================================>]
 194,641        2.98K/s    ETA 00:00
139:
140: 226 Transfer complete.
141: 19:10:21 (2.14 KB/s) - `slackware/n/getmail-4.6.3-noarch-1.tgz' saved 
[194641]
142:

In the case of the bug which actually mangles the resultant file size, it is
always accompanied by an "Error in server response" message whether
--server-response is turned on or not. I've only ever experienced it from the
mirror.datapipe.net FTP server, but that may be meaningless. Here's the
transcript that demonstrates this bug.

  1: wget --no-host-directories --read-timeout=10 --tries=0 --server-response 
--cut-dirs=3 --output-document=extras/office/libchipcard2-2.1.4-i486-1frg.tgz 
ftp://mirror.datapipe.net/norlug/frg/frg-current/extras/office/libchipcard2-2.1.4-i486-1frg.tgz
  2: --00:40:46--  
ftp://mirror.datapipe.net/norlug/frg/frg-current/extras/office/libchipcard2-2.1.4-i486-1frg.tgz
  3:            => `extras/office/libchipcard2-2.1.4-i486-1frg.tgz'
  4: Resolving mirror.datapipe.net... 64.27.65.115
  5: Connecting to mirror.datapipe.net|64.27.65.115|:21... connected.
  6: Logging in as anonymous ...
  7: 220 Welcome to DataPipe's FTP service.
  8: --> USER anonymous
  9:
 10: 331 Please specify the password.
 11: --> PASS Turtle Power!
 12:
 13: 230 Login successful.
 14: --> SYST
 15:
 16: 215 UNIX Type: L8
 17: --> PWD
 18:
 19: 257 "/"
 20: --> TYPE I
 21:
 22: 200 Switching to Binary mode.
 23: --> CWD /norlug/frg/frg-current/extras/office
 24:
 25: 250 Directory successfully changed.
 26: --> PASV
 27:
 28: 227 Entering Passive Mode (64,27,65,115,122,33)
 29: --> RETR libchipcard2-2.1.4-i486-1frg.tgz
 30:
 31: 150 Opening BINARY mode data connection for 
libchipcard2-2.1.4-i486-1frg.tgz (601263 bytes).
 32: Length: 601,263 (587K) (unauthoritative)
 33:
 34: 46% 
[=====================================================================================>
                                                                                   
                ] 281,808       --.--K/s    ETA 03:28
 35:
 36: 00:43:58 (1.50 KB/s) - Data connection: Connection timed out;
 37: Control connection closed.
 38: Retrying.
 39:
 40: --00:44:09--  
ftp://mirror.datapipe.net/norlug/frg/frg-current/extras/office/libchipcard2-2.1.4-i486-1frg.tgz
 41:   (try: 2) => `extras/office/libchipcard2-2.1.4-i486-1frg.tgz'
 42: Connecting to mirror.datapipe.net|64.27.65.115|:21... connected.
 43: Logging in as anonymous ...
 44: 220 Welcome to DataPipe's FTP service.
 45: --> USER anonymous
 46:
 47: 331 Please specify the password.
 48: --> PASS Turtle Power!
 49:
 50: 230 Login successful.
 51: --> SYST
 52:
 53: 215 UNIX Type: L8
 54: --> PWD
 55:
 56: 257 "/"
 57: --> TYPE I
 58:
 59: 200 Switching to Binary mode.
 60: --> CWD /norlug/frg/frg-current/extras/office
 61:
 62: 250 Directory successfully changed.
 63: --> SIZE libchipcard2-2.1.4-i486-1frg.tgz
 64:
 65: 213 601263
 66: --> PASV
 67:
 68: 227 Entering Passive Mode (64,27,65,115,169,9)
 69: --> REST 281808
 70:
 71: 350 Restart position accepted (281808).
 72: --> RETR libchipcard2-2.1.4-i486-1frg.tgz
 73:
 74: 150 Opening BINARY mode data connection for 
libchipcard2-2.1.4-i486-1frg.tgz (601263 bytes).
 75: Length: 601,263 (587K), 319,455 (312K) remaining
 76:
 77: 65% 
[++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++=================================>
                                                                 ] 392,616       
--.--K/s    ETA 03:03
 78:
 79: 00:45:55 (1.11 KB/s) - Data connection: Connection timed out;
 80: Control connection closed.
 81: Retrying.
 82:
 83; --00:46:07--  
ftp://mirror.datapipe.net/norlug/frg/frg-current/extras/office/libchipcard2-2.1.4-i486-1frg.tgz
 84:   (try: 3) => `extras/office/libchipcard2-2.1.4-i486-1frg.tgz'
 85: Connecting to mirror.datapipe.net|64.27.65.115|:21... connected.
 86: Logging in as anonymous ...
 87: 220 Welcome to DataPipe's FTP service.
 88: --> USER anonymous
 89:
 90: 331 Please specify the password.
 91: --> PASS Turtle Power!
 92:
 93;
 94: Error in server response, closing control connection.
 95: Retrying.
 96;
 97: --00:46:21--  
ftp://mirror.datapipe.net/norlug/frg/frg-current/extras/office/libchipcard2-2.1.4-i486-1frg.tgz
 98:   (try: 4) => `extras/office/libchipcard2-2.1.4-i486-1frg.tgz'
 99: Connecting to mirror.datapipe.net|64.27.65.115|:21... connected.
100: Logging in as anonymous ...
101: 220 Welcome to DataPipe's FTP service.
102: --> USER anonymous
103:
104: 331 Please specify the password.
105: --> PASS Turtle Power!
106:
107: 230 Login successful.
108: --> SYST
109:
110: 215 UNIX Type: L8
111: --> PWD
112:
113: 257 "/"
114: --> TYPE I
115:
116: 200 Switching to Binary mode.
117: --> CWD /norlug/frg/frg-current/extras/office
118:
119: 250 Directory successfully changed.
120: --> PASV
121:
122: 227 Entering Passive Mode (64,27,65,115,162,74)
123: --> RETR libchipcard2-2.1.4-i486-1frg.tgz
124:
125: 150 Opening BINARY mode data connection for 
libchipcard2-2.1.4-i486-1frg.tgz (601263 bytes).
126: Length: 601,263 (587K) (unauthoritative)
127:
128: 
100%[========================================================================================================================================================================================>]
 601,263        2.48K/s    ETA 00:00
129:
130: 226 File send OK.
131: 00:52:09 (1.75 KB/s) - `extras/office/libchipcard2-2.1.4-i486-1frg.tgz' 
saved [601263]
132:

For whatever reason, at line 123 was a full RETR command without an
intervening REST 392616 command as the final progress incidator at line 77
would seem to require, and it happened with the server response error at line
94.

Still, it would seem to have just downloaded the already-downloaded parts all
over again, but the file should be fine. But a ls -l shows it's way oversized:

> extras/office$ v libchipcard2-2.1.4-i486-1frg.tgz
> -rw-r--r-- 1 garrett users 993879 2006-05-18 23:06 
libchipcard2-2.1.4-i486-1frg.tgz

993879 is larger than the actual file size shown above in lines 32, 75, 126,
and 128, which is 601263. The difference is 392616 as shown in line 77. So to
fix it, I just do this:

> split -b 392616 libchipcard2-2.1.4-i486-1frg.tgz &&
> rm xaa &&
> cat x[[:lower:]][[:lower:]] > libchipcard2-2.1.4-i486-1frg.tgz &&
> rm x[[:lower:]][[:lower:]]

--
Cathy Garrett
[EMAIL PROTECTED]

Reply via email to