Hi,

I believe I found a bug. While downloading a large file with wget, the 
connection failed multiple times. Wget retried with a range request until it 
had the entire file downloaded. In the resulting WARC file, all of the requests 
are present, but only the final partial response was saved.

I observed this behavior with Wget/1.21.3. Arguments were: "-O" "/dev/null" 
"--warc-file" "<redacted>" "--warc-cdx" "--warc-max-size=1G" "--input-file" 
"<redacted>"

This is pretty unfortunate, since it means that a section of the start of the 
file was just discarded by Wget. Please let me know if you'd like me to supply 
any additional information.

Reply via email to