Greetings,
Stumbled across a bug yesterday reproduced in both v1.8.2 and 1.10.2.
Apparently, recursive get tries to open the file for reading after
downloading, to download subsequent files. Problem is, when used with
-O - to deliver to stdout, it cannot open that file, so you get the
output below (note the "No such file or directory error"). In 1.10,
it appears that they removed this error message, but wget still fails
to recursively fetch.
I realize it seems like there wouldn't be much reason to send more
than one page to stdout, but I'm feeding it all into a statistical
filter to classify website data, so it doesn't really matter to the
filter. Do you know of any workaround for this, other than opening
the files after reading (won't scale with thousands per minute).
Thanks!
$ wget -O - -r http://www.zdziarski.com > out
--15:40:06-- http://www.zdziarski.com/
=> `-'
Resolving www.zdziarski.com... done.
Connecting to www.zdziarski.com[209.51.159.242]:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 24,275 [text/html]
100%[>] 24,275 163.49K/s
ETA 00:00
15:40:06 (163.49 KB/s) - `-' saved [24275/24275]
www.zdziarski.com/index.html: No such file or directory
FINISHED --15:40:06--
Downloaded: 24,275 bytes in 1 files
Jonathan