"William H. Gilmore" <[EMAIL PROTECTED]> writes:

> I have recently tripped across a bug with the version of wget shipped
> with RedHat 7.2.  When I attempt to recursively retrieve a web tree
> starting with an html link that contains a base href, wget apparently
> limits all href to base href even if another absolute path is
> specified.  You can verify this with the following command.

You didn't provide the command.  And I'm not exactly sure what you
mean by "limits all href to base href".  The way base href works is,
every URL gets merged with the base href URL.  The merging process
should correctly handle absolute paths.

For instance if base href is "http://www.server.com/foo/";, the URL
"/bar/index.html" will be merged as
"http://www.server.com/bar/index.html";, i.e. the initial slash in the
URL overrode the "foo" part of the base URL.

> I cannot provide you with the site that I identified the problem
> with because of security reasons.

Understood.  It could still be possible to make a minimum Wget run
that demonstrate the problem with `-d -o log'.  After that, `log'
should contain a full debugging dump of the download.  Replace your
site name with "www.server.com", and the identity of your site should
be protected.

Reply via email to