Vladimir Volovich <[EMAIL PROTECTED]> writes:

> E.g., Apache 2.0 does complain on requests like "GET
> /../dir/file.html HTTP/1.0" with "HTTP/1.1 400 Bad Request" so wget
> will not work properly at all.

Wget's implementation is reflects rfc1808, which explicitly requires
all extraneous ".." path elements to be retained.  In other words,
that Wget does so is no accident, it had to be separately coded into
path_simplify, as shown by this ChangeLog entry:

2003-11-14  Hrvoje Niksic  <[EMAIL PROTECTED]>

        (path_simplify): Don't swallow ".."'s at the beginning of string.
        E.g. simplify "foo/../../bar" as "../bar", not as "bar".

However, even rfc2396, released in 1998, relaxed this, stating in
section 5.2:

      g) If the resulting buffer string still begins with one or more
         complete path segments of "..", then the reference is
         considered to be in error.  Implementations may handle this
         error by retaining these components in the resolved path
         (i.e., treating them as part of the final URI), by removing
         them from the resolved path (i.e., discarding relative levels
         above the root), or by avoiding traversal of the reference.

rfc3986 (released in 2005) goes further and, as far as I can tell,
simply specifies extraneous ".." to resolve to "/".  The editors
apparently recognized the reality of virtually all current
implementations, and Wget should do the same.

I believe Frank's proposed modification is a correct fix for this.
(Except the entire "else" block should be deleted, rather than just
commenting out the two offending lines.)

Reply via email to