Vladimir Volovich <[EMAIL PROTECTED]> writes: > E.g., Apache 2.0 does complain on requests like "GET > /../dir/file.html HTTP/1.0" with "HTTP/1.1 400 Bad Request" so wget > will not work properly at all.
Wget's implementation is reflects rfc1808, which explicitly requires all extraneous ".." path elements to be retained. In other words, that Wget does so is no accident, it had to be separately coded into path_simplify, as shown by this ChangeLog entry: 2003-11-14 Hrvoje Niksic <[EMAIL PROTECTED]> (path_simplify): Don't swallow ".."'s at the beginning of string. E.g. simplify "foo/../../bar" as "../bar", not as "bar". However, even rfc2396, released in 1998, relaxed this, stating in section 5.2: g) If the resulting buffer string still begins with one or more complete path segments of "..", then the reference is considered to be in error. Implementations may handle this error by retaining these components in the resolved path (i.e., treating them as part of the final URI), by removing them from the resolved path (i.e., discarding relative levels above the root), or by avoiding traversal of the reference. rfc3986 (released in 2005) goes further and, as far as I can tell, simply specifies extraneous ".." to resolve to "/". The editors apparently recognized the reality of virtually all current implementations, and Wget should do the same. I believe Frank's proposed modification is a correct fix for this. (Except the entire "else" block should be deleted, rather than just commenting out the two offending lines.)