Bill Moseley <[EMAIL PROTECTED]> writes: > I'm not clear what URI should do here. In a spider fetching /../foo then > fetches /../../foo and so on. > > > perl -MURI -le 'print URI->new_abs("../foo","http://root.com")->as_string' > http://root.com/../foo > > I can fix $uri->path, of course.
The ".." are left on purpose. RFC 2396 has this to say: g) If the resulting buffer string still begins with one or more complete path segments of "..", then the reference is considered to be in error. Implementations may handle this error by retaining these components in the resolved path (i.e., treating them as part of the final URI), by removing them from the resolved path (i.e., discarding relative levels above the root), or by avoiding traversal of the reference. And it also says this in the test-case appendix: Parsers must be careful in handling the case where there are more relative path ".." segments than there are hierarchical levels in the base URI's path. Note that the ".." syntax cannot be used to change the authority component of a URI. ../../../g = http://a/../g ../../../../g = http://a/../../g In practice, some implementations strip leading relative symbolic elements (".", "..") after applying a relative URI calculation, based on the theory that compensating for obvious author errors is better than allowing the request to fail. Thus, the above two references will be interpreted as "http://a/g" by some implementations.