Daniel Pittman writes:
> On Mon, 10 Jan 2000, Pete Forman <[EMAIL PROTECTED]>
> wrote:
>
> > Kai Gro�johann writes:
> > > Pete Forman <[EMAIL PROTECTED]> writes:
> > >
> > > > [...] The ftp URL scheme to access an absolute path is,
> > > > e.g. ftp://user@host/%2Fetc/magic. You might think that
> > > > ftp://user@host//etc/magic would work but that actually
> > > > specifies a relative path, i.e. ~user/etc/magic. [...]
> > >
> > > I thought that ftp://user@host/etc/magic was the equivalent of
> > > /user@host:/etc/magic, and that there was no way to specify
> > > relative file names in `ftp' URLs? Gotta read that RFC, I
> > > guess...
> >
> > Yup. Out on the Internet it rarely matters as most ftp accesses
> > are anonymous where relative and absolute pathnames are the same.
>
> Nope. That's not quite right, I fear.
>
> The ftp scheme was defined, quite carefully, to be usable just like
> a standard ftp client.
Agreed. But they missed an opportunity to specify an absolute path
more elegantly. See the end of this post.
> In the regexp `ftp://user@host/(.*)', \1 is treated as two things:
> a filename to fetch and the path /exactly/ as you would type it at
> a `cd' command.
If we're going to be picky about this, the path is a series of
elements which would be typed at several `cd' commands. An example is
ftp://user@host/%2Fvar/spool/mail/user
open host
...
cd /var
cd spool
cd mail
get user
That could have be done as a single cd only if the URL became
ftp://user@host/%2Fvar%2Fspool%2Fmail/user. Remember that the '/'s in
a URL delimit <cwd> elements. It happens to be the same character as
is used in UNIX file systems. URLs also work with DOS or mainframe
file systems.
It would be possible for an FTP client which found itself talking to a
server that groks UNIX to generate a single CWD command. That would
be an implementation detail, though.
> So, if you log in as a user and sit in your home directory by
> default you need to specify a `/' to get to the root level. If you
> don't specify that (eg: ftp://.../path/file), you get the file
> relative to your login directory.
Yes, but in a URL the only way to specify a `/' to get to the root
level is to encode it as %2F.
> Anonymous paths without the leading `/' work because you are placed at
> the top of the tree rather than in a user home directory.
That's what I was saying.
Here's the relevant section of RFC 1738 in full.
3.2. FTP
The FTP URL scheme is used to designate files and directories on
Internet hosts accessible using the FTP protocol (RFC959).
A FTP URL follow the syntax described in Section 3.1. If :<port> is
omitted, the port defaults to 21.
3.2.1. FTP Name and Password
A user name and password may be supplied; they are used in the ftp
"USER" and "PASS" commands after first making the connection to the
FTP server. If no user name or password is supplied and one is
requested by the FTP server, the conventions for "anonymous" FTP are
to be used, as follows:
The user name "anonymous" is supplied.
The password is supplied as the Internet e-mail address
of the end user accessing the resource.
If the URL supplies a user name but no password, and the remote
server requests a password, the program interpreting the FTP URL
should request one from the user.
3.2.2. FTP url-path
The url-path of a FTP URL has the following syntax:
<cwd1>/<cwd2>/.../<cwdN>/<name>;type=<typecode>
Where <cwd1> through <cwdN> and <name> are (possibly encoded) strings
and <typecode> is one of the characters "a", "i", or "d". The part
";type=<typecode>" may be omitted. The <cwdx> and <name> parts may be
empty. The whole url-path may be omitted, including the "/"
delimiting it from the prefix containing user, password, host, and
port.
The url-path is interpreted as a series of FTP commands as follows:
Each of the <cwd> elements is to be supplied, sequentially, as the
argument to a CWD (change working directory) command.
If the typecode is "d", perform a NLST (name list) command with
<name> as the argument, and interpret the results as a file
directory listing.
Otherwise, perform a TYPE command with <typecode> as the argument,
and then access the file whose name is <name> (for example, using
the RETR command.)
Within a name or CWD component, the characters "/" and ";" are
reserved and must be encoded. The components are decoded prior to
their use in the FTP protocol. In particular, if the appropriate FTP
sequence to access a particular file requires supplying a string
containing a "/" as an argument to a CWD or RETR command, it is
necessary to encode each "/".
For example, the URL <URL:ftp:[EMAIL PROTECTED]/%2Fetc/motd> is
interpreted by FTP-ing to "host.dom", logging in as "myname"
(prompting for a password if it is asked for), and then executing
"CWD /etc" and then "RETR motd". This has a different meaning from
<URL:ftp:[EMAIL PROTECTED]/etc/motd> which would "CWD etc" and then
"RETR motd"; the initial "CWD" might be executed relative to the
default directory for "myname". On the other hand,
<URL:ftp:[EMAIL PROTECTED]//etc/motd>, would "CWD " with a null
argument, then "CWD etc", and then "RETR motd".
FTP URLs may also be used for other operations; for example, it is
possible to update a file on a remote file server, or infer
information about it from the directory listings. The mechanism for
doing so is not spelled out here.
3.2.3. FTP Typecode is Optional
The entire ;type=<typecode> part of a FTP URL is optional. If it is
omitted, the client program interpreting the URL must guess the
appropriate mode to use. In general, the data content type of a file
can only be guessed from the name, e.g., from the suffix of the name;
the appropriate type code to be used for transfer of the file can
then be deduced from the data content of the file.
3.2.4 Hierarchy
For some file systems, the "/" used to denote the hierarchical
structure of the URL corresponds to the delimiter used to construct a
file name hierarchy, and thus, the filename will look similar to the
URL path. This does NOT mean that the URL is a Unix filename.
3.2.5. Optimization
Clients accessing resources via FTP may employ additional heuristics
to optimize the interaction. For some FTP servers, for example, it
may be reasonable to keep the control connection open while accessing
multiple URLs from the same server. However, there is no common
hierarchical model to the FTP protocol, so if a directory change
command has been given, it is impossible in general to deduce what
sequence should be given to navigate to another directory for a
second retrieval, if the paths are different. The only reliable
algorithm is to disconnect and reestablish the control connection.
==== end of quote
Now it would have been useful if the RFC had defined a null <cwd1> as
meaning "change directory to root". But it did not and there are no
relevant updates AFAIK.
Can anyone think of a reason why that special treatment of null <cwd1>
should not have been allowed?
There is a minor hurdle to doing that in emacs. Interactive file name
completion interprets two adjacent '/'s as meaning "start again from
the root".
--
Pete Forman
Western Geophysical
[EMAIL PROTECTED]