On 10/27/12 3:35 PM, Anne van Kesteren wrote:
This is covered as we do this for all URLs currently with a "relative
scheme" (http/ws/...). I know you indicated this as potentially
problematic

Let's have that fight separately.  ;)

2)  file:// URIs are parsed as a "no authority" URL in Gecko.  Quoting the
IDL comment:
...
The parser in the specification should handle these in the same way.

Same as the comment I quoted?  As same as something else?

I have not introduced a "no authority" concept however. The parser in
the specification also preserves the host as other user agents seem to
preserve it.

Well, the Gecko parser preserves the host at this stage assuming the URI was correctly formatted with a host. Again:

  blah://foo/bar => blah://foo/bar

The interesting things happen when you have 0, 1, or 3 slashes between ':' and "foo". The handling of "foo" after this point is a separate issue.

4)  For "no authority" URLs, including file://, on Windows and OS/2 only, if
what looks like authority section looks like a drive letter, it's treated as
part of the path.  For example, "file://c:/" is treated as the filename
"c:\".  "Looks like a drive letter" is defined as "ASCII letter (any case),
followed by a ':' or '|' and then followed by end of string or '/' or '\\'".
I'm not sure why this is checking for '\\' again, honestly.  ;)

Is this part of URL parsing or part of doing something with the
resulting URL?

In Gecko, it's part of URL parsing. More precisely, it's part of the normalization performed as part of constructing a "URL" object from a string. Since this is also how we parse URLs, it's effectively all part of the package.

But note that it would be a bit odd of file://c:/ claimed to have a host of "c" with a default port or some such...

5)  When parsing a "no authority" URL (including file://), and when item 4
above does not apply, it looks like Gecko skips everything after "file://"
up until the next '/', '?', or '#' char before parsing path stuff.

So the host is dropped?

In Gecko, I believe so, yes. I'm not saying this is desirable; just what Gecko does.

6)  On Windows and OS/2, when dynamically parsing a path for a "no
authority" URL (not sure whether this is actually web-exposed, fwiw...)
Gecko will do something involving looking for a path that's only an ASCII
letter followed by ':' or '|' followed by end of string.
...
7)  When doing URI equality comparisons
...
8)  When actually resolving a file:// URL
These points do not seem to be about parsing, correct?

Well, point 6 is about parsing, sort of.

7 and 8 are not, though at some point we'll need to define equality comparisons anyway.

-Boris


Reply via email to