On 11/02/2009, at 12:05 PM, Thomas Roessler wrote:
(diverting to www-talk, too...)
On 11 Feb 2009, at 01:20, Mark Nottingham wrote:
Yeah, I'm not completely happy with it yet. The thought was that
since blank lines don't introduce ambiguity here, they're not
harmful. OTOH one of my goals for the format is to allow existing
HTTP header and MIME parsers (e.g., in Python) to be used on the
format, and they very well may barf on a blank line.
Well, they'll barf on blank lines and declare the header over;
changing that within the parser (or just restarting it on the rest
of the file) should be relatively cheap.
This assumes that people will be comfortable modifying libraries. IME
people tend to treat them as magical black boxes that shouldn't be
opened (or even questioned) under any circumstances...
BTW, I notice that this draft is silent on the HTTP header syntax's
combining feature for multiple occurences of the same field (last
paragraph of 4.2, RFC 2616); I suspect that to be one of the more
likely causes for surprises if HTTP header parsers are re-used. (No
such risk with MIME parsers.)
I'll add a note.
Finally, why disallow whitespace stuffed folding? It's pretty
useful to make long lines editable, and I suspect that we're
assuming /host-meta to be the product of some human with emacs in
their hands. ;-) Implementing it is easy, and a given if existing
parsers are used.
Not necessarily; it's not very widely supported, IME.
So, the right thing to do might be to explicitly disallow them,
both in BNF and prose. Eran, thoughts?
I'd just prefer to not have the BNF say "no empty lines", and then
have prose that says the opposite, but with a SHOULD.
5. Minting New meta-fields
Applications that wish to mint new meta-fields for use in the
host- meta format MUST register them in the host-meta field-
registry, following the procedures in Section 7.2. Field-names
MUST conform to the field-name ABNF Section 3, and field-value
syntax MUST be well- defined (e.g., using ABNF, or a reference to
the syntax of an existing header field-value). Field-values
SHOULD use the ISO-859-1 character encoding. If a field-value
applies to a scope other than the entire authority, that scope
MUST be well-defined.
Editorial nit: ISO-8859-1 is missing an 8 here.
That one always gets me, thanks.
More substantially, is there any particular reason to not just go
with utf-8 here? After all, the content type is *appplication*/
host-meta anyway.
Same as above; allowing existing parsers and serialisation
libraries to be used. That said, there have been many arguments in
HTTPbis that existing libraries won't harm non-ASCII characters in
transit, but IIRC no one has actually gone out and surveyed what
they do...
That suggests that it's a coin toss, unless the mythical "someone"
does that work. May I, in that event, suggest that we use a coin
biased in favor of broader internationalization, i.e., UTF-8?
Well, the other side of the coin is interoperability, something that
is also close to our collective hearts.
OTOH we're talking about a SHOULD here. Maybe it just needs more
careful guidance; i.e., that you should stick to ASCII unless you're
conveying elements for presentation to end users.
--
Mark Nottingham m...@yahoo-inc.com