(diverting to www-talk, too...)
On 11 Feb 2009, at 01:20, Mark Nottingham wrote:
Yeah, I'm not completely happy with it yet. The thought was that
since blank lines don't introduce ambiguity here, they're not
harmful. OTOH one of my goals for the format is to allow existing
HTTP header and MIME parsers (e.g., in Python) to be used on the
format, and they very well may barf on a blank line.
Well, they'll barf on blank lines and declare the header over;
changing that within the parser (or just restarting it on the rest of
the file) should be relatively cheap.
BTW, I notice that this draft is silent on the HTTP header syntax's
combining feature for multiple occurences of the same field (last
paragraph of 4.2, RFC 2616); I suspect that to be one of the more
likely causes for surprises if HTTP header parsers are re-used. (No
such risk with MIME parsers.)
Finally, why disallow whitespace stuffed folding? It's pretty useful
to make long lines editable, and I suspect that we're assuming /host-
meta to be the product of some human with emacs in their hands. ;-)
Implementing it is easy, and a given if existing parsers are used.
So, the right thing to do might be to explicitly disallow them, both
in BNF and prose. Eran, thoughts?
I'd just prefer to not have the BNF say "no empty lines", and then
have prose that says the opposite, but with a SHOULD.
5. Minting New meta-fields
Applications that wish to mint new meta-fields for use in the
host- meta format MUST register them in the host-meta field-
registry, following the procedures in Section 7.2. Field-names
MUST conform to the field-name ABNF Section 3, and field-value
syntax MUST be well- defined (e.g., using ABNF, or a reference to
the syntax of an existing header field-value). Field-values SHOULD
use the ISO-859-1 character encoding. If a field-value applies to
a scope other than the entire authority, that scope MUST be well-
defined.
Editorial nit: ISO-8859-1 is missing an 8 here.
That one always gets me, thanks.
More substantially, is there any particular reason to not just go
with utf-8 here? After all, the content type is *appplication*/
host-meta anyway.
Same as above; allowing existing parsers and serialisation libraries
to be used. That said, there have been many arguments in HTTPbis
that existing libraries won't harm non-ASCII characters in transit,
but IIRC no one has actually gone out and surveyed what they do...
That suggests that it's a coin toss, unless the mythical "someone"
does that work. May I, in that event, suggest that we use a coin
biased in favor of broader internationalization, i.e., UTF-8?