Dave, On 11/08/2012 03:59, Dave Thaler wrote: > Brian Carpenter writes: >> On 09/08/2012 22:31, Stuart Cheshire wrote: >>> At the meeting in Vancouver, Dave Thaler made a point that I found >>> convincing: >>> >>> Where is the character set for IPv6 zone IDs specified? >> RFC 4007 doesn't do so, but can be read to imply ASCII. > > How? RFC 4007 says: >> An implementation MAY support other kinds of non-null strings as >> <zone_id>. However, the strings must not conflict with the delimiter >> character. The precise format and semantics of additional strings is >> implementation dependent.
Yes, it says that, but the context to me implies ASCII. We could argue about that for a long time, so let's not bother... > > So it's completely implementation dependent, the only restriction being > that % and null are disallowed. > >> draft-ietf-6man-uri-zoneid-02 is explicit that it refers to >> the URI character set, which is ASCII: >> >> A <zone_id> SHOULD contain only ASCII characters classified >> in RFC 3986 as "unreserved". The draft isn't clear enough (yet) but my idea was that this was part of the update to RFC 4007. >> >> But it allows percent encoding in a URI, which is necessary because of >> the SHOULD: >> >> ZoneID = 1*( unreserved / pct-encoded ) > > ZoneID needs to allow (including via percent-encoding) the same characters > as are allowed in <zone_id> in RFC 4007. For example the ']' > character would be legal in RFC 4007 but would have to be percent > encoded in a URI. Yes > >>> If we accept >>> that future interface names might include non-roman characters, then we >>> have to assume that to allow safe unambiguous use in URIs, interface >>> names have to undergo escaping. >> If we want to internationalise the ZoneID, that would be a whole >> other discussion. > > It's already allowed by RFC 4007 as far as I know. Well, again, it's a matter of interpretation; the question is simply not addressed, which is a defect in the document IMHO. > > Stuart's email is an accurate summary of my position. Yes, but that doesn't help with the %251 problem, which is where we got stuck some months ago and came to the initial decision to add a new delimiter. If people don't want to solve that problem, i.e. accept that %251 in a URI is %1 in ping, and that %251 in ping is %25251 in a URI, then we're done. I'm here as a document editor, looking for guidance. Brian Brian > > -Dave > >>> And if the interface name itself is going to be escaped using URI "%xx" >>> notation, then why not escape the '%' the same way? >> My impression is that this WG has already objected to that, which is why >> we ended up with the current proposal. I leave the next step to the WG Chair. >> >> Brian >> >>> This argues in support of what Microsoft already did: Encode '%' as "%25". >>> >>> It's not my favourite outcome, but based on Dave Thaler's comment, it's >>> the one that gets my vote. >>> >>> In the spirit of "be liberal with what you accept" the doc should also >>> advocate that URI parsers are forgiving about accepting bare '%' signs >>> -- i.e. a '%' not followed by two valid hex characters is left >>> untouched. This lets a human user copy-and-paste "fe80::a%en1" from a >>> "ping" command and have it work, though the strictly correct form (which >>> URI generators should output) remains "fe80::a%25en1". >>> >>> Stuart Cheshire > -------------------------------------------------------------------- IETF IPv6 working group mailing list ipv6@ietf.org Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6 --------------------------------------------------------------------