Dave,

On 11/08/2012 03:59, Dave Thaler wrote:
> Brian Carpenter writes:
>> On 09/08/2012 22:31, Stuart Cheshire wrote:
>>> At the meeting in Vancouver, Dave Thaler made a point that I found
>>> convincing:
>>>
>>> Where is the character set for IPv6 zone IDs specified?
>> RFC 4007 doesn't do so, but can be read to imply ASCII.
> 
> How?  RFC 4007 says:
>> An implementation MAY support other kinds of non-null strings as
>> <zone_id>.  However, the strings must not conflict with the delimiter
>> character.  The precise format and semantics of additional strings is
>> implementation dependent.

Yes, it says that, but the context to me implies ASCII. We could argue
about that for a long time, so let's not bother...

> 
> So it's completely implementation dependent, the only restriction being
> that % and null are disallowed.
> 
>> draft-ietf-6man-uri-zoneid-02 is explicit that it refers to
>> the URI character set, which is ASCII:
>>
>>    A <zone_id> SHOULD contain only ASCII characters classified
>>    in RFC 3986 as "unreserved".

The draft isn't clear enough (yet) but my idea was that this was part
of the update to RFC 4007.

>>
>> But it allows percent encoding in a URI, which is necessary because of
>> the SHOULD:
>>
>>    ZoneID = 1*( unreserved / pct-encoded )
> 
> ZoneID needs to allow (including via percent-encoding) the same characters
> as are allowed in <zone_id> in RFC 4007.  For example the ']'
> character would be legal in RFC 4007 but would have to be percent
> encoded in a URI.

Yes

> 
>>> If we accept
>>> that future interface names might include non-roman characters, then we
>>> have to assume that to allow safe unambiguous use in URIs, interface
>>> names have to undergo escaping.
>> If we want to internationalise the ZoneID, that would be a whole
>> other discussion.
> 
> It's already allowed by RFC 4007 as far as I know.

Well, again, it's a matter of interpretation; the question is
simply not addressed, which is a defect in the document IMHO.

> 
> Stuart's email is an accurate summary of my position.

Yes, but that doesn't help with the %251 problem, which is where
we got stuck some months ago and came to the initial decision to
add a new delimiter. If people don't want to solve that problem,
i.e. accept that %251 in a URI is %1 in ping, and that %251 in ping
is %25251 in a URI, then we're done.

I'm here as a document editor, looking for guidance.

     Brian

     Brian

> 
> -Dave
> 
>>> And if the interface name itself is going to be escaped using URI "%xx"
>>> notation, then why not escape the '%' the same way?
>> My impression is that this WG has already objected to that, which is why
>> we ended up with the current proposal. I leave the next step to the WG Chair.
>>
>>    Brian
>>
>>> This argues in support of what Microsoft already did: Encode '%' as "%25".
>>>
>>> It's not my favourite outcome, but based on Dave Thaler's comment, it's
>>> the one that gets my vote.
>>>
>>> In the spirit of "be liberal with what you accept" the doc should also
>>> advocate that URI parsers are forgiving about accepting bare '%' signs
>>> -- i.e. a '%' not followed by two valid hex characters is left
>>> untouched. This lets a human user copy-and-paste "fe80::a%en1" from a
>>> "ping" command and have it work, though the strictly correct form (which
>>> URI generators should output) remains "fe80::a%25en1".
>>>
>>> Stuart Cheshire
> 
--------------------------------------------------------------------
IETF IPv6 working group mailing list
ipv6@ietf.org
Administrative Requests: https://www.ietf.org/mailman/listinfo/ipv6
--------------------------------------------------------------------

Reply via email to