Re: Last Call: An IETF URN Sub-namespace for Registered Protocol Parameters to BCP

Keith Moore Wed, 03 Jul 2002 11:14:43 -0700

> Keith,
> 
> It seems that your objections to this proposal are based on a very 
> different view of what constitutes a "resource" to that which is understood 
> in circles where URIs are commonly used.  Some edge-cases may have been a 
> matter for debate, but a good working approximation is "anything that can 
> be identified by a URI".


In other words, anything you can attach a name to is a resource.
 
> Spurred by XML and related technologies (which I assert are far more than 
> mere "fashion") we are seeing URIs used for a wide range of purposes which 
> are not constrained by a requirement for dereferencing.   The use of URIs 
> for identifying arbitrary things is now a fact of life, and in some 
> technical domains is providing to be extremely useful.  You claim "harm", 
> but I recognize no such harm.

Clarification: I claim "harm" for the proposed use of *URNs* because
URNs were designed to be long-term stable names for (at least potentially)
network-accessible resources, whereas the proposal is to use them as a 
way of generating globally unique strings like UUIDs or OIDs.

> Having different syntactic contexts in which names are used will inevitably 
> lead to different syntactic name forms.  I submit that the real challenge 
> here is not to prevent the use of varying syntax, but to lock the various 
> syntactic forms to a common semantic definition 

Oddly enough, having different syntactic contexts also tends to cause
differences in semantic definition.  In one syntactic context order
of elements can be significant whereas it's not in the other. one syntactic 
context is designed to allow individual components to be accessed independently
of the others while another expects the entire resource description to
be available to the consumer.  One makes it easy to group related
items; another doesn't have a way of representing relationships between
items.  The semantic definitions tend to be influenced by these factors.

I'm all for reuse of data models where it makes sense, but if the goal
is really to "lock the various syntactic forms to a common semantic
definition" (presumably one which is compatible with XML) then I take
strong issue with that, as the XML model is quite dysfunctional for
many purposes.  (as are the others, it's just that XML is the current
bandwagon)

> -- in this case, providing 
> a way to create syntactic URI forms that can be bound to protocol semantics 
> in a way that inhibits semantic drift between the different forms.

But such drift is almost inevitable.  You can't recast some existing
data structure in XML and use it widely and expect the meanings of the 
protocol elements to stay the same.  And in essentially every example I've 
seen of an attempt to do this, the meanings of the protocol elements are 
changed subtly from the very beginning, usually by trying to use XML
structure to represent relationships that aren't explicit in the original  
data model.  More generally, an XML representation of a data model will get 
used differently than the original representation, and the semantics of the 
individual protocol elements will almost certainly drift as a result.

(Actually this happens even when you use the same representation.
RFC 822 headers had subtly different meanings on BITNET than on
the Internet, because there were enough differences in the two user 
communities and the mail reading programs used by those communities.
Similarly, casting a data model into XML means that a different set
of tools will be used to access/manipulate that data - indeed that
is the entire point of doing so - but this *will* cause semantic drift
in the data model between the two environments)

Using URIs for the names of the data elements won't stop that kind of drift.

> One of the motivating factors in this work (for me, at least, and I think 
> for others) has been to draw together some of the divergent strands of 
> thinking that are taking place in the IETF and W3C.  W3C are fundamentally 
> set on a course of using URIs as a generic space of identifiers.  IETF have 
> a number of well-established protocols that use registries to allocate 
> names.  Neither of these are going to change in the foreseeable future.  So 
> do we accept a Balkanization of Internet standards efforts, or do we try to 
> draw them together?

Some things don't mix very well, even if they are quite useful individually.
The traditional examples are oil and water.

> A particular case in point is content negotiation.  The IETF have prepared 
> a specification for describing media features that uses a traditional form 
> of IANA registry to bind names to features.  In parallel with this, W3C 
> have prepared a specification which has some similar goals, but which uses 
> URIs to represent media features, and relies on the normal URI allocation 
> framework to ensure the minting of unique names as and when needed.  (I 
> have some reservations about this, but that can't change what is actually 
> happening.)  

But neither do we have to endorse it just so they will use our stuff.
Especially when their using our stuff dilutes the utility of our stuff
by not requiring widespread agreement on the media features used.

> This URN namespace proposal will provide a way to incorporate 
> the IETF feature registry directly into the W3C work, in a way which is 
> traceable through IETF specifications.   Without this, I predict that the 
> parties who are looking to use the W3C work (notably, mobile phone 
> companies) will simply go away and invent their own set of media features, 
> without any kind of clear relationship to the IETF features.  

The w3c approach is encouraging them to do this anyway, by having
all media features be URIs that anyone can create/assign without any 
agreement from anyone else.

> I also observe that IETF and W3C operate against somewhat differing 
> background assumptions:  the IETF focus on wire protocols means that the 
> context in which a PDU is processed is well-understood, pretty much by 
> definition of the protocol.  We have protocol rendezvous mechanisms and 
> state-machines and synchronization techniques that reduce the amount of 
> explicit information that is needed to be exchanged between parties -- this 
> is all part of efficient protocol design.  The work of W3C (and other 
> designers working "over the stack") often depends on obviating such 
> contextual assumptions, and in such cases the global (context-free) 
> qualities of URIs are extremely valuable.  If these layers were truly 
> isolated from each other, this debate would probably never arise.  But 
> there is genuine leakage:  client preferences depend on underlying hardware 
> capabilities;  trust decisions may incorporate protocol addressing and 
> other information, etc., etc.  This proposal to allow IETF protocol 
> parameter identifiers to be embedded in URI space is one way of controlling 
> information in these cross-layer interactions.

I think it would be far more useful to think of things in terms of the
mapping/translation process rather than just assigning alternate names to 
the protocol elements.  If it happens that the translation doesn't affect 
semantics of individual elements at all, that's a good thing, but my
experience suggests that that it often will.  And you don't know for sure
until you start looking at how those protocol elements will actually be
used in the new environment.
 
> Another different assumption between wire-protocols and application data 
> formats:  protocols are very binary -- either one is using a particular 
> protocol or one is not.  The years-long Internet Fax debates about adapting 
> email for real-time image transmission made that very clear.  It is not 
> permissable to simply assume that a communicating party understands 
> anything beyond the standardized protocol elements.  And there is a very 
> clear distinction in protocol specifications between what is standardized 
> and what is private extension.  This distinction is not so clear in 
> application data formats, and while there may be a core of standardized 
> data elements, it is often desirable for communities of users (or 
> application designers) to agree some common extensions -- this is typical 
> of how XML application formats are deployed.  Using URIs as identifiers 
> (e.g. in the case of XML, as namespace identifiers) allows for more 
> flexible deployment of formats, avoiding the problems of "X-headers" that 
> have for so long been a bane of IETF application standardization/extension 
> efforts.

Actually they are X- headers, just globally unique ones.   I'll freely
admit that such extensibility can be useful, and that having distributed
assignment of globally unique names for extension fields is a good idea
(though I've rarely seen a conflict between X- headers) but it's a huge
stretch to say that all fields should be defined this way.

(Also, I don't think that X- headers are a "bane" or ever have been;
they seem to cause far less harm than improper use of non X- fields; 
they're just currently out of fashion for reasons I cannot fathom)

> In summary:  URIs *will* be used to identify protocol parameters.  The IETF 
> cannot prevent that.  What the IETF can do by supporting a particular form 
> of such use is to try and ensure that such use remains bound by a clear, 
> authoritative chain of specifications to the IETF specification of what 
> such parameters mean.  The harm that comes from not doing this, in my view, 
> is that we end up with a multiplicity of URIs that mean nearly, but not 
> quite, the same thing as an IETF protocol parameter.  That outcome, I 
> submit, cannot be good for longer term interoperability between IETF and 
> other organizations' specifications.

The likely consequence of what is being proposed is for the URIs that we
define to mean nearly, but not quite, the same thing as an IETF protocol
parameter - but we have to try to pretend that they mean the same thing. 
And it will degrade interoperability.

 
> >d) embed NO visible structure in the URNs - just assign each
> >    parameter value a sequence number.  people who want to use
> >    those URNs in XML or whatever would need to look them up at IANA's
> >    web site.
> 
> I disagree.  This requirement actively works against one of the motivations 
> for using URIs in application data formats;  that there be a scalable 
> framework for different organizations and persons to mint their own 
> identifiers.

The fact that people want to use URIs in this way does not mean that it's
appropriate to use URNs in this way.  If people want to mint their own URNs,
then they have to follow the rules for URNs.  Those rules *do not* 
permit arbitrary organizations and persons to mint their own identifiers
without explicit delegation from a URN namespace, for very good reasons
which are consistent with URNs' purposes.

The very temptation to treat URNs as if they were as malleable as other 
URIs is part of what makes this proposal dangerous.  Since I think that
URNs *will* be widely misused if they are used for protocol elements, 
I'd far rather have IANA assign ordinary URIs for this - then we will
still get semantic drift but at least it won't dilute the value of URNs.

> To use an identifier, one must:
> 
> (i) have a framework for assigning identifier values, in such a way that it 
> is possible by some means for a human to locate its defining 
> specification.  I can't see how to do this without exploiting a visible 
> syntactic structure in the name.

ISBNs do not have a visible syntactic structure, at least, not an 
obvious one.  But they're quite frequently used to look up book information.
 
> (ii) have a framework for actually using the identifier in an 
> application:  in this case, I agree that the identifier should generally be 
> treated as opaque.
> 
> Also, I think (d) contradicts your goal (a):  I cannot conceive any 
> scalable resolution mechanism that does not in some sense depend on 
> syntactic decomposition of the name.

You should really read up on the CNRI handle system then.  There are a lot
of things I don't like about it but it really was designed to have exactly
this property.

Keith

Re: Last Call: An IETF URN Sub-namespace for Registered Protocol Parameters to BCP

Reply via email to