Re: Unicode 3.1: incomplete tags considered harmless/useful

DougEwell2 Wed, 31 Jan 2001 23:26:44 -0800
In a message dated 2001-01-31 12:19:33 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

>  The section "Dangers of Incomplete Support" in section 13.7 seems to me
>  to be far too strongly worded; it should be weakened or removed
>  altogether.
>  
>  In particular, there is no reason why sequences of tag characters
>  not beginning with LANGUAGE TAG or CANCEL TAG cannot be used
>  for various purposes by private agreement.  However, as currently
>  worded, language-tag-interpreting applications SHOULD remove them,
>  contrary to the usual Unicode view of not-understood content
>  ("leave it alone").

What would be the meaning or benefit of a sequence of tag characters *not* 
beginning with a tag header in the range U+E0001 through U+E001F?  We are 
already promised that tag characters may only be used to form valid tags, so 
I don't see any benefit in allowing their use for privately defined purposes. 
 But clearly the restriction to U+E0001 LANGUAGE TAG and U+E007F CANCEL TAG 
will be inappropriate as soon as another type of tag is defined.

>  Nor is there any reason why a CANCEL TAG should be required to exist for
>  every LANGUAGE TAG; in particular, a LANGUAGE TAG at the beginning
>  of plain text that is meant to apply to the whole text (document,
>  human-readable-string in protocols, etc.) should be unproblematic.
>  As currently worded, editors SHOULD not permit such uses.

This makes sense, and in fact I was not aware of any such requirement.  
Technical Report #7 specifically mentions the legitimate possibility of 
language-tagged text going out of scope (i.e. hitting EOF) without a CANCEL 
TAG.

-Doug Ewell
 Fullerton, California
Re: Unicode 3.1: incomplete tags considered harmless/useful

Reply via email to