I have two questions about Plane 14 language tags as specified in
Technical Report #7:
1. Does anyone know of any implementation that interprets language tags
and actually does something with the result? I'm not looking for
code, just information and ideas.
2. (Ken and Glenn) Can you explain in a little more detail the rationale
for lowercasing the entire language tag? It seems that if RFC 1766
is the model to be followed, then the RFC 1766 casing convention
(lowercase for language, uppercase for country) might be preferred.
The exact text in the TR follows:
> Since RFC 1766 specifies that language tags are not case significant,
> it is recommended that for language tags, the entire tag be lowercased
> before conversion to Plane 14 tag characters. (This would not be
> required for Unicode conformance, but should be followed as general
> practice by protocols making use of RFC 1766 language tags, to simplify
> and speed up the processing for operations which need to identify or
> ignore language tags embedded in text.) Lowercasing, rather than
> uppercasing, is recommended because it follows the majority practice of
> expressing language tag values in lowercase letters.
I guess I don't see how lowercasing the entire tag simplifies or
speeds up anything, since the hyphen which separates language from
country is outside the range of lowercase letters anyway and
processes that want to ignore LT's must ignore the entire range from
U+E0000 through U+E007F.
Thanks for any pointers,
-Doug Ewell
Fullerton, California