On Thu, Apr 16, 2026 at 08:24:17PM +0100, Gavin Smith wrote:
> As I understand, the IETF language subtag registry defines which "variant" 
> subtags
> may be used in combination with other "variant" subtags.
> 
> For example, the "grclass" subtag you used as an example is more
> specific than variants of Occitan and occurs later in the BCP 47 language
> identifier.  This is shown by the "Prefix:" entries in the register:
> 
> Type: variant
> Subtag: grclass
> Description: Classical Occitan orthography
> Added: 2018-04-22
> Prefix: oc
> Prefix: oc-aranes
> Prefix: oc-auvern
> Prefix: oc-cisaup
> Prefix: oc-creiss
> Prefix: oc-gascon
> Prefix: oc-lemosin
> Prefix: oc-lengadoc
> Prefix: oc-nicard
> Prefix: oc-provenc
> Prefix: oc-vivaraup
> Comments: Classical written standard for Occitan developed in 1935 by
>   Alibèrt
> 
> "oc-legnadoc-grclass" would be denoted in a Texinfo input file thus:
> 
> @documentlanguage oc
> @documentlanguagevariant lengadoc, grclass
> 
> The other order, "@documentlanguagevariant grclass, lengadoc" would have
> to be considered incorrect, as "lengadoc" is only allowed under the "oc-"
> prefix, not under "oc-grclass-":
> 
> Type: variant
> Subtag: lengadoc
> Description: Languedocien
> Added: 2018-04-22
> Prefix: oc
> Comments: Occitan variant spoken in Languedoc
> 
> - (although it's uncertain whether texi2any should do the work to validate
> such usages).

I had understood that, and the examples I have seen in the IANA registry
made me think that it was well done, but I think that texi2any should
not validate the variants order and association to @documentlanguage.
Maybe later on, but to me this looks like unneeded complexity.  Users
using variants will not be numerous and can be supposed to know what
they are doing.

> I prefer @documentlanguagevariant to @documentlanguagevariants, as it
> is fine in the singular with a list, whereas in the plural with a single
> item in the argument would look strange.

Ok, this is what I used.

> If I understand correctly, the argument to @documentlanguagevariant
> only accepts tags that are entered as "Type: variant" in the IANA
> registry.  We do not accept region codes, scripts, or the "extension"
> subtags of BCP 47.

>  Notably, we do not accept "-u-sd-" extensions
> used to denote regional variants.  (Wikipedia gives "gsw-u-sd-chzh"
> as an example denoting "Swiss German as used in the Canton of Zurich".)

I agree that the u extensions should not be used for languages.
They could be of use for other purposes, for example collation
customization.  But this is a refinement that is not needed right now
and may not be interesting ever -- unless we get users reports.  Having
language specific collation is good enough, in my opinion.
 
> (This -u- extension is governed by yet another entity, the Unicode
> Consortium.  There is information on it here:
> https://www.unicode.org/reports/tr35/#u_Extension.)
> 
> So we can reference the IANA subtag registry in a simple way, without
> accepting the full complexity and scope of possibilities of BCP 47.

That was always my intention, by focusing on the main language, region,
script and variants parts.

-- 
Pat

Reply via email to