Hi, [...]
> As an example, while Solaris uses fr_FR.UTF-8 as the locale name for > French France UTF-8 locale, IBM AIX uses FR_FR and HP-UX 11.11 and RHEL 5.4 > use fr_FR.utf8. (It also appears that glibc based Linux distributions allow > some variations of the locale names in codeset part of locale name via > some kind of codeset name normalization mechanism hence accepting not only > fr_FR.UTF-8 and fr_FR.UTF8 but also fr_FR.utf8.) > > One way to resolve this interoperability/compatibility issue would be > creating and maintaining thousands of locale name related symbolic links at > our locale directories but that will be quite messy and very difficult to > maintain. I just wonder, what's wrong (messy?) about thousands of symbolic links in directory? I might be horribly naive, but I thought that managing files (links) is quite easy task for any packaging system. Moreover the 'managing' will be most probably just adding the missing links, no deleting or renaming (exceptions proves the rule). The downsizes of such solution I see: a) unintuitive If my locale does not work and similar are installed (consulting locale -a), first I would look into locale directory to see what's there. Creating just another symlink is next logical step. b1) harder to manage If we have separate packages for each locale, the shared locale_alias would have to be dynamically modified with each package install and remove. b2) harder to manage if there is single locale_alias file, and user makes modifications, we have to merge his change with our own patches (or how it is called these days). Creating symlink does not interfere with other packages. c) slow If user has LC_ALL=blah, then each subsequent setlocale(3C) has to go through the locale_alias and compare thousands of lines to blah. Setlocale is frequently called command ... (You could cache this blah=cs_CZ somewhere. But then there is need for tools to purge this cache and user has to find out how to do it when he changes the locale_alias). > Hence, this project proposes to have a transparent locale name alias support > mechanism at libc with embedded locale name mapping tables as outlined at > below to remedy the interoperability/compatibility issue and aid users > who want to migrate from other platforms to Solaris. Linux uses locale.alias (Debian has /etc/locale.alias and /usr/X11R6/lib/X11/locale/locale.alias for example). To aid Linux users, it might be good idea to create locale.alias(4) man page. > TECHNICAL DETAILS > > Currently, when a locale selection is made with setlocale(3C), as an example > for 32-bit environment, the function looks for the locale shared object at > /usr/lib/locale/<locale>/<locale>.so.3. In this process of locating the locale > shared object, the <locale> name given to the setlocale(3C) and the <locale> > component of the path to the locale shared object must be identical byte by > byte. That is problem for symlinks. Still would not it be easier to relax this scheme a bit (file locale.so.3 would work for all locales [pardon my naivity again ...]). [...] [ ... LC_MESSAGES ] > If the locale name given is a canonical locale name to obsoleted Solaris > locale names by [3] and [4] and there is no associated translated message > object or catalog in the system with the locale name, for a better backward > compatibility, the messaging functions will additionally look for the message > object or catalog using the obsoleted Solaris locale names as the additional > locale names to check on against with. Also, as a part of locale name alias > support mechanism, if the locale name given is an accepted and supported > locale name alias to a canonical locale name by this project and there is > no associated translated message object or catalog in the system with > the locale name, the messaging functions will additionally look for its > message object or catalog by using the canonical locale name. Details on > these are described in the updated man pages, gettext(1), catopen(3C), > gettext(3C), and environ(5) [2] and the new man page, locale_alias(5) [2]. > > These additional checkings are necessary to make our messaging functionality > transparently work for obsoleted Solaris locales and also for the supported > locale name aliases. The reason why the project team is explicitly updating > the messaging function related man pages is due to that the interfaces are > explicitly specifying the locale directories and locale names. No other > internationalized interfaces appear requiring such explicit update on > the man pages. > > The mapping tables shown at locale_alias(5) [2] are formulated from the > data extracted from [3], [4], and some operating systems such as AIX 6.1, > HP-UX 11.11, RHEL 5.4, Ubuntu 9.04, and the latest OpenSolaris/Solaris Nevada > via some simple reverse engineering. They will be embedded into libc under > read only data section. (We expect there will be no significant changes at > the tables, if any, in the future.) Oh, this is not file, but rather ELF section? I don't see the aid for the user if he has to file escalation (pay support first) and wait some time to get updated packages from upstream, just to make his cs_CZ.UtF-8 work? > Although this project does not change locale(1) utility, this project also > update the NOTES section of locale(1) man page as shown at [2] to clarify on > the "locale -a" output that locale aliases are supported only as aliases and > will not be shown at the output. Just out of interest, what is the difference between canonical locale name, and 'additional locale name'? Why do we have to keep them separated? We would not need lawyer-like documentation like http://sac.sfbay/PSARC/2009/594/materials/setlocale.3c.diff Maybe I just don't see the problem. Customer takes his script form HP-UX, and suddenly it's output is not in French, but rather in English. How exactly will help him this quite complex change? Thank you for your patience with me -- Vlad