Robie Basak [2013-06-06 12:42 +0100]: > I'd like to distinguish three different use cases of locales. > > 1) The locale that a sysadmin sees for commands that he types under > his own login (whether local or over ssh). I'll call this the user > locale.
This actually needs to be split in two: (1host) the locale as it is configured on your host (from where you run ssh), and (1remote) the locale that your user has configured on the remote host (where you ssh into). These are usually defined in the respective ~/.profile or ~/.pam_profile. Let's > 2) The locale that system services run under (eg. logging). I'll call > this the system locale. ... which is usually specified in /etc/default/locale, or on older systems, /etc/environment. > 3) The locale served to users accessing services that the server > provides. Example: what a user using a web application gets as a > collation order when he sorts some listing by name. I'll call this the > service locale. My feeling is that this will usually be equal to the system locale, or you have to configure the particular process for that specific locale, overriding /etc/default/locale and the environment. Note that due to the default behaviour of our ssh, (1host) becomes (1remote) iff the remote system has no (2) or (1remote) configured (that's the precise case I'm picking on, as this is just broken), and due to our default behaviour of sudo, (1remote) even becomes (2). I. e. due to these two, these concepts can easily become mixed up. > Example: > > I (English) work with a French sysadmin on a server which > provides services to Polish customers. I'd want all three locales > configured on my server. We might settle on C or en_GB for the system > locale. My French co-worker may use LANG=fr(?) and expects to see > messages in French when he uses ssh to diagnose issues. You said the server has en_GB.UTF-8 in /etc/default/locale. In that case, ssh'ing in would give your French sysadmin that if he didn't configure anything in his remote ~/.profile, or he puts LANG=fr_FR.UTF-8 into his remote profile. (1host) only comes into play if the remote server would not have an /etc/default/locale, then ssh would transfer his (1host) locale (presumably fr_FR.UTF-8) to (1remote), but that wouldn't actually work unless the remote server actually has fr_FR.UTF-8 available (which wasn't the case in said bug reports). > I need to see English, since I don't understand French well enough. > The server runs a web application which uses Postgres, so Postgres > should use a Polish collation order. PostgreSQL can define the collation and locales per-database and thus individually per webapp, but for the sake of argument let's pretend it can't, and it would always use the locale of template1 (i. e. the default database created at installation). > Problem 1: does postgresql-9.1's postinst do a different thing depending > on whether me or my colleague installs it? Why? Now that we have > /etc/default/locale, wouldn't it be better to use this? If you have a (2) set in /etc/default/locale, then it will use that. ssh maintains the right fallback here: (1remote) wins over (2) wins over (1host). (And again, I claim that the latter is bad behaviour). If you neither have (1remote) nor (2), then yes, it will depend on what your host locale is due to how ssh and sudo combined behave, and either fail because (1host) doesn't exist on the server, or create the default db with just (1host). But in that case there is nothing else that PostgreSQL could use, except perhaps for saying that "if I have a locale defined by the environment, but it's not in /etc/default/locale, /etc/environment, /etc/postgresql/version/cluster/environment, ~/.profile, ~/.pam_profile, or whereever else you could set environment variables, then use C", but that would make things even less predictable and robust IMHO. > Problem 2: but I wanted Postgres configured with a Polish collation > order, which doesn't happen either way. Maybe the postinst should use > debconf to ask the user, defaulting to what /etc/default/locale says? That would be overkill IMHO. Except in the pathological case above, it works just fine, and then you often want different locales for different DBs/applications anyway. You can create more clusters or databases, and in each case (pg_createcluster/createdb) specify an individual locale combination. > > I have always considered this default behaviour of ssh unexpected and > > wrong. It blindly applies the host locale to the remote ssh session > > without any checks whether that locale is actually valid. In > > particular because it only seems to do that if the remote server does > > not have any default locale from /etc/default/locale, > > .pam_environment, or otherwise, which usually only occurs in servers > > where locales have not been installed and configured at all (this > > might be the case in our cloud instances, something we ought to fix). > > I hope I've presented the case for passing the locale setting through in > my use case above. Two different sysadmins want two different locales > available when they log in. How should we cater for this? There can be just one (2), so they need to fight out which of the two should become it. The other can then set a different locale in their ~/.profile. This is independent of ssh's behaviour, but of course the sudo behaviour still applies here: if the other sysadmin uses sudo, all operations will be done under his locale. > > So in this situation it is very likely that the locale that ssh passes > > from the host to the remote shell will not work. > > IMHO, we should make it work, or drop to C if a locale isn't configured. My preferred fix would be to ssh to simply stop passing (1host) at least if the remote system does not have that available. The very first time you press <tab> in the remote shell you'll get screen clutter by error messages, apt/dpkg will spit out a plethora of errors from perl, and the locale will not work and behave like C anyway. It's somewhat debatable if we want to pass (1host) if it is available on the server; I'd personally prefer not to because of our sudo behaviour, but I have no strong opionion about it, and good arguments can be made either way. > > I don't understand this -- if you run your entire server without > > locales, then a lot of stuff will not work; e. g. your cited > > No, perhaps I haven't been clear. I want multiple locales configured, so > that on a multi-user system users see their own locales when they shell > in. That's fine. But in the bug you cited the problem was that none of the locales involved were configured on the server. > But because we don't know what these locales might be, we can't easily > configure them in advance. So a warning of an unconfigured locale is > great, but that doesn't stop it being broken. That's why I'm proposing > dropping the user down to C until the misconfiguration is fixed. As I said in my reply to Steve, I'm happy to make the package installation succeed, but not create any cluster in that case (and print a warning instead, which might or might not be actually seen). That's somewhat cleaner from a package installation POV, but again leads to bug reports like "I installed postgresql and did not get a default DB", which are rather hard to debug. As for predicting which locales you might need on a server: If we keep the current behaviour of ssh, then you could just generate _all_ locales on the server to be prepared for anything? > Alternatively, if we conclude that locales shouldn't be passed through > ssh, and that dropping my multi-locale multi-user use case is fine Those are not the same thing as far as I can see. Working with multiple locales on the server is fine; what's not fine is to throw an arbitrary LANG= setting into a remote shell and expecting it to work without checking whether it's valid. > then sticking to the system locale (/etc/default/locale or > whatever) would be fine for the original bug that prompted this > thread. That's what is already happening, as far as I can see. > > I'd think that on a server you ought to set the system locale to > > what you actually want, and then have your services use that, not > > some random locale from outside that someone happens to use on > > their workstation? > > Right - but the locale the user installing Postgres on is using isn't > necessarily the system locale. So how about using /etc/default/locale > instead of the environment-defined one? Again, the referenced bug occurs when there is no /etc/default/locale. Canonically, locales are defined by environment variables, and /etc/default/locale, /etc/environment, /etc/postgresql/version/cluster/environment, /root/.profile, and what not are means to set them. We can probably add some heuristics there, but if we limit it to only /etc/default/locale, we'd break other cases which are legitimately working right now. Thanks, Martin -- Martin Pitt | http://www.piware.de Ubuntu Developer (www.ubuntu.com) | Debian Developer (www.debian.org) -- ubuntu-devel mailing list ubuntu-devel@lists.ubuntu.com Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel