Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
On 12/18/2015 12:55 PM, Markus Armbruster wrote: > Alberto Garcia writes: > > We do however have translations for a few simple strings for the GTK+ > menu items, so in order to run QEMU using the C locale, and yet have a > translated UI let's use setlocale() for LC_MESSAGES only. > Not sure why I noticed it only now and if it's related to any recent package upgrade on my side (using RHEL 7), but I noticed that non-ASCII characters in the GTK UI strings are broken for me and git bisect pointed to this commit. >>> >>> I guess we need to set LC_CTYPE too. >> >> That affects functions in ctype.h (isalpha(), islower(), isupper(), ...) >> I guess that's safe? Gnulib introduces functions named c_isalpha(), c_islower(), and so forth, which behave identically regardless of the current locale, precisely because locale-dependent definitions on which byte sequences form a valid character can cause undesirable behavior. I don't know if glib does the same, but it does indeed have the potential to affect us, in at least util/id.c:id_wellformed(). It would be weird to let the user's choice of locale determine which ids they can create. > > If we're guessing, then I guess it isn't. But we shouldn't be guessing. > > "LC_CTYPE affects the behavior of the character handling functions and > the multibyte and wide character functions." > > I doubt there's much use for the latter in QEMU itself, but in > libraries, all bets are off. I guess this is what actually screws up > GTK. > > We do use the former. LC_CTYPE set to some sufficiently funky locale is > bound to upset these uses. > > In short: nope, we can't just set LC_CTYPE, at least not without further > analysis. In fact, if LC_CTYPE and LC_COLLATE are incompatible, then strcoll() has undefined behavior. GNU coreutils warns: Unless otherwise specified, all comparisons use the character collating sequence specified by the ‘LC_COLLATE’ locale.(1) [...] (1) If you use a non-POSIX locale (e.g., by setting ‘LC_ALL’ to ‘en_US’), then ‘sort’ may produce output that is sorted differently than you’re accustomed to. In that case, set the ‘LC_ALL’ environment variable to ‘C’. Note that setting only ‘LC_COLLATE’ has two problems. First, it is ineffective if ‘LC_ALL’ is also set. Second, it has undefined behavior if ‘LC_CTYPE’ (or ‘LANG’, if ‘LC_CTYPE’ is unset) is set to an incompatible value. For example, you get undefined behavior if ‘LC_CTYPE’ is ‘ja_JP.PCK’ but ‘LC_COLLATE’ is ‘en_US.UTF-8’. Off-hand, we are specifically NOT calling setlocale() for the categories that we want to leave in the C locale, so we don't have to worry about LC_ALL throwing us off. And I'm hard-pressed to think of an example where LC_COLLATE=C while LC_CTYPE is a multibyte character will cause unusual sorting artifacts (the one that coreutils is warning against is when you have two incompatibly different multibyte character sets involved, where our case is a multibyte character set for display but a unibyte set for collation). But it is indeed a can of worms, that requires special analysis. -- Eric Blake eblake redhat com+1-919-301-3266 Libvirt virtualization library http://libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
Alberto Garcia writes: >>> > We do however have translations for a few simple strings for the GTK+ >>> > menu items, so in order to run QEMU using the C locale, and yet have a >>> > translated UI let's use setlocale() for LC_MESSAGES only. >>> > >>> Not sure why I noticed it only now and if it's related to any recent >>> package upgrade on my side (using RHEL 7), but I noticed that >>> non-ASCII characters in the GTK UI strings are broken for me and git >>> bisect pointed to this commit. >> >> I guess we need to set LC_CTYPE too. > > That affects functions in ctype.h (isalpha(), islower(), isupper(), ...) > I guess that's safe? If we're guessing, then I guess it isn't. But we shouldn't be guessing. "LC_CTYPE affects the behavior of the character handling functions and the multibyte and wide character functions." I doubt there's much use for the latter in QEMU itself, but in libraries, all bets are off. I guess this is what actually screws up GTK. We do use the former. LC_CTYPE set to some sufficiently funky locale is bound to upset these uses. In short: nope, we can't just set LC_CTYPE, at least not without further analysis. We should've stayed out of the GUI business. [...]
Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
>> > We do however have translations for a few simple strings for the GTK+ >> > menu items, so in order to run QEMU using the C locale, and yet have a >> > translated UI let's use setlocale() for LC_MESSAGES only. >> > >> Not sure why I noticed it only now and if it's related to any recent >> package upgrade on my side (using RHEL 7), but I noticed that >> non-ASCII characters in the GTK UI strings are broken for me and git >> bisect pointed to this commit. > > I guess we need to set LC_CTYPE too. That affects functions in ctype.h (isalpha(), islower(), isupper(), ...) I guess that's safe? > @@ -2044,8 +2044,9 @@ void gtk_display_init(DisplayState *ds, bool > full_screen, bool grab_on_hover) > > s->free_scale = FALSE; > > -/* LC_MESSAGES only. See early_gtk_display_init() for details */ > +/* LC_MESSAGES+LC_CTYPE only. See early_gtk_display_init() for details */ > setlocale(LC_MESSAGES, ""); > +setlocale(LC_CTYPE, ""); > bindtextdomain("qemu", CONFIG_QEMU_LOCALEDIR); > textdomain("qemu"); You can also modify the comment in early_gtk_display_init() to say that " we support importing LC_MESSAGES and LC_CTYPE from the environment ". Berto
Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
Am 18.12.2015 um 14:23 hat Gerd Hoffmann geschrieben: > On Fr, 2015-12-18 at 12:38 +0100, Kevin Wolf wrote: > > Am 10.09.2015 um 17:19 hat Alberto Garcia geschrieben: > > > The QEMU code is not internationalized and assumes that it runs under > > > the C locale, but if we use the GTK+ UI we'll end up importing the > > > locale settings from the environment. This can break things, such as > > > the JSON generator and iotest 120 in locales that use a decimal comma. > > > > > > We do however have translations for a few simple strings for the GTK+ > > > menu items, so in order to run QEMU using the C locale, and yet have a > > > translated UI let's use setlocale() for LC_MESSAGES only. > > > > > > Signed-off-by: Alberto Garcia > > > > Not sure why I noticed it only now and if it's related to any recent > > package upgrade on my side (using RHEL 7), but I noticed that non-ASCII > > characters in the GTK UI strings are broken for me and git bisect > > pointed to this commit. > > I guess we need to set LC_CTYPE too. > Can you try whenever the attached patch fixes the issue? Yes, that works for me. Tested-by: Kevin Wolf
Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
On Fr, 2015-12-18 at 12:38 +0100, Kevin Wolf wrote: > Am 10.09.2015 um 17:19 hat Alberto Garcia geschrieben: > > The QEMU code is not internationalized and assumes that it runs under > > the C locale, but if we use the GTK+ UI we'll end up importing the > > locale settings from the environment. This can break things, such as > > the JSON generator and iotest 120 in locales that use a decimal comma. > > > > We do however have translations for a few simple strings for the GTK+ > > menu items, so in order to run QEMU using the C locale, and yet have a > > translated UI let's use setlocale() for LC_MESSAGES only. > > > > Signed-off-by: Alberto Garcia > > Not sure why I noticed it only now and if it's related to any recent > package upgrade on my side (using RHEL 7), but I noticed that non-ASCII > characters in the GTK UI strings are broken for me and git bisect > pointed to this commit. I guess we need to set LC_CTYPE too. Can you try whenever the attached patch fixes the issue? thanks, Gerd From 54821a4b405ca31c997485b563ec5c43dd53e4ed Mon Sep 17 00:00:00 2001 From: Gerd Hoffmann Date: Fri, 18 Dec 2015 14:15:56 +0100 Subject: [PATCH] gtk: fix utf8 strings in the ui Commit "2cb5d2a gtk: use setlocale() for LC_MESSAGES only" restricts locate settings to LC_MESSAGES, to avoid bugs caused by locale-specific number printing (LC_NUMERIC) and possibly others. We need LC_CTYPE too to make messages with chars outside us-ascii work correctly. Add it. Reported-by: Kevin Wolf Signed-off-by: Gerd Hoffmann --- ui/gtk.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/ui/gtk.c b/ui/gtk.c index 47b37e1..30407a5 100644 --- a/ui/gtk.c +++ b/ui/gtk.c @@ -2044,8 +2044,9 @@ void gtk_display_init(DisplayState *ds, bool full_screen, bool grab_on_hover) s->free_scale = FALSE; -/* LC_MESSAGES only. See early_gtk_display_init() for details */ +/* LC_MESSAGES+LC_CTYPE only. See early_gtk_display_init() for details */ setlocale(LC_MESSAGES, ""); +setlocale(LC_CTYPE, ""); bindtextdomain("qemu", CONFIG_QEMU_LOCALEDIR); textdomain("qemu"); -- 1.8.3.1
Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
Am 10.09.2015 um 17:19 hat Alberto Garcia geschrieben: > The QEMU code is not internationalized and assumes that it runs under > the C locale, but if we use the GTK+ UI we'll end up importing the > locale settings from the environment. This can break things, such as > the JSON generator and iotest 120 in locales that use a decimal comma. > > We do however have translations for a few simple strings for the GTK+ > menu items, so in order to run QEMU using the C locale, and yet have a > translated UI let's use setlocale() for LC_MESSAGES only. > > Signed-off-by: Alberto Garcia Not sure why I noticed it only now and if it's related to any recent package upgrade on my side (using RHEL 7), but I noticed that non-ASCII characters in the GTK UI strings are broken for me and git bisect pointed to this commit. Kevin > ui/gtk.c | 21 - > 1 file changed, 20 insertions(+), 1 deletion(-) > > diff --git a/ui/gtk.c b/ui/gtk.c > index df2a79e..11ea2cf 100644 > --- a/ui/gtk.c > +++ b/ui/gtk.c > @@ -1941,7 +1941,8 @@ void gtk_display_init(DisplayState *ds, bool > full_screen, bool grab_on_hover) > > s->free_scale = FALSE; > > -setlocale(LC_ALL, ""); > +/* LC_MESSAGES only. See early_gtk_display_init() for details */ > +setlocale(LC_MESSAGES, ""); > bindtextdomain("qemu", CONFIG_QEMU_LOCALEDIR); > textdomain("qemu"); > > @@ -2010,6 +2011,24 @@ void gtk_display_init(DisplayState *ds, bool > full_screen, bool grab_on_hover) > > void early_gtk_display_init(int opengl) > { > +/* The QEMU code relies on the assumption that it's always run in > + * the C locale. Therefore it is not prepared to deal with > + * operations that produce different results depending on the > + * locale, such as printf's formatting of decimal numbers, and > + * possibly others. > + * > + * Since GTK+ calls setlocale() by default -importing the locale > + * settings from the environment- we must prevent it from doing so > + * using gtk_disable_setlocale(). > + * > + * QEMU's GTK+ UI, however, _does_ have translations for some of > + * the menu items. As a trade-off between a functionally correct > + * QEMU and a fully internationalized UI we support importing > + * LC_MESSAGES from the environment (see the setlocale() call > + * earlier in this file). This allows us to display translated > + * messages leaving everything else untouched. > + */ > +gtk_disable_setlocale(); > gtkinit = gtk_init_check(NULL, NULL); > if (!gtkinit) { > /* don't exit yet, that'll break -help */ > -- > 2.5.1 > >
Re: [Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
> void early_gtk_display_init(int opengl) > { > +/* The QEMU code relies on the assumption that it's always run in > + * the C locale. Therefore it is not prepared to deal with > + * operations that produce different results depending on the > + * locale, such as printf's formatting of decimal numbers, and > + * possibly others. > + * > + * Since GTK+ calls setlocale() by default -importing the locale > + * settings from the environment- we must prevent it from doing so > + * using gtk_disable_setlocale(). > + * > + * QEMU's GTK+ UI, however, _does_ have translations for some of > + * the menu items. As a trade-off between a functionally correct > + * QEMU and a fully internationalized UI we support importing > + * LC_MESSAGES from the environment (see the setlocale() call > + * earlier in this file). This allows us to display translated > + * messages leaving everything else untouched. > + */ > +gtk_disable_setlocale(); Thanks. Replacing my version with this one. cheers, Gerd
[Qemu-devel] [PATCH] gtk: use setlocale() for LC_MESSAGES only
The QEMU code is not internationalized and assumes that it runs under the C locale, but if we use the GTK+ UI we'll end up importing the locale settings from the environment. This can break things, such as the JSON generator and iotest 120 in locales that use a decimal comma. We do however have translations for a few simple strings for the GTK+ menu items, so in order to run QEMU using the C locale, and yet have a translated UI let's use setlocale() for LC_MESSAGES only. Signed-off-by: Alberto Garcia --- ui/gtk.c | 21 - 1 file changed, 20 insertions(+), 1 deletion(-) diff --git a/ui/gtk.c b/ui/gtk.c index df2a79e..11ea2cf 100644 --- a/ui/gtk.c +++ b/ui/gtk.c @@ -1941,7 +1941,8 @@ void gtk_display_init(DisplayState *ds, bool full_screen, bool grab_on_hover) s->free_scale = FALSE; -setlocale(LC_ALL, ""); +/* LC_MESSAGES only. See early_gtk_display_init() for details */ +setlocale(LC_MESSAGES, ""); bindtextdomain("qemu", CONFIG_QEMU_LOCALEDIR); textdomain("qemu"); @@ -2010,6 +2011,24 @@ void gtk_display_init(DisplayState *ds, bool full_screen, bool grab_on_hover) void early_gtk_display_init(int opengl) { +/* The QEMU code relies on the assumption that it's always run in + * the C locale. Therefore it is not prepared to deal with + * operations that produce different results depending on the + * locale, such as printf's formatting of decimal numbers, and + * possibly others. + * + * Since GTK+ calls setlocale() by default -importing the locale + * settings from the environment- we must prevent it from doing so + * using gtk_disable_setlocale(). + * + * QEMU's GTK+ UI, however, _does_ have translations for some of + * the menu items. As a trade-off between a functionally correct + * QEMU and a fully internationalized UI we support importing + * LC_MESSAGES from the environment (see the setlocale() call + * earlier in this file). This allows us to display translated + * messages leaving everything else untouched. + */ +gtk_disable_setlocale(); gtkinit = gtk_init_check(NULL, NULL); if (!gtkinit) { /* don't exit yet, that'll break -help */ -- 2.5.1