Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 Jason Short wrote: Madeline Book wrote: Oh wow, it really was trivial! But certainly not obvious for the likes of me :(. Once I ran the client with LANG= fr_FR.UTF-8 the libc messages displayed correctly and gtk/ pango was very happy. Nor to me. Leaving open until this is documented on our web pages and in the distributed files. This is a bug in freeciv. Background: when I wrote the charset code, I divided freeciv into 3 charsets. ... And where the heck is this documented? So in summary, if any bug reporters can track where the offending strings are coming from, fixing each incident is not too hard. In the meantime this should be listed as a known bug. I'm really tired of this laissez fair attitude toward bug fixing. It may be a known bug, but that means there should be an open ticket, probably with a tracking ticket to gather the related incidents. It is a serious bug. And the .UTF-8 workaround needs to be clearly documented! Everywhere! ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 [jdorje - Sun Jan 27 06:35:40 2008]: Obviously. But this is not a bug that is going to be easy to verify as completely fixed. Doing so requires scanning every string (not every translated string; every string) to check for inclusions of libc output. It is a serious bug. That depends on how many such strings there are. I suspect there are very few, thus making it a non-fatal and rather rare bug. In any case, we can quickly fix the most common places and make it such. When I was trying to find the source of the strings in the (warclient) code I noticed that most (if not all) of the translated system strings were coming from the strerror wrappers (mystrerror, mystrsocketerror). I presume if you want to intercept and modify the strings that would be the best place to do it... but I have reservations about implementing such a hackish fix (since it would go against what the user implicitly or explicitly requested via the LANG enviroment variable). And the .UTF-8 workaround needs to be clearly documented! Everywhere! What workaround? I think Mr. Simpson meant the LANG=fr_FR.UTF-8 workaround to get the system strings in UTF-8, as mentioned previously in this thread. Incidentally you can see some other workarounds in this thread http://freeciv.freeforums.org/viewtopic.php?t=156, though in retrospect it looks to me more like the blind leading the blind. :( Veering slightly offtopic, but can someone more knowldgeable please explain to me how it is possible that a cast (i.e. to const gchar * from const char *) could possibly change the encoding of a string? ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 [wsimpson - Sat Jan 26 10:38:42 2008]: Jason Short wrote: Madeline Book wrote: Oh wow, it really was trivial! But certainly not obvious for the likes of me :(. Once I ran the client with LANG= fr_FR.UTF-8 the libc messages displayed correctly and gtk/ pango was very happy. Nor to me. Leaving open until this is documented on our web pages and in the distributed files. This is a bug in freeciv. Background: when I wrote the charset code, I divided freeciv into 3 charsets. ... And where the heck is this documented? In this and other RT tickets, it seems. Strange, I thought I had written this up somewhere in the code, but I can find nothing. Where should it be documented? So in summary, if any bug reporters can track where the offending strings are coming from, fixing each incident is not too hard. In the meantime this should be listed as a known bug. I'm really tired of this laissez fair attitude toward bug fixing. Huh? I'm the one who pointed out it's a bug in the first place, remember? It may be a known bug, but that means there should be an open ticket, probably with a tracking ticket to gather the related incidents. Obviously. But this is not a bug that is going to be easy to verify as completely fixed. Doing so requires scanning every string (not every translated string; every string) to check for inclusions of libc output. It is a serious bug. That depends on how many such strings there are. I suspect there are very few, thus making it a non-fatal and rather rare bug. In any case, we can quickly fix the most common places and make it such. And the .UTF-8 workaround needs to be clearly documented! Everywhere! What workaround? -jason ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 [wsimpson - Mon Jan 21 09:51:15 2008]: Madeline Book wrote: Oh wow, it really was trivial! But certainly not obvious for the likes of me :(. Once I ran the client with LANG= fr_FR.UTF-8 the libc messages displayed correctly and gtk/ pango was very happy. Nor to me. Leaving open until this is documented on our web pages and in the distributed files. This is a bug in freeciv. Background: when I wrote the charset code, I divided freeciv into 3 charsets. The local charset is the one supported on the command line: this one is not under our control, and all output to the command line must be converted into it. The internal charset is the one used internally within freeciv: this is always utf-8 at the server but can be configured by the GUI at the client side (GUI writing is a lot easier if your charset is the same as your GUI library's). Meanwhile the data charset is the one used in all data files and network transactions and is utf-8. Now, the relevant point here is that while freeciv strings as translated by _() go directly into the internal encoding (see bind_textdomain_codeset in fciconv.c), anything returned by a library is going to go into whatever encoding that library uses. In the case of libc, this is the local encoding, and thus any translatable strings returned by libc need to be converted before they can be used. This can't really be changed; although we could hack the local encoding to switch to UTF-8 or just change the libc domain to return strings in that encoding, this would be poorly portable and would also cause stuff printed to the command line directly by libc to be in the wrong encoding. The functions to do the conversion exist in fciconv.h and are the same ones used for reading data from the command line: local_to_internal_string_malloc and local_to_internal_string_buffer. It would be nice to have a shortened form of these (like L_()), as has been discussed before, but since iconv requires a buffer into which to stick results this buffer must be either provided or allocated. Declaring the buffer statically is tempting but would lead to hard-to-trace bugs when L_() was used twice at once; some sort of garbage collection or buffer-rotating scheme might also come to mind for this. Perhaps #define L_(t,b) local_to_internal_string_buffer((t),b,sizeof(b)) might be the best way to go. Of course the real work lies in finding all the places where L_ should be used...which will require either a full audit or just fixing it as bugs are reported. The latter would be the easiest place to start as we already have a concrete bug report of a few and it's not likely too many libc-translated strings are being used internally within freeciv. So in summary, if any bug reporters can track where the offending strings are coming from, fixing each incident is not too hard. In the meantime this should be listed as a known bug. -jason ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 [book - Sun Jan 20 23:12:18 2008]: [wsimpson - Sun Jan 20 11:27:57 2008]: Madeline Book wrote: I can reproduce the garbled text and gtk warnings with the released 2.1.1 and branch S2_1. As other occurences of special characters (e.g. accented vowels) in translated messages don't show up as ?, I am led to believe it is a problem with only a few translated strings (or maybe just the one mentioned in my initial report). Spendid! We have about 18 hours before release of 2.1.3. If you submit your patch to po/fr.po, I'll be happy to check it in As much as I would like to do that, after looking for those strings (I found a few more that gtk doesn't like) in po/fr.po I failed to find them. In fact they are from libc.mo (e.g. in /usr/share/locale/fr/LC_MESSAGES/ on my system). So now I am a bit confused. Obviously it cannot be the fault of the freeciv french translators anymore (sorry :)), is it perhaps that I don't have utf8 french localization installed on my system? Or perhaps freeciv is not correctly initializing nls to return translated text from libc as utf8? I will investigate this problem some more (though I have no experience at all in properly building nls-enabled applications); hopefully someone will speak up about the trivially obvious solution. ;) Oh wow, it really was trivial! But certainly not obvious for the likes of me :(. Once I ran the client with LANG= fr_FR.UTF-8 the libc messages displayed correctly and gtk/ pango was very happy. This then was a non-issue from the start. Anyway it is resolved now, as far as I can see. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 [wsimpson - Sun Jan 20 11:27:57 2008]: Madeline Book wrote: I can reproduce the garbled text and gtk warnings with the released 2.1.1 and branch S2_1. As other occurences of special characters (e.g. accented vowels) in translated messages don't show up as ?, I am led to believe it is a problem with only a few translated strings (or maybe just the one mentioned in my initial report). Spendid! We have about 18 hours before release of 2.1.3. If you submit your patch to po/fr.po, I'll be happy to check it in As much as I would like to do that, after looking for those strings (I found a few more that gtk doesn't like) in po/fr.po I failed to find them. In fact they are from libc.mo (e.g. in /usr/share/locale/fr/LC_MESSAGES/ on my system). So now I am a bit confused. Obviously it cannot be the fault of the freeciv french translators anymore (sorry :)), is it perhaps that I don't have utf8 french localization installed on my system? Or perhaps freeciv is not correctly initializing nls to return translated text from libc as utf8? I will investigate this problem some more (though I have no experience at all in properly building nls-enabled applications); hopefully someone will speak up about the trivially obvious solution. ;) ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 Some translated strings show garbage characters when the client (branch S2_2) runs under LANG=fr_FR. In particular if you try to connect to a non- existant server (e.g. 192.168.77.77) then after failure the text in the network status bar will contain the string Aucun chemin d'acc?s pour..., i.e. accented characters appear as '?' and gtk/pango complains about invalid utf8. I am guessing that it is a small mistake on the part of the french translation main- tainer(s), and could be easily rectified by using only utf8 in translations. But I am not very knowledgable as far as the native language support system in freeciv is concerned, perhaps you can help me to understand it better. ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev
[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR
URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 [wsimpson - Sun Jan 20 02:34:24 2008]: Madeline Book wrote: Some translated strings show garbage characters when the client (branch S2_2) runs under LANG=fr_FR. S2_2 does not have currently maintained translations. That begins next week. Please check again after the 2.1 po files are merged into 2.2 I can reproduce the garbled text and gtk warnings with the released 2.1.1 and branch S2_1. As other occurences of special characters (e.g. accented vowels) in translated messages don't show up as ?, I am led to believe it is a problem with only a few translated strings (or maybe just the one mentioned in my initial report). ___ Freeciv-dev mailing list Freeciv-dev@gna.org https://mail.gna.org/listinfo/freeciv-dev