Re: [Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR

2008-01-26 Thread William Allen Simpson

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

Jason Short wrote:
 Madeline Book wrote:
 Oh wow, it really was trivial! But certainly not obvious
 for the likes of me :(. Once I ran the client with LANG=
 fr_FR.UTF-8 the libc messages displayed correctly and gtk/
 pango was very happy.

 Nor to me.  Leaving open until this is documented on our web pages and in
 the distributed files.
 
 This is a bug in freeciv.
 
 Background: when I wrote the charset code, I divided freeciv into 3
 charsets.  ...

And where the heck is this documented?


 So in summary, if any bug reporters can track where the offending
 strings are coming from, fixing each incident is not too hard.  In the
 meantime this should be listed as a known bug.
 
I'm really tired of this laissez fair attitude toward bug fixing.

It may be a known bug, but that means there should be an open ticket,
probably with a tracking ticket to gather the related incidents.

It is a serious bug.

And the .UTF-8 workaround needs to be clearly documented!  Everywhere!



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR

2008-01-26 Thread Madeline Book

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

 [jdorje - Sun Jan 27 06:35:40 2008]:
 Obviously. But this is not a bug that is going to be easy to verify as
 completely fixed.  Doing so requires scanning every string (not every
 translated string; every string) to check for inclusions of libc 
output.
 
  It is a serious bug.
 
 That depends on how many such strings there are.  I suspect there are
 very few, thus making it a non-fatal and rather rare bug.  In any 
case,
 we can quickly fix the most common places and make it such.

When I was trying to find the source of the strings in the (warclient)
code I noticed that most (if not all) of the translated system strings
were coming from the strerror wrappers (mystrerror, mystrsocketerror).
I presume if you want to intercept and modify the strings that would be
the best place to do it... but I have reservations about implementing
such a hackish fix (since it would go against what the user implicitly
or explicitly requested via the LANG enviroment variable).
 
  And the .UTF-8 workaround needs to be clearly documented!  
Everywhere!
 
 What workaround?

I think Mr. Simpson meant the LANG=fr_FR.UTF-8 workaround to get the
system strings in UTF-8, as mentioned previously in this thread.

Incidentally you can see some other workarounds in this thread
http://freeciv.freeforums.org/viewtopic.php?t=156, though in
retrospect it looks to me more like the blind leading the blind. :(

Veering slightly offtopic, but can someone more knowldgeable please
explain to me how it is possible that a cast (i.e. to const gchar *
from const char *) could possibly change the encoding of a string?


___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR

2008-01-26 Thread Jason Short

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

 [wsimpson - Sat Jan 26 10:38:42 2008]:
 
 Jason Short wrote:
  Madeline Book wrote:
  Oh wow, it really was trivial! But certainly not obvious
  for the likes of me :(. Once I ran the client with LANG=
  fr_FR.UTF-8 the libc messages displayed correctly and gtk/
  pango was very happy.
 
  Nor to me.  Leaving open until this is documented on our web pages
and in
  the distributed files.
  
  This is a bug in freeciv.
  
  Background: when I wrote the charset code, I divided freeciv into 3
  charsets.  ...
 
 And where the heck is this documented?

In this and other RT tickets, it seems.  Strange, I thought I had
written this up somewhere in the code, but I can find nothing.  Where
should it be documented?

  So in summary, if any bug reporters can track where the offending
  strings are coming from, fixing each incident is not too hard.  In the
  meantime this should be listed as a known bug.
  
 I'm really tired of this laissez fair attitude toward bug fixing.

Huh?  I'm the one who pointed out it's a bug in the first place, remember?

 It may be a known bug, but that means there should be an open ticket,
 probably with a tracking ticket to gather the related incidents.

Obviously. But this is not a bug that is going to be easy to verify as
completely fixed.  Doing so requires scanning every string (not every
translated string; every string) to check for inclusions of libc output.

 It is a serious bug.

That depends on how many such strings there are.  I suspect there are
very few, thus making it a non-fatal and rather rare bug.  In any case,
we can quickly fix the most common places and make it such.

 And the .UTF-8 workaround needs to be clearly documented!  Everywhere!

What workaround?

-jason


___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR

2008-01-25 Thread Jason Short

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

 [wsimpson - Mon Jan 21 09:51:15 2008]:
 
 Madeline Book wrote:
  Oh wow, it really was trivial! But certainly not obvious
  for the likes of me :(. Once I ran the client with LANG=
  fr_FR.UTF-8 the libc messages displayed correctly and gtk/
  pango was very happy.
  
 Nor to me.  Leaving open until this is documented on our web pages and in
 the distributed files.

This is a bug in freeciv.

Background: when I wrote the charset code, I divided freeciv into 3
charsets.  The local charset is the one supported on the command line:
this one is not under our control, and all output to the command line
must be converted into it.  The internal charset is the one used
internally within freeciv: this is always utf-8 at the server but can be
configured by the GUI at the client side (GUI writing is a lot easier if
your charset is the same as your GUI library's).  Meanwhile the data
charset is the one used in all data files and network transactions and
is utf-8.

Now, the relevant point here is that while freeciv strings as translated
by _() go directly into the internal encoding (see
bind_textdomain_codeset in fciconv.c), anything returned by a library is
going to go into whatever encoding that library uses.  In the case of
libc, this is the local encoding, and thus any translatable strings
returned by libc need to be converted before they can be used.  This
can't really be changed; although we could hack the local encoding to
switch to UTF-8 or just change the libc domain to return strings in that
encoding, this would be poorly portable and would also cause stuff
printed to the command line directly by libc to be in the wrong encoding.

The functions to do the conversion exist in fciconv.h and are the same
ones used for reading data from the command line:
local_to_internal_string_malloc and local_to_internal_string_buffer.  It
would be nice to have a shortened form of these (like L_()), as has been
discussed before, but since iconv requires a buffer into which to stick
results this buffer must be either provided or allocated.  Declaring the
buffer statically is tempting but would lead to hard-to-trace bugs when
L_() was used twice at once; some sort of garbage collection or
buffer-rotating scheme might also come to mind for this.  Perhaps

#define L_(t,b) local_to_internal_string_buffer((t),b,sizeof(b))

might be the best way to go.  Of course the real work lies in finding
all the places where L_ should be used...which will require either a
full audit or just fixing it as bugs are reported.  The latter would be
the easiest place to start as we already have a concrete bug report of a
few and it's not likely too many libc-translated strings are being used
internally within freeciv.

So in summary, if any bug reporters can track where the offending
strings are coming from, fixing each incident is not too hard.  In the
meantime this should be listed as a known bug.

-jason


___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR

2008-01-20 Thread Madeline Book

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

 [book - Sun Jan 20 23:12:18 2008]:
 
  [wsimpson - Sun Jan 20 11:27:57 2008]:
  
  Madeline Book wrote:
   I can reproduce the garbled text and gtk warnings with the 
released
   2.1.1 and branch S2_1. As other occurences of special characters
   (e.g. accented vowels) in translated messages don't show up as 
?,
   I am led to believe it is a problem with only a few translated 
 strings
   (or maybe just the one mentioned in my initial report).
   
  Spendid!  We have about 18 hours before release of 2.1.3.  If you 
 submit
  your patch to po/fr.po, I'll be happy to check it in
  
 As much as I would like to do that, after looking for
 those strings (I found a few more that gtk doesn't like)
 in po/fr.po I failed to find them. In fact they are from
 libc.mo (e.g. in /usr/share/locale/fr/LC_MESSAGES/ on my
 system). So now I am a bit confused. Obviously it cannot
 be the fault of the freeciv french translators anymore
 (sorry :)), is it perhaps that I don't have utf8 french
 localization installed on my system? Or perhaps freeciv
 is not correctly initializing nls to return translated
 text from libc as utf8? I will investigate this problem
 some more (though I have no experience at all in properly
 building nls-enabled applications); hopefully someone
 will speak up about the trivially obvious solution. ;)
 
Oh wow, it really was trivial! But certainly not obvious
for the likes of me :(. Once I ran the client with LANG=
fr_FR.UTF-8 the libc messages displayed correctly and gtk/
pango was very happy.

This then was a non-issue from the start. Anyway it is
resolved now, as far as I can see.

___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR

2008-01-20 Thread Madeline Book

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

 [wsimpson - Sun Jan 20 11:27:57 2008]:
 
 Madeline Book wrote:
  I can reproduce the garbled text and gtk warnings with the released
  2.1.1 and branch S2_1. As other occurences of special characters
  (e.g. accented vowels) in translated messages don't show up as ?,
  I am led to believe it is a problem with only a few translated 
strings
  (or maybe just the one mentioned in my initial report).
  
 Spendid!  We have about 18 hours before release of 2.1.3.  If you 
submit
 your patch to po/fr.po, I'll be happy to check it in
 
As much as I would like to do that, after looking for
those strings (I found a few more that gtk doesn't like)
in po/fr.po I failed to find them. In fact they are from
libc.mo (e.g. in /usr/share/locale/fr/LC_MESSAGES/ on my
system). So now I am a bit confused. Obviously it cannot
be the fault of the freeciv french translators anymore
(sorry :)), is it perhaps that I don't have utf8 french
localization installed on my system? Or perhaps freeciv
is not correctly initializing nls to return translated
text from libc as utf8? I will investigate this problem
some more (though I have no experience at all in properly
building nls-enabled applications); hopefully someone
will speak up about the trivially obvious solution. ;)


___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR

2008-01-19 Thread Madeline Book

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

Some translated strings show garbage characters when the client (branch
S2_2) runs under LANG=fr_FR. In particular if you try to connect to a non-
existant server (e.g. 192.168.77.77) then after failure the text in the network
status bar will contain the string Aucun chemin d'acc?s pour..., i.e. accented
characters appear as '?' and gtk/pango complains about invalid utf8. I am
guessing that it is a small mistake on the part of the french translation main-
tainer(s), and could be easily rectified by using only utf8 in translations. But
I am not very knowledgable as far as the native language support system in
freeciv is concerned, perhaps you can help me to understand it better.



___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev


[Freeciv-Dev] (PR#40028) gtk/pango invalid utf8 warning for LANG=fr_FR

2008-01-19 Thread Madeline Book

URL: http://bugs.freeciv.org/Ticket/Display.html?id=40028 

 [wsimpson - Sun Jan 20 02:34:24 2008]:
 
 Madeline Book wrote:
  Some translated strings show garbage characters when the client
 (branch
  S2_2) runs under LANG=fr_FR.
 
 S2_2 does not have currently maintained translations.  That begins
 next week.
 Please check again after the 2.1 po files are merged into 2.2

I can reproduce the garbled text and gtk warnings with the released
2.1.1 and branch S2_1. As other occurences of special characters
(e.g. accented vowels) in translated messages don't show up as ?,
I am led to believe it is a problem with only a few translated strings
(or maybe just the one mentioned in my initial report).

___
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev