Follow-up Comment #10, bug #15377 (project freeciv):

Actually I came to the exact same conclusion Ulrik: we should simply rewrite
the character functions to work only on ascii (range 0 to 127) characters,
and leave others alone.  Using the system-provided functions (such as
tolower) which work in the current locale makes no sense and never will make
sense.  This means some UTF8 characters wont get properly converted which
will cause minor bugs (for instance it wont do proper case comparison on
non-ascii letters in player/ruler names), but will remove a much larger set
of bugs (for instance it may currently do wrong case comparison on those same
non-ascii letters).

As for making or using a UTF-8 variant for these, it's a bit more complicated
than that.  You can't have isspace() or tolower() or isalpha() functions in
utf8 that go byte-by-byte.  This means the same job you were doing via byte
iteration over the string has to be done entirely differently, in every case.
 Additionally, some of the places these functions are called they are given
utf-8, some they are given ascii, and in some they may be given latin1 or a
different character set - so again it all requires careful auditing of the
users in that case, of which there are a lot.  If we go that route I'd rather
rewrite all freeciv core to use ucs2 or ucs4 (fixed-width unicode) strings and
impose that on these functions, and then we get type-checking out of it as
well.  This wouldn't be that hard but then we have to convert all data files
(in utf-8) and all GUI strings (also in utf-8, for gtk2) on both input and
output, everywhere, which is a lot of lines of change.  It's probably
overkill.

    _______________________________________________________

Reply to this item at:

  <http://gna.org/bugs/?15377>

_______________________________________________
  Message sent via/by Gna!
  http://gna.org/


_______________________________________________
Freeciv-dev mailing list
Freeciv-dev@gna.org
https://mail.gna.org/listinfo/freeciv-dev

Reply via email to