On Wed, Jul 10, 2002 at 12:25:42AM +0200, Simo Sorce wrote: > On Tue, 2002-07-09 at 22:59, Jeremy Allison wrote: > > But Simo, I disagree about the internal rep. I think it > > needs to be utf8 for Samba internal strings. We already > > have to deal with mbcs issues - this doesn't make it any > > worse.
> Have you thought how difficult is to effectively use utf8 strings? > search/replace/uppercase/lowercase? > it is very difficult to manipulate correctly utf8 strings without > introducing errors. I already experimented working with ucs2 null > terminated strings and it is way more easy and less prone to errors. > a character is always 2 bytes long and a byte codification doesn't > change meaning based on which place do it takes inside a string. > And substituting/manipulating characters in a string do not change the > string length with ucs2! > Can you instead tell me what are benefits of using utf8? Well, for starters, utf8 is forwards-compatible with the full current Unicode spec, whereas UCS-2 is truncated at 16 bits (hence Apple's use of UTF-16). The less work that has to be done to convert Samba from UCS-2 to UCS-4 on-the-wire when the time comes, the better. Steve Langasek postmodern programmer
msg02001/pgp00000.pgp
Description: PGP signature