On Wed, Jul 10, 2002 at 12:25:42AM +0200, Simo Sorce wrote:
> On Tue, 2002-07-09 at 22:59, Jeremy Allison wrote:
> > But Simo, I disagree about the internal rep. I think it
> > needs to be utf8 for Samba internal strings. We already
> > have to deal with mbcs issues - this doesn't make it any
> > worse.

> Have you thought how difficult is to effectively use utf8 strings?
> search/replace/uppercase/lowercase?
> it is very difficult to manipulate correctly utf8 strings without
> introducing errors. I already experimented working with ucs2 null
> terminated strings and it is way more easy and less prone to errors.
> a character is always 2 bytes long and a byte codification doesn't
> change meaning based on which place do it takes inside a string.
> And substituting/manipulating characters in a string do not change the
> string length with ucs2!

> Can you instead tell me what are benefits of using utf8?

Well, for starters, utf8 is forwards-compatible with the full current
Unicode spec, whereas UCS-2 is truncated at 16 bits (hence Apple's use
of UTF-16).  The less work that has to be done to convert Samba from
UCS-2 to UCS-4 on-the-wire when the time comes, the better.

Steve Langasek
postmodern programmer

Attachment: msg02001/pgp00000.pgp
Description: PGP signature

Reply via email to