Re: LSB status of sarge?

Michael Stone Wed, 25 Aug 2004 16:18:19 -0500

On Wed, Aug 25, 2004 at 03:45:24PM -0500, Jeff Licquia wrote:

As I understood it, this was the major objection.  However, I've also
heard that the patch works this way because of some performance
considerations when handling unibyte with the multibyte code.


I've heard that too. Bottom line is that nobody who has to maintain the
code in question wants to maintain two code paths with logic that should
be identical except in character width.

It's my understanding that a proper multibyte implementation still uses
fixed-width characters, just wider.  Specifically, most people told me
that it's futile to use UTF-8 Unicode internally; instead, UTF-8 input
should be converted to UCS-2 for internal use and then manipulated as
multibyte.


Interesting, since UCS-2 doesn't cover the whole unicode space. I assume
the patches are handling the necessary mapping?

Obviously, the question is: what to do with UTF-8 in external files?


Not just files, consider command line arguments.

It may not be "the" patch, but it is "a" patch, and the lack of any
other makes it "the" patch by default.  Certainly the other
distributions have been taking that approach.


It may be the patch by default, but that doesn't mean it will be
included. Each distribution has to decide which ugly hacks it is willing
to support. This isn't the first time that the choices have diverged.

Mike Stone

Re: LSB status of sarge?

Reply via email to