On 10/17/05, Peter da Silva <pe...@taronga.com> wrote:
> > Software configured to use anything other than UTF-8 (or, at the
> > very least another UTF-*) is hateful. I can???t wait until
> > single-byte charsets are a thing of the past.

> Variable width character sets are themselves hateful.  I'll go further
> and say that they are a spectacularly stupid idea, and that whoever
> decreed them needs shootin'.  Yes, being able to represent more than
> 220-odd characters is a Good Idea.  So damnit, just use 32-bit - or
> 64-bit - characters.  Then you'll be able to seek!

Agreed. Don't piddle around with UCS-2, go straight to UCS-4.

In fact ISO-10646 is way too conservative for my tastes. I think each
distinct set of case transformation rules should have four 16-bit planes
allocated to it, so that truly internationalised characters will be able
to reliably toggle case when c&0x10000000 by flipping c&0x20000000. The
current set of wishy-washy unified characters with c&0x10000000 == 0 should
be left to rot like the hateful legacy things that they are.

What do you mean by "reliably toggle case"? My understanding is that
while western european alphabets tend to have two cases, many
languages have more than two. Thus I believe it is actually
meaningless to "toggle case" in an internationalized context.

Anyway, if you look at alphabets and languages as software they are
all hateful. Unfortunately sacrificing backwards compatibility to
resolve the problems in this sphere is completely nonviable. :-)

--
perl -Mre=debug -e "/just|another|perl|hacker/"

Reply via email to