Re: [OT?] QBCS

2003-09-01 Thread Tex Texin
Doug,

In most industry usages, MBCS refers to variable width encodings, not fixed
width.

tex

Doug Ewell wrote:

> Paradoxically (at least to me), the term "multi-byte character set"
> refers to a fixed-width encoding, such as UCS-2.  The official name of
> ISO/IEC 10646 is "Universal Multiple-Octet Coded Character Set."

-- 
-
Tex Texin   cell: +1 781 789 1898   mailto:[EMAIL PROTECTED]
Xen Master  http://www.i18nGuy.com
 
XenCrafthttp://www.XenCraft.com
Making e-Business Work Around the World
-



RE: [OT?] QBCS

2003-08-29 Thread Marco Cimarosti
Doug Ewell wrote:
> [...]
> (BTW, pet peeve:  The word "acronym" should only be used to mean a
> pronounceable WORD ("nym") formed from the initials of other words.
> Classic examples are "scuba" and "radar."  If you can figure 
> out how to pronounce "qbcs," more power to you, but to me it's just
> an abbreviation.)

Right, sorry.

(I can pronounce ['kubks], although I wouldn't do it in front of my managers
and customers. :-)

Actually, I don't like this "QBCS" term and I'd rather avoid saying it
myself. But I wanted to be sure other people mean when they use it.

> [...]
> > So what it really means must be "quadra-byte character
> > encoding", and both GB 18030 and UTF-32 should fit
> > into that category.
> 
> GB 18030, yes, because its code units vary from one to four bytes in
> length.  UTF-32, no, because its code units are uniformly 32 bits.

But UTF-8 fits the definition.

_ Marco




Re: [OT?] QBCS

2003-08-29 Thread Doug Ewell
Lars Marius Garshol  quoted Marco
Cimarosti:

> | It seems that the IT world has a new acronym: "QBCS". I understand
> | that it stands for "quadra-byte character set", and I heard it used
> | to refer to GB 13030.
> |
> | My question is: it just a fancy sinomym for GB 13030 or can it also
> | refer to Unicode or other encodings?

The original term "DBCS," or "double-byte character set," refers to a
variable-width encoding where each character requires either one or two
bytes.  East Asian legacy character encodings fall into this category.

By extension, then, a "QBCS" would be a variable-width character
encoding where the code units can be anywhere from one to four bytes
long -- an apt description of GB 18030.

Paradoxically (at least to me), the term "multi-byte character set"
refers to a fixed-width encoding, such as UCS-2.  The official name of
ISO/IEC 10646 is "Universal Multiple-Octet Coded Character Set."

(BTW, pet peeve:  The word "acronym" should only be used to mean a
pronounceable WORD ("nym") formed from the initials of other words.
Classic examples are "scuba" and "radar."  If you can figure out how to
pronounce "qbcs," more power to you, but to me it's just an
abbreviation.)

> This must be an oxymoron, in the sense that character sets don't
> really have a byte width, being completely abstract assignments of
> abstract characters to abstract numbers.

This is technically true, but the terms SBCS and DBCS are so entrenched
in the industry that it doesn't seem useful to try to deprecate them
now.

> So what it really means must be "quadra-byte character encoding", and
> both GB 18030 and UTF-32 should fit into that category.

GB 18030, yes, because its code units vary from one to four bytes in
length.  UTF-32, no, because its code units are uniformly 32 bits.

-Doug Ewell
 Fullerton, California
 http://users.adelphia.net/~dewell/




Re: [OT?] QBCS

2003-08-28 Thread Lars Marius Garshol

* Marco Cimarosti
|
| It seems that the IT world has a new acronym: "QBCS". I understand
| that it stands for "quadra-byte character set", and I heard it used
| to refer to GB 13030.
| 
| My question is: it just a fancy sinomym for GB 13030 or can it also
| refer to Unicode or other encodings?

This must be an oxymoron, in the sense that character sets don't
really have a byte width, being completely abstract assignments of
abstract characters to abstract numbers.

So what it really means must be "quadra-byte character encoding", and
both GB 18030 and UTF-32 should fit into that category.

-- 
Lars Marius Garshol, Ontopian http://www.ontopia.net >
GSM: +47 98 21 55 50  http://www.garshol.priv.no >




[OT?] QBCS

2003-08-28 Thread Marco Cimarosti
It seems that the IT world has a new acronym: "QBCS". I understand that it
stands for "quadra-byte character set", and I heard it used to refer to GB
13030.

My question is: it just a fancy sinomym for GB 13030 or can it also refer to
Unicode or other encodings?

Thanks in advance.

_ Marco