Simon Law wrote:
<< In Oracle9i our next Database Release shipping this summer, we have introduced
support for two new Unicode character sets. ...>>
New character *sets* ???
> From: Carl W. Brown [mailto:[EMAIL PROTECTED]]
>
> I resisted calling it FTF-8 (Funky Transfer Format - 8), but
> if you want to
> call it Weird Transfer Format - 8, I don't have any real objections.
Well, that's ONE possible translation of "WTF"...
/|/|ike
wn'; Simon Law; [EMAIL PROTECTED]
Subject: RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and
email)
If you have this funny encoding please don't call it UTF8 because it is not
UTF8 and will only confuse users. You could call it OTF8 or something like
that but not UTF8.
If you have this funny encoding please don't call it UTF8 because it is not
UTF8 and will only confuse users. You could call it OTF8 or something like
that but not UTF8.
How about "WTF-8"?
Sorry - I couldn't resist.
/|/|ike
ut
not UTF8.
Carl
-Original Message-From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Simon
LawSent: Wednesday, May 30, 2001 11:02 AMTo:
[EMAIL PROTECTED]Subject: Re: ISO vs Unicode UTF-8 (was RE: UTF-8
signature in web and email)Hi Folks,
Over the last f
acters from all scripts
are
represented in 2 bytes.
Comments?
-Original Message-From: Simon Law
[mailto:[EMAIL PROTECTED]]Sent: Wednesday, May 30, 2001 8:02
PMTo: [EMAIL PROTECTED]Subject: Re: ISO vs Unicode
UTF-8 (was RE: UTF-8 signature in web and email)Hi Folks,
O
someone emits the b
michka
- Original Message -
From: "Simon Law" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Wednesday, May 30, 2001 11:01 AM
Subject: Re: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)
> Hi Folks,
>
> Over the last
Simon,
Would you care to answer (officially) why exactly Oracle needs for anything
to be done here? Per the spec, it is not illegal for a process to interpret
5/6-byte supplementary characters; it is only illegal to emit them. It seems
that Oracle and everyone else is well covered with the existi
8 and UTF-32 system that sort like
UTF-16 is folly.
Carl
-Original Message-From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]]On Behalf Of Simon
LawSent: Wednesday, May 30, 2001 11:02 AMTo:
[EMAIL PROTECTED]Subject: Re: ISO vs Unicode UTF-8 (was RE: UTF-8
signature in we
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
> According to the proposal, UTF-8S and UTF-32S would not have the same
> status: they wouldn't be for interchange; they'd just be for
> representation
> internal to a given system, like UTF-EBCDIC (which, I think I
> heard, has
> not actual
Tuesday, May 29, 2001 3:47 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and
email)
Carl,
> Ken,
>
> UTF-8s is essentially a way to ignore surrogate processing.
It allows a
> company to encode UTF-16
9=P?M(B: "Carl W. Brown" <[EMAIL PROTECTED]>;
$B08@h(B: [EMAIL PROTECTED];
Cc:
$BF|;~(B: 01/05/30 0:46
$B7oL>(B: RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and email)
>Ken,
>
>I suspect that Oracle is specifically pushing for this standard because
ay, May 29, 2001 3:47 PM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: RE: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and
email)
Carl,
> Ken,
>
> UTF-8s is essentially a way to ignore surrogate processing. It allows a
> company to encode UTF-16 wit
Carl,
> Ken,
>
> UTF-8s is essentially a way to ignore surrogate processing. It allows a
> company to encode UTF-16 with UCS-2 logic.
>
> The problem is that by not implementing surrogate support you can introduce
> subtle errors. For example it is common to break buffers apart into
> segment
.
Carl
-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of Kenneth Whistler
Sent: Tuesday, May 29, 2001 11:18 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and
email)
Doug
Antoine Leca wrote:
> Jianping Yang wrote:
> >
> > As a matter of fact, the surrogate or supplementary character was not defined
> > in the past,
>
> How long is "the past"? I remember reading about these surrogates the first
> time I put my hands on a draft copy of ISO 10646. It was nearly six
Doug wrote:
> UTF-8 and UTF-32 should absolutely not be similarly hacked to maintain some
> sort of bizarre "compatibility" with the binary sorting order of UTF-16.
> UTC should not, and almost certainly will not, endorse such a proposal on the
> part of the database vendors.
I would be l
On 05/27/2001 08:03:37 PM Jianping Yang wrote:
>>But it seems to me that we've lived without
>>Premise B in the past, and that it won't benefit us to adopt it now. Why
>>bother with it? Why not continue doing what we already know how to do?
>As a matter of fact, the surrogate or supplementary c
From: "Jianping Yang" <[EMAIL PROTECTED]>
> As a matter of fact, the surrogate or supplementary
> character was not defined in the past, so we could
> live without Premise B in the past. But now the
> supplementary character is defined and will soon be
> supported, we have to bother with it.
Poo
-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]On
Behalf Of [EMAIL PROTECTED]
Sent: Monday, May 28, 2001 3:30 AM
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: ISO vs Unicode UTF-8 (was RE: UTF-8 signature in web and
email)
In a message dated 2001-05-26 16:00:47 Pacific Day
In a message dated 2001-05-26 16:00:47 Pacific Daylight Time,
[EMAIL PROTECTED] writes:
> The issue is this: Unicode's three encoding forms don't sort in the same
> way when sorting is done using that most basic and
> valid-in-almost-no-locales-but-easy-and-quick approach of simply comparing
Jianping Yang wrote:
>
> As a matter of fact, the surrogate or supplementary character was not defined
> in the past,
How long is "the past"? I remember reading about these surrogates the first
time I put my hands on a draft copy of ISO 10646. It was nearly six years ago.
Or do you mean that it
$B!z$8$e$&$$$C$A$c$s!z(B
EKYWY TXLY NPZ P MPVD XPHYV LPWWQY
NKT ZPN XT WYPZTX PE PMM ET HPWWD
"EYX EKTSZPXV'Z HTWY GSX
P XSHOYW EKPX TXY
PXV LTHHQEHYXE, ET HY, QZ RSQEY ZLPWD"
>>
>> >There was another abomination proposed. Oracle rather than adding UTF-16
>> >support proposed that non plan
I don't want to argue on this lengthy email, but only point two facts:
>According to the proposal, UTF-8S and UTF-32S would not have the same
>status: they wouldn't be for interchange; they'd just be for representation
>internal to a given system, like UTF-EBCDIC (which, I think I heard, has
>not
>If you think something abominable is happening, please raise a loud voice
>and flood UTC members with e-mail and tell everyone what you think and why
>you think it. Nobody can hear you when you mumble.
>
>And it helps if you have solid technical and philosophical arguments to
convey.
Well, I w
Some people said things like...
>There was another abomination proposed.
>I was choosing not to mention the abominable.
The abominable steam-rollers of history squish those who don't scream and
run; and the few weak survivors are forever cleaning up the resulting
messes.
If you think someth
Unicode UTF-8 (was RE: UTF-8 signature in web and
email)
On 05/25/2001 12:21:13 PM Carl W. Brown wrote:
>Peter,
>
>There was another abomination proposed.
I was choosing not to mention the abominable.
- Peter
It was not shot down entirely... in baseball terms, the umpire said "Foul
tip, strike two" (strike one was the last time). :-)
michka
- Original Message -
From: <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Friday, May 25, 2001 12:49 PM
Subject: RE: ISO vs Unic
Peter,
There was another abomination proposed. Oracle rather than adding UTF-16
support proposed that non plane 0 characters be encoded to an from UTF-8 by
encoding each of the surrogate pairs into a separate UTF-8 character.
This way they could encode UTF-16 using the UCS-2 encoding into two 3
29 matches
Mail list logo