Cherlin [mailto:[EMAIL PROTECTED]]
Sent: Saturday, July 29, 2000 2:33 PM
To: Unicode List
Subject: Re: FW: Oracle and Surrogate Pairs
At 2:41 AM -0800 7/25/2000, [EMAIL PROTECTED] wrote:
>Hi all,
> I have been developing/convering a software to support multiple
>languages, especially
them.
>Thanks & Regards,
>Samir Mehrotra,
>i-flex Solutions Limited,
>a CitiCorp venture capital company
>at SEI-CMM level 5.
>[EMAIL PROTECTED]
>
>> -Original Message-
>> From: John H. Jenkins [SMTP:[EMAIL PROTECTED]]
>> Sent:
You could define a UTF that mapped scalar values below to the same as
UTF-8, and values above to a 6 byte value. It would *not* be UTF-8, but it
can be well defined.
If you look below D29 -- p. 46 at the first full paragraph -- you find that for
round tripping, UTFs are required to map
Not bad at all for Oracle as we get exact requirement from our application
vendors that they want to store the surrogate as 6 bytes in database so that
they can have the same semantics as UTF-16.
As for conforming, I don't think there is any issue here for the database
client if UTF-16 is used fo
>As Oracle UTF8 character set definition supports surrogates by a pairs of
two
>3-bytes to be sync with UTF-16 in binary sorting and code point,
This in not a conformant representation.
D29 (p. 46) states that a UTF "transforms each Unicode scalar value into a
unique sequence of code values". A
SMTP:[EMAIL PROTECTED]]
> Sent: Tuesday, July 25, 2000 8:12 AM
> To: Unicode List
> Subject: Re: Oracle and Surrogate Pairs
>
> > Does the field in question need to support literally any possible
> character
> > in Unicode 3.0 and beyond (since 3.0 does not have any surro
> Does the field in question need to support literally any possible character
> in Unicode 3.0 and beyond (since 3.0 does not have any surrogates
> assigned!)?
>
>
True, but within a year or so, there *will* be surrogates assigned in Unicode. One
cannot be premature in supporting them at
supporting
CJK Unified Ideographs, Extension B (especially HKSCS) which will be in
next version of the Unicode standard?
Mikko
-Original
Message-
From:
Jianping Yang [mailto:[EMAIL PROTECTED]]
Sent:
Monday, July 24, 2000 5:08 PM
To:
Mikko Lahti
Cc:
Unicode List
Subject:
Re: Oracle and
, July 24, 2000 5:08
PM
To: Mikko Lahti
Cc: Unicode List
Subject: Re: Oracle and Surrogate
Pairs
Mikko,
As there is no character defined in
surrogate range in Unicode 3.0, the maximum width for Oracle UTF8 character set
is 3 bytes. Here I recommend you to use 3 times for the number of
Mikko,
As there is no character defined in surrogate range in Unicode 3.0,
the maximum width for Oracle UTF8 character set is 3 bytes. Here I recommend
you to use 3 times for the number of characters you intend to store
in a column.
Regards,
Jianping..
Mikko Lahti wrote:
What is the correct wa
st" <[EMAIL PROTECTED]>
Sent: Monday, July 24, 2000 4:28 PM
Subject: Oracle and Surrogate Pairs
> What is the correct way of supporting surrogate pairs in Oracle 8?
Anything
> wrong with approach of making fields 3 times longer from ASCII or should
> fields be 4 times ASCI
Title: Oracle and Surrogate Pairs
What is the correct way of supporting surrogate pairs in Oracle 8? Anything wrong with approach of making fields 3 times longer from ASCII or should fields be 4 times ASCII as per UTF-8 spec?
Later,
Mikko
Globalization Specialist
Onyx Software
[EMAIL
12 matches
Mail list logo