RE: FW: Oracle and Surrogate Pairs

2000-07-29 Thread Michael Kung
Cherlin [mailto:[EMAIL PROTECTED]] Sent: Saturday, July 29, 2000 2:33 PM To: Unicode List Subject: Re: FW: Oracle and Surrogate Pairs At 2:41 AM -0800 7/25/2000, [EMAIL PROTECTED] wrote: >Hi all, > I have been developing/convering a software to support multiple >languages, especially

Re: FW: Oracle and Surrogate Pairs

2000-07-29 Thread Edward Cherlin
them. >Thanks & Regards, >Samir Mehrotra, >i-flex Solutions Limited, >a CitiCorp venture capital company >at SEI-CMM level 5. >[EMAIL PROTECTED] > >> -Original Message- >> From: John H. Jenkins [SMTP:[EMAIL PROTECTED]] >> Sent:

Re: Oracle and Surrogate Pairs

2000-07-25 Thread Mark Davis
You could define a UTF that mapped scalar values below to the same as UTF-8, and values above to a 6 byte value. It would *not* be UTF-8, but it can be well defined. If you look below D29 -- p. 46 at the first full paragraph -- you find that for round tripping, UTFs are required to map

Re: Oracle and Surrogate Pairs

2000-07-25 Thread Jianping Yang
Not bad at all for Oracle as we get exact requirement from our application vendors that they want to store the surrogate as 6 bytes in database so that they can have the same semantics as UTF-16. As for conforming, I don't think there is any issue here for the database client if UTF-16 is used fo

Re: Oracle and Surrogate Pairs

2000-07-25 Thread Peter_Constable
>As Oracle UTF8 character set definition supports surrogates by a pairs of two >3-bytes to be sync with UTF-16 in binary sorting and code point, This in not a conformant representation. D29 (p. 46) states that a UTF "transforms each Unicode scalar value into a unique sequence of code values". A

FW: Oracle and Surrogate Pairs

2000-07-25 Thread samir . mehrotra
SMTP:[EMAIL PROTECTED]] > Sent: Tuesday, July 25, 2000 8:12 AM > To: Unicode List > Subject: Re: Oracle and Surrogate Pairs > > > Does the field in question need to support literally any possible > character > > in Unicode 3.0 and beyond (since 3.0 does not have any surro

Re: Oracle and Surrogate Pairs

2000-07-24 Thread John H. Jenkins
> Does the field in question need to support literally any possible character > in Unicode 3.0 and beyond (since 3.0 does not have any surrogates > assigned!)? > > True, but within a year or so, there *will* be surrogates assigned in Unicode. One cannot be premature in supporting them at

Re: Oracle and Surrogate Pairs

2000-07-24 Thread Jianping Yang
supporting CJK Unified Ideographs, Extension B (especially HKSCS) which will be in next version of the Unicode standard? Mikko -Original Message- From: Jianping Yang [mailto:[EMAIL PROTECTED]] Sent: Monday, July 24, 2000 5:08 PM To: Mikko Lahti Cc: Unicode List Subject: Re: Oracle and

RE: Oracle and Surrogate Pairs

2000-07-24 Thread Mikko Lahti
, July 24, 2000 5:08 PM To: Mikko Lahti Cc: Unicode List Subject: Re: Oracle and Surrogate Pairs   Mikko, As there is no character defined in surrogate range in Unicode 3.0, the maximum width for Oracle UTF8 character set is 3 bytes. Here I recommend you to use 3 times for the number of

Re: Oracle and Surrogate Pairs

2000-07-24 Thread Jianping Yang
Mikko, As there is no character defined in surrogate range in Unicode 3.0, the maximum width for Oracle UTF8 character set is 3 bytes. Here I recommend you to use 3 times for the number of  characters you intend to store in a column. Regards, Jianping.. Mikko Lahti wrote:   What is the correct wa

Re: Oracle and Surrogate Pairs

2000-07-24 Thread Michael \(michka\) Kaplan
st" <[EMAIL PROTECTED]> Sent: Monday, July 24, 2000 4:28 PM Subject: Oracle and Surrogate Pairs > What is the correct way of supporting surrogate pairs in Oracle 8? Anything > wrong with approach of making fields 3 times longer from ASCII or should > fields be 4 times ASCI

Oracle and Surrogate Pairs

2000-07-24 Thread Mikko Lahti
Title: Oracle and Surrogate Pairs What is the correct way of supporting surrogate pairs in Oracle 8? Anything wrong with approach of making fields 3 times longer from ASCII or should fields be 4 times ASCII as per UTF-8 spec? Later, Mikko Globalization Specialist Onyx Software [EMAIL