Re: Oracle and Surrogate Pairs

2000-07-24 Thread John H. Jenkins
> Does the field in question need to support literally any possible character > in Unicode 3.0 and beyond (since 3.0 does not have any surrogates > assigned!)? > > True, but within a year or so, there *will* be surrogates assigned in Unicode. One cannot be premature in supporting them at

Re: Oracle and Surrogate Pairs

2000-07-24 Thread Jianping Yang
Mikko, As Oracle UTF8 character set definition supports surrogates by a pairs of two 3-bytes to be sync with UTF-16 in binary sorting and code point, you will have the same issue to determine how many bytes for UTF8 as how many ushorts for UTF-16 if you want to have exactly match in surrogate su

RE: Oracle and Surrogate Pairs

2000-07-24 Thread Mikko Lahti
What is the recommendation what comes dealing surrogate pairs and supporting CJK Unified Ideographs, Extension B (especially HKSCS) which will be in next version of the Unicode standard?   Mikko   -Original Message- From: Jianping Yang [mailto:[EMAIL PROTECTED]] Sent: Monday, J

Re: Oracle and Surrogate Pairs

2000-07-24 Thread Jianping Yang
Mikko, As there is no character defined in surrogate range in Unicode 3.0, the maximum width for Oracle UTF8 character set is 3 bytes. Here I recommend you to use 3 times for the number of  characters you intend to store in a column. Regards, Jianping.. Mikko Lahti wrote:   What is the correct wa

Re: Oracle and Surrogate Pairs

2000-07-24 Thread Michael \(michka\) Kaplan
Does the field in question need to support literally any possible character in Unicode 3.0 and beyond (since 3.0 does not have any surrogates assigned!)? If not, then you can actually consider how big the field needs to be by the characters being used and what is the largest per character byte co

Oracle and Surrogate Pairs

2000-07-24 Thread Mikko Lahti
Title: Oracle and Surrogate Pairs What is the correct way of supporting surrogate pairs in Oracle 8? Anything wrong with approach of making fields 3 times longer from ASCII or should fields be 4 times ASCII as per UTF-8 spec? Later, Mikko Globalization Specialist Onyx Software [EMAIL PROTEC

Re: Bytes and Unicode

2000-07-24 Thread john
> (Torsten Mohrin) wrote: >> Kenneth Whistler wrote: >> So the first step to interoperability in big, interconnected system >> software using C is to set up fundamental header files containing >> well-defined datatypes of fixed sizes, to make up for the lack of same >> in the definition of C itse

Re: Abnormal Bytes and Unicode: (was Re: Unicode FAQ addendum)

2000-07-24 Thread Kenneth Whistler
Torsten responded: > > The lack of fixed-size datatypes in C > >is now a *defect* in the language, and not an *asset* of the language. > > The latest revision of ISO C has introduced exact-width integer types > (like "int8_t", "int16_t" and so on). These are also straightforward > names rather t

Re: Abnormal Bytes and Unicode: (was Re: Unicode FAQ addendum)

2000-07-24 Thread Torsten Mohrin
Kenneth Whistler <[EMAIL PROTECTED]> wrote: >So the first step to interoperability in big, interconnected system >software using C is to set up fundamental header files containing >well-defined datatypes of fixed sizes, to make up for the lack of same >in the definition of C itself. The lack of f

RE: What is "Unicode" in Chinese?

2000-07-24 Thread Kenneth Whistler
Ed asked: > > Would it be appropriate to look at the title of GB-13000, which is ISO/IEC > 10646-1 in China? > Probably not, since it is a national version of 10646, and not of Unicode. It is, at any rate: Xinxi jishu -- Tongyong duobawei bianma zifuji (UCS), i.e. "Information technology -- U

RE: What is "Unicode" in Chinese?

2000-07-24 Thread Hart, Edwin F.
Joe, Lee, Would it be appropriate to look at the title of GB-13000, which is ISO/IEC 10646-1 in China? Ed Edwin F. Hart The Johns Hopkins University Applied Physics Laboratory 11100 Johns Hopkins Road Laurel, MD 20723-6099 +1-443-778-6926 (Baltimore) +1-240-228-6926 (DC Area) +1-443-778-1093 (

What is "Unicode" in Chinese?

2000-07-24 Thread Becker, Joseph
It seems that Chinese is the only major language in which the term "Unicode" needs to be translated rather than transliterated. It may be time to gather up current usage and select an "official" translation, and perhaps to bless one or more informal ones. We have collected these candidates so f

Abnormal Bytes and Unicode: (was Re: Unicode FAQ addendum)

2000-07-24 Thread Kenneth Whistler
Paul Keinanen wrote: > At 16.10 22.7.2000 -0800, jgo wrote: > >> Addison wrote: > >> 1. 1 byte != 1 character: deal with it. > > > >Hmm, depends on how you define "byte". > >I've seen them in 8-bit, 12-bit, 16-bit and 18-bit varieties. > > > >The trouble, though, is that 1 character (in this cont

RE: Unicode & various Indian font styles

2000-07-24 Thread Apurva Joshi
Hello Mr. Paresh Agarwal, Yes, it is certainly possible to make your fonts based on Unicode. The fonts would need to contain glyphs mapped to the corresponding code-points in the Indic blocks. If the fonts contain a number of consonant conjuncts, and glyph variants for vowel signs whose use is de

Unicode & various Indian font styles

2000-07-24 Thread Magda Danish (Unicode)
  -Original Message-From: AGARWAL [mailto:[EMAIL PROTECTED]]Sent: Sunday, July 23, 2000 7:43 AMTo: [EMAIL PROTECTED]Subject: information Dear Sir/Madam, Hello! We are a translation agency by the name MULTI-LINGUIST, operating from India. We are interested in incorporating the Unicod

Java and unicode

2000-07-24 Thread William Overington
I am learning Java and learning the applying of unicode within Java programs. Perhaps readers might like to know of my experiences of using Java and unicode together. I learned Java from the free course at http://www.free-ed.net and got the Java Development Kit, a later version than the one orig