> Does the field in question need to support literally any possible character
> in Unicode 3.0 and beyond (since 3.0 does not have any surrogates
> assigned!)?
>
>
True, but within a year or so, there *will* be surrogates assigned in Unicode. One
cannot be premature in supporting them at
Mikko,
As Oracle UTF8 character set definition supports surrogates by a pairs
of two 3-bytes to be sync with UTF-16 in binary sorting and code point,
you will have the same issue to determine how many bytes for UTF8 as how
many ushorts for UTF-16 if you want to have exactly match in surrogate
su
What is
the recommendation what comes dealing surrogate pairs and supporting CJK
Unified Ideographs, Extension B (especially HKSCS) which will be in next
version of the Unicode standard?
Mikko
-Original
Message-
From: Jianping Yang
[mailto:[EMAIL PROTECTED]]
Sent: Monday, J
Mikko,
As there is no character defined in surrogate range in Unicode 3.0,
the maximum width for Oracle UTF8 character set is 3 bytes. Here I recommend
you to use 3 times for the number of characters you intend to store
in a column.
Regards,
Jianping..
Mikko Lahti wrote:
What is the correct wa
Does the field in question need to support literally any possible character
in Unicode 3.0 and beyond (since 3.0 does not have any surrogates
assigned!)?
If not, then you can actually consider how big the field needs to be by the
characters being used and what is the largest per character byte co
Title: Oracle and Surrogate Pairs
What is the correct way of supporting surrogate pairs in Oracle 8? Anything wrong with approach of making fields 3 times longer from ASCII or should fields be 4 times ASCII as per UTF-8 spec?
Later,
Mikko
Globalization Specialist
Onyx Software
[EMAIL PROTEC
> (Torsten Mohrin) wrote:
>> Kenneth Whistler wrote:
>> So the first step to interoperability in big, interconnected system
>> software using C is to set up fundamental header files containing
>> well-defined datatypes of fixed sizes, to make up for the lack of same
>> in the definition of C itse
Torsten responded:
> > The lack of fixed-size datatypes in C
> >is now a *defect* in the language, and not an *asset* of the language.
>
> The latest revision of ISO C has introduced exact-width integer types
> (like "int8_t", "int16_t" and so on). These are also straightforward
> names rather t
Kenneth Whistler <[EMAIL PROTECTED]> wrote:
>So the first step to interoperability in big, interconnected system
>software using C is to set up fundamental header files containing
>well-defined datatypes of fixed sizes, to make up for the lack of same
>in the definition of C itself. The lack of f
Ed asked:
>
> Would it be appropriate to look at the title of GB-13000, which is ISO/IEC
> 10646-1 in China?
>
Probably not, since it is a national version of 10646, and not of Unicode.
It is, at any rate: Xinxi jishu -- Tongyong duobawei bianma zifuji (UCS), i.e.
"Information technology -- U
Joe, Lee,
Would it be appropriate to look at the title of GB-13000, which is ISO/IEC
10646-1 in China?
Ed
Edwin F. Hart
The Johns Hopkins University Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-443-778-6926 (Baltimore)
+1-240-228-6926 (DC Area)
+1-443-778-1093 (
It seems that Chinese is the only major language in which the term "Unicode"
needs to be translated rather than transliterated. It may be time to gather
up current usage and select an "official" translation, and perhaps to bless
one or more informal ones.
We have collected these candidates so f
Paul Keinanen wrote:
> At 16.10 22.7.2000 -0800, jgo wrote:
> >> Addison wrote:
> >> 1. 1 byte != 1 character: deal with it.
> >
> >Hmm, depends on how you define "byte".
> >I've seen them in 8-bit, 12-bit, 16-bit and 18-bit varieties.
> >
> >The trouble, though, is that 1 character (in this cont
Hello Mr. Paresh Agarwal,
Yes, it is certainly possible to make your fonts based on Unicode. The fonts
would need to contain glyphs mapped to the corresponding code-points in the
Indic blocks. If the fonts contain a number of consonant conjuncts, and
glyph variants for vowel signs whose use is de
-Original Message-From: AGARWAL
[mailto:[EMAIL PROTECTED]]Sent: Sunday, July 23, 2000 7:43
AMTo: [EMAIL PROTECTED]Subject:
information
Dear Sir/Madam,
Hello!
We are a translation agency by the name
MULTI-LINGUIST, operating from India. We are interested in incorporating the
Unicod
I am learning Java and learning the applying of unicode within Java
programs.
Perhaps readers might like to know of my experiences of using Java and
unicode together.
I learned Java from the free course at http://www.free-ed.net and got the
Java Development Kit, a later version than the one orig
16 matches
Mail list logo