Michael Kaplan wrote:
Thus far it is something that has been implemented in the fonts, rather than
anywhere else for example there are several ligatures in Tamil that will
display one way with the Latha font and the other way with Monotype Tamil
Arial (the way set out in Unicode 3.0 is
(This message is send in UTF-8. Flames regarding that fact
will be deleted without response.)
No, those case mappings are not in error. Nor are their
canonical mappings in error. (The MICRO SIGN would have
had a canonical mapping to Greek mu, if it had not been
included in such much-used
Title: Chinese characters in Java Applet
Hello,
I am trying to to display chinese characters stored in Unicode format in oracle database through a Java applet in the browser. The applet uses JDBC calls and thin driver.
The oracle resides on Sun Solaris server . But the applet is not
John Cowan wrote:
Now suppose we have a character sequence beginning with U+FEFF U+0020.
This would be encoded as follows:
US-ASCII: (not possible)
UTF-16: 0xFE 0xFF 0xFE 0xFF 0x00 0x20 ...
UTF-16: 0xFF 0xFE 0xFF 0xFE 0x20 0x00 ...
UTF-16BE: 0xFE 0xFF 0x00 0x20 ...
UTF-16LE: 0xFF
Thus since people who write the language sent both,
cut
Do you mean that Tamil writers *purposely* use both the "ancient" and the
"modern" forms in the same document?
What is the intent?
yes, that is what am I saying. If you go to several of the Tamil resource
sites on the web, you can
On 06/21/2000 03:09:43 PM [EMAIL PROTECTED] wrote:
Appropriate or not, users (you know, those people who don't read the
documentation that the programmers don't write) will use text editors to
split
files. They will then concatenate the files using a non-Unicode aware
tool.
And they will
These characters are purely coded for compatibility. Unicode does not distinguish
letters by the abbreviations that they happen to be used in. There is no difference in
semantics between the "g" in "go" vs. the "g" in "12g", nor between the "Å" in "Århus"
vs. the "Å" in "15Å", nor -- for that
[EMAIL PROTECTED] wrote:
... I think the suggestion that BOM and ZWNBSP be
de-unified, which I have heard before, may make the best sense.
*If* that's the solution, it should be done yesterday. The longer it takes the
more implementations (and data) there will be that needs to be changed.
-
On Thu, Jun 22, 2000 at 02:20:39 -0800, Parvinder Singh(EHPT) wrote:
I am trying to to display chinese characters stored in Unicode format in
oracle database through a Java applet in the browser. The applet uses JDBC
calls and thin driver.
The oracle resides on Sun Solaris server . But the
At 12:12 PM 06/20/2000 -0800, Kenneth Whistler wrote:
Bob Rosenberg wrote:
This was my concern, there is no way to distinguish UTF-8 from Latin-1 in
case of upper ASCII characters here.
Yes there is - its called a "Sanity Check". You parse the file looking for
High-ASCII. If you
-Original Message-
From: Robert A. Rosenberg [mailto:[EMAIL PROTECTED]]
...
[on overlong UTF-8 sequences, a few lines down:]
faked) files. I agree that missed the extra sanity check of
looked for
shortest string but if I remember the rules correctly, there is no
requirement
"Ayers, Mike" wrote:
Am I reading this wrong? Here's what I get:
I hand you a UTF-16 document. This document is:
FE FF 00 48 00 65 00 6C 00 6C 00 6F
..so it says "Hello". Then I say, "Oh, by the way, that's
big-endian." *POOF* The content of the document
Kenneth Whistler wrote:
Now we are pushing through the long, bureaucratic process of getting
this accepted into 10646-1, so it we maintain synchronicity with a
joint publication of it as a *standard* character.
So a fair statement of what you hope to achieve is: U+2060 will be
the zero-width
I agree Gary.
Windows 2000 Notepad, however, does not agree and writes one.
Since Notepad in prior versions of Windows was in fact the defacto standard
for HTML editor (g), clearly it is a program to be reckoned with. People
should be aware of the fact that there are going to MANY files out
I want to write an application in Java that will store information
in a database using Unicode. Ideally the application will run
with any database that supports Unicode. One would presume that the
JDBC driver would take care of any differences between databases
so my application could be
I got some amusing results when I tried out the Altavista translation
service on segments of the new language descriptions in
http://www.unicode.org/unicode/standard/WhatIsUnicode.html
Original (English):
What is Unicode? Unicode provides a unique number for every character,
no matter
On 06/21/2000 06:33:57 PM [EMAIL PROTECTED] wrote:
The standard doesn't ever discuss the BOM in the context of UTF-8,
See section 13.6 (page 324).
Sure enough. Well, there you go: the confusion is officially sanctioned!
Peter Constable
Michael Kaplan wrote:
Thus since people who write the language sent both,
cut
Do you mean that Tamil writers *purposely* use both the "ancient" and the
"modern" forms in the same document?
What is the intent?
yes, that is what am I saying.
Okay, I did not know (and I did not
18 matches
Mail list logo