Hi.
I took several minutes to scan through your post and I am not sure what you
are asking. Would you like to see some examples, for instance, of real
(assigned) code points that require encoding by surrogate pairs to be
represented as Java char? Looking at what you are trying to do, I think I
Peter Constable:
On 02/20/2001 03:34:28 AM Marco Cimarosti wrote:
"Unicode is now a 32-bit character encoding standard,
although only about one million of codes actually exist,
[...]
Well, it's probably a better answer to say that Unicode is a 20.1-bit
encoding since the direct
John Hudson wrote:
(In French, sans serif is normally named "antique"
Which must be very confusing to Germans and others who use
'antiqua' to
distinguish seriffed humanists types from blackletter.
Antoine Leca replied:
And you do believe that Frenchies are _not_ confused by the fact
John Hudson wrote:
At 09:05 AM 2/20/2001 -0800, Antoine Leca wrote:
(In French, sans serif is normally named "antique"
Which must be very confusing to Germans and others who use 'antiqua' to
distinguish seriffed humanists types from blackletter.
And you do believe that Frenchies
At 21:25 -0800 2001-02-20, [EMAIL PROTECTED] wrote:
And perhaps the Mac people think of MacRoman as "8-bit ASCII."
No, we think of it as Mac Roman.
--
Michael Everson ** Everson Gunn Teoranta ** http://www.egt.ie
15 Port Chaeimhghein ochtarach; Baile tha Cliath 2; ire/Ireland
Mob +353 86
On Wed, 21 Feb 2001, Werner LEMBERG wrote:
Section 10.1 of PDUTR #27 "Unicode 3.1" (2000.1.17) gives the sources of
the 42,711 new characters as:
...
CNS 11643-1992, 15th plane
Really? I thought this should be CNS 11643-1986. I think there isn't
a 15th plane in the 1992
IETF defines this as part of one of the very early RFCs on SMTP, FTP, or
TELNET but has not defined it in a separate RFC. Essentially, it is 7-bit
ASCII (ANSI X3.4-1986) but the IETF may not include the ASCII control
characters.
Ed Hart
Edwin F. Hart
[EMAIL PROTECTED]
The Johns Hopkins
What is the function of ASCII control code 0x7F (DEL) in text interchange?
Particularly, what effect or interpretation might it have in communication
protocols, terminal protocols and, especially, inside text files?
My interest is about the function of this character in *contemporary*
platforms
Apurva Joshi va escriure:
Re: "Uniscribe is just an implementation of these specifications, and I hope
sincerely Microsoft will not hide some "features" into USP10.DLL in order to
kill any concurrence."
The process of adding new feature support to Uniscribe is not unlike adding
newer
Marco Cimarosti wrote:
With all that I really don't envy people who, like
Patrick Andries, have undertaken the "impossible" task of
translating the Unicode documentation into another
language, and I look with sympathy at their requests for
proof-reading...
Thank you for the kind
On Wed, Feb 21, 2001 at 06:29:29 -0800, Marco Cimarosti wrote:
What is the function of ASCII control code 0x7F (DEL) in text
interchange?
Particularly, what effect or interpretation might it have in
communication protocols, terminal protocols and, especially, inside
text files?
My
On Wed, 21 Feb 2001, Werner LEMBERG wrote:
South Korea's PKS 5700
This is a North Korean standard AFAIK.
No. AFAIK, PKS stands for 'Proposed Korean Standard' and as such PKS 5700
became KS C 5700 which in turn was renamed KS X 1005-1. Then, what is
KS X 1005-1? It's just the Korean
In a message dated 2001-02-21 07:03:46 Pacific Standard Time,
[EMAIL PROTECTED] writes:
What is the function of ASCII control code 0x7F (DEL) in text interchange?
Particularly, what effect or interpretation might it have in communication
protocols, terminal protocols and, especially,
Which systems interpret 0x7F as "interrupt process"? I know that this would
be 0x03 in DOS (^C), and 0x03, 0x04 or 0x1A in Unix (^C, ^D, and ^Z,
respectively), but I know nothing about other systems, e.g. Macintosh.
Very long ago, in the Seventh Edition of Unix, the default interrupt
On Wed, 21 Feb 2001, Jungshik Shin wrote:
On Wed, 21 Feb 2001, Jungshik Shin wrote:
On Wed, 21 Feb 2001, Werner LEMBERG wrote:
South Korea's PKS 5700
This is a North Korean standard AFAIK.
No. AFAIK, PKS stands for 'Proposed Korean Standard' and as such PKS 5700
became KS C 5700
On Wed, 21 Feb 2001, Jungshik Shin wrote:
On Wed, 21 Feb 2001, Werner LEMBERG wrote:
South Korea's PKS 5700
This is a North Korean standard AFAIK.
No. AFAIK, PKS stands for 'Proposed Korean Standard' and as such PKS 5700
became KS C 5700 which in turn was renamed KS X 1005-1.
Marco Cimarosti wrote:
Which systems interpret 0x7F as "interrupt process"? I know that this would
be 0x03 in DOS (^C), and 0x03, 0x04 or 0x1A in Unix (^C, ^D, and ^Z,
respectively), but I know nothing about other systems, e.g. Macintosh.
Very long ago, in the Seventh Edition of Unix, the
On Wed, Feb 21, 2001 at 09:42:53 -0800, Marco Cimarosti wrote:
1) What happens if emacs loads Doug Ewell's text file (I.e. a text file
containing "ABCdelDEF") and then saves it? Would the file's content be
changed to "ABDEF"?
No. I don't think any program interprets file contents in this
On Wed, 21 Feb 2001, Werner LEMBERG wrote:
Section 10.1 of PDUTR #27 "Unicode 3.1" (2000.1.17) gives the sources of
the 42,711 new characters as:
...
CNS 11643-1992, 15th plane
Really? I thought this should be CNS 11643-1986. I think there isn't
a 15th plane in the 1992
On Wed, 21 Feb 2001, Jungshik Shin wrote:
On Wed, 21 Feb 2001, Thomas Chan wrote:
The unihan.txt file ver 3.0b1 (1999.7.2) lists four K- sources as:
K0 KS C 5601-1987
K1 KS C 5657-1991
K2 PKS C 5700-1 1994
K3 PKS C 5700-2 1994
It's very clear what K0 and K1 are, and they
John Cowan wrote:
As I take it, you mean that the Bulgarian alphabet consists of those
letters of the Cyrillic script that are necessary and customary in
writing Bulgarian...
Am I reading you correctly?
Yes.
- Peter
Frank da Cruz wrote:
DEL does indeed have a use in plain text files that are encoded with
Shift-In / Shift-Out to switch between left and right halves of (say)
ISO 8859-1 without having to actually put 8-bit characters in the file.
Ditto for "higher" levels of ISO-2022 character-set
On Tue, 13 Feb 2001, Julie Doll Allen wrote:
Julie,
Thank you for your kind answer.
Just to let you know, Ken Whistler and I both filed away your comments
and those of others about p. 124. As the editor for 4.0, I've flagged
that passage for discussion in the editorial committee once we
Marco Cimarosti wrote:
What is the function of ASCII control code 0x7F (DEL) in text interchange?
Particularly, what effect or interpretation might it have in communication
protocols, terminal protocols and, especially, inside text files?
In general it has none. Some systems interpret it
The Unihan.txt file for Unicode 3.1 has finally arrived!
You can go to the beta update directory location to get your
copy:
http://www.unicode.org/Public/3.1-Update/
(Sorry, but ftp access is still broken. We'll have all the
data files mirrored for ftp access in the near future, but
not just
On Wed, 21 Feb 2001, Thomas Chan wrote:
On Wed, 21 Feb 2001, Jungshik Shin wrote:
On Wed, 21 Feb 2001, Jungshik Shin wrote:
On Wed, 21 Feb 2001, Werner LEMBERG wrote:
South Korea's PKS 5700
This is a North Korean standard AFAIK.
No. AFAIK, PKS stands for 'Proposed Korean
Marco:
Would you mind if I re-post my reply that I forget to cc to the list?
--- missing post
What exactly _would_ be wrong with calling UNICODE a thirty-two bit encoding
with a couple of ways to represent most of the characters in a smaller
Related to the "clear" identification of plain text:
My group is trying to convince developers to implement Unicode in their
systems. So, one of our first task is to identify "plain text" in their
systems so that we can understand the implication and requirements for
implementing Unicode.
A
What exactly _would_ be wrong with calling UNICODE a
thirty-two bit encoding
If I have a 32 bit integer type, holding a Unicode code point, I have
11 bits left over to hold other data. That's worth knowing.
Btw, saying approximately 20.087 bits (Am I calculating that
Hi Thomas,
I am just a newby making noise and otherwise being obnoxious. I had
forgotten to cc the intermediate message to the mailing list, and didn't
realize it until after I posted my reply with most of Ken Whistler's reply
clipped. I'll waste even more bandwidth and paste the intermediates
We've seen several posts about the perception that Unicode is a
16 bit character set encoding. Among those, we've heard anecdotes
about the problems people have introducing newcomers to Unicode.
Here is a chapter of a reference manual I've been working on.
The original manual can be found at
Hi all,
Between January 30-31, there was a thread here entitled "ConScript
registry?", in which I mentioned[1] the possibility of non-Western
fictional scripts gobbling up codepoints, where I gave two example .jpg
files of the kinds of Chinese fictional scripts that exist.
Whether those
Since there seem to be some people here who know about something about Greek
diacritics, I'm hoping someone here will be able to help me. I know very little about
Greek, as will probably become clear. I'm making a Unicode version of an ASCII
representation of an etymological dictionary,
33 matches
Mail list logo