Re: "UNICODE BOMBER STRIKES AGAIN"

2002-04-22 Thread Doug Ewell
Kenneth Whistler <[EMAIL PROTECTED]> wrote: > -- K '\0' e '\0' n '\0' Lemme see, that's 0x4B 0x00 0x65 0x00 0x6E 0x00. There's no BOM, and no external tagging as "UTF-16LE," and since this is the Internet, we don't know the endianness of the originating machine. So, based on last week's discus

Re: "UNICODE BOMBER STRIKES AGAIN"

2002-04-22 Thread Doug Ewell
ろ ろ〇〇〇 <[EMAIL PROTECTED]> wrote: > Why don't they just romanize the little boxes? I would rather read, > say, romanized kana than boxes. Because if the font maker is going to go to the trouble of providing glyphs for the romanization, she might as well provide real kana glyphs. > Is the Un

Two new characters added to KS X 1001 in Dec. 1998

2002-04-22 Thread Jungshik Shin
Note to subscribers of Unicode and Linux-utf8 mailing list: The following message is about two new characters added to South Korean (ROK) nat'l coded character set standard KS X 1001 in December, 1998. Although this change is not directly related to two lists I'm copying this to, I'm taking that

Re: browsers and unicode surrogates

2002-04-22 Thread David Starner
On Mon, Apr 22, 2002 at 12:19:15PM -0400, [EMAIL PROTECTED] wrote: > My test pages don't have yet characters beyond BMP(I just > recycled a page I made a long time ago for Korean testing) . I may later > add them. (Tex, can I use your sample page? I'd rather put up a page > with some content ins

Re: unicode conversion in any amtp server (eg sendmail)

2002-04-22 Thread Jungshik Shin
On Mon, 22 Apr 2002, David Starner wrote: > On Mon, Apr 22, 2002 at 08:06:51PM +0100, x0638890 wrote: > > Hello, > > > > Can anyone tell me if there is any smtp server (eg sendmail) which can do > > automatic unicode to Windows 1252 codepage conversion of incoming emails ? > > Probably not. Wh

Re: "UNICODE BOMBER STRIKES AGAIN"

2002-04-22 Thread Tex Texin
Kenneth Whistler wrote: > There he sits in wait until you switch on, and BOM!, all your GIGS > turn to SQUARE-RAD/S and the little bytestie is laughing his SCSU off. There he sits in "symbol for synchonous idle" until you "symbol for start of text", and BOM!, all your GIGS "clockwise open circle

Re: unicode conversion in any amtp server (eg sendmail)

2002-04-22 Thread David Starner
On Mon, Apr 22, 2002 at 08:06:51PM +0100, x0638890 wrote: > Hello, > > Can anyone tell me if there is any smtp server (eg sendmail) which can do > automatic unicode to Windows 1252 codepage conversion of incoming emails ? Probably not. Why would you want to do this? It's working at the wrong lev

Re: Variant locales?

2002-04-22 Thread Peter_Constable
On 04/22/2002 02:18:52 PM John Hudson wrote: >This raises a related topic. The codes to tag combinations of script and >(the poorly named) language system in OpenType together represent a set of >specific typographic conventions that may or may not be identified along >language lines and which a

Re: "UNICODE BOMBER STRIKES AGAIN"

2002-04-22 Thread Kenneth Whistler
> Doug Ewell pun'ed: > > > > > There he sits in wait until you switch on, and BAM, all your data > > > turns to squares and the little beastie is laughing his socks off. > > > > That should have been "BOM." > > Yes, and "turns to squares" should have been "turns to replacement > characters".

Re: browsers and unicode surrogates

2002-04-22 Thread Stefan Persson
- Original Message - From: <[EMAIL PROTECTED]> To: "Unicode Mailing List" <[EMAIL PROTECTED]> Sent: den 22 april 2002 20:24 Subject: Re: browsers and unicode surrogates > Thank you for this tip. I didn't know this and ended up > 'cluttering' my filenames with charset suffices at >

Re: "UNICODE BOMBER STRIKES AGAIN"

2002-04-22 Thread Peter_Constable
On 04/22/2002 02:21:48 PM Tex Texin wrote: >Doug Ewell pun'ed: >> >> > There he sits in wait until you switch on, and BAM, all your data >> > turns to squares and the little beastie is laughing his socks off. >> >> That should have been "BOM." > >Yes, and "turns to squares" should have been "tur

Re: "UNICODE BOMBER STRIKES AGAIN"

2002-04-22 Thread $B$m!;!;!;!;(B $B$m!;!;!;(B
>From: Tex Texin <[EMAIL PROTECTED]> >To: Doug Ewell <[EMAIL PROTECTED]> >CC: [EMAIL PROTECTED], [EMAIL PROTECTED] >Subject: Re: "UNICODE BOMBER STRIKES AGAIN" >Date: Mon, 22 Apr 2002 15:21:48 -0400 > >Doug Ewell pun'ed: > > > > > There he sits in wait until you switch on, and BAM, all your data

Re: Variant locales?

2002-04-22 Thread Michael \(michka\) Kaplan
From: <[EMAIL PROTECTED]> > (Many would say it would be even better if the structure and > definition of notion locale itself were revisited -- a common > theme on the locales list). Although until they decide on that list just what they *do* want (and move off of the " does not apply me sinc

Re: Variant locales?

2002-04-22 Thread John Hudson
At 11:40 4/22/2002, [EMAIL PROTECTED] wrote: >Before suggesting that one freely combine language and country codes (not >that that will help in this particular case), I'd like to mention the paper >I'll be presenting at IUC21, in which I suggest the need for (and propose a >draft of) a model for

Re: "UNICODE BOMBER STRIKES AGAIN"

2002-04-22 Thread Tex Texin
Doug Ewell pun'ed: > > > There he sits in wait until you switch on, and BAM, all your data > > turns to squares and the little beastie is laughing his socks off. > > That should have been "BOM." Yes, and "turns to squares" should have been "turns to replacement characters". -- ---

unicode conversion in any amtp server (eg sendmail)

2002-04-22 Thread x0638890
Hello, Can anyone tell me if there is any smtp server (eg sendmail) which can do automatic unicode to Windows 1252 codepage conversion of incoming emails ? Thank you for your support and kind regards Manuel Sepulveda [EMAIL PROTECTED]

Re: browsers and unicode surrogates

2002-04-22 Thread jshin
> "James H. Cloos Jr." wrote: > > > > Since you are using apache, it is quite easy to get the extra headers > > sent at the protocol level rather than having to use meta tags. > > > > You can use a Header directive in an .htaccess file a la: > > > > > > Header set Content-Language en-US > >

Re: Variant locales?

2002-04-22 Thread Michael Everson
Peter makes a good point of course. Concatenation of tags is the primary way of doing this *today* however. -- Michael Everson *** Everson Typography *** http://www.evertype.com

Re: Variant locales?

2002-04-22 Thread Peter_Constable
On 04/22/2002 12:06:18 PM Michael Everson wrote: >At 09:32 -0700 2002-04-22, Deborah Goldsmith wrote: >>I had a recent inquiry from inside Apple as to whether there was a >>registry of variants of the standard ISO locales, e.g. ja_JP.kana >>for Japanese written only with kana. Does anyone know i

Klez virus and forged mail from Unicode addresses

2002-04-22 Thread Sarasvati
Good Morning Troopers, Very rarely do I step in with a warning, but this time, the worms have gone too far. There is a virus called WORM_KLEZ.G now going around. It picks up old addresses from client address books and will send virus infected mail posing as someone you know. In particular we have

Re: How many printable characters in 3.2.0?

2002-04-22 Thread Kenneth Whistler
Stefan noted: > Also, you can add accents and such to just about any character. Shall "a" > with an acute accent be considered a printable character? And what about the > Chinese character for "one" with an acute accent? And different kinds of > accents from different scripts can be added to the

Re: Variant locales?

2002-04-22 Thread Michael Everson
At 09:32 -0700 2002-04-22, Deborah Goldsmith wrote: >I had a recent inquiry from inside Apple as to whether there was a >registry of variants of the standard ISO locales, e.g. ja_JP.kana >for Japanese written only with kana. Does anyone know if there is >any standard that attempts to describe s

Re: How many printable characters in 3.2.0?

2002-04-22 Thread Stefan Persson
- Original Message - From: "Doug Ewell" <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Cc: "Zsigri Gyula" <[EMAIL PROTECTED]> Sent: den 22 april 2002 17:35 Subject: Re: How many printable characters in 3.2.0? > Zsigri Gyula <[EMAIL PROTECTED]> wrote: > > > How many printable characters are

Variant locales?

2002-04-22 Thread Deborah Goldsmith
I had a recent inquiry from inside Apple as to whether there was a registry of variants of the standard ISO locales, e.g. ja_JP.kana for Japanese written only with kana. Does anyone know if there is any standard that attempts to describe such things? Thanks, Deborah Goldsmith Manager, Fonts &

Re: browsers and unicode surrogates

2002-04-22 Thread jshin
On Fri, 19 Apr 2002, Tom Gewecke wrote: > > With BOM at the beginning, Netscape 4.x, Netscape 6.x/Mozilla and MS > >IE 5.x/6.x can handle them without much problem except that support > >for characters above BMP varies from browser to browser as Tex tried to > >demonstrate in his test pages.

Re: browsers and unicode surrogates

2002-04-22 Thread Steffen Kamp
Tex, >Section 5.2.1 discusses the BOM. Also see my previous mail. Ah right. I agree that a BOM is definitely a good idea there but as 0xFFFE could either be a UTF-32LE BOM or a UTF-16LE BOM followed by a U+ (should't happen too often, though) it is not necessesarily easy to determine the

Re: How many printable characters in 3.2.0?

2002-04-22 Thread Doug Ewell
Zsigri Gyula <[EMAIL PROTECTED]> wrote: > How many printable characters are there in Unicode 3.2.0? I tried > desperately to find the answer at the Unicode web site but could > not. There are 95,156 total assigned characters. To find the number of "printable" characters, you must first determin

Re: browsers and unicode surrogates

2002-04-22 Thread Tex Texin
Thanks Stefan, that's good to know. Seems a bit odd to hide the encoding abilities, especially when there is already a "more" menu pick... thanks for the info. tex Stefan Persson wrote: > > IE 5 supports more encodings than listed. For example, > the western European DOS encoding is supported (

Re: "UNICODE BOMBER STRIKES AGAIN"

2002-04-22 Thread Doug Ewell
> There he sits in wait until you switch on, and BAM, all your data > turns to squares and the little beastie is laughing his socks off. That should have been "BOM." -Doug Ewell Fullerton, California

Re: browsers and unicode surrogates

2002-04-22 Thread Tex Texin
Jim, thanks for all the info. I would prefer to not clutter filenames with encodings and locales, and the remainder I need to coordinate with my ISP. I'll talk to them and see what they let me do. tex "James H. Cloos Jr." wrote: > > > "Tex" == Tex Texin <[EMAIL PROTECTED]> writes: > > Tex>

Re: browsers and unicode surrogates

2002-04-22 Thread Stefan Persson
--- Tex Texin <[EMAIL PROTECTED]> skrev: > Jungshik Shin, > Opera 6 lists UTF-16 as an encoding. Netscape 6.2 > lists UTF-16LE. IE 6 > does not list any UTF other than UTF-8. I haven't > noticed any encodings > becoming available or unavailable when I access > pages with different > encodings, b

"UNICODE BOMBER STRIKES AGAIN"

2002-04-22 Thread Peter_Constable
FYI: http://linguistlist.org/issues/13/13-1106.html#3 - Peter --- Peter Constable Non-Roman Script Initiative, SIL International 7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA Tel: +1 972 708 7485 E-mail: <[EMAIL PROTECT

Re: browsers and unicode surrogates

2002-04-22 Thread James H. Cloos Jr.
> "Tex" == Tex Texin <[EMAIL PROTECTED]> writes: Tex> I am surprised by the "must only be used". It seems I am not Tex> conforming by including a meta statement in the utf-16 HTML Tex> page. I should either remove the statement or encode the HTML up Tex> to and including that statement as asc

Re: How to punch in Tamil Unicode characters?

2002-04-22 Thread Avarangal
visit http://www.geocities.com/avarangal/index.html I'll be contacting you after Wednesday about Keyboard Input prog that I have. - Original Message - From: "suresh ." <[EMAIL PROTECTED]> To: <[EMAIL PROTECTED]> Sent: Monday, April 22, 2002 9:16 AM Subject: How to punch in Tamil Unicod

How many printable characters in 3.2.0?

2002-04-22 Thread Zsigri Gyula
How many printable characters are there in Unicode 3.2.0? I tried desperately to find the answer at the Unicode web site but could not. Thanks, Gyula Zsigri

Re: Thai word list

2002-04-22 Thread KUSANO Takayuki
At Thu, 18 Apr 2002 15:47:19 +0200 (CEST), Werner LEMBERG wrote: > > > I may be able to help you in these area. See > > http://developer.thai.net/libinthai/ - an open-source word-break > > library > > All links are broken on this page... The Japanese host no longer > exists apparently under the

Re: browsers and unicode surrogates

2002-04-22 Thread Lars Marius Garshol
* Tex Texin | | In looking at the HTML 4.01 spec to quote the above, I noted an | interesting sentence: | "The META declaration must only be used when the character encoding | is organized such that ASCII-valued bytes stand for ASCII characters | (at least until the META element is parsed)." |

How to punch in Tamil Unicode characters?

2002-04-22 Thread suresh .
Hello everybody, I wish to use Unicode tamil chatacters to develop an website.Can some body tell me how to punch in Unicode tamil characters in to my webpage?.Is it sufficient for me to mention encoding=utf-16 as my charset in the webpage? Please let me know how to key-in as well as to display uni

Re: browsers and unicode surrogates

2002-04-22 Thread Tex Texin
Steffen, Section 5.2.1 discusses the BOM. Also see my previous mail. I'll talk to my ISP to see if it's possible to have the charset in the HTTP set. Thanks for noting it. It wouldn't surprise me if the W3C validator didn't support utf-16, but I'll ask the author. tex Steffen Kamp wrote: > > >I

Re: browsers and unicode surrogates

2002-04-22 Thread Tex Texin
Jungshik Shin, Hi! Just a couple of minor comments. Opera 6 lists UTF-16 as an encoding. Netscape 6.2 lists UTF-16LE. IE 6 does not list any UTF other than UTF-8. I haven't noticed any encodings becoming available or unavailable when I access pages with different encodings, but maybe there is a