Re: Variations of UTF-16 (was: Re: "UNICODE BOMBER STRIKES AGAIN")

2002-04-24 Thread John Cowan
Doug Ewell scripsit: > The Unix and Linux world is very > opposed to the use of BOM in plain-text files, and if they feel that way > about UTF-8 they probably feel the same about UTF-16. I doubt it. The trouble with BOMizing is that it makes ASCII not a subset of UTF-8, but ASCII cannot be a su

Re: Variations of UTF-16 (was: Re: "UNICODE BOMBER STRIKES AGAIN")

2002-04-24 Thread Jungshik Shin
On Wed, 24 Apr 2002, David Starner wrote: > On Wed, Apr 24, 2002 at 09:00:17AM -0700, Doug Ewell wrote: > > The Unix and Linux world is very > > opposed to the use of BOM in plain-text files, and if they feel that way > > about UTF-8 they probably feel the same about UTF-16. The reason we're n

Re: Variations of UTF-16 (was: Re: "UNICODE BOMBER STRIKES AGAIN")

2002-04-24 Thread David Starner
On Wed, Apr 24, 2002 at 01:37:39PM -0400, [EMAIL PROTECTED] wrote: > Err, no. That's not the point, AFAIK. The point is that traditionally > in UNIX there hasn't been any sort of "marker" or "tag" in the beginning, > UNIX files being flat streams of bytes. The UNIX toolset has been built > with

RE: Variations of UTF-16 (was: Re: "UNICODE BOMBER STRIKES AGAIN")

2002-04-24 Thread jarkko . hietaniemi
> Why? The problems with a BOM in UTF-8 have to do with it being an > ASCII-compatible encoding. Err, no. That's not the point, AFAIK. The point is that traditionally in UNIX there hasn't been any sort of "marker" or "tag" in the beginning, UNIX files being flat streams of bytes. The UNIX tool

Re: Variations of UTF-16 (was: Re: "UNICODE BOMBER STRIKES AGAIN")

2002-04-24 Thread David Starner
On Wed, Apr 24, 2002 at 09:00:17AM -0700, Doug Ewell wrote: > The Unix and Linux world is very > opposed to the use of BOM in plain-text files, and if they feel that way > about UTF-8 they probably feel the same about UTF-16. Why? The problems with a BOM in UTF-8 have to do with it being an ASCII

Re: Variations of UTF-16 (was: Re: "UNICODE BOMBER STRIKES AGAIN")

2002-04-24 Thread Doug Ewell
Mark Davis <[EMAIL PROTECTED]> wrote: >> I must not *call* the sequence "UTF-16," since that term is officially >> reserved for BOM-marked text which can be either little- or big-endian, >> or BOMless text which must be big-endian. > > Yes, assuming the "BUT" clause applies to (b). That is, the u

Re: Variations of UTF-16 (was: Re: "UNICODE BOMBER STRIKES AGAIN")

2002-04-24 Thread Mark Davis
below — Γνῶθι σαυτόν — Θαλῆς [For transliteration, see http://oss.software.ibm.com/cgi-bin/icu/tr] http://www.macchiato.com - Original Message - From: "Doug Ewell" <[EMAIL PROTECTED]> To: "Mark Davis" <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]> Cc: "Kenneth Whistler" <[EMAIL PROTECTED]