Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Doug Ewell
Joseph Boyle wrote: > Newline problems are a good analogy. They still require bookkeeping of > different formats and attention in any new coding and cause new bugs, > even though the problem has been around for decades. Nobody is holding > their breath for any of the platforms to change their new

Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Tex Texin
Joseph Boyle wrote: > > Yes, the software business is largely about dealing with the BADLY WRITTEN, > the TRIVIAL, and the BRAIN-DEAD. Your point? I see we are still working on naming utf-8 formats with and without the bom. I find these quite acceptable, assuming you mean: utf8-badly-written-

RE: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Joseph Boyle
rager@;umich.edu] Sent: Monday, November 04, 2002 9:19 AM To: Unicode Mailing List Subject: Re: PRODUCING and DESCRIBING UTF-8 with and without BOM Hi, everyone, It's almost unbelievable to me how many email postings are wasted on discussions such as this UTF-8 BOM issue ... I guess it means

Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Michael \(michka\) Kaplan
From: "Joseph Boyle" <[EMAIL PROTECTED]> > No, the notation to say "BOM required (report any files without BOM)", "BOM > not allowed (report any files with BOM)", or "BOM optional (only report > files if they are not valid UTF-8 at all)", for a given file type. Well, yes. If you wanted to avoid m

RE: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Joseph Boyle
>> Yes, it's trivial to check. What's missing is the notation to tell the >> checker what to check for. >Sorry, but that is incorrect. If they know its UTF-8, then its either a BOM or its not. It is three specific bytes. No, the notation to say "BOM required (report any files without BOM)", "BO

Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Edward H Trager
Hi, everyone, It's almost unbelievable to me how many email postings are wasted on discussions such as this UTF-8 BOM issue ... I guess it means that there is a lot of BADLY WRITTEN software out there in the world ;-) With regard to READING incoming UTF-8 text streams, surely any good software d

Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread John Cowan
Joseph Boyle scripsit: > I haven't encountered UTF-32, SCSU, UTF-7, or BOCU-1 as transfer encodings. Alas, a member of one of the mailing lists I'm on is using an old version of Netscape, and he ends up sending UTF-7 (unless he is very careful not to) whenever he does anything non-ASCII. The tro

RE: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Joseph Boyle
ardize is irritating and unnecessary doesn't make existing software go away. -Original Message- From: Michael (michka) Kaplan [mailto:michka@;trigeminal.com] Sent: Monday, November 04, 2002 8:08 AM To: Joseph Boyle; Unicode Mailing List Subject: Re: PRODUCING and DESCRIBING UTF-8 with an

RE: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Joseph Boyle
Boyle; 'Michael (michka) Kaplan' Subject: Re: PRODUCING and DESCRIBING UTF-8 with and without BOM Joseph Boyle wrote: > Software currently under development could use the identifiers for > choosing whether to require or emit BOM, like the file requirements > checker I have

Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Michael \(michka\) Kaplan
From: "Joseph Boyle" <[EMAIL PROTECTED]> > Yes, it's trivial to check. What's missing is the notation to tell the > checker what to check for. Sorry, but that is incorrect. If they know its UTF-8, then its either a BOM or its not. It is three specific bytes. > Yes, this is a good description of

Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Doug Ewell
Joseph Boyle wrote: > Software currently under development could use the identifiers for > choosing whether to require or emit BOM, like the file requirements > checker I have to write, and ICU/uconv. Alternatively, software could use a completely separate flag to indicate whether a BOM is to be

Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Michael \(michka\) Kaplan
From: "Joseph Boyle" <[EMAIL PROTECTED]> Joesph, > Software currently under development could use the identifiers for choosing > whether to require or emit BOM, like the file requirements checker I have to > write, and ICU/uconv. Lets separate that into the two issuse it represents: EMITTING: T

RE: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Joseph Boyle
: PRODUCING and DESCRIBING UTF-8 with and without BOM From: "Joseph Boyle" <[EMAIL PROTECTED]> > Thanks for the dozens of responses discussing consumers' behavior on > UTF-8 BOM. This is actually not what I'm concerned with, as I have to > take it as a > g

Re: PRODUCING and DESCRIBING UTF-8 with and without BOM

2002-11-04 Thread Michael \(michka\) Kaplan
From: "Joseph Boyle" <[EMAIL PROTECTED]> > Thanks for the dozens of responses discussing consumers' behavior on UTF-8 > BOM. This is actually not what I'm concerned with, as I have to take it as a > given that there is both software that wants UTF-8 BOM and software that > doesn't want it. > > Cou