>INCOMING TEXT: Trivial to simply chek. I say (once again) its THERE BYTES.
If hey are there then there is a BOM. Simple.

Yes, it's trivial to check. What's missing is the notation to tell the
checker what to check for.

>> The inability to update to one standard all possible consuming 
>> software one might encounter (or for that matter human customers'
opinions) is precisely
>> why producing and checking software has to handle both possibilities.
>But the "both possibilities" are trivial adn its by no means dificult to
do. Having a good program that refuses to do a little work to handle three
bytes is like someone who runs a 100 mile marathon and then refuses to cross
the finish line because the line is yellor instead of white.

Yes, this is a good description of the sad state of existing software.
Noting that failure to standardize is irritating and unnecessary doesn't
make existing software go away.

-----Original Message-----
From: Michael (michka) Kaplan [mailto:michka@;trigeminal.com] 
Sent: Monday, November 04, 2002 8:08 AM
To: Joseph Boyle; Unicode Mailing List
Subject: Re: PRODUCING and DESCRIBING UTF-8 with and without BOM


From: "Joseph Boyle" <[EMAIL PROTECTED]>

Joesph,

> Software currently under development could use the identifiers for
choosing
> whether to require or emit BOM, like the file requirements checker I 
> have
to
> write, and ICU/uconv.

Lets separate that into the two issuse it represents:

EMITTING: They could simply choose globally whether to emit the BOM or not.
If they wanted to get "fancy" they could have a command line option which
said whether to emit the bytes or not. But that is optional.

INCOMING TEXT: Trivial to simply chek. I say (once again) its THERE BYTES.
If hey are there then there is a BOM. Simple.

> The inability to update to one standard all possible consuming 
> software
one
> might encounter (or for that matter human customers' opinions) is
precisely
> why producing and checking software has to handle both possibilities.

But the "both possibilities" are trivial adn its by no means dificult to do.
Having a good program that refuses to do a little work to handle three bytes
is like someone who runs a 100 mile marathon and then refuses to cross the
finish line because the line is yellor instead of white.

> What would you mean by "the right thing" as far as emitting BOM? 
> Should
file
> conversion programs only allow output of non-BOM? (or with-BOM?) Or 
> should they take the specification in an argument separate from the 
> charset name? As said before this unnecessarily requires extra logic.

Already answered --- they can make a global decision, like notepad or other
programs do. Especially if the progammer finds the idea of setting it as a
huge hardship, they can skip that work and simply choose whether they want
it or not....

I plead with you -- keep it SIMPLE. :-)

MichKa



Reply via email to