On Jun 26, 2017 1:47 AM, "Rowan Worth" <row...@dug.com> wrote:

On 26 June 2017 at 15:09, Eric Grange <egra...@glscene.org> wrote:

> Alas, there is no end in sight to the pain for the Unicode decision to not
> make the BOM compulsory for UTF-8.
>

UTF-8 is byte oriented. The very concept of byte order is nonsense in this
context as there is no multi-byte storage primitives to worry about.

Making it optional or non-necessary basically made every single text file
> ambiguous
>

Easily solved by never including a superflous BOM in UTF-8 text.


Some people talk about dialing a phone or referring to a remote control as
a clicker, even though most of us don't use pulse based dialing or remote
controls that actually click.

The reality is that interchange of text requires some means to communicate
the encoding, in band or out of band. ZWNBSP (now BOM) was selected as a
handy in band way to distinguish LE from BE fixed size multi-byte text. One
could just as easily call that stupid and demand everyone use network byte
order.

Byte Order Mark isn't perfectly descriptive when used with UTF-8. Neither
is dialing a cell phone. Language evolves.

Maybe people would prefer calling it TEI (Text Encoding Identifier). Then
we could get back to discussion of whether or not stripping U+FEFF from the
beginning of text streams is a good idea. I'm not advocating one way or
another, but if a system strips U+FEFF from a text stream after using it to
determine the encoding, surely it is reasonable to expect that for all
supported encodings. If it doesn't do that for one, it shouldn't do it for
any.

Does SQLite3 support UTF-16 CSV files with BOM/TEI? If not, then UTF-8 need
not. If so, perhaps it should.

As for using a signature at the beginning of UTF-8 text, it certainly can
be useful to distinguish Unicode from code pages & other incompatible
encodings.

That being said, it's not difficult to strip TEI from a file before passing
it to SQLite3 (or any other tool for that matter).
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to