Mahmoud Al-Qudsi wrote: > with `.import ……`, SQLite3 includes a BOM (UTF-8) as part of the first > column of the first record.
The Unicode Standard 9.0 says in section 3.10: | When represented in UTF-8, the byte order mark turns into the byte | sequence <EF BB BF>. Its usage at the beginning of a UTF-8 data stream | is neither required nor recommended by the Unicode Standard, so you should not use it. Treating this character as a zero width no-break space, and keeping it, is a correct interpretation of the file. > IMHO, this is of particular importance since the latest versions of MS > Excel default to “UTF-8 CSV” which includes a BOM. That's wrong: | When converting between different encoding schemes, extreme care must | be taken in handling any initial byte order marks. For example, if one | converted a UTF-16 byte serialization with an initial byte order mark | to a UTF-8 byte serialization, thereby converting the byte order mark | to <EF BB BF> in the UTF-8 form, the <EF BB BF> would now be ambiguous | as to its status as a byte order mark (from its source) or as an | initial zero width no-break space. If the UTF-8 byte serialization | were then converted to UTF-16BE and the initial <EF BB BF> were | converted to <FE FF>, the interpretation of the U+FEFF character would | have been modified by the conversion. This would be nonconformant | behavior according to conformance clause C7, because the change | between byte serializations would have resulted in modification of the | interpretation of the text. This is one reason why the use of the | initial byte sequence <EF BB BF> as a signature on UTF-8 byte | sequences is not recommended by the Unicode Standard. And Google Docs also thinks it would be a good idea to act against this recommendation: <https://productforums.google.com/forum/#!topic/docs/p_jCTwzuIqk> > Would anyone be opposed to a patch to SQLite that disregarded a BOM > when found during a csv import operation? Well, being wrong doesn't mean that Microsoft or Google will change their behaviour ... Regards, Clemens _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users