On 27 Jun 2017, at 7:12am, Rowan Worth <row...@dug.com> wrote:
> In fact using this assumption we could dispense with the BOM entirely for > UTF-8 and drop case 5 from the list. If you do that, you will try to process the BOM at the beginning of a UTF-8 stream as if it is characters. > So my question is, what advantage does > a BOM offer for UTF-8? What other cases can we identify with the > information it provides? Suppose your software processes only UTF-8 files, but someone feeds it a file which begins with FE FF. Your software should recognise this and reject the file, telling the user/programmer that it can’t process it because it’s in the wrong encoding. Processing BOMs is part of the work you have to do to make your software Unicode-aware. Without it, your documentation should state that your software handles the one flavour of Unicode it handles, not Unicode in general. There’s nothing wrong with this, if it’s all the programmer/user needs, as long as it’s correctly documented. Simon. _______________________________________________ sqlite-users mailing list sqlite-users@mailinglists.sqlite.org http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users