On 10/13/2012 06:53 PM, Charles Hixson wrote:
> If std.stream is being deprecated, what is the correct way to deal with
> file BOMs. This is particularly concerning utf8 files, which I
> understand to be a bit problematic, as there isn't, actually, a utf8
> BOM,

That's correct. There is just one byte order for UTF-8.

> merely a convention which isn't a part of a standard.

I am not sure about that. The Unicode standard describes UTF-8 as code units following each other in the file. There can't be any confusion about their order. According to Wikipedia, the only use of BOM for UTF-8 is to identify the file as having been encoded in UTF-8:

  http://en.wikipedia.org/wiki/Byte_order_mark#UTF-8

But that can't have any meaning. The file could have been encoded in any one of the multitude of code pages as well. Treating the first three bytes as BOM would be taking a chance in that case and dropping those three characters.

> But the
> std.stdio documentation doesn't so much as mention byte order marks (BOMs).
>
> If this should wait until std.io is released, then I could use
> std.stream until them, but the documentation is already warning to avoid
> using it.

As I understand it, it is all down to convention any way. What is the meaning of the non-ASCII code 166? Only the generator of the file knows. :/

Ali

Reply via email to