On 10/14/2012 10:28 PM, Nick Sabalausky wrote:
On Sat, 13 Oct 2012 18:53:48 -0700
Charles Hixson<charleshi...@earthlink.net>  wrote:

If std.stream is being deprecated, what is the correct way to deal
with file BOMs.  This is particularly concerning utf8 files, which I
understand to be a bit problematic, as there isn't, actually, a utf8
BOM, merely a convention which isn't a part of a standard.  But the
std.stdio documentation doesn't so much as mention byte order marks
(BOMs).

If this should wait until std.io is released, then I could use
std.stream until them, but the documentation is already warning to
avoid using it.

Personally, I think it's kind of cumbersome to deal with in Phobos, so
I wrote this wrapper that I use instead, which handles everything:

https://bitbucket.org/Abscissa/semitwistdtools/src/977820d5dcb0/src/semitwist/util/io.d?at=master#cl-24

And then there's the utfConvert below it if you already have the data
in memory instead of on disk.

(Maybe I should add some range capability and make a Phobos pull
request. I don't know if it'd fly though. It uses a lot of custom
endian- and bom-related code since I found the existing endian/bom
stuff in phobos inadequate. So that stuff would have to be accepted,
and then this too, and it's usually a bit of a pain to get things
approved.)

That wrapper looks very nice, but it's a lot more than what I need. I want to deal only with utf8 files, many of which have BOMs. I *can* handle that by detecting the BOM and dropping it. I don't need anything else. I was merely wondering what the appropriate way to approach this was now that std.stream is being documented as deprecated, but no replacement specified. It sounds like the appropriate response is to use std.stdio, and handle the BOM myself.

Reply via email to