On Sat, Feb 28, 2009 at 1:40 AM, jicman <cabre...@_wrc.xerox.com> wrote: > > Greetings. > > Sorry guys, please be patient with me. I am having a hard time understanding > this Unicode, ANSI, UTF* ideas. I know how to get an UTF8 File and turn it > into ANSI. and I know how to take a ANSI file and turn it into an UTF file. > But, now I have a Unicode file and I need to change the content and create a > new Unicode file with the changes in the content. I have read all kind of > places, and I found mtext, from Chris Miller's site, by reading, > > http://www.prowiki.org/wiki4d/wiki.cgi?DanielKeep/TextInD > > Anyway, what I need is to read an Unicode file, search the strings inside, > make changes to the file and write the changes back to an Unicode file.
You seem to be distinguishing between UTF and Unicode; it's kind of apples to oranges. Unicode is a standard for character encoding (a mapping from numbers to characters, like ASCII). UTF is a way - or rather, _several_ ways - of encoding Unicode text. There are three major encodings, UTF-8, UTF-16, and UTF-32 (and the 16- and 32-bit encodings have both little- and big-endian versions), which correspond to D's char[], wchar[], and dchar[]. When you say a "Unicode" file do you mean it's encoded in UTF-16? If so, you can just read the file's contents as a wchar[]. If you're using Phobos, keep in mind that it provides no functionality for searching or manipulating wchar[]s, which means you'll have to convert it to UTF-8 (char[]). If you're using Tango, you can give tango.io.UnicodeFile a shot - it will automatically transcode a file from any Unicode encoding to any other, and if your file has a BOM, it can even automatically detect which encoding it's in.
