On Wed, 23 May 2012 22:02:25 +0100, Paul <[email protected]> wrote:
This works, though it's ugly:
foreach(line; uniS.splitLines()) {
transcode(line, latinS);
fout.writeln((cast(char[]) latinS));
}
The Latin1String type, at the storage level, is a ubyte[]. By casting
to char[], you can get a similar-to-string thing that writeln() can
handle.
Graham
Awesome! What a lesson! Thannk you!
So if anyone is following this thread heres my code now. This reads a
text file(encoded in Latin1 which is basic ascii with extended ascii
codes), allows D to work with it in unicode, and then spits it back out
as Latin1.
I wonder about the speed between this method and Era's home-spun
solution?
import std.stdio;
import std.string;
import std.file;
import std.encoding;
// Main function
void main(){
auto fout = File("out.txt","w");
auto latinS = cast(Latin1String) read("in.txt");
string uniS;
transcode(latinS, uniS);
foreach(line; uniS.splitLines()){
transcode(line, latinS);
fout.writeln((cast(char[]) latinS));
}
}
The only thing which would worry me about this code is the cast(char[]) in
the final writeln.. I know some parts of phobos verify the char data is
correct UTF-8 and this line casts latin-1 to char[] which can potentially
create invalid UTF-8 data. That said, I had a really quick look at the
phobos code for File.writeln and I'm not sure whether this function does
any UTF-8 validation. I would be happier if the latin-1 was written as a
stream of bytes with no assumed interpretation, IMO.
R
--
Using Opera's revolutionary email client: http://www.opera.com/mail/