On Tuesday, 3 February 2015 at 23:55:19 UTC, FG wrote:
On 2015-02-04 at 00:07, Foo wrote:
How would I use decoding for that? Isn't there a way to read
the file as utf8 or event better, as unicode?
Well, apparently the utf-8-aware foreach loop still works just
fine.
This program shows the file size and the number of unicode
glyps, or whatever they are called:
import core.stdc.stdio;
int main() @nogc
{
const int bufSize = 64000;
char[bufSize] buffer;
size_t bytesRead, count;
FILE* f = core.stdc.stdio.fopen("test.d", "r");
if (!f)
return 1;
bytesRead = fread(cast(void*)buffer, 1, bufSize, f);
if (bytesRead > bufSize - 1) {
printf("File is too big");
return 1;
}
if (!bytesRead)
return 2;
foreach (dchar d; buffer[0..bytesRead])
count++;
printf("read %d bytes, %d unicode characters\n",
bytesRead, count);
fclose(f);
return 0;
}
Outputs for example this: read 838 bytes, 829 unicode characters
(It would be more complicated if it had to process bigger
files.)
To use a foreach loop is such a nice idea! tank you very much. :)
That's my code now:
----
private:
static import m3.m3;
static import core.stdc.stdio;
alias printf = core.stdc.stdio.printf;
public:
@trusted
@nogc
auto readFile(in string filename) nothrow {
import std.c.stdio : FILE, SEEK_END, SEEK_SET, fopen, fclose,
fseek, ftell, fread;
FILE* f = fopen(filename.ptr, "rb");
fseek(f, 0, SEEK_END);
immutable size_t fsize = ftell(f);
fseek(f, 0, SEEK_SET);
char[] str = m3.m3.make!(char[])(fsize);
fread(str.ptr, fsize, 1, f);
fclose(f);
return str;
}
@trusted
@nogc
@property
dstring toUTF32(in char[] s) {
dchar[] r = m3.m3.make!(dchar[])(s.length); // r will never
be longer than s
foreach (immutable size_t i, dchar c; s) {
r[i] = c;
}
return cast(dstring) r;
}
@nogc
void main() {
auto str = readFile("test_file.txt");
scope(exit) m3.m3.destruct(str);
auto str2 = str.toUTF32;
printf("%d : %d\n", cast(int) str[0], cast(int) str2[0]);
}
----
m3 is my own module and means "manual memory management", three
m's so m3. If we will use D (what is now much more likely) that
is our core module for memory management.