Re: Utf8 to Utf32 cast cost
Am Mon, 8 Jun 2015 12:59:31 +0200 schrieb Daniel Kozák via Digitalmars-d-learn : > > On Mon, 08 Jun 2015 10:41:59 + > Kadir Erdem Demir via Digitalmars-d-learn > wrote: > > > I want to use my char array with awesome, cool std.algorithm > > functions. Since many of this algorithms requires like slicing > > etc.. I prefer to create my string with Utf32 chars. But by > > default all strings literals are Utf8 for performance. > > > > With my current knowledge I use to!dhar to convert Utf8[](or > > char[]) to Utf32[](or dchar[]) > > > > dchar[] range = to!dchar("erdem".dup) > > > > How costly is this? > > import std.conv; > import std.utf; > import std.datetime; > import std.stdio; > > void f0() { > string somestr = "some not so long utf8 string forbenchmarking"; > dstring str = to!dstring(somestr); > } > > > void f1() { > string somestr = "some not so long utf8 string forbenchmarking"; > dstring str = toUTF32(somestr); > } > > void main() { > auto r = benchmark!(f0,f1)(1_000_000); > auto f0Result = to!Duration(r[0]); > auto f1Result = to!Duration(r[1]); > writeln("f0 time: ",f0Result); > writeln("f1 time: ",f1Result); > } > > > /// output /// > f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs > f1 time: 600 ms, 979 μs, and 8 hnsecs > Please have the result of the transcode influence the program output. E.g. Add the first character of the UTF32 string to some global variable and print it out. At the moment - at least in theory - you allow the compiler to deduce f0/f1 as pure, return-nothing functions and you will benchmark anything from your written code to an empty loop. I'm talking out of experience here: https://github.com/mleise/fast/blob/master/source/fast/internal.d#L99 -- Marco
Re: Utf8 to Utf32 cast cost
Am Mon, 08 Jun 2015 11:13:25 + schrieb "Daniel Kozak" : > BTW on ldc(ldc -O3 -singleobj -release -boundscheck=off) > transcode is the fastest: > > f0 time: 1 sec, 115 ms, 48 μs, and 7 hnsecs // to!dstring > f1 time: 449 ms and 329 μs // toUTF32 > f2 time: 272 ms, 969 μs, and 1 hnsec // transcode Three functions, each twice as fast and twice as hidden as the one before. :) -- Marco
Re: Utf8 to Utf32 cast cost
On Monday, 8 June 2015 at 18:48:17 UTC, Daniel Kozak wrote: Yep, but I dont care, I am the one who makes transcode faster, so I am happy with results :P. P.S. I care and probably when I have some spare time I will improve to!dstring too Ah, so you are. I confused you with Kadir Erdem Demir.
Re: Utf8 to Utf32 cast cost
On Mon, 08 Jun 2015 18:16:57 + Anonymouse via Digitalmars-d-learn wrote: > On Monday, 8 June 2015 at 11:44:47 UTC, Daniel Kozák wrote: > > No difference even with GC.disable() results are same. > > Profile! Callgrind is your friend~ Yep, but I dont care, I am the one who makes transcode faster, so I am happy with results :P. P.S. I care and probably when I have some spare time I will improve to!dstring too
Re: Utf8 to Utf32 cast cost
On Monday, 8 June 2015 at 11:44:47 UTC, Daniel Kozák wrote: No difference even with GC.disable() results are same. Profile! Callgrind is your friend~
Re: Utf8 to Utf32 cast cost
On Mon, 08 Jun 2015 11:32:07 + Kagamin via Digitalmars-d-learn wrote: > On Monday, 8 June 2015 at 10:59:45 UTC, Daniel Kozák wrote: > > import std.conv; > > import std.utf; > > import std.datetime; > > import std.stdio; > > > > void f0() { > > string somestr = "some not so long utf8 string > > forbenchmarking"; > > dstring str = to!dstring(somestr); > > } > > > > > > void f1() { > > string somestr = "some not so long utf8 string > > forbenchmarking"; > > dstring str = toUTF32(somestr); > > } > > > > void main() { > > auto r = benchmark!(f0,f1)(1_000_000); > > auto f0Result = to!Duration(r[0]); > > auto f1Result = to!Duration(r[1]); > > writeln("f0 time: ",f0Result); > > writeln("f1 time: ",f1Result); > > } > > > > > > /// output /// > > f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs > > f1 time: 600 ms, 979 μs, and 8 hnsecs > > Chances are you're benchmarking the GC. Try > benchmark!(f0,f1,f0,f1,f0,f1); No difference even with GC.disable() results are same.
Re: Utf8 to Utf32 cast cost
On Monday, 8 June 2015 at 10:59:45 UTC, Daniel Kozák wrote: import std.conv; import std.utf; import std.datetime; import std.stdio; void f0() { string somestr = "some not so long utf8 string forbenchmarking"; dstring str = to!dstring(somestr); } void f1() { string somestr = "some not so long utf8 string forbenchmarking"; dstring str = toUTF32(somestr); } void main() { auto r = benchmark!(f0,f1)(1_000_000); auto f0Result = to!Duration(r[0]); auto f1Result = to!Duration(r[1]); writeln("f0 time: ",f0Result); writeln("f1 time: ",f1Result); } /// output /// f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs f1 time: 600 ms, 979 μs, and 8 hnsecs Chances are you're benchmarking the GC. Try benchmark!(f0,f1,f0,f1,f0,f1);
Re: Utf8 to Utf32 cast cost
On Monday, 8 June 2015 at 11:06:07 UTC, Daniel Kozák wrote: On Mon, 08 Jun 2015 10:51:53 + weaselcat via Digitalmars-d-learn wrote: On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote: > On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir > wrote: >> I want to use my char array with awesome, cool >> std.algorithm functions. Since many of this algorithms >> requires like slicing etc.. I prefer to create my string >> with Utf32 chars. But by default all strings literals are >> Utf8 for performance. >> >> With my current knowledge I use to!dhar to convert >> Utf8[](or char[]) to Utf32[](or dchar[]) >> >> dchar[] range = to!dchar("erdem".dup) >> >> How costly is this? >> Is there a way which I can have Utf32 string directly >> without a cast? > > 1. dstring range = to!dstring("erdem"); //without dup > 2. dchar[] range = to!(dchar[])("erdem"); //mutable > 3. dstring range = "erdem"d; //directly > 4. dchar[] range = "erdem"d.dup; //mutable what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32 from: http://dlang.org/phobos/std_encoding.html#.transcode Supersedes: This function supersedes std.utf.toUTF8(), std.utf.toUTF16() and std.utf.toUTF32() (but note that to!() supersedes it more conveniently). BTW on ldc(ldc -O3 -singleobj -release -boundscheck=off) transcode is the fastest: f0 time: 1 sec, 115 ms, 48 μs, and 7 hnsecs // to!dstring f1 time: 449 ms and 329 μs // toUTF32 f2 time: 272 ms, 969 μs, and 1 hnsec // transcode
Re: Utf8 to Utf32 cast cost
On Mon, 08 Jun 2015 10:51:53 + weaselcat via Digitalmars-d-learn wrote: > On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote: > > On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote: > >> I want to use my char array with awesome, cool std.algorithm > >> functions. Since many of this algorithms requires like slicing > >> etc.. I prefer to create my string with Utf32 chars. But by > >> default all strings literals are Utf8 for performance. > >> > >> With my current knowledge I use to!dhar to convert Utf8[](or > >> char[]) to Utf32[](or dchar[]) > >> > >> dchar[] range = to!dchar("erdem".dup) > >> > >> How costly is this? > >> Is there a way which I can have Utf32 string directly without > >> a cast? > > > > 1. dstring range = to!dstring("erdem"); //without dup > > 2. dchar[] range = to!(dchar[])("erdem"); //mutable > > 3. dstring range = "erdem"d; //directly > > 4. dchar[] range = "erdem"d.dup; //mutable > > what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32 from: http://dlang.org/phobos/std_encoding.html#.transcode Supersedes: This function supersedes std.utf.toUTF8(), std.utf.toUTF16() and std.utf.toUTF32() (but note that to!() supersedes it more conveniently).
Re: Utf8 to Utf32 cast cost
Thanks a lot, your answers are very useful for me . Nothing wrong with toUtf32, I just didn't know it.
Re: Utf8 to Utf32 cast cost
On Mon, 08 Jun 2015 10:41:59 + Kadir Erdem Demir via Digitalmars-d-learn wrote: > I want to use my char array with awesome, cool std.algorithm > functions. Since many of this algorithms requires like slicing > etc.. I prefer to create my string with Utf32 chars. But by > default all strings literals are Utf8 for performance. > > With my current knowledge I use to!dhar to convert Utf8[](or > char[]) to Utf32[](or dchar[]) > > dchar[] range = to!dchar("erdem".dup) > > How costly is this? import std.conv; import std.utf; import std.datetime; import std.stdio; void f0() { string somestr = "some not so long utf8 string forbenchmarking"; dstring str = to!dstring(somestr); } void f1() { string somestr = "some not so long utf8 string forbenchmarking"; dstring str = toUTF32(somestr); } void main() { auto r = benchmark!(f0,f1)(1_000_000); auto f0Result = to!Duration(r[0]); auto f1Result = to!Duration(r[1]); writeln("f0 time: ",f0Result); writeln("f1 time: ",f1Result); } /// output /// f0 time: 2 secs, 281 ms, 933 μs, and 8 hnsecs f1 time: 600 ms, 979 μs, and 8 hnsecs
Re: Utf8 to Utf32 cast cost
On Monday, 8 June 2015 at 10:49:59 UTC, Ilya Yaroshenko wrote: On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote: I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance. With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[]) dchar[] range = to!dchar("erdem".dup) How costly is this? Is there a way which I can have Utf32 string directly without a cast? 1. dstring range = to!dstring("erdem"); //without dup 2. dchar[] range = to!(dchar[])("erdem"); //mutable 3. dstring range = "erdem"d; //directly 4. dchar[] range = "erdem"d.dup; //mutable what's wrong with http://dlang.org/phobos/std_utf.html#.toUTF32
Re: Utf8 to Utf32 cast cost
On Mon, 08 Jun 2015 10:41:59 + Kadir Erdem Demir via Digitalmars-d-learn wrote: > I want to use my char array with awesome, cool std.algorithm > functions. Since many of this algorithms requires like slicing > etc.. I prefer to create my string with Utf32 chars. But by > default all strings literals are Utf8 for performance. > > With my current knowledge I use to!dhar to convert Utf8[](or > char[]) to Utf32[](or dchar[]) > > dchar[] range = to!dchar("erdem".dup) > > How costly is this? > Is there a way which I can have Utf32 string directly without a > cast? dstring str = "erdem"d; dstring str2 = std.utf.toUTF32(someUtf8Or16Or32String);
Re: Utf8 to Utf32 cast cost
On Monday, 8 June 2015 at 10:42:00 UTC, Kadir Erdem Demir wrote: I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance. With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[]) dchar[] range = to!dchar("erdem".dup) How costly is this? Is there a way which I can have Utf32 string directly without a cast? 1. dstring range = to!dstring("erdem"); //without dup 2. dchar[] range = to!(dchar[])("erdem"); //mutable 3. dstring range = "erdem"d; //directly 4. dchar[] range = "erdem"d.dup; //mutable
Utf8 to Utf32 cast cost
I want to use my char array with awesome, cool std.algorithm functions. Since many of this algorithms requires like slicing etc.. I prefer to create my string with Utf32 chars. But by default all strings literals are Utf8 for performance. With my current knowledge I use to!dhar to convert Utf8[](or char[]) to Utf32[](or dchar[]) dchar[] range = to!dchar("erdem".dup) How costly is this? Is there a way which I can have Utf32 string directly without a cast?