Interesting. I'd have thought the "extra copy" would be an
overall slowdown, but I guess that's not the case.
I also tried your strategy of adding '\n' to the buffer, but I
was getting some bad output on windows. I'm not sure why "\n\n"
works though. On *nix, I'd have also expected a double line
feed. Did you check the actual output?
I checked the output. The range selected is for one newline.
Appender is better than "~=", but it's not actually that good
either. Try this:
//----
void printDiamond3(size_t N)
{
import core.memory;
char* p = cast(char*)GC.malloc(N*N+16);
p[0..N*N+16]='*';
auto pp = p;
N/=2;
enum code = q{
pp[0 .. N - n] = ' ';
pp+=(1+N+n);
version(Windows)
{
pp[0 .. 2] = "\r\n";
pp+=2;
}
else
{
pp[0] = '\n';
++pp;
}
};
foreach (n; 0 .. N + 1) {mixin(code);}
foreach_reverse(n; 0 .. N ) {mixin(code);}
write(p[0 .. pp-p]);
}
//----
This makes just 1 allocation of roughly the right size. It also
eagerly fills the entire array with '*', since I *figure*
that's faster than a lot of different writes.
I could be mistaken about that though, but I imagine the
pre-allocation and not using Appender is definitely a boost.
ok. I'll try it. I was happy the appender was pretty fast.