On Sunday, 15 April 2012 at 02:56:21 UTC, Joseph Rushton Wakeling
wrote:
On Saturday, 14 April 2012 at 19:51:21 UTC, Joseph Rushton
Wakeling wrote:
GDC has all the regular gcc optimization flags available
IIRC. The ones on the
GDC man page are just the ones specific to GDC.
I'm not talking about compiler flags, but the "inline" keyword
in the C++ source
code. I saw some discussion about "@inline" but it seems not
implemented (yet?).
Well, that is not a priority for D anyway.
About compiler optimizations, -finline-functions and -fweb are
part of -O3. I
tried to compile with -no-bounds-check, but made no diference
for DMD and GDC.
It probably is part of -release as q66 said.
Ah yes, you're right. I do wonder if your seeming speed
differences are magnified because the whole operation is only
2-4 seconds long: if your algorithm were operating over a
longer timeframe I think you'd likely find the relative speed
differences decrease. (I have a memory of a program that took
~0.004s with a C/C++ version and 1s with D, and the difference
seemed to be just startup time for the D program.)
Well, this don't seem to be true:
1.2MB compressible
encode:
C++: 0.11s (100%)
D-inl: 0.14s (127%)
decode
C++: 0.12s (100%)
D-inl: 0.16s (133%)
~200MB compressible
encode:
C++: 17.2s (100%)
D-inl: 21.5s (125%)
decode:
C++: 16.3s (100%)
D-inl: 24,5s (150%)
3,8GB, barelly-compressible
encode:
C++: 412s (100%)
D-inl: 512s (124%)
What really amazes me is the difference between g++, DMD and
GDC in size of the executable binary. 100 orders of magnitude!
I have remarked it in another topic before, with a simple "hello
world". I need to update there, now that I got DMD working. BTW,
it is 2 orders of magnitude.
3 remarks about the D code. One is that much of it still seems
very "C-ish"; I'd be interested to see how speed and executable
size differ if things like the file opening, or the reading of
characters, are done with more idiomatic D code.
Sounds stupid as the C stuff should be fastest, but I've been
surprised sometimes at how using idiomatic D formulations can
improve things.
Well, it may indeed be faster, especially the IO that is
dependent on things like buffering and so on. But for this I just
wanted something as close as the C++ code as possible.
Second remark is more of a query -- can Predictor.p() and
.update really be marked as pure? Their result for a given
input actually varies depending on the current values of cxt
and ct, which are modified outside of function scope.
Yeah, I don't know. I just did just throw those qualifiers
against the compiler, and saw what sticks. And I was testing the
decode speed specially to see easily if the output was corrupted.
But maybe it haven't corrupted because the compiler don't
optimize based on "pure" yet... there was no speed difference
too.. so...
Third remark -- again a query -- why the GC.disable ... ?
Just to be sure that the speed difference wasn't caused by the
GC. It didn't make any speed difference also, and it is indeed a
bad idea.