On Friday, 14 December 2012 at 18:27:29 UTC, Rob T wrote:
I created a D library wrapper for sqlite3 that uses a
dynamically constructed result list for returned records from a
SELECT statement. It works in a similar way to a C++ version
that I wrote a while back.
The D code is D code, not a cloned up version of my earlier C++
code, so it makes use of many of the features of D, and one of
them is the garbage collector.
When running comparison tests between the C++ version and the D
version, both compiled using performance optimization flags,
the C++ version runs 3x faster than the D version which was
very unexpected. If anything I was hoping for a performance
boost out of D or at least the same performance levels.
I remembered reading about people having performance problems
with the GC, so I tried a quick fix, which was to disable the
GC before the SELECT is run and re-enable afterwards. The
result of doing that was a 3x performance boost, making the DMD
compiled version run almost as fast as the C++ version. The DMD
compiled version is now only 2 seconds slower on my stress test
runs of a SELECT that returns 200,000+ records with 14 fields.
Not too bad! I may get identical performance if I compile using
gdc, but that will have to wait until it is updated to 2.061.
Fixing this was a major relief since the code is expected to be
used in a commercial setting. I'm wondering though, why the GC
causes such a large penalty, and what negative effect if any if
there will be when disabling the GC temporarily. I know that
memory won't be reclaimed until the GC is re-enabled, but is
there anything else to worry about?
I feel it's worth commenting on my experience as feed back for
the D developers and anyone else starting off with D.
Coming from C++ I *really* did not like having the GC, it made
me very nervous, but now that I'm used to having it, I've come
to like having it up to a point. It really does change the way
you think and code. However as I've discovered, you still have
to always be thinking about memory management issues because
the GC can eat up a huge performance penalty under certain
situations. I also NEED to know that I can always go full
manual where necessary. There's no way I would want to give up
that kind of control.
The trade off with having a GC seems to be that by default, C++
apps will perform considerably faster than equivalent D apps
out-of-the-box, simply because the manual memory management is
fine tuned by the programmer as the development proceeds. With
D, when you simply let the GC take care of business, then you
are not necessarily fine tuning as you go along, and when you
do not take the resulting performance hit into consideration it
means that your apps will likely perform poorly compared to a
C++ equivalent. However, building the equivalent app in D is a
much more pleasant experience in terms of the programming
productivity gain. The code is simpler to deal with, and
there's less to worry about with pointers and other memory
management issues.
What I have not yet had the opportunity to explore, is using D
in full manual memory management mode. My understanding is that
if I take that route, then I cannot use certain parts of the
std lib, and will also loose a few of the nice features of D
that make it fun to work with. I'm not fully clear though on
what to expect, so if there's any detailed information to look
at, it would be a big help.
I wonder what can be done to allow a programmer to go fully
manual, while not loosing any of the nice features of D?
Also, I think everyone agrees we really need a better GC, and I
wonder once we do get a better GC, what kind of overall
improvements we can expect to see?
Thanks for listening.
--rt
Having lots of experience in GC enabled languages, even for
systems programming (Oberon & Active Oberon).
I think there a few issues to consider:
- D's GC still has a lot of room to improve, so some of the
issues you have found might eventually get improved;
- Having GC support, does not mean to do call new like crazy, one
still needs to think how to code in a GC friendly way;
- Make proper use of weak references in case they are available;
- GC enabled languages runtimes usually offer ways to peak into
the runtime, somehow, and allow the developer to understand how
GC is working and what might be improved;
The goodness of having a GC is to have a safer way to manage
memory across multiple modules, specially when ownership is not
clear.
Even in C++ I seldom do manual memory management nowadays, if
working on new codebases. Of course, others will have a different
experience.
Other than that, thanks for sharing your experience.
--
Paulo