On Friday, 14 December 2012 at 18:27:29 UTC, Rob T wrote:
I created a D library wrapper for sqlite3 that uses a dynamically constructed result list for returned records from a SELECT statement. It works in a similar way to a C++ version that I wrote a while back.

The D code is D code, not a cloned up version of my earlier C++ code, so it makes use of many of the features of D, and one of them is the garbage collector.

When running comparison tests between the C++ version and the D version, both compiled using performance optimization flags, the C++ version runs 3x faster than the D version which was very unexpected. If anything I was hoping for a performance boost out of D or at least the same performance levels.

I remembered reading about people having performance problems with the GC, so I tried a quick fix, which was to disable the GC before the SELECT is run and re-enable afterwards. The result of doing that was a 3x performance boost, making the DMD compiled version run almost as fast as the C++ version. The DMD compiled version is now only 2 seconds slower on my stress test runs of a SELECT that returns 200,000+ records with 14 fields. Not too bad! I may get identical performance if I compile using gdc, but that will have to wait until it is updated to 2.061.

Fixing this was a major relief since the code is expected to be used in a commercial setting. I'm wondering though, why the GC causes such a large penalty, and what negative effect if any if there will be when disabling the GC temporarily. I know that memory won't be reclaimed until the GC is re-enabled, but is there anything else to worry about?

I feel it's worth commenting on my experience as feed back for the D developers and anyone else starting off with D.

Coming from C++ I *really* did not like having the GC, it made me very nervous, but now that I'm used to having it, I've come to like having it up to a point. It really does change the way you think and code. However as I've discovered, you still have to always be thinking about memory management issues because the GC can eat up a huge performance penalty under certain situations. I also NEED to know that I can always go full manual where necessary. There's no way I would want to give up that kind of control.

The trade off with having a GC seems to be that by default, C++ apps will perform considerably faster than equivalent D apps out-of-the-box, simply because the manual memory management is fine tuned by the programmer as the development proceeds. With D, when you simply let the GC take care of business, then you are not necessarily fine tuning as you go along, and when you do not take the resulting performance hit into consideration it means that your apps will likely perform poorly compared to a C++ equivalent. However, building the equivalent app in D is a much more pleasant experience in terms of the programming productivity gain. The code is simpler to deal with, and there's less to worry about with pointers and other memory management issues.

What I have not yet had the opportunity to explore, is using D in full manual memory management mode. My understanding is that if I take that route, then I cannot use certain parts of the std lib, and will also loose a few of the nice features of D that make it fun to work with. I'm not fully clear though on what to expect, so if there's any detailed information to look at, it would be a big help.

I wonder what can be done to allow a programmer to go fully manual, while not loosing any of the nice features of D?

Also, I think everyone agrees we really need a better GC, and I wonder once we do get a better GC, what kind of overall improvements we can expect to see?

Thanks for listening.

--rt

Having lots of experience in GC enabled languages, even for systems programming (Oberon & Active Oberon).

I think there a few issues to consider:

- D's GC still has a lot of room to improve, so some of the issues you have found might eventually get improved;

- Having GC support, does not mean to do call new like crazy, one still needs to think how to code in a GC friendly way;

- Make proper use of weak references in case they are available;

- GC enabled languages runtimes usually offer ways to peak into the runtime, somehow, and allow the developer to understand how GC is working and what might be improved;

The goodness of having a GC is to have a safer way to manage memory across multiple modules, specially when ownership is not clear.

Even in C++ I seldom do manual memory management nowadays, if working on new codebases. Of course, others will have a different experience.

Other than that, thanks for sharing your experience.

--
Paulo

Reply via email to