Re: manual memory management

H. S. Teoh Mon, 07 Jan 2013 15:13:35 -0800

On Mon, Jan 07, 2013 at 11:26:02PM +0100, Rob T wrote:
> On Monday, 7 January 2013 at 17:19:25 UTC, Jonathan M Davis wrote:
[...]
> >You _can_ program in D without the GC, but you lose features, and
> >there's no way around that. It may be the case that some features
> >currently require the GC when they shouldn't, but there are
> >definitely features that _must_ have the GC and _cannot_ be
> >implemented otherwise (e.g. array concatenation and closures).
> 
> Is this a hard fact, or can there be a way to make it work? For
> example what about the custom allocator idea?


Some features of D were *designed* with a GC in mind. As Jonathan has
already said, array slicing, concatenation, etc., pretty much *require*
a GC. I don't see how else you could implement code like this:

        int[] f(int[] arr) {
                assert(arr.length >= 4);
                return arr[2..4];
        }

        int[] g(int[] arr) {
                assert(arr.length >= 2);
                return arr[0..2];
        }

        int[] h(int[] arr) {
                assert(arr.length >= 3);
                if (arr[0] > 5)
                        return arr[1..3];
                else
                        return arr[2..3] ~ 6;
        }

        void main() {
                int[] arr = [1,2,3,4,5,6,7,8];
                auto a1 = f(arr[1..5]);
                auto a2 = g(arr[3..$]);
                auto a3 = h(arr[0..6]);
                a2 ~= 123;

                // Exercise for the reader: write manual deallocation
                // for this code.
        }

Yes, this code *can* be rewritten to use manual allocation, but it will
be a major pain in the neck (not to mention likely to be inefficient,
due to the required overhead of tracking where each array slice went and
whether a reallocation was needed and what must be freed at the end).

Not to mention that h() makes it impossible to do static analysis in the
compiler to keep track of what's going on (it will reallocate the array
or not depending on runtime data, for example). So you're pretty much
screwed if you don't have a GC.

To make it possible to do without the GC at the language level, you'd
have to basically cripple most of the main selling points of D arrays,
so that they become nothing more than C arrays with fancy syntax. Along
with all the nasty caveats that made C arrays (esp. strings) so painful
to work with. In particular, h() would require manual re-implementation
and major API change (it needs to somehow return a flag of some sort to
indicate whether or not the input array was reallocated), along with all
code that calls it (check for the flag, then decide based on where a
whole bunch of other pointers are pointing whether the input array needs
to be deallocated, etc., all the usual daily routine of a C programmer's
painful life). This cannot be feasibly automated, which means it can't
be done by the compiler, which means using D doesn't really give you any
advantage here, and therefore you might as well just write it in
straight C to begin with.


> From a marketing POV, if the language can be made 100% free of the GC
> it would at least not be a deterrent to those who cannot accept having
> to use one. From a technical POV, there are definitely many situations
> where not using a GC is desirable.
[...]

I think much of the aversion to GCs is misplaced.  I used to be very
aversive of GCs as well, so I totally understand where you're coming
from. I used to believe that GCs are for lazy programmers who can't be
bothered to think through their code and how to manage memory properly,
and that therefore GCs encourage sloppy coding. But then, after having
used D extensively for my personal projects, I discovered to my surprise
that having a GC actually *improved* the quality of my code -- it's much
more readable because I don't have to keep fiddling with pointers and
ownership (or worse, reference counts), and I can actually focus on how
to make the algorithms better. Not to mention the countless frustrating
hours spent chasing pointer bugs and memory leaks are all gone -- 'cos I
don't have to use pointers directly anymore.

As for performance, I have not noticed any significant performance
problems with using a GC in my D code. Now I know that there are cases
when the intermittent pause of the GC's mark-n-sweep cycle may not be
acceptable, but I suspect that 90% of applications don't even need to
care about this. Most applications won't even have any noticeable
pauses.

The most prominent case where this *does* matter is in game engines,
that must squeeze out every last drop of performance from the hardware,
no matter what. But then, when you're coding a game engine, you aren't
writing general application code per se; you're engineering a
highly-polished and meticulously-tuned codebase where all data
structures are already carefully controlled and mapped out -- IOW, you
wouldn't be using GC-dependent features of D in this code anyway. So it
shouldn't even be a problem.

The problem case comes when you have to interface this highly-optimized
core with application-level code, like in-game scripting or what-not. I
see a lot of advantages in separating out the scripting engine into a
separate process from the high-performance video/whatever-handling code,
so you can have the GC merrily doing its thing in the scripting engine
(targeted for script writers, level designers, who aren't into doing
pointer arithmetic in order to get the highest polygon rates from the
video hardware), without affecting the GC-independent core at all. So
you get the best of both worlds.

Crippling the language to cater to the 10% crowd who want to squeeze
every last drop of performance from the hardware is the wrong approach
IMO.


T

-- 
"Life is all a great joke, but only the brave ever get the point." -- Kenneth 
Rexroth

Re: manual memory management

Reply via email to