On Monday, 28 July 2014 at 20:52:01 UTC, Anton wrote:
On Monday, 28 July 2014 at 19:57:38 UTC, Carl Sturtivant wrote:
Suppose I want to use D as a system programming language to
work with a library of functions written in another language,
operating on dynamically typed data that has its own garbage
collector, such as an algebra system or the virtual machine of
a dynamically typed scripting language viewed as a library of
operations on its own data type. For concreteness, suppose the
library is written in C. (More generally, the data need not
restricted to the kind above, but for concreteness, make that
supposition.)
Data in such a system is usually a (possibly elaborate) tagged
union, that is essentially a struct consisting of (say) two
words, the first indicating the type and perhaps containing
some bits that indicate other attributes, and the second
containing the data, which may be held directly or indirectly.
Call this a Descriptor.
Descriptors are small, so it's natural to want them held by
value and not allocated on the heap (either D's or the
library's) unless they are a part of a bigger structure that
naturally resides there. And it's natural to want them to
behave like values when passed as parameters or assigned. This
usually fits in with the sort of heterogeneous copy semantics
of such a library, where some of the dynamic types are
implicitly reference types and others are not.
The trouble is that the library's alien GC needs to be made
aware of each Descriptor when it appears and when it
disappears, so that a call of a library function that
allocates storage doesn't trigger a garbage collection that
vacuums up library allocated storage that a D Descriptor
points to, or fails to adjust a pointer inside a D descriptor
when it moves the corresponding data, or worse, follows a
garbage pointer from an invalid D Descriptor that's gone out
of scope. This requirement applies to local variables,
parameters and temporaries, as well as to other situations,
like D arrays of Descriptors that are D-heap allocated. Ignore
the latter kind of occasion for now.
Abstract the process of informing the GC of a Descriptor's
existence as a Protect operation, and that it will be out of
scope as an Unprotect operation. Protect and Unprotect
naturally need the address of the storage holding the relevant
Descriptor.
In a nutshell, the natural requirement when interfacing to
such a library is to add Descriptor as a new value type in D
along the lines described above, with a definition such that
Protect and Unprotect operations are compiled to be performed
automatically at the appropriate junctures so that the user of
the library can forget about garbage collection to the usual
extent.
How can this requirement be fulfilled?
Suppose I want to do system programming...Would I choose the
option with a GC ?
Just get off. The GC is just such a fagot. People are smart
enough to manage memory.
It's the library to interface to that has its own GC, not my
code. I just need to use D's system programming capabilities to
work around the library's nasty GC so my data used by my calls to
that library isn't trashed, and to do that efficiently and
transparently. A system programming language should be able to
efficiently interface to anything, right?