On Monday, 28 July 2014 at 20:52:01 UTC, Anton wrote:
On Monday, 28 July 2014 at 19:57:38 UTC, Carl Sturtivant wrote:
Suppose I want to use D as a system programming language to work with a library of functions written in another language, operating on dynamically typed data that has its own garbage collector, such as an algebra system or the virtual machine of a dynamically typed scripting language viewed as a library of operations on its own data type. For concreteness, suppose the library is written in C. (More generally, the data need not restricted to the kind above, but for concreteness, make that supposition.)

Data in such a system is usually a (possibly elaborate) tagged union, that is essentially a struct consisting of (say) two words, the first indicating the type and perhaps containing some bits that indicate other attributes, and the second containing the data, which may be held directly or indirectly. Call this a Descriptor.

Descriptors are small, so it's natural to want them held by value and not allocated on the heap (either D's or the library's) unless they are a part of a bigger structure that naturally resides there. And it's natural to want them to behave like values when passed as parameters or assigned. This usually fits in with the sort of heterogeneous copy semantics of such a library, where some of the dynamic types are implicitly reference types and others are not.

The trouble is that the library's alien GC needs to be made aware of each Descriptor when it appears and when it disappears, so that a call of a library function that allocates storage doesn't trigger a garbage collection that vacuums up library allocated storage that a D Descriptor points to, or fails to adjust a pointer inside a D descriptor when it moves the corresponding data, or worse, follows a garbage pointer from an invalid D Descriptor that's gone out of scope. This requirement applies to local variables, parameters and temporaries, as well as to other situations, like D arrays of Descriptors that are D-heap allocated. Ignore the latter kind of occasion for now.

Abstract the process of informing the GC of a Descriptor's existence as a Protect operation, and that it will be out of scope as an Unprotect operation. Protect and Unprotect naturally need the address of the storage holding the relevant Descriptor.

In a nutshell, the natural requirement when interfacing to such a library is to add Descriptor as a new value type in D along the lines described above, with a definition such that Protect and Unprotect operations are compiled to be performed automatically at the appropriate junctures so that the user of the library can forget about garbage collection to the usual extent.

How can this requirement be fulfilled?

Suppose I want to do system programming...Would I choose the option with a GC ? Just get off. The GC is just such a fagot. People are smart enough to manage memory.

It's the library to interface to that has its own GC, not my code. I just need to use D's system programming capabilities to work around the library's nasty GC so my data used by my calls to that library isn't trashed, and to do that efficiently and transparently. A system programming language should be able to efficiently interface to anything, right?

Reply via email to