Never thought to mention these two things in the same subject line?
Haha, well today I finally have reason to. This post is about an obscure
bug I encountered today in one of my projects, with a moral lesson on
why you really, really, ought to be using unittest blocks everywhere.
First, a bit of a background. The program in which said bug occurred
consists of a module that takes user input, preprocesses it, and
instantiates a code template that produces a D code snippet. This
snippet is then saved into a temporary file, and compiled with the local
D compiler to produce a shared object. Subsequently, it makes use of
Posix's dlopen() family of functions to load the shared object, lookup
the symbol of the generated function, and return a function pointer to
it. The main module then does its own processing in which it calls the
generated code via this function pointer.
The actual code is, of course, rather involved, but here's a
highly-simplified version of it that captures the essentials:
// The code template.
//
// The q"..." syntax is D's built-in heredoc syntax, convenient
// for multi-line string literals.
//
// Basically, this template is just a boilerplate function
// declaration to wrap around the generated D code snippet. It's
// written as a format string, with the "%s" specifying where
// the code snippet should be inserted.
static immutable string codeTemplate = q"ENDTEMPLATE
module funcImpl;
double funcImpl(double x, double y)
{
return %s;
}
ENDTEMPLATE";
// A convenient alias for the function pointer type that will be
// returned by the JIT compiler.
alias FuncImpl = double function(double, double);
// Compiles the given input into a shared object, load it, and
// return a function pointer to its entry point.
FuncImpl compile(string preprocessedInput)
{
// Instantiate code template and write it into a
// temporary source file.
import std.format : format;
string code = format(codeTemplate, preprocessedInput);
import std.file : write;
enum srcFile = "/path/to/tmpfile.d";
write(srcFile, code);
// Compile it into a shared object with the D compiler.
// Thanks to the wonderful API of std.process, this is a
// cinch, no need to fiddle with fork(), execv(),
// waitpid(), etc..
import std.process;
enum soFile = "/path/to/tmpfile.so";
auto ret = execute([
"/path/to/dmd",
"-fPIC",
"-shared", // make it a shared object
"-of" ~ soFile,
srcFile
]);
if (ret.status != 0) ... // compile failed
// Load the result as a shared library
import core.sys.posix.dlfcn;
import std.string : toStringz;
void* lib = dlopen(soFile.toStringz, RTLD_LAZY | RTLD_LOCAL);
if (lib is null) ... // handle error
// Lookup the symbol of the generated function
auto symbol = "_D8funcImpl8funcImplFddZd"; // mangled name of
funcImpl()
impl = cast(FuncImpl) dlsym(lib, symbol);
if (impl is null) ... // handle error
return impl;
}
void main(string[] args)
{
auto input = getUserInput(...);
auto snippet = preprocessInput(input);
auto impl = compile(snippet);
... // do stuff
auto result = impl(x, y); // call generated function
... // more stuff
}
The symbol "_D8funcImpl8funcImplFddZd" is basically the mangled version
of funcImpl(). A mangled name is basically a way of encoding a function
signature into a legal identifier for an object file symbol -- the
system linker does not (and should not) understand D overloading rules,
for example, so the compiler needs to generate a unique name for every
function overload. Generally, the D compiler takes care of this for us,
so we never have to worry about it in usual D code, and can simply use
the human-readable name "funcImpl", or the fully-qualified name
"funcImpl.funcImpl". However, since dlsym() doesn't understand the D
mangling scheme, in this case we need to tell it the mangled name so
that it can find the right symbol in the shared object.
This was the original version of the program. So far so good.
In this version of the program, I didn't write many unittests for
compile(), because I felt it was ugly for unittests to have side-effects
on the host system (creating / deleting files, running external
programs, etc.). So, to my shame, this