Through Reddit I've found a nice blog post, "Using Static
Analysis And Clang To Find Heartbleed":
http://blog.trailofbits.com/2014/04/27/using-static-analysis-and-clang-to-find-heartbleed/
The part about the range analysis is nice:
Ranges of symbol values:
conj_$2{int} : { [-2147483648, -2], [0, 2147483647] }
conj_$9{uint32_t} : { [0, 6] }
Unlike D, here the Clang compiler is keeping and managing those
ranges across different lines of code. This means that here C/C++
is more modern&advanced than D.
Then the blog post shows a small C program that contains a
Heartbleed-like bug:
int data_array[] = { 0, 18, 21, 95, 43, 32, 51};
int main(int argc, char *argv[]) {
int fd;
char buf[512] = {0};
fd = open("dtin", O_RDONLY);
if(fd != -1) {
int size;
int res;
res = read(fd, &size, sizeof(int));
if(res == sizeof(int)) {
size = ntohl(size);
if(size < sizeof(data_array)) {
memcpy(buf, data_array, size);
}
memcpy(buf, data_array, size);
}
close(fd);
}
return 0;
}
Using a language a little less primitive than C helps avoid those
bugs and to avoid several other mistakes that can happen in that
C example code.
This is hypothetical equivalent D code that I think it's less
bug-prone (perhaps I've missed some little part of the original C
code):
immutable int[$] data = [0, 18, 21, 95, 43, 32, 51];
void main() {
ubyte[512] buf;
auto fIn = "dtin".File;
scope(exit) fIn.close;
immutable sizeBytes = fIn.tryRead!uint.Maybe!bswap;
if (!sizeBytes.isNull) {
if (sizeBytes.get < data.sizeof) {
copyBytes(buf, data, sizeBytes.get);
}
copyBytes(buf, data, sizeBytes.get);
}
}
Notes:
The "int[$] =" syntax means data is a fixed size array where the
length is inferred. This is a possible syntax to avoid bug-prone
code like (that currently compiles):
immutable int[8] data = [0, 18, 21, 95, 43, 32, 51];
An alternative syntax:
immutable data = [0, 18, 21, 95, 43, 32, 51]s;
scope(exit) helps avoid to forget to close the file when the
scope ends. It's not essential in D, that closes the File using
RAII and reference counting.
Instead of the ugly and bug-prone mess of "read(fd, &size,
sizeof(int));" I have used something nicer. "tryRead!uint" tries
to read an uint, and returns a Nullable!uint. The "Maybe" is a
function currently missing in Phobos
(https://issues.dlang.org/show_bug.cgi?id=12679 ), it's used to
chain other functions on nullables, so bswap gets called only if
tryRead doesn't return a null value. If tryRead returns a null,
then Maybe returns a nulled Nullable (currently I think "tryRead"
is not present in Phobos, but it's not hard to add something
similar, using parse).
copyBytes is a template function that accepts both dynamic arrays
and fixed-static arrays (if the input is a fixed-static array it
doesn't throw away this precious compile-time information, as
usually done by Phobos), that is much safer than memcpy. In
non-release mode the code inside copyBytes will give a out of
range error.
There are simple ways to reduce the template bloat caused by
handling the length of fixed size arrays in library code:
introducing in D a simple syntax to handle ghost typing (or here
ghost values), that allows to manage types (and compile-time
values) to enforce compile-time invariants, but that do not
generate distinct instantiations of code and data structures,
like templates on regular types (and regular compile-time values)
do.
The D version also uses better names that make the semantics more
clear, uses immutability (that is partially available in C too,
with "const" that is quite underused in C code), and is less
cluttered and noisy, this helps spot the bugs better.
In general I suggest to avoid using the raw C functions in D.
Better to wrap them inside something a little safer. If such
wrappers are standard Phobos functions it's even better:
https://d.puremagic.com/issues/show_bug.cgi?id=12548
Bye,
bearophile