[sqlite] Article about pointer abuse in SQLite

Richard Hipp Fri, 18 Mar 2016 18:03:54 -0400

On 3/18/16, Igor Tandetnik <igor at tandetnik.org> wrote:
> On 3/18/2016 4:40 PM, Keith Medcalf wrote:
>> There is no such thing as "undefined behaviour".  The machine code does
>> exactly what it is told to do
>
> But SQLite is not written in machine code. It is (largely) written in C.


SQLite is written in C, but the focus of testing is the resulting
machine code.  So, there is a point of view that says that it doesn't
really matter if the C code is "undefined" or not.  What matters is
whether or not the resulting machine code computes the correct answer.
That's what we test: does the machine code compute the correct answer.

Our testing standard is to cause every machine-code branch instruction
to take both decisions (jump and fall through) at least once, in an
as-deployed configuration.  That means every instruction in the
machine code is tested and testing occurs with no special compile-time
options or flags.  The code that gets tested is exactly byte-for-byte
the same machine code that gets deployed in the field.  This is what
they mean by "fly what you test and test what you fly".

That said, we also do source-code validation on SQLite.  We work to
make SQLite compile without warnings, we run it through static
analyzers, and I address any "undefined behaviors" in the source code
that John Regehr or others report.  SQLite contains thousands of
assert() statements which are really another form of source-code
validation (as the asserts do not appear in deployment builds.)  Just
the other day, I wrote a custom static code analyzer for SQLite that
runs on every "make" looking for a particular kind of programming
error (see https://www.sqlite.org/src/artifact/4f65e1a6748e42f2).  We
also compile and test SQLite on as many different compilers as we can,
with all kinds of different compiler settings, and verify that they
all get the same answer, which is yet another kind of source-code
validation.

All of this source-code validation is well and good, and I recommend
it.  But the real focus of testing is SQLite is the machine code.  And
so to a first approximation, Keith is correct: "undefined behavior" at
the source code level does not matter.

You cannot say that a program is correct by only looking at source
code.  You have to validate the machine code.  Compilers make
mistakes.  SQLite has, at various times, hit bugs in each of GCC,
Clang, and MSVC, all of which have now been fixed.  Do not trust your
compiler.

One point of view is that compilers that attempt to take advantage of
obscure "undefined behavior" to boost performance, but instead break
programs, are in fact buggy.  (Compiler writers tend to disagree with
this viewpoint.  I'm not saying it is the correct point of view, just
a point of view that is widely held.)  So watching out for obscure
"undefined behavior" that breaks your program is really the same thing
as watching out for bugs in your compiler.  Both of these happen, and
in my experience, with about the same frequency.

-- 
D. Richard Hipp
drh at sqlite.org

[sqlite] Article about pointer abuse in SQLite

Reply via email to