On 3/18/16, Igor Tandetnik <igor at tandetnik.org> wrote: > On 3/18/2016 4:40 PM, Keith Medcalf wrote: >> There is no such thing as "undefined behaviour". The machine code does >> exactly what it is told to do > > But SQLite is not written in machine code. It is (largely) written in C.
SQLite is written in C, but the focus of testing is the resulting machine code. So, there is a point of view that says that it doesn't really matter if the C code is "undefined" or not. What matters is whether or not the resulting machine code computes the correct answer. That's what we test: does the machine code compute the correct answer. Our testing standard is to cause every machine-code branch instruction to take both decisions (jump and fall through) at least once, in an as-deployed configuration. That means every instruction in the machine code is tested and testing occurs with no special compile-time options or flags. The code that gets tested is exactly byte-for-byte the same machine code that gets deployed in the field. This is what they mean by "fly what you test and test what you fly". That said, we also do source-code validation on SQLite. We work to make SQLite compile without warnings, we run it through static analyzers, and I address any "undefined behaviors" in the source code that John Regehr or others report. SQLite contains thousands of assert() statements which are really another form of source-code validation (as the asserts do not appear in deployment builds.) Just the other day, I wrote a custom static code analyzer for SQLite that runs on every "make" looking for a particular kind of programming error (see https://www.sqlite.org/src/artifact/4f65e1a6748e42f2). We also compile and test SQLite on as many different compilers as we can, with all kinds of different compiler settings, and verify that they all get the same answer, which is yet another kind of source-code validation. All of this source-code validation is well and good, and I recommend it. But the real focus of testing is SQLite is the machine code. And so to a first approximation, Keith is correct: "undefined behavior" at the source code level does not matter. You cannot say that a program is correct by only looking at source code. You have to validate the machine code. Compilers make mistakes. SQLite has, at various times, hit bugs in each of GCC, Clang, and MSVC, all of which have now been fixed. Do not trust your compiler. One point of view is that compilers that attempt to take advantage of obscure "undefined behavior" to boost performance, but instead break programs, are in fact buggy. (Compiler writers tend to disagree with this viewpoint. I'm not saying it is the correct point of view, just a point of view that is widely held.) So watching out for obscure "undefined behavior" that breaks your program is really the same thing as watching out for bugs in your compiler. Both of these happen, and in my experience, with about the same frequency. -- D. Richard Hipp drh at sqlite.org