seL4 Verification (Was: L4.sec)

Gernot Heiser Sat, 09 Jun 2007 21:40:27 -0700

On Fri, 08 Jun 2007 10:22:52 -0400, Jonathan S. Shapiro wrote:
shap> On Fri, 2007-06-08 at 09:19 +0530, Prem Mallappa wrote:
> > [1] http://ertos.nicta.com.au/research/sel4/
> 
> Looks like it is written in Haskell, anybody has/interested in 'C/C++'
> implementation?


shap> This is not quite what is happening. The seL4 project has built an
shap> executable *model* of the system in Haskell. This version is a complete
shap> but unoptimized version of seL4. It runs real application binaries. It
shap> is the one that they are verifying certain properties about. Think of
shap> the Haskell version as providing a low-level formal model.

shap> Separately, they are doing an implementation in a constrained subset of
shap> C (I think it is C rather than C++, but I'm not sure). The idea is to be
shap> able to translate this implementation into their theorem prover as well,
shap> and then to show that it corresponds to the Haskell low-level model. So
shap> there *is* an implementation in C.

shap> There are many missing links in their verification story, but it is very
shap> promising, and it is much further along than anyone else has managed to
shap> get.

Shap is essentially right. There are always some "missing links", but
they aren't where people might suspect them.

Let's look at what we have:

1) semi-formal (hacker-readable) spec written in Literal Haskell

2) formal spec (abstract model) in the Isabelle prover

3) executable model in Haskell. As shap correctly observes, this can
   run real binaries built with standard tool chains. To this end it
   is coupled with a simulator for the unprivileged part of the ARM
   ISA. All application instructions are executed by the simulator;
   when the app does a kernel trap, this is interpreted by the Haskell
   kernel. The executable model allows us to exercise the kernel, get
   experience with the API and port user-level components, all before
   a proper kernel implementation exists.

4) Isabelle representation of (3)

5) C/assembler implementation of the kernel. This is, at present, very
   much a prototype and still incomplete, but is progressing fine

6) Isabelle representation of (5)

(2) is more high-level than (1), which is why it is called the
"abstract model". It has been created by manual abstraction of (1).

(2) It is the "primary" model as far as any proofs are concerned,
whether it's the implementation proofs or proofs of security
properties (separation, confinement etc).

(3) is essentially the completed version of (1). In a way, (1) doesn't
really exist any more as a separate document. There is a single .lhs
document that produces (via latex) the human-readable manual and (via
ghc) the executable.

(4) is also produced from (3), automatically, using the (well-defined)
semantics of Haskell.

(5) is, of course, written manually, based on the Pistachio-embedded
code base.

(6) is the Isabelle equivalent of (5)

Relevant for the verification story are

a) formal proofs about security properties of (2). A number of such
   proofs (including separation properties) have been completed (in
   Isabelle), others are in progress.

b) the correspondence between (2) and (5)

(b) is done in several steps.

One such step is (5)->(6). This is done automatically, using a formal
semantics (in Isabelle) of our subset of C (plus some formal model of
the hardware for assembler code). That formal semantics exists, and
(6) is almost complete.

Another step is (2)->(4). This is a refinement proof in Isabelle,
which is now (essentially) complete. Since (4) has been produced from
(3), we could relatively easily prove that everything proved about (2)
also holds for (3) (and (1), the spec part of (3)). Whether or not
this is a worthwhile thing to do depends on the on-going significance
of (3). We haven't decided yet whether we'll do that part.

The final step that closes the gap is (4)->(6). This is the second
refinement step in Isabelle, which is work in progress, scheduled to
be complete in April next year.

Missing links are things like the verification of the prover and the C
compiler (which in the present model must be trusted). It can be
eliminated by
 - using a verified compiler (such work is progress at other places) or
 - adding a further refinement step from C to machine code. This is an
   option for the future, but we suspect that the verified compiler is
   the better approach.

And, of course, there is always the need for a hardware model. That
could be derived from the manufacturer's VHDL description, which makes
it dependent on the VHDL compiler and other tools etc. We derive it
automatically from the manufacturer's published RTL-description of the
ISA. There is, of course, no guarantee that the actual hardware
operates as described by the manufacturer, nor have we proven our RTL
translator (which would be possible, of course).

You can never eliminate all "missing links".

shap> [...] but it is very promising, and it is much further along
shap> than anyone else has managed to get.

I couldn't argue with that.

Gernot


_______________________________________________
L4-hurd mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/l4-hurd

seL4 Verification (Was: L4.sec)

Reply via email to