Re: Integrating "safe" languages into OpenBSD?

Max Hayden Chiz Mon, 15 Jan 2018 13:50:51 -0800

On Sun, Dec 3, 2017 at 7:59 PM, Nicolas Schmidt <schmi...@math.hu-berlin.de>
wrote:
> Hi,
>
> I recently watched a recording of Theo's talk on pledge at
> EuroBSDCon 2017, in which the question of memory-safe
> languages and their practical usefulness came up. Specifically,
> someone in the audience criticized the approach taken by
> OpenBSD, which (as I understand) accepts that all software
> is broken and mitigates the damage caused
> by various classes of exploits through techniques like ASLR, and
> suggested that instead one should stick to "memory safe languages"
> to avoid these exploits altogether.
>
> As a response to this, Theo asked rhetorically "Where's ls, where's
> cat, where's grep, and where's sort?", implying that noone so far
> bothered to write implementations of even the basic unix utilities
> in such a language.
>
> This brings me to the question, what if someone actually bothered?
> Under what conditions would you consider replacing one of the
> current C implementations with an implementation written in
> another, "safer" language? Note that with Cgrep and haskell-ls,
> there do in fact exist implementations/analogues of two of the
> mentioned utilities in a memory safe language (Haskell).


Sorry for the thread res, but I wanted to add something to this discussion
and didn't have a chance until now.

There's a big misconception here about the point of "safe" languages. Safe
languages are *not* a security feature.

Let's take Rust as an example. Neither of the "two remote holes" would have
been caught by Rust's features. Rust doesn't protect against integer
overflow errors. *At best*, if SSH had been written in Rust, it would have
turned the remote hole into a failed bounds check and a panic. So instead
of a remote hole, we'd have had a denial of service attack. More
realistically, if the code in question had been part of an "unsafe" block,
the same security breach would have occurred.

Similarly, Rust doesn't prevent SSL from being written with spaghetti code
and obfuscating the fact that it was taking a user provided number to use
as a bounds check on a crappy custom memory structure. The same limitations
holds for most of the security bugs recorded in the CERT C standard. The
ones that can be automatically checked can be detected by C programming
tools. The ones that can't be automatically checked don't magically go away
when you use Rust.

More fundamentally, even if you have an application written in a safe
language, you have no way of examining the binary and knowing that those
safety guarantees still hold. In an embedded system, you can do verified
programming, use a verified compiler, and then make sure that the binary
can't be modified afterward by using read-only memory. (And you could still
have a problem due to a hardware bug.) None of that stuff is available or
practical for OpenBSD. Even worse, if your "safe" system has an elaborate
run-time, like Java or JavaScript, that run-time itself becomes an attack
surface. (Just look at how many security issues these things have had over
the years.)

Moreover, *good* C code is about as safe as it can be. CCured was an
application that (in part) could automatically prove that large aspects of
C programs were safe. In practice about 90% of the code was safe as-is.
Another 9% was safe if it had a bounds check (the system couldn't verify
this). And only 1% needed special handling. If you translate the code into
Rust, you are just going to end up proving that the safe 90% is safe and
the remaining 10% will live in an "unsafe" block because of what it does.

Contra the questioner's assumption, the real point of safe languages is
that they enhance programmer productivity by handling certain repetitive
issues automatically and by allowing for easier use of higher-level
language constructs. Done well, a safe language makes your code more
compact, faster to write, and easier to reason about.

But stated that way, it's obvious that OpenBSD won't benefit from them. For
one, an operating system is inherently low-level and doesn't have much room
for higher-level constructs. For another, productivity enhancements only
count when you are writing code from scratch. Re-writing the 3M+ lines of
OpenBSD's kernel code would be an obvious waste. (Not to mention the rest
of the system.)

Don't get me wrong, I think Rust is worth using in a new project for a
sufficiently complex application. But "use the right tool for the job"
applies here. Most of what an OS does (be it the kernel, ld.so, or the
various state machines inside of priv-sep'ed services) is "low-level" and
benefits from neither the additional abstraction nor from the safety
guarantees that you'd just have to disable or work around.

In contrast to mere memory-safety, how to do *low-level* programming in a
verifiable, bug-free way is still an active area of language research.
There are some promising developments out there, and rather than rewriting
Unix tools in Rust or making yet another microkernel, people seriously
interested in improving the ecosystem should be working to make these
developments usable and accessible.

One approach would be to annotate the C code with something like ACSL and
then use an automatic proof system to show that the code satisfies the
specification. Unfortunately, we are a long way off from that. ACSL is
pretty limited in terms of what it can do and tracking resources is
currently not on the menu. Furthermore, the code-bloat from those
annotations is close to 100%, so it's probably unusable for something as
large as OpenBSD unless someone comes up with a way to auto-annotate
existing code or to otherwise reduce the amount of programmer effort.

An approach that's further along is to use a mix of dependent and linear
types. This way the type checker complains when resources aren't properly
cleaned up, when APIs are misused, when you are missing a buffer overflows
check, if your all of your locks aren't obtained and released in the proper
order, if they don't protect everything that needs protecting, and so
forth. Hypothetically, you can even use the type system to verify
high-order concepts like, "obtaining root in an unauthorized manner is
impossible". There's a front-end for C called ATS that does all of this
now. (See, e.g., https://www.youtube.com/watch?v=zt0OQb1DBko ). But as that
video makes clear, the system is far from perfect. Needing to rewrite all
of your C code aside, the code bloat from the type system is pretty
intolerable. And syntactically separating the proof-code from the
executable code makes the language hideous. (Idris hopes to be similar
someday, but isn't as far along. They have cleaner syntax, but it requires
a global code analysis as part of compilation. So, that probably isn't a
solution that OpenBSD could live with. Whether you can have a clean syntax
that also separates proof-code from executable code is an open question.)
As should be apparent from the video, this is still just one guy's research
project. It's far from production ready.

What seems to be needed is a set of annotations (like ACSL), where the
basic starting point can be automatically derived (like CCured), and that
uses dependent and linear types (like ATS) to handle low-level coding and
resource management. Failing that, if you go the ATS route and make a new
language, the core of the language needs to be very close to C
syntactically so that you can automatically migrate the C code over before
you start gradually introducing the advanced type checking features. But we
seem to be a long-way off from any of that.

That said, it might be an interesting research project to translate a few
key parts of OpenBSD into ATS to see where the programmer pain points occur
and to try and resolve them. Ideally, this would also give you insight into
how to automate the process.

Tools like this are probably the future, but someone has to make them
usable and productive first. If and when good tools become available,
OpenBSD will no doubt use them. But the OpenBSD devs can't wish such tools
into existence. And it is a mistake to think that Rust et al. are remotely
close to what good tools would require.

And, again, even in some parallel universe where OpenBSD was written in
ATS, you'd still need all of the mitigation techniques because language
safety features protect you against bugs in your own code; they don't
protect you from bugs in code you don't control.

Hope this has clarified things.

--Max Chiz
New Orleans, LA

Re: Integrating "safe" languages into OpenBSD?

Reply via email to