On Sun, Dec 3, 2017 at 7:59 PM, Nicolas Schmidt <schmi...@math.hu-berlin.de> wrote: > Hi, > > I recently watched a recording of Theo's talk on pledge at > EuroBSDCon 2017, in which the question of memory-safe > languages and their practical usefulness came up. Specifically, > someone in the audience criticized the approach taken by > OpenBSD, which (as I understand) accepts that all software > is broken and mitigates the damage caused > by various classes of exploits through techniques like ASLR, and > suggested that instead one should stick to "memory safe languages" > to avoid these exploits altogether. > > As a response to this, Theo asked rhetorically "Where's ls, where's > cat, where's grep, and where's sort?", implying that noone so far > bothered to write implementations of even the basic unix utilities > in such a language. > > This brings me to the question, what if someone actually bothered? > Under what conditions would you consider replacing one of the > current C implementations with an implementation written in > another, "safer" language? Note that with Cgrep and haskell-ls, > there do in fact exist implementations/analogues of two of the > mentioned utilities in a memory safe language (Haskell).
Sorry for the thread res, but I wanted to add something to this discussion and didn't have a chance until now. There's a big misconception here about the point of "safe" languages. Safe languages are *not* a security feature. Let's take Rust as an example. Neither of the "two remote holes" would have been caught by Rust's features. Rust doesn't protect against integer overflow errors. *At best*, if SSH had been written in Rust, it would have turned the remote hole into a failed bounds check and a panic. So instead of a remote hole, we'd have had a denial of service attack. More realistically, if the code in question had been part of an "unsafe" block, the same security breach would have occurred. Similarly, Rust doesn't prevent SSL from being written with spaghetti code and obfuscating the fact that it was taking a user provided number to use as a bounds check on a crappy custom memory structure. The same limitations holds for most of the security bugs recorded in the CERT C standard. The ones that can be automatically checked can be detected by C programming tools. The ones that can't be automatically checked don't magically go away when you use Rust. More fundamentally, even if you have an application written in a safe language, you have no way of examining the binary and knowing that those safety guarantees still hold. In an embedded system, you can do verified programming, use a verified compiler, and then make sure that the binary can't be modified afterward by using read-only memory. (And you could still have a problem due to a hardware bug.) None of that stuff is available or practical for OpenBSD. Even worse, if your "safe" system has an elaborate run-time, like Java or JavaScript, that run-time itself becomes an attack surface. (Just look at how many security issues these things have had over the years.) Moreover, *good* C code is about as safe as it can be. CCured was an application that (in part) could automatically prove that large aspects of C programs were safe. In practice about 90% of the code was safe as-is. Another 9% was safe if it had a bounds check (the system couldn't verify this). And only 1% needed special handling. If you translate the code into Rust, you are just going to end up proving that the safe 90% is safe and the remaining 10% will live in an "unsafe" block because of what it does. Contra the questioner's assumption, the real point of safe languages is that they enhance programmer productivity by handling certain repetitive issues automatically and by allowing for easier use of higher-level language constructs. Done well, a safe language makes your code more compact, faster to write, and easier to reason about. But stated that way, it's obvious that OpenBSD won't benefit from them. For one, an operating system is inherently low-level and doesn't have much room for higher-level constructs. For another, productivity enhancements only count when you are writing code from scratch. Re-writing the 3M+ lines of OpenBSD's kernel code would be an obvious waste. (Not to mention the rest of the system.) Don't get me wrong, I think Rust is worth using in a new project for a sufficiently complex application. But "use the right tool for the job" applies here. Most of what an OS does (be it the kernel, ld.so, or the various state machines inside of priv-sep'ed services) is "low-level" and benefits from neither the additional abstraction nor from the safety guarantees that you'd just have to disable or work around. In contrast to mere memory-safety, how to do *low-level* programming in a verifiable, bug-free way is still an active area of language research. There are some promising developments out there, and rather than rewriting Unix tools in Rust or making yet another microkernel, people seriously interested in improving the ecosystem should be working to make these developments usable and accessible. One approach would be to annotate the C code with something like ACSL and then use an automatic proof system to show that the code satisfies the specification. Unfortunately, we are a long way off from that. ACSL is pretty limited in terms of what it can do and tracking resources is currently not on the menu. Furthermore, the code-bloat from those annotations is close to 100%, so it's probably unusable for something as large as OpenBSD unless someone comes up with a way to auto-annotate existing code or to otherwise reduce the amount of programmer effort. An approach that's further along is to use a mix of dependent and linear types. This way the type checker complains when resources aren't properly cleaned up, when APIs are misused, when you are missing a buffer overflows check, if your all of your locks aren't obtained and released in the proper order, if they don't protect everything that needs protecting, and so forth. Hypothetically, you can even use the type system to verify high-order concepts like, "obtaining root in an unauthorized manner is impossible". There's a front-end for C called ATS that does all of this now. (See, e.g., https://www.youtube.com/watch?v=zt0OQb1DBko ). But as that video makes clear, the system is far from perfect. Needing to rewrite all of your C code aside, the code bloat from the type system is pretty intolerable. And syntactically separating the proof-code from the executable code makes the language hideous. (Idris hopes to be similar someday, but isn't as far along. They have cleaner syntax, but it requires a global code analysis as part of compilation. So, that probably isn't a solution that OpenBSD could live with. Whether you can have a clean syntax that also separates proof-code from executable code is an open question.) As should be apparent from the video, this is still just one guy's research project. It's far from production ready. What seems to be needed is a set of annotations (like ACSL), where the basic starting point can be automatically derived (like CCured), and that uses dependent and linear types (like ATS) to handle low-level coding and resource management. Failing that, if you go the ATS route and make a new language, the core of the language needs to be very close to C syntactically so that you can automatically migrate the C code over before you start gradually introducing the advanced type checking features. But we seem to be a long-way off from any of that. That said, it might be an interesting research project to translate a few key parts of OpenBSD into ATS to see where the programmer pain points occur and to try and resolve them. Ideally, this would also give you insight into how to automate the process. Tools like this are probably the future, but someone has to make them usable and productive first. If and when good tools become available, OpenBSD will no doubt use them. But the OpenBSD devs can't wish such tools into existence. And it is a mistake to think that Rust et al. are remotely close to what good tools would require. And, again, even in some parallel universe where OpenBSD was written in ATS, you'd still need all of the mitigation techniques because language safety features protect you against bugs in your own code; they don't protect you from bugs in code you don't control. Hope this has clarified things. --Max Chiz New Orleans, LA