[forwarded with Robbert’s permission]
> Begin forwarded message: > > Subject: Re: [Avail] Avail's soundness > From: Mark van Gulik <[email protected]> > Date: March 11, 2015 at 6:57:50 PM CDT > Cc: Mark van Gulik <[email protected]> > To: Robbert van Dalen <[email protected]> > > Robbert, > > Your statement about functional purists reminds me of something I had > intended to add… > > The directness of Avail’s expression, its lack of syntactic cruft, is not > intended to make programs short; rather, it’s to make the programs > expressible in a form more closely resembling a specification language than > could normally be achieved in a language. The similarity to a specification > language, or at least the strong similarities even if not in a declarative > form, should reduce the potential for semantic disconnect between > specification and implementation. > > While functional purists may say they have a declarative language, I think > it’s about as declarative in practice as Lisp is expressive. Just as Lisp > has almost no syntax, allowing Lisp programs to manipulate other programs > with ease – as long as they’re just Lisp programs, functional languages seem > to have the same variety of design myopia. They’re great languages for > specifying functional programs, or for describing ways in which calculation > specifications can be described as functional programs. Unfortunately, > that’s a completely made-up "problem domain", and functional languages seem > absolutely terrible for expressing actual specifications. Almost every > Haskell program I’ve ever seen has been riddled with loops manually mangled > into recursion... but worst of all, they’re full of nasty tricks to tell > Haskell what order to evaluate things (see discussions of foldl versus > foldl’)! That’s pretty clear evidence that they’re sawing off the one leg > they're standing on – abstracting away the execution order. > > Functional programs never look like the solution to a real problem; they all > look like a solution to the secondary problem of trying to rewrite a > specification into something as awkward as a functional language, and then > tweaking it until it actually runs without consuming an infinite amount of > memory. The problem is that they forgot there are two jumps to “reliably" > going from a software problem statement to its solution: the jump from a > clearly written problem statement into a formal spec, and the jump from the > formal spec into a program. Let’s grant that Haskell easily nails the second > jump; but the first jump is waaaay farther than both jumps combined would be > in an imperative language. In other words, what they think of as a spec is > already just a complete program crammed into an awkward, weak language. > Unless Haskell has actual power as a specification language, it shouldn’t be > used as such, and unless it has actual power as a programming language, > likewise. Unfortunately it seems like a total failure for both. It’s like a > taxi service that operates over a one-block radius and boasts about their > short travel time. > > Avail’s take is to provide a way to express a problem statement as simply and > directly as possible, preferably in a notation suitable for describing the > problem. The further from the problem domain such a language is, the more > likely a change (or reinterpretation) of the requirements is to cause > sweeping changes to the formal specification (and then the program). So like > Haskell, Avail tries to shorten the two jumps by shrinking one of them; the > opposite one in this case, the mapping from the problem statement to the > spec. But Avail is built on a sufficiently solid foundation that the second > jump is also small; it’s a "simple matter" of adding support for the problem > statement’s language. Which is kind of the opposite of Haskell’s approach, > which is to remove as much functionality as possible from the language for > “purity”. Avail has its own purity concerns, but they’re for practical > reasons, not an academic exercise. For example, removing its core syntax > allows the first jump to be unencumbered by any Avail-specific cruft (i.e., > the problem-domain-specific language isn’t “dirtied” by Avail’s core syntax). > And mapping everything into polymorphic methods composed of L1 nybblecodes > (and built from phrases) allows tools to (some day) work more easily with > Avail’s… peculiarities… even when the specific language looks nothing like > core Avail. > > So technically in Avail there’s a third jump, or maybe more accurately > there’s the step of building the problem-domain-specific vehicle with which > to do the other jumping: the language. Fortunately, Avail is by far > unmatched at building these linguistic vehicles, especially since the birth > of macros. > > Ok, I’m done preaching to the choir. And thanks for using the word > “practical” to describe the VM / library restriction dichotomy. Some people > use “practical” or “pragmatic” to mean cheap and easy for the VM writer to > produce, but that’s clearly not what either of us mean here. More like > "capable of surviving practice”. > > -Mark > > > >> On Mar 11, 2015, at 5:50 PM, Robbert van Dalen <[email protected] >> <mailto:[email protected]>> wrote: >> >> Mark, >> >> Thank you, this is really an *exceptional* educational reply! >> I now believe I understand the interplay between sound first-order types >> (ensured by the VM) and - potentially unsound - second-order semantic >> restrictions (‘ensured' by Avail programmers). >> >> Also, your statements: >> >> "There are languages that claim to do better, say rejecting programs that >> have no proof of correctness, but the programs that can be written in those >> languages are still quite restricted… and the correctness is only relative >> to a spec, which is just another program in a more declarative form. In my >> experience the spec is wrong often enough that a deep, formal proof of an >> implementation is almost pointless.” >> >> ..really stood out. >> >> I don’t believe such statements will persuade the functional purists, but I >> do think you made a legitimate and very *practical* point there. >> >> Ah, you can’t please everybody I guess :) >> >> cheers, >> Robbert. >> >> >>> On 11 Mar 2015, at 23:10, Mark van Gulik <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> Robbert, >>> >>> I believe Avail breaks new ground in its balance between soundness and >>> expressiveness. I call its type system “first-order safe”, because if >>> semantic restrictions are all correct, then no expression can fail to >>> produce the right type at run time. The language’s implicit rules for >>> propagating values ensure this. However, since it’s possible to write a >>> semantic restriction that is incorrect, we don’t have "second-order type >>> safety” – we can’t check the semantic restrictions for correctness, any >>> more than we could check an arbitrarily complex program in some other >>> language for correctness (other than type-correctness). So Avail balances >>> its first-order soundness with second-order unsound semantic restrictions, >>> tempered by run time checks. >>> >>> There are languages that claim to do better, say rejecting programs that >>> have no proof of correctness, but the programs that can be written in those >>> languages are still quite restricted… and the correctness is only relative >>> to a spec, which is just another program in a more declarative form. In my >>> experience the spec is wrong often enough that a deep, formal proof of an >>> implementation is almost pointless. >>> >>> Now to be more specific, consider how we ensure first-order safety. The >>> primitives declare in the VM what types they require and produce. If those >>> declarations within the VM are correct (a requirement of any language), >>> then what they produce can be relied on. In fact, L2 doesn’t check the >>> results of primitives. The primitive declarations are even able to be more >>> precise than just specifying a flat type signature. They say that given a >>> particular collection of argument types, the result is constrained to a >>> particular type that’s a function of the input types (provided by a method >>> in that subclass of Primitive). They also declare under what circumstances >>> the primitives are known to succeed, known to fail, or are unknown. L2 >>> makes use of this information as well at each call site for performance. >>> >>> But we wanted the library’s function/method types to be independent of >>> improvements to this detailed but incomplete information provided by the >>> VM. Otherwise when we improved a bound on some primitive we would produce >>> a different type signature in the library. Typically just stronger, but in >>> theory we might have to weaken some result types to fix bugs. So we keep >>> this detailed information out of the Avail programer’s visibility. The >>> “_+_” operation takes two numbers and produces a number. It’s up to a >>> semantic restriction in the library to determine how to assert that a call >>> site that’s adding (say) two natural numbers can only produce a natural >>> number – and can’t fail. In general the addition can fail, say for ∞ + >>> (-∞), which is undefined, but natural numbers don’t contain ±∞, so such an >>> addition won’t fail. We restrict the call site by the intersection of the >>> semantic restrictions (and the primitive’s basic result type, number), but >>> we don’t actually check the result's run-time type if the primitive has >>> guaranteed at least as strong a result for those inputs. >>> >>> The same way we block propagation of nulls out of variables, the VM also >>> prevents propagation of incorrectly typed information past the returns from >>> method invocations. We also check the input types for general operations >>> like function application, but… no surprise… L2 is able to skip the >>> run-time checks if it can statically prove that there will be the right >>> number of arguments of the right types for the provided function. >>> >>> So to get better performance we can simply assert stronger things in the >>> VM, without affecting any of the types in the library. A tighter return >>> type can even reduce the potential polymorphic calls that subsequently >>> operate on the value. Assuming we get those assertions correct, the VM >>> runs faster while still being type-correct. But when we express stronger >>> semantic restrictions we (the VM writers) no longer have the responsibility >>> of coding them correctly, like we require in the VM. That’s because >>> anybody may write a semantic restriction. The code’s static type safety is >>> as correct as those restrictions, no more and no less (assuming a correct >>> VM), but there is always run time type safety (again assuming a correct >>> VM), because anything the VM can’t prove is simply checked at run time, >>> invoking the run time type failure handler if something goes wrong. That’s >>> not something you want to happen very often, of course, but it helps you >>> debug your library by refusing to propagate anything that’s the wrong type. >>> >>> The other reason to separate VM primitive assertions and semantic >>> restrictions is expressivity. In theory what you can write in Java (for >>> primitive strengthening) and what you can write in Avail (semantic >>> restrictions) cover the same territory, but obviously there are huge >>> differences in expressiveness. In fact, when you write such code in Java, >>> the Java type system is so incredibly weak that almost any gibberish will >>> compile. That’s partly due to how we use Java’s types in the VM, but >>> really Java doesn’t provide anything close to strong enough to have benefit >>> for us in the VM. It was originally written in Smalltalk, and we really >>> get very little mileage out of Java’s types. The A_* interfaces are only >>> there to reduce the chance you’ll shoot yourself in the foot, and to make >>> auto-complete suggestions more narrowly focused. If I have an A_Map and I >>> extract a key from it, I have to broaden it from AvailObject to A_Tuple or >>> whatever I already “know” is supposed to be the kind of key in that map. >>> I’ve played with the idea of making A_Map be generic in two parameters, but >>> since Java generics are erased and most values pass through Avail at some >>> time anyhow, the Java types would just be made-up anyhow. Java’s type >>> system provides about as much help with correctness with the Avail VM as >>> assembly language’s type system (i.e., none). >>> >>> Note that since semantic restrictions are written in Avail, their bodies >>> have first-order type safety. So you get the benefit of the type rules and >>> any existing semantic restrictions even when writing new semantic >>> restrictions. This assistance is compounded by the principle of >>> meta-covariance (given two types t1 and t2, (t1 ⊆ t2) = (t1’s type ⊆ t2’s >>> type)), which Avail automatically guarantees for all of its types. >>> >>> So maybe the right way to describe all these things is that Avail separates >>> the concepts of: >>> • first-order type safety (guaranteed by the VM), >>> • second-order type safety (provided by libraries, but checked at >>> runtime), and >>> • performance (by using type deduction within the VM). >>> >>> It makes sense that several years down the road we could start applying >>> guided theorem proving to statically check the correctness of semantic >>> restrictions. I suspect we could eventually prove the majority of them, >>> but simply validating a user-provided proof might be easier (the name >>> Cayenne comes to mind, but I think that project is defunct now). Some >>> semantic restrictions, however, one may want to simply accept as axioms. I >>> think all of this depends on how the market chooses to accept Avail. After >>> all, people are still using PHP and Javascript (for anything ever). >>> >>> One more thing before I go: Say we invoke a method and get back a result >>> that has a semantic restriction that narrows it such that its run-time type >>> has to be checked. Yes, we take a hit checking that value's type, but >>> after the check passes we know that the value really has that narrowed >>> type. We can pass it around with confidence, and the VM will optimize >>> using the assumption that this already-checked value has the type that we >>> expect. When passed to a primitive, this narrowed type is what gets used >>> to deduce what type of value the primitive will produce (and whether it can >>> fail), so the actual number of run-time checks is probably lower than you’d >>> expect. >>> >>> -Mark >> >
signature.asc
Description: Message signed with OpenPGP using GPGMail
