[Avail] Fwd: Avail's soundness

Mark van Gulik Sat, 14 Mar 2015 23:42:07 -0700

[forwarded with Robbert’s permission]


> Begin forwarded message:
> 
> Subject: Re: [Avail] Avail's soundness
> From: Mark van Gulik <[email protected]>
> Date: March 11, 2015 at 6:57:50 PM CDT
> Cc: Mark van Gulik <[email protected]>
> To: Robbert van Dalen <[email protected]>
> 
> Robbert,
> 
> Your statement about functional purists reminds me of something I had 
> intended to add…
> 
> The directness of Avail’s expression, its lack of syntactic cruft, is not 
> intended to make programs short; rather, it’s to make the programs 
> expressible in a form more closely resembling a specification language than 
> could normally be achieved in a language.  The similarity to a specification 
> language, or at least the strong similarities even if not in a declarative 
> form, should reduce the potential for semantic disconnect between 
> specification and implementation.
> 
> While functional purists may say they have a declarative language, I think 
> it’s about as declarative in practice as Lisp is expressive.  Just as Lisp 
> has almost no syntax, allowing Lisp programs to manipulate other programs 
> with ease – as long as they’re just Lisp programs, functional languages seem 
> to have the same variety of design myopia.  They’re great languages for 
> specifying functional programs, or for describing ways in which calculation 
> specifications can be described as functional programs.  Unfortunately, 
> that’s a completely made-up "problem domain", and functional languages seem 
> absolutely terrible for expressing actual specifications.  Almost every 
> Haskell program I’ve ever seen has been riddled with loops manually mangled 
> into recursion... but worst of all, they’re full of nasty tricks to tell 
> Haskell what order to evaluate things (see discussions of foldl versus 
> foldl’)!  That’s pretty clear evidence that they’re sawing off the one leg 
> they're standing on – abstracting away the execution order.
> 
> Functional programs never look like the solution to a real problem; they all 
> look like a solution to the secondary problem of trying to rewrite a 
> specification into something as awkward as a functional language, and then 
> tweaking it until it actually runs without consuming an infinite amount of 
> memory.  The problem is that they forgot there are two jumps to “reliably" 
> going from a software problem statement to its solution: the jump from a 
> clearly written problem statement into a formal spec, and the jump from the 
> formal spec into a program.  Let’s grant that Haskell easily nails the second 
> jump; but the first jump is waaaay farther than both jumps combined would be 
> in an imperative language.  In other words, what they think of as a spec is 
> already just a complete program crammed into an awkward, weak language.  
> Unless Haskell has actual power as a specification language, it shouldn’t be 
> used as such, and unless it has actual power as a programming language, 
> likewise.  Unfortunately it seems like a total failure for both.  It’s like a 
> taxi service that operates over a one-block radius and boasts about their 
> short travel time.
> 
> Avail’s take is to provide a way to express a problem statement as simply and 
> directly as possible, preferably in a notation suitable for describing the 
> problem.  The further from the problem domain such a language is, the more 
> likely a change (or reinterpretation) of the requirements is to cause 
> sweeping changes to the formal specification (and then the program).  So like 
> Haskell, Avail tries to shorten the two jumps by shrinking one of them; the 
> opposite one in this case, the mapping from the problem statement to the 
> spec.  But Avail is built on a sufficiently solid foundation that the second 
> jump is also small; it’s a "simple matter" of adding support for the problem 
> statement’s language.  Which is kind of the opposite of Haskell’s approach, 
> which is to remove as much functionality as possible from the language for 
> “purity”.  Avail has its own purity concerns, but they’re for practical 
> reasons, not an academic exercise.  For example, removing its core syntax 
> allows the first jump to be unencumbered by any Avail-specific cruft (i.e., 
> the problem-domain-specific language isn’t “dirtied” by Avail’s core syntax). 
>  And mapping everything into polymorphic methods composed of L1 nybblecodes 
> (and built from phrases) allows tools to (some day) work more easily with 
> Avail’s… peculiarities… even when the specific language looks nothing like 
> core Avail.
> 
> So technically in Avail there’s a third jump, or maybe more accurately 
> there’s the step of building the problem-domain-specific vehicle with which 
> to do the other jumping: the language.  Fortunately, Avail is by far 
> unmatched at building these linguistic vehicles, especially since the birth 
> of macros.
> 
> Ok, I’m done preaching to the choir.  And thanks for using the word 
> “practical” to describe the VM / library restriction dichotomy.  Some people 
> use “practical” or “pragmatic” to mean cheap and easy for the VM writer to 
> produce, but that’s clearly not what either of us mean here.  More like 
> "capable of surviving practice”.
> 
> -Mark
> 
> 
> 
>> On Mar 11, 2015, at 5:50 PM, Robbert van Dalen <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>> Mark,
>> 
>> Thank you, this is really an *exceptional* educational reply!
>> I now believe I understand the interplay between sound first-order types 
>> (ensured by the VM) and - potentially unsound - second-order semantic 
>> restrictions (‘ensured' by Avail programmers).
>> 
>> Also, your statements:
>> 
>> "There are languages that claim to do better, say rejecting programs that 
>> have no proof of correctness, but the programs that can be written in those 
>> languages are still quite restricted… and the correctness is only relative 
>> to a spec, which is just another program in a more declarative form.  In my 
>> experience the spec is wrong often enough that a deep, formal proof of an 
>> implementation is almost pointless.”
>> 
>> ..really stood out.
>> 
>> I don’t believe such statements will persuade the functional purists,  but I 
>> do think you made a legitimate and very *practical* point there.
>> 
>> Ah, you can’t please everybody I guess :)
>> 
>> cheers,
>> Robbert.
>> 
>> 
>>> On 11 Mar 2015, at 23:10, Mark van Gulik <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>> 
>>> Robbert,
>>> 
>>> I believe Avail breaks new ground in its balance between soundness and 
>>> expressiveness.  I call its type system “first-order safe”, because if 
>>> semantic restrictions are all correct, then no expression can fail to 
>>> produce the right type at run time.  The language’s implicit rules for 
>>> propagating values ensure this.  However, since it’s possible to write a 
>>> semantic restriction that is incorrect, we don’t have "second-order type 
>>> safety” – we can’t check the semantic restrictions for correctness, any 
>>> more than we could check an arbitrarily complex program in some other 
>>> language for correctness (other than type-correctness).  So Avail balances 
>>> its first-order soundness with second-order unsound semantic restrictions, 
>>> tempered by run time checks.
>>> 
>>> There are languages that claim to do better, say rejecting programs that 
>>> have no proof of correctness, but the programs that can be written in those 
>>> languages are still quite restricted… and the correctness is only relative 
>>> to a spec, which is just another program in a more declarative form.  In my 
>>> experience the spec is wrong often enough that a deep, formal proof of an 
>>> implementation is almost pointless.
>>> 
>>> Now to be more specific, consider how we ensure first-order safety.  The 
>>> primitives declare in the VM what types they require and produce.  If those 
>>> declarations within the VM are correct (a requirement of any language), 
>>> then what they produce can be relied on.  In fact, L2 doesn’t check the 
>>> results of primitives.  The primitive declarations are even able to be more 
>>> precise than just specifying a flat type signature.  They say that given a 
>>> particular collection of argument types, the result is constrained to a 
>>> particular type that’s a function of the input types (provided by a method 
>>> in that subclass of Primitive).  They also declare under what circumstances 
>>> the primitives are known to succeed, known to fail, or are unknown.  L2 
>>> makes use of this information as well at each call site for performance.
>>> 
>>> But we wanted the library’s function/method types to be independent of 
>>> improvements to this detailed but incomplete information provided by the 
>>> VM.  Otherwise when we improved a bound on some primitive we would produce 
>>> a different type signature in the library.  Typically just stronger, but in 
>>> theory we might have to weaken some result types to fix bugs.  So we keep 
>>> this detailed information out of the Avail programer’s visibility.  The 
>>> “_+_” operation takes two numbers and produces a number.  It’s up to a 
>>> semantic restriction in the library to determine how to assert that a call 
>>> site that’s adding (say) two natural numbers can only produce a natural 
>>> number – and can’t fail.  In general the addition can fail, say for ∞ + 
>>> (-∞), which is undefined, but natural numbers don’t contain ±∞, so such an 
>>> addition won’t fail.  We restrict the call site by the intersection of the 
>>> semantic restrictions (and the primitive’s basic result type, number), but 
>>> we don’t actually check the result's run-time type if the primitive has 
>>> guaranteed at least as strong a result for those inputs.
>>> 
>>> The same way we block propagation of nulls out of variables, the VM also 
>>> prevents propagation of incorrectly typed information past the returns from 
>>> method invocations.  We also check the input types for general operations 
>>> like function application, but… no surprise… L2 is able to skip the 
>>> run-time checks if it can statically prove that there will be the right 
>>> number of arguments of the right types for the provided function.
>>> 
>>> So to get better performance we can simply assert stronger things in the 
>>> VM, without affecting any of the types in the library.  A tighter return 
>>> type can even reduce the potential polymorphic calls that subsequently 
>>> operate on the value.  Assuming we get those assertions correct, the VM 
>>> runs faster while still being type-correct.  But when we express stronger 
>>> semantic restrictions we (the VM writers) no longer have the responsibility 
>>> of coding them correctly, like we require in the VM.  That’s because 
>>> anybody may write a semantic restriction.  The code’s static type safety is 
>>> as correct as those restrictions, no more and no less (assuming a correct 
>>> VM), but there is always run time type safety (again assuming a correct 
>>> VM), because anything the VM can’t prove is simply checked at run time, 
>>> invoking the run time type failure handler if something goes wrong.  That’s 
>>> not something you want to happen very often, of course, but it helps you 
>>> debug your library by refusing to propagate anything that’s the wrong type.
>>> 
>>> The other reason to separate VM primitive assertions and semantic 
>>> restrictions is expressivity.  In theory what you can write in Java (for 
>>> primitive strengthening) and what you can write in Avail (semantic 
>>> restrictions) cover the same territory, but obviously there are huge 
>>> differences in expressiveness.  In fact, when you write such code in Java, 
>>> the Java type system is so incredibly weak that almost any gibberish will 
>>> compile.  That’s partly due to how we use Java’s types in the VM, but 
>>> really Java doesn’t provide anything close to strong enough to have benefit 
>>> for us in the VM.  It was originally written in Smalltalk, and we really 
>>> get very little mileage out of Java’s types.  The A_* interfaces are only 
>>> there to reduce the chance you’ll shoot yourself in the foot, and to make 
>>> auto-complete suggestions more narrowly focused.  If I have an A_Map and I 
>>> extract a key from it, I have to broaden it from AvailObject to A_Tuple or 
>>> whatever I already “know” is supposed to be the kind of key in that map.  
>>> I’ve played with the idea of making A_Map be generic in two parameters, but 
>>> since Java generics are erased and most values pass through Avail at some 
>>> time anyhow, the Java types would just be made-up anyhow.  Java’s type 
>>> system provides about as much help with correctness with the Avail VM as 
>>> assembly language’s type system (i.e., none).
>>> 
>>> Note that since semantic restrictions are written in Avail, their bodies 
>>> have first-order type safety.  So you get the benefit of the type rules and 
>>> any existing semantic restrictions even when writing new semantic 
>>> restrictions.  This assistance is compounded by the principle of 
>>> meta-covariance (given two types t1 and t2, (t1 ⊆ t2) = (t1’s type ⊆ t2’s 
>>> type)), which Avail automatically guarantees for all of its types.
>>> 
>>> So maybe the right way to describe all these things is that Avail separates 
>>> the concepts of:
>>>     • first-order type safety (guaranteed by the VM),
>>>     • second-order type safety (provided by libraries, but checked at 
>>> runtime), and
>>>     • performance (by using type deduction within the VM).
>>> 
>>> It makes sense that several years down the road we could start applying 
>>> guided theorem proving to statically check the correctness of semantic 
>>> restrictions.  I suspect we could eventually prove the majority of them, 
>>> but simply validating a user-provided proof might be easier (the name 
>>> Cayenne comes to mind, but I think that project is defunct now).  Some 
>>> semantic restrictions, however, one may want to simply accept as axioms.  I 
>>> think all of this depends on how the market chooses to accept Avail.  After 
>>> all, people are still using PHP and Javascript (for anything ever).
>>> 
>>> One more thing before I go:  Say we invoke a method and get back a result 
>>> that has a semantic restriction that narrows it such that its run-time type 
>>> has to be checked.  Yes, we take a hit checking that value's type, but 
>>> after the check passes we know that the value really has that narrowed 
>>> type.  We can pass it around with confidence, and the VM will optimize 
>>> using the assumption that this already-checked value has the type that we 
>>> expect.  When passed to a primitive, this narrowed type is what gets used 
>>> to deduce what type of value the primitive will produce (and whether it can 
>>> fail), so the actual number of run-time checks is probably lower than you’d 
>>> expect.
>>> 
>>> -Mark
>> 
>

signature.asc
Description: Message signed with OpenPGP using GPGMail

[Avail] Fwd: Avail's soundness

Reply via email to