Re: [racket] Puzzled about type inference

Matthias Felleisen Tue, 05 Aug 2014 06:26:58 -0700

In principle your email summarizes it all. I would like to add two comments 
though:


--- 1. There was no contradiction between Sam's email (happy birthday Sam) and 
mine. My initial reaction was to fix up your program in a 'least intrusive' 
manner. That's why I left in the cast and simply pinpointed those places in the 
program where the type checker really needs help. When I found the open 
DrRacket buffer again, I saw the code in a whole new light, especially in 
violation of my (untyped) coding style and once I recognized the delta, I 
realized that I really wanted to write the program according to the Typed 
Racket philosophy: 

        add type declarations to variables and fields and function and method 
signatures. 

That should work. No code rewrites necessary. Period. -- Sam's reply points out 
that this is basically possible for your code, without the style-based rewrite. 
[We confirmed this point in an off-line exchange.] 

--- 2. Well, the 'period' part leads my to my second comment. While we have 
achieved our goal -- mostly, kind of -- for mostly functional Racket, we're not 
there yet. On occasion, we do need to insert casts. Worse, we do need to 
rewrite code. Worst, the type annotations are more of a burden than the TR 
slogan leads us to believe. 

Yes, type annotations are helpful for any future reader of the program, 
especially if they are correct. They bring across intentions, and with good 
function and type names, extensions. You need little more to 'get' the code. In 
a sense, type annotations explicate design documentation the way the HtDP 
design recipe suggests. [And yes, we also need tests to illustrate abstract 
statements and property-checking would be even better.] The code truly explains 
itself -- when combined with the rules from the Style guide (which we don't 
always follow and couldn't in the past). 

But, type annotations become really heavy in some cases (polymorphism) and 
simply labor in others. ____(: s String) (define s "hello world")___ anyone? So 
Sam injected _local type inference_ into the original TR. LOCAL means just that 
-- within a maximal S-expression TR tries to restore missing type declarations. 
Sometimes it works, sometimes it fails. 

We know from experience that MODULE-GLOBAL inference (Hindley Milner a la ML or 
Hindley-Milner soft type a la Fagan/Wright) doesn't work. Error messages are 
bad and cost more time than the reduction in labor is worth. 

Bottom line is then, TR is in an actual state and then there is the ideal that 
some of us have in our head where local type inference does a lot more and 
becomes almost as good as GLOBAL ti in terms of convenience. It is critical to 
keep this in mind. TR isn'f finished and it will continue to move from 'actual' 
to 'ideal'. 


Having said that, this thread definitely points out (1) we need to fix some of 
the write-ups about TR and (2) we should add sections about TR to the Style 
guide. We will do so ... 

-- Matthias
















On Aug 5, 2014, at 5:36 AM, Norman Gray wrote:

> 
> Matthias and Sam, hello.
> 
> This has been a very useful brief thread for me, and also, I suspect, for 
> others.
> 
> On 2014 Aug 4, at 01:22, Matthias Felleisen <matth...@ccs.neu.edu> wrote:
> 
>> In a sense, you put your finger on a sore spot of TR from R -- we are still 
>> figuring out how to help programmers along here.
> 
> Such help would be most valuable!
> 
>> ;; entirely unnecessary but helped me read; IGNORE 
>> (define-type (Maybe α) (U False α))
>> (define-type NI Nonnegative-Integer)
>> (define-type LNI (Listof NI))
>> (define-type VNI (Vectorof NI))
> 
> This is clearly a good idiom/pattern: very local, and telegraphically short, 
> abbreviations for useful types.
> 
>> Then I checked on the maps and realized that I need to help the TC along 
>> with the instantiations of vector->list (how can it know with context-free 
>> reasoning what kind of vector it is) and the same for car (how can it know 
>> what kind of list you're getting here). 
> 
> So this returns to the question of what the TC can and cannot be expected to 
> work out for itself, and I see that Sam had a couple of remarks on this later.
> 
>> (let* ((given  : LVNI '(#(1 2) #(3 4) #(5 6)))
>>      (length : NI   (1+ (apply max (map (lambda ({x : VNI}) (vector-ref x 
>> 0)) given))))
>>      (result : [Vectorof [Maybe NI]] (make-vector length #f)))
>> (for ((q : VNI (in-list given)))
>>   (vector-set! result (vector-ref q 0) (vector-ref q 1)))
>> result)
>> 
>> The above -- minus types -- is how I would probably write the code in 
>> Racket, following the style guide. (I would use three defines instead of 
>> let* but let's ignore this point.)
> 
> That looks _much_ nicer than what I had, and is still perfectly readable.
> 
>> If I were to move this code to Typed Racket, I'd add a type declaration to 
>> each identifier just like in any other (explicitly, statically) typed 
>> programming language. That's what I did, and there was no problem checking 
>> and running your code. 
> 
> So the pattern I see there, which both you and Sam have illustrated, is:
> 
>  * annotate as far 'down' the tree as possible, so that type information can 
> propagate upwards in the parse tree
> 
>  * If one liberally uses named bindings (recommended on other style grounds, 
> in any case), then each of these is a natural location for a type annotation
> 
>  * when a polymorphic-typed function takes polymorphic arguments, annotate 
> the arguments, not the function (that is, I see that, in (map car '(foo ...)) 
>  both you and Sam annotated the car, not the map); this is in harmony with 
> 'annotate as far down as possible', for some value of 'down'.
> 
> Sam said:
> 
>> The most signficant differences are (1) I'm not using `cast`, which
>> should be avoided when possible. `cast` is a runtime operation, and
>> might be hiding arbitrary bugs. (2) I've annotated the `let`-bound
>> variables, which makes life easier on the rest of the type checker and
>> helps improve error messages.
> 
> Ah, now this is something I've meant to ask about: what's the difference 
> between ann and cast?
> 
> The manual says:
> 
>> (ann e t)
>> Ensure that e has type t, or some subtype. The entire expression has type t. 
>> This is legal only in expression contexts.
> 
>> (cast e t)
>> The entire expression has the type t, while e may have any type. The value 
>> of the entire expression is the value returned by e, protected by a contract 
>> ensuring that it has type t. This is legal only in expression contexts.
> 
> I've never been sure whether the difference between "Ensure that e has type 
> t" (which sounds rather runtime-y) and "The entire expression has the type t" 
> is significant.  The only difference I could parse from this is that `cast` 
> includes a contract, which would appear to make it _safer_ than `ann`, in the 
> sense that if this were accidentally given a wrong-typed value than it would 
> cause an immediate error.
> 
> I presume that the difference is that `ann` is about static type analysis, 
> whereas `cast` is for the situation where that static analysis can't be done. 
>  Would I be correct that in the above case I can and should write
> 
>     (ann '(#(1 2) #(3 4) #(5 6)) (Listof VNI)
> 
> since this is saying that this expression has a more specific type than the 
> TC would work out by itself, and it can verify that the restriction is 
> possible.  In the real program, these numbers appear from a SQL query, and so 
> this list is actually of type (Listof (Vectorof (U Integer String))), so that 
> I should indeed be writing
> 
>    (cast (query-rows ...) (Listof VNI))
> 
> because these numbers are coming in, dynamically, from _outside_ the reach of 
> the static TC, and so the annotation needs to be dynamically verified at this 
> point.  That is: `cast` is for the 'boundary' of the type system, whereas 
> `ann` is for adding detail within it?
> 
> Ah -- I think I've just had a little epiphany.  In which case, it might be 
> useful to transplant a little of that into the ann/cast section of the 
> manual.  The examples shown there don't really bring out the distinction to 
> the naive reader. (Just by the way, `assert` seems to be doing rather similar 
> work to these two, and while I can appreciate a distinction, from the text, 
> I'd struggle to express it compactly -- a compare-and-contrast explanation in 
> the manual would be very illuminating)
> 
> (Does this imply that when TR code is called from untyped code, the arguments 
> are implicitly `cast` at call-time?)
> 
>> There are a few things that make this program require more annotations
>> than I'd like.
>> 
>> 1. TR doesn't handle automatically inferring type arguments when
>> polymorphic functions (like `map`) are given polymorphic arguments
>> like `vector->list`. While fixing this would require work, it just
>> needs to be done -- there's no reason we shouldn't be able to make
>> this work.
> 
> Righto -- that's what I wondered, whether it was 'just a matter of code' or 
> whether there was something fundamentally more complicated about this, which 
> I wasn't understanding.
> 
> Thanks, both, for your careful explanations.
> 
> Best wishes,
> 
> Norman
> 
> 
> -- 
> Norman Gray  :  http://nxg.me.uk
> SUPA School of Physics and Astronomy, University of Glasgow, UK
> 


____________________
  Racket Users list:
  http://lists.racket-lang.org/users

Re: [racket] Puzzled about type inference

Reply via email to