Marshall wrote: >>The short answer is that I'm most directly referring to "the types in >>the programmer's head". > > > In the database theory world, we speak of three levels: conceptual, > logical, physical. In a dbms, these might roughly be compared to > business entities described in a requirements doc, (or in the > business owners' heads), the (logical) schema encoded in the > dbms, and the b-tree indicies the dbms uses for performance. > > So when you say "latent types", I think "conceptual types."
That sounds plausible, but of course at some point we need to pick a term and attempt to define it. What I'm attempting to do with "latent types" is to point out and emphasize their relationship to static types, which do have a very clear, formal definition. Despite some people's skepticism, that definition gives us a lot of useful stuff that can be applied to what I'm calling latent types. > The thing about this is, both the Lisp and the Haskell programmers > are using conceptual types (as best I can tell.) Well, a big difference is that the Haskell programmers have language implementations that are clever enough to tell them, statically, a very precise type for every term in their program. Lisp programmers usually don't have that luxury. And in the Haskell case, the conceptual type and the static type match very closely. There can be differences, e.g. a programmer might have knowledge about the input to the program or some programmed constraint that Haskell isn't capable of inferring, that allows them to figure out a more precise type than Haskell can. (Whether that type can be expressed in Haskell's type system is a separate question.) In that case, you could say that the conceptual type is different than the inferred static type. But most of the time, the human is reasoning about pretty much the same types as the static types that Haskell infers. Things would get a bit confusing otherwise. > And also, the > conceptual/latent types are not actually a property of the > program; That's not exactly true in the Haskell case (or other languages with static type inference), assuming you accept the static/conceptual equivalence I've drawn above. Although static types are often not explicitly written in the program in such languages, they are unambiguously defined and automatically inferrable. They are a static property of the program, even if in many cases those properties are only implicit with respect to the source code. You can ask the language to tell you what the type of any given term is. > they are a property of the programmer's mental > model of the program. That's more accurate. In languages with type inference, the programmer still has to figure out what the implicit types are (or ask the language to tell her). You won't get any argument from me that this figuring out of implicit types in a Haskell program is quite similar to what a Lisp, Scheme, or Python programmer does. That's one of the places the connection between static types and what I call latent types is the strongest. (However, I think I just heard a sound as though a million type theorists howled in unison.[*]) [*] most obscure inadvertent pun ever. > It seems we have languages: > with or without static analysis > with or without runtime type information (RTTI or "tags") > with or without (runtime) safety > with or without explicit type annotations > with or without type inference > > Wow. And I don't think that's a complete list, either. Yup. > I would be happy to abandon "strong/weak" as terminology > because I can't pin those terms down. (It's not clear what > they would add anyway.) I wasn't following the discussion earlier, but I agree that strong/weak don't have strong and unambiguous definitions. > Uh, oh, a new term, "manifest." Should I worry about that? Well, people have used the term "manifest type" to refer to a type that's explicitly apparent in the source, but I wouldn't worry about it. I just used the term to imply that at some point, the idea of "latent type" has to be converted to something less latent. Once you explicitly identify a type, it's no longer latent to the entity doing the identifying. >>But if we ask Javascript what it thinks the type of timestwo is, by >>evaluating "typeof timestwo", it returns "function". That's because the >>value bound to timestwo has a tag associated with it which says, in >>effect, "this value is a function". > > > Well, darn. It strikes me that that's just a decision the language > designers > made, *not* to record complete RTTI. No, there's more to it. There's no way for a dynamically-typed language to figure out that something like the timestwo function I gave has the type "number -> number" without doing type inference, by examining the source of the function, at which point it pretty much crosses the line into being statically typed. > (Is it going to be claimed that > there is an *advantage* to having only incomplete RTTI? It is a > serious question.) More than an advantage, it's difficult to do it any other way. Tags are associated with values. Types in the type theory sense are associated with terms in a program. All sorts of values can flow through a given term, which means that types can get complicated (even in a nice clean statically typed language). The only way to reason about types in that sense is statically - associating tags with values at runtime doesn't get you there. This is the sense in which the static type folk object to the term "dynamic type" - because the tags/RTTI are not "types" in the type theory sense. Latent types as I'm describing them are intended to more closely correspond to static types, in terms of the way in which they apply to terms in a program. > Hmmm. Another place where the static type isn't the same thing as > the runtime type occurs in languages with subtyping. Yes. > Question: if a language *does* record complete RTTI, and the > languages does *not* have subtyping, could we then say that > the runtime type information *is* the same as the static type? You'd still have a problem with function types, as I mentioned above, as well as expressions like the conditional one I gave: (flag ? 5 : "foo") The problem is that as the program is running, all it can ever normally do is tag a value with the tags obtained from the path the program followed on that run. So RTTI isn't, by itself, going to determine anything similar to a static type for such a term, such as "string | number". One way to think about "runtime" types is as being equivalent to certain kinds of leaves on a statically-typed program syntax tree. In an expression like "x = 3", an inferring compiler might determine that 3 is of type "integer". The tagged values which RTTI uses operate at this level: the level of individual values. However, as soon as you start dealing with other kinds of nodes in the syntax tree -- nodes that don't represent literal values, or compound nodes (that have children) -- the possibility arises that the type of the overall expression will be more complex than that of a single value. At that point, RTTI can't do what static types do. Even in the simple "x = 3" case, a hypothetical inferring compiler might notice that elsewhere in the same function, x is treated as a floating point number, perhaps via an expression like "x = x / 2.0". According to the rules of its type system, our language might determine that x has type "float", as opposed to "number" or "integer". It might then either treat the original "3" as a float, or supply a conversion when assigning it to "x". (Of course, some languages might just give an error and force you to be more explicit, but bear with me for the example - it's after 3:30am again.) Compare this to the dynamically-typed language: it sees "3" and, depending on the language, might decide it's a "number" or perhaps an "integer", and tag it as such. Either way, x ends up referring to that tagged value. So what type is x? All you can really say, in the RTTI case, is that x is a number or an integer, depending on the tag the value has. There's no way it can figure out that x should be a float at that point. Of course, further down when it runs into the code "x = x / 2.0", x might end up holding a value that's tagged as a float. But that only tells you the value of x at some point in time, it doesn't help you with the static type of x, i.e. a type that either encompasses all possible values x could have during its lifetime, or alternatively determines how values assigned to x should be treated (e.g. cast to float). BTW, it's worth noting at this point, since it's implied by the last paragraph, that a static type is only an approximation to the values that a term can have during the execution of a program. Static types can (often!) be too conservative, describing possible values that a particular term couldn't actually ever have. This gives another hint as to why RTTI can't be equivalent to static types. It's only ever dealing with the concrete values right in front of it, it can't see the bigger static picture. Anton -- http://mail.python.org/mailman/listinfo/python-list