[Python-3000] Type Expressions

Birch, Bill Tue, 18 Apr 2006 02:47:39 -0700

Hi,

There is a massive amount of discussion about adding type checking to Python 
going back a long way, however most discussions gloss over exactly what type 
expressions are exactly. I'm working on this question with the goal of putting 
up a proposal of some kind to this list.


So far I have some conclusions/opinions which I'd like to share and get 
feedback and help on. Here's some points which I want to get out there:

(a) A lot of work on types does not consider composite types. A composite type 
is something like:  [int, int, str], ie a type consisting of atomic types held 
together by collections. As a programmer I want to be able to formally define 
composite types. I am parsing a CSV file I'd like to be able to declare the 
file format as a type.  These types are very important in the following 
contexts: 
  (1) reading input and validating it. (XML anyone?)
  (2) unit testing - did my code create the right output?
  (3) documenting the types my code uses for human readers in a standard way

(b) Much of the academic papers I have read on types are focussed on 
compile-time static type checking. This is understandable, but Python is a 
dynamic language and an object can change it's structure at run-time so 
programs will always outwit the compiler. So IMHO Python needs to have a 
_dynamic_ type expression system. This also gives Python programmers access to 
type expressions, they should not be the sole preserve of the compiler writers.

(b) I work with some heavy-hitting java developers who don't write a line of 
code without unit tests. Lots of unit tests. Even developers using a statically 
typed language rely on unit tests for program correctness. I trust Python 
programmers are the same. So any discussion of type systems needs to consider 
what benefits to unit test writing. IMHO dynamic type expressions with dynamic 
type checking predicates are very useful in unit tests. They can help answer 
the question "is my program creating the right structures?" Even when those 
structures are untyped collections.

(c) Simple composite type expressions with dynamic predicates are in my view 
more useful and universal than complex Interface or Adaption schemes. I would 
prefer to have a flexible and expressive type expression facility than 
Interfaces. 

(e) Unlike a compiled type system, a dynamic type system does have access to 
values. So we can define types where values are given. e.g. the type ('href', 
str) is the set of all 2-tuples where the first element is 'href'. Compiled 
static type systems don't normally allow this. 

(d) Working code. So far I have been coding some simple examples of type 
expressions with some predicates. I am (ab)using the existing Python syntax to 
get type expressions. Here are some examples:
 
        (int, int) # a tuple with two ints

        [int, etc]  # a list of ints
                        # (etc = zero or more of the previous element)

        {'b':[12, 34], 'c':list, 'a':long} # dict 

        Union(int, float)

        function(a=int, b=float, returns=float) 

        pyobject(xcoord=int, ycoord=int) # an instance

        These are all true:

        isSubType(list, [42])
        isSubType((int, etc), (int, int, int))
        isSubType(Union(int, float, str), Union(float, int))
        isSubType({'a':Number, 'b':object}, {'b':[12, 34], 'c':list, 'a':long})

(e) Python needs a succinct type expression syntax! Really.

(f) Predicates on types: "isInstance(T1, x)". Given a Python object, can x be 
used where type T1 is expected? And isSubType(T1, T2) - can T2 be used where a 
T1 is expected? This is a _structural_ subtyping predicate. Given point (e), 
isSubType() and isInstance() look like the same thing. If a value is also a 
valid type expression there's no distinction. For now my implementation of 
isSubType() accepts instances too.

(f) A dynamic type expression system also allows programmers to _generate_ 
objects from type expressions. I have a function generateRandom(T) which 
returns a random instance of type T. In a loop this is a fun way to break code. 

(g) Parametric types are simple function calls: lambda T: {'stack': [T], 'pop': 
function(returns=T) , 'push': function(newtop=T)}. Type names are variable 
bindings, just like values.

(h) Interfaces are just rather complex type expressions.

In summary a type system should be _dynamic_ and not static.  It's reflection 
API should be trivial to use. It should help programmers to answer these 
questions: Is this data I just received in the right structure? Did my program 
just output something in the right structure? What structure am I expected to 
pass to this library? My view is that a dynamic type system used in unit 
testing will find more defects than a static type system. And it will be fun.









_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

[Python-3000] Type Expressions

Reply via email to