Hi,
There is a massive amount of discussion about adding type checking to Python
going back a long way, however most discussions gloss over exactly what type
expressions are exactly. I'm working on this question with the goal of putting
up a proposal of some kind to this list.
So far I have some conclusions/opinions which I'd like to share and get
feedback and help on. Here's some points which I want to get out there:
(a) A lot of work on types does not consider composite types. A composite type
is something like: [int, int, str], ie a type consisting of atomic types held
together by collections. As a programmer I want to be able to formally define
composite types. I am parsing a CSV file I'd like to be able to declare the
file format as a type. These types are very important in the following
contexts:
(1) reading input and validating it. (XML anyone?)
(2) unit testing - did my code create the right output?
(3) documenting the types my code uses for human readers in a standard way
(b) Much of the academic papers I have read on types are focussed on
compile-time static type checking. This is understandable, but Python is a
dynamic language and an object can change it's structure at run-time so
programs will always outwit the compiler. So IMHO Python needs to have a
_dynamic_ type expression system. This also gives Python programmers access to
type expressions, they should not be the sole preserve of the compiler writers.
(b) I work with some heavy-hitting java developers who don't write a line of
code without unit tests. Lots of unit tests. Even developers using a statically
typed language rely on unit tests for program correctness. I trust Python
programmers are the same. So any discussion of type systems needs to consider
what benefits to unit test writing. IMHO dynamic type expressions with dynamic
type checking predicates are very useful in unit tests. They can help answer
the question "is my program creating the right structures?" Even when those
structures are untyped collections.
(c) Simple composite type expressions with dynamic predicates are in my view
more useful and universal than complex Interface or Adaption schemes. I would
prefer to have a flexible and expressive type expression facility than
Interfaces.
(e) Unlike a compiled type system, a dynamic type system does have access to
values. So we can define types where values are given. e.g. the type ('href',
str) is the set of all 2-tuples where the first element is 'href'. Compiled
static type systems don't normally allow this.
(d) Working code. So far I have been coding some simple examples of type
expressions with some predicates. I am (ab)using the existing Python syntax to
get type expressions. Here are some examples:
(int, int) # a tuple with two ints
[int, etc] # a list of ints
# (etc = zero or more of the previous element)
{'b':[12, 34], 'c':list, 'a':long} # dict
Union(int, float)
function(a=int, b=float, returns=float)
pyobject(xcoord=int, ycoord=int) # an instance
These are all true:
isSubType(list, [42])
isSubType((int, etc), (int, int, int))
isSubType(Union(int, float, str), Union(float, int))
isSubType({'a':Number, 'b':object}, {'b':[12, 34], 'c':list, 'a':long})
(e) Python needs a succinct type expression syntax! Really.
(f) Predicates on types: "isInstance(T1, x)". Given a Python object, can x be
used where type T1 is expected? And isSubType(T1, T2) - can T2 be used where a
T1 is expected? This is a _structural_ subtyping predicate. Given point (e),
isSubType() and isInstance() look like the same thing. If a value is also a
valid type expression there's no distinction. For now my implementation of
isSubType() accepts instances too.
(f) A dynamic type expression system also allows programmers to _generate_
objects from type expressions. I have a function generateRandom(T) which
returns a random instance of type T. In a loop this is a fun way to break code.
(g) Parametric types are simple function calls: lambda T: {'stack': [T], 'pop':
function(returns=T) , 'push': function(newtop=T)}. Type names are variable
bindings, just like values.
(h) Interfaces are just rather complex type expressions.
In summary a type system should be _dynamic_ and not static. It's reflection
API should be trivial to use. It should help programmers to answer these
questions: Is this data I just received in the right structure? Did my program
just output something in the right structure? What structure am I expected to
pass to this library? My view is that a dynamic type system used in unit
testing will find more defects than a static type system. And it will be fun.
_______________________________________________
Python-3000 mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe:
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com