Fun with decorators and unification dispatch

2005-09-10 Thread talin at acm dot org
Been playing around a bit more with developing my python inference
engine, and I thought it might be of general interest (plus, I am
finding the criticism useful).

My original goal was to brush up on my math skills. Now, I've long felt
that the best way to learn a language *thoroughly* is to write a
compiler for it. So why not write a compiler for math?

However, algebra and calculus aren't just about evaluating an
expression and getting the answer, they are about *manipulating*
mathematical expressions, applying transformations to them. Systems
that do this (such as Mathematica and Yacas) are called computer
algebra systems, so I decided to see how hard it would be to implement
one in Python.

A computer algebra system is essentially a specialized kind of expert
system, in other words it is a set of transformation rules, where an
input expression is matched against a database of patterns, and the
result of the database query determines what transformation is to be
made.

Most such systems start by implementing their own, specialized
programming language, but I wanted instead to see if I could add a
inference engine capability to Python itself.

So what I have is a dispatch mechanism that maintains a database of
Python functions, keyed by the input pattern. Unlike multimethod
systems, the input patterns are not merely a list of types, but can be
trees containing both constants and variables.

Here's a trivial example, using the classic recursive algorithm for
computing the factorial of a number:

# Declare that Factor is a generic function
Factorial = Function()

# Define Factorial( 0 )
@Arity( 0 )
def Factorial():
return 1

# Define Factorial( x )
@Arity( MatchInteger.x )
def Factorial( x ):
return x * Factorial( x - 1 )

print Factorial( 12 )

Function produces an instance of a generic function, which is a
callable object. When called, it searches its list of patterns to
determine the function to dispatch.

The Arity decorator adds the function as a special case of the
generic function. It adds the specific function to the generic's
internal pattern database, and also returns the generic function as its
return result, so that that (in this case) the name Factorial is
bound to the generic function object.

MatchInteger is a class that produces a matching object, otherwise
known as a bound variable. In this case, the variable's name is x.
(It overloads the __getattr__ method to get the variable name.) When
the pattern matcher encounters a matching object, it attempts to bind
the corresponding portion of the expression to that variable. It does
this by adding the mapping of variable name to value into a dictionary
of variable bindings.

When the function is called, the dictionary of variable bindings is
expanded and passed to the function (i.e. func( *bindings )), so that
the variable that was bound to x now gets passed in as the x
parameter of the function.

MatchInteger itself is a type of qualified match (that is, a variable
that only binds under certain conditions), and could be defined as:

MatchInteger = QualifiedMatch( lambda x: type( x ) is int )

(Although it is not in fact defined this way.) QualifiedMatch takes a
list of matching preducates, which are applied to the expression before
binding can take place.

Here's a more complex example, which is a set of rules for simplifying
an expression:

Simplify = Function()

# x = x
@Arity( MatchAny.x )
def Simplify( x ):
return x

# x * 0 = 0
@Arity( ( Multiply, MatchAny.x, 0 ) )
def Simplify( x ):
return 0

# x * 1 = x
@Arity( ( Multiply, MatchAny.x, 1 ) )
def Simplify( x ):
return Simplify( x )

# x + 0 = x
@Arity( ( Add, MatchAny.x, 0 ) )
def Simplify( x ):
return Simplify( x )

# x + x = 2x
@Arity( ( Add, MatchAny.x, MatchAny.x ) )
def Simplify( x ):
return (Multiply, 2, Simplify( x ) )

# General recursion rule
@Arity( ( MatchAny.f, MatchAny.x, MatchAny.y ) )
def Simplify( f, x, y ):
return ( Simplify( f ), Simplify( x ), Simplify( y ) )

And in fact if I call the function:

print Pretty( Simplify( Parse( (x + 2) * 1 ) ) )
print Pretty( Simplify( Parse( x * 1 + 0 ) ) )
print Pretty( Simplify( Parse( y + y ) ) )
print Pretty( Simplify( Parse( (x + y) + (x + y) ) ) )

It prints:

x + 2
x
2 * y
2 * (x + y)

The argument matcher tries to prioritize matches so that more specific
matches (i.e. containing more constants) are matched before more
general matches. This is perhaps too unsophisticated a scheme, but it
seems to have worked so far.

The pattern matcher also looks to see if the object being matched has
the commute or associate property. If it finds commute it
attempts to match against all posssible permutations of the input
arguments (with special optimized logic for functions of one and two
arguments). If the associate property is found, it first tries to
flatten the expression (transforming (+ a (+ b c)) into (+ a b c), and
then generating all possible partitions of the arguments into two sets,
before attempting 

Re: Fun with decorators and unification dispatch

2005-09-10 Thread talin at acm dot org
Yes, it was a typo.

Even thought the function has not yet been bound to the name
Factorial when it calls the decorator, the function's __name__
attribute is set to it, so I use that to look up the name of the
generic.

Here''s the source for Arity:

def Arity( *pattern ):
A function decorator that defines a specific arity of a generic
function. This registers the
specific implementation with the generic function (which must be in
the global scope.)

def inner( f ):
if isinstance( f, Function ):
generic = f
f = generic.last_defined
else:
name = f.__name__
if not name in f.func_globals:
raise Exception( Generic function  + name +  has not
been defined. )
generic = f.func_globals[ name ]
generic.name = name
generic.last_defined = f
generic.add_arity( pattern, f )
return generic
return inner

There's a couple of kludges here:

1) The Generic doesn't know its own name until you define at least one
specialization for it. Otherwise, you would have to say:

Factorial = Function( Factorial )

which I consider verbose and redundant.

2) The whole last_defined thing is to allow multiple arities for a
single specialized function. Since the arity normally returns the
generic, and not the specialized func, as the return value, that means
that any additional decorators will be applied to the generic rather
than the specialized func. last_defined is a kludge for getting around
that, but only for decorators that understand the situation.

-- 
http://mail.python.org/mailman/listinfo/python-list


Replacement for lambda - 'def' as an expression?

2005-09-06 Thread talin at acm dot org
I've been reading about how lambda is going away in Python 3000 (or
at least, that's the stated intent), and while I agree for the most
part with the reasoning, at the same time I'd be sad to see the notion
of anonymous functions go - partly because I use them all the time.

Of course, one can always create a named function. But there are a lot
of cases, such as multimethods / generics and other scenarios where
functions are treated as data, where you have a whole lot of functions
and it can be tedious to come up with a name for each one.

For example, my current hobby project is implementing pattern matching
similar to Prolog in Python. The dispatcher I am making allows you to
create overloaded versions of a function that take different patterns
as their input arguments, so that Simplify( (add, x, y) ) calls a
different method than Simplify( (log, x) ) -- in other words, the
choice of which code is executed is based on the structure of the tuple
that is passed into it. However, in order for this to work, I need to
be able to assign a block of Python code to a particular pattern, and
having to invent a named function for each pattern is a burden :)

Anyway, here's an example, then, of how 'def' could be used:

add = def( a, b ):
   return a + b

The lack of a function name signals that this is an anonymous function.
The def keyword defines the function using the same syntax as always -
the arguments are in parentheses, and are unevaluated; The colon marks
the beginning of a suite.

In fact, it looks a lot like the existing lambda, with a couple of
differences:

1) It uses the familiar def keyword, which every Python beginner
understands, instead of the somewhat unfamiliar lambda
2) The arguments are enclosed in parentheses, instead of a bare tuple
followed by a colon, again reiterating the similarity to the normal
usage of def.
3) The statements are a real suite instead of a pseudo-suite - they can
consist of multiple lines of statements.

Like all statements whose last argument is a suite, you can put the
body of the function on a single line:

add = def( a, b ): return a + b

(If this were Perl, you could also omit the return, since in Perl the
last evaluated expression in the function body is what gets returned if
there's no explicit return statement.)

What about passing an anonymous function as an argument, which is the
most common case? This gets tricky, because you can't embed a suite
inside of an expression. Or can you?

The most powerful option would be to leverage the fact that you can
already do line breaks inside of parentheses. So the def keyword
would tell the parser to restart the normal indentation calculations,
which would terminate whenever an unmatched brace or paren was
encountered:

a = map(
   (def( item ):
  item = do_some_calculation( item )
  return item
   ), list )

The one-liner version looks a lot prettier of course:

a = map( (def( item ): return item * item), list )

And it looks even nicer if we switch the order of the arguments around,
since you can now use the final paren of the enclosing function call to
terminate the def suite.

a = map( list, def( item ): return item * item )

Unfortunately, there's no other good way I can think of to signal the
end of the block of statements without introducing some radical new
language construct.

(Besides, if being an expression is good enough for 'yield', why
shouldn't def get the same privilege? :)

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Replacement for lambda - 'def' as an expression?

2005-09-06 Thread talin at acm dot org
I like the decorator idea. Unfortunately, the version of Python I am
using is pre-decorator, and there are various issues involved in
upgrading on Mac OS X (due to the built-in Python 2.3 being used by the
OS itself.) I'll have to look into how to upgrade without breaking too
much...

Some further examples of what I am trying to do. First let me state
what my general goal is: There are lots of inference engines out there,
from Prolog to Yacas, but most of them rely on a custom interpreter.
What I want to find out is if I can build a solver, not by creating a
new language on top of Python, but rather by giving solver-like
capabilities to a Python programmer. Needless to say, this involves a
number of interesting hacks, and part of the motivation for my
suggestion(s) is reducing the hack factor.

So, at the risk of being visited by Social Services for my abuse of
Python Operators, here's a sample of how the sovler works:

# Define a function with multiple arities
Simplify = Function()

# Define some arities. We overload __setitem__ to define an arity.
# Param is a class who'se metaclass defines __getattr__ to return a new
instance
# of Param with the given parameter name.
Simplify[ ( add, Param.x, 0 ) ] = lamba x: return Simplify( x )# x
+ 0 = x
Simplify[ ( mul, Param.x, 1 ) ] = lamba x: return Simplify( x )# x
* 1 = x
Simplify[ ( mul, Param.x, 0 ) ] = lamba x: return 0   #
x * 0 = 0
Simplify[ Param.x ] = lamba x: return x
 # Fallback case

# Invoke the function. Should print the value of x
print Simplify( (add, x, 0) )

Of course, what I really want is not def or lambda, what I really want
is to be able to define functions that take suites as arguments. But
that would be crazy talk :)

Define( Simplify, args ):
   code

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: 'isa' keyword

2005-09-02 Thread talin at acm dot org
Thanks for all the respones :) I realized up front that this suggestion
is unlikely to gain approval, for reasons eloquently stated above.
However, there are still some interesting issues raised that I would
like to discuss.

Let me first respond to a few of the comments:

What's the difference between this and ``isinstance`` ?

What's the difference between 'in' and 'has_key()? 1) Its shorter and
more readable, 2) it can be overridden to mean different things for
different container types.

 What's wrong with:
 if image.isa(gif):
 elif image.isa(jpeg):
elif image.isa(png):

That forces the classification logic to be put into the instance,
rather than in the category. With the in keyword, the __contains__
function belongs to the container, not the contained item, which is as
it should be, since an item can be in multiple containers.

 Especially conidering that checking parameters with isinstance is
 considered bad form with Python's duck typing.

Here's an example where the strict OOP style of programming breaks
down. I'll use SCons as an example. In SCons, there is a Depends
function that can take a filename, a list of filenames, or a build
target (which is a python object). So the logic looks like this:

def Depends( target ):
   if isinstance( target, str ):
...
   elif isinstance( target, list ):
   ...
   elif isinstance( target, BuildTarget ):
   ...
   else: error

You can't use method overloading here, because you are dealing with
builtin python objects (except for the BuildTarget).

I can think of several different cases where you would have to resort
to logic like this:
  -- Where you are trying to distinguish between built-n python types
  -- Where you are trying to distinguish between types that are created
by another, independent module and which you can't modify
  -- Where you are trying to do multi-method dispatch logic.

As an example of the latter, imagine a music sequencer application that
edits a stream of Midi events. Lets suppose that there are various
classes of events:

   Note On
   Note Off
   Aftertouch
   Pitchbend
   Control Change

In addition, lets suppose that we  have a variety of different ways of
editing these events:

   Piano Roll Editor - edits in horizontal piano roll form
   Drum Machine editor - notes are placed on a grid
   Event List Editor - a text list of events
   Music Notation Editor - uses conventional notes and staves

All of these editors operate on the same underlying data, which is a
stream of events. Each of these editors has a repaint function to
render the stream of events. So the rendering of the event depends
*both* on the class of the event, and the class of the editor.

So you could organize it this way:

class PianoRollEditor:
   def repaint( event ):
  if isinstance( event, Note ): # draw note
  elif isinstance( event, Aftertouch ): # draw
  ...etc

You could also invert the logic (which is somewhat clumsier):

class Note:
   def repaint( context ):
  if isinstance( context, PianoRollEditor ): ...
  etc...

Now, I realize that some folks have built multi-method dispatch systems
for Python (I did one myself at one point.) However, these tend to be
somewhat slow and clunky without language support (and no, I am not
asking for Python to become Dylan or CLOS.) But it would be nice to
know if there was a cleaner way to solve the above problems...

-- 
http://mail.python.org/mailman/listinfo/python-list


'isa' keyword

2005-09-01 Thread talin at acm dot org
Although I realize the perils of even suggesting polluting the Python
namespace with a new keyword, I often think that it would be useful to
consider defining an operator for testing whether or not an item is a
member of a category.

Currently, we have the 'in' operator, which tests for membership within
a container, and that works very well -- in particular, it allows such
membership tests to be expressed in very natural way. So for example,
whereas in C++ I always have to say:

if (dependencies.find( name ) != dependencies.end())

in Python I can simply say:

if name in dependencies:

...which is much more readable and intuitive. At the same time,
however, I recognize that there is a logical difference between
membership in a container, and membership in a category. For example,
although a bear is a member of the class of mammals, it doesn't make as
much to say if bear in mammal. Similarly, you wouldn't want to use
the 'in' keyword as a replacement for isinstance(), i.e. if name in
str.

I propose the word 'isa' because the term 'isa hierarchy' is commonly
used to indicate a tree of types. So the syntax would look like this:

if bear isa mammal:
if name isa str:

(I suppose it would look prettier to put a space between is and a,
but there are many obvious reasons why you don't want a to be a
keyword!)

The isa operator would of course be overloadable, perhaps by an
accessor functions called __isa__, which works similarly to
__contains__. The potential uses for this are not limited to
isinstance() sugar, however. For example:

if image isa gif:
elif image isa jpeg:
elif image isa png:

In this case, we're not testing for object identity (which tests if two
variables are referring to the same object), or even object equivalence
(which tests of two objects are of equal value), nor are we testing for
membership within a container -- instead we're testing for membership
with a type hierarchy, where 'type' can be defined to mean whatever the
programmer wants.

Of course, at this point, I am sure that someone will point out that I
should be using method overloading and inheritance rather than explicit
testing of types. However, try writing an efficient __cmp__ function
solely by method overloading -- or any other function that deals with
more two object argument, where the action to be taken depends on the
combination of types of both arguments. This can be solved with
multi-method dispatch, but such methods are complex, non-standard, and
have somewhat dubious performance characteristics. Its generally faster
and simpler to dispatch based on the type of one of the arguments, and
then test the types of the other arguments.

-- 
http://mail.python.org/mailman/listinfo/python-list