So now I am at the point where Bicicleta needs only native functions in order to run any interesting programs --- things like fac:
{env: fac = {fac: x = 3, '()' = fac.x.'<'{lt: arg1=2}.'()'.if_true{i: then=1, else=fac.x.'*'{mu: arg1=env.fac{f: x=fac.x.'-'{m: arg1=1}.'()'}.'()'}.'()'}.'()'} }.fac{f: x = 4}.'()' or fib: {env: fib = {fib: x = 3, '()' = fib.x.'<'{lt: arg1=2}.'()'.if_true{i: then=1, else = env.fib{f: x=fib.x.'-'{m: arg1=1}.'()'}.'()'.'+'{p: arg1 = env.fib{f: x=fib.x.'-'{m: arg1=2}.'()'}.'()' } }.'()' }}.fib{f: x = 5}.'()' These parse, but they return errors, because numbers don't yet have any methods. Fac depends on integers having '*' and '-' methods that return integers, and a '<' method that returns a boolean (i.e. something with an if_true method that returns its then or else). Fib is similar, but wants '+' instead of '*'. So what's the best way to implement this? Here are some approaches I've seen before. The C++/Perl/CLOS/OCaml Approach: Primitive Objects Aren't Objects ------------------------------------------------------------------ In C++, Perl, CLOS, and OCaml, primitive objects don't have any methods, and you can't inherit from them. You access them in other non-method non-inheritance ways that exist in each language. In CLOS, you can at least define methods on them, but they don't come with any to start with. In all four cases, I think this is the result of retrofitting an object system to an existing non-object-oriented language. This is not an approach I am considering. The Python Approach: Native Objects And Native Functions -------------------------------------------------------- In Python, primitive objects like strings and numbers are different kinds of objects from user-defined objects. (This is less true now than before the "class-type unification".) They use different mechanisms for looking up their properties (such as methods), and you cannot add properties to them the way you can add properties to normal objects. Before the class-type unification, you couldn't inherit from them, either. Now you can, but the process has some pitfalls. This approach discourages the introduction of methods like Squeak's asWords method: 3252523 asWords 'three million, two hundred fifty-two thousand, five hundred twenty-three' Because it really wouldn't be worth it to write that in C. Another kind of primitive object is the "built-in function", meaning a function written in C. The Wheat Approach: Invisible Native Methods -------------------------------------------- In Wheat, primitive objects like strings belong to some class whose path is hardcoded into the language interpreter. If you put ordinary user-defined methods in that class, they start working on all objects of that primitive type. However, there are also non-user-defined methods in these classes, stuck there by the interpreter at startup. This means that there are two places to look to see whether, say, strings override some particular method, or inherit the version in the standard object. The Squeak Approach: Objects With Some Hidden State, And Primitive Methods -------------------------------------------------------------------------- Squeak has primitive methods, which are executed directly by the interpreter, and some objects like SmallInteger that don't have any instance variables and thus whose contents can only be accessed through primitives. If you browse ByteString>>at:, you will see a thing at the beginning that says <primitive: 63>. The comment above it says, "See Object documentation whatIsAPrimitive," and it turns out Object has a class method called "whatIsAPrimitive", consisting mostly of a long comment, part of which reads as follows: When the Smalltalk interpreter begins to execute a method which specifies a primitive response, it tries to perform the primitive action and to return a result. If the routine in the interpreter for this primitive is successful, it will return a value and the expressions in the method will not be evaluated. If the primitive routine is not successful, the primitive 'fails', and the Smalltalk expressions in the method are executed instead. These expressions are evaluated as though the primitive routine had not been called. There are also certain class-selector pairs for native methods that are not looked up through the normal method lookup mechanism, which the comments claim is for efficiency; but I suspect that this is also necessary to provide a base case for the recursion involved in things like method lookup. There are about 149 methods in Squeak3.8-6665full.image that specify primitive responses. The comment about "if the primitive routine is not successful" above is interesting. The most apparent reason the primitive routine would not be successful is that it may not be implemented in the Squeak virtual machine running your image at the moment, but there are other possibilities as well. For example, SmallInteger>>* has a primitive implementation that fails on overflow, allowing the Smalltalk version to fall back to arbitrary-precision, and BlockContext>>valueWithArguments: fails, for example, if the number of arguments is wrong, and the error reporting is all handled by this fallback. The Lua Approach: I Don't Know What It Is ----------------------------------------- Apparently Lua has a "clientdata" type which is basically an opaque pointer into C-land. It also has, I think, native functions which call into C when you call them, just as in Python. But you can define a "metatable" on a piece of "clientdata" to arrange for table lookup on the clientdata to return things, such as functions, in Lua or C, to be invoked as methods. This seems very similar to the Wheat approach. But I should finish reading the Lua manual before passing judgment. My Approach For the First Bicicleta Prototype --------------------------------------------- On one hand, at the bottom level, I am taking a Python-like approach: I have primitive objects, with a fixed set of primitive methods, from which you cannot inherit. However, integer literals and string literals do not evaluate to these objects; integers, for example, evaluate the expression "prog.sys.machine_integer" and derive from its result by overriding the method 'clientdata' to return the primitive object representing that particular integer. This level of indirection allows the program to wrap more methods around the primitive object, and even allows integer literals in certain parts of the program to evaluate to something different.