On May 9, 2014, at 2:35 PM, Robbert van Dalen wrote: > how do avail classes/objects work?
Heh, get ready for another gem! Ok, where to start... I guess I'll cover
everything, if only for the amusement of onlookers.
The basics. In Avail, an object is any value whose type is a specialization of
the type known as "object". An object is conceptually (and implementationally
at the moment) a map from field keys (atoms) to values. For example, if x and
y are atoms, then you can build an object in which x maps to 5 and y maps to
"cheese". Note that the object is immutable and identityless, just like a map.
The fields of an object are permanent, and contribute towards the object's
equality, hash, and type. Sounds simple so far, right?
Object types are similarly represented with a field type map which maps from
atoms to types. Objects and object types are different kinds of things (i.e.,
an object type is not itself an object). An object type whose field type map
has x → integer and y → string has as an instance the object which I specified
above.
An object is an instance of an object type if it has all of the fields
specified in the object type, and the values in those fields are instances of
the values specified in the corresponding field type map of the object type.
About what you'd expect from the above example. But note that this definition
allows additional fields to be present in an object while maintaining
conformance with some type. Also, due to the fact that Avail's types are very
detailed and precise, it should be obvious that object types covary by their
field types. Thus, an object type whose field type map has x → [0..255] and y
→ <character…|6> is a subtype of an object type whose field type map has only x
→ number. So the most general object type is the one whose field type map is
empty.
This allows us to create subtypes by covariant specialization: A "point" type
whose x and y fields must be numbers can be specialized to an "integer point"
type whose x and y fields must be integers.
It allows us to multiply-inherit: An object type C can be simultaneously
compatible with both an object type A with field type map {x→[0..255]}, and
object type B with field type map {x→integer, y→integer}. The type
intersection of A and B, which is the most general type compatible with both A
and B, has field type map {x→[0..255], y→integer}. An instance of this
intersection type is always an instance of A and an instance of B. Similarly,
A and B's type union is the most specific common ancestor of both A and B,
which in this case is the object type with field type map {x→integer}. An
instance of A or B is always an instance of this union type.
The class declaration syntax simplifies some of these details, but doesn't
attempt to conceal the semantics. It also automatically generates a variety of
methods like "_'s⁇x". But unlike traditional object oriented languages,
Avail's only boundary of encapsulation is the module, not the class, so we
can't directly have private fields – what would they be private to? Instead,
privacy is handled through the existing uniform mechanism of modular name
visibility. If you can't get to the atom "x", you can't extract the value
associated with that atom from an object. And unlike C++'s privacy-is-access
policy, Avail's privacy-is-visibility policy allows one to have a different x
field (different atoms) for points and for polynomials. If you choose to
inherit from both you might even end up with two x fields. But it's only
someone looking at the object textually that might be confused, never the
runtime machinery.
> for instance, what’s the difference between an explicit class and a regular
> class?
> (my guess is that an instance of an explicit class has its member slots
> pre-allocated)
An explicit subclass introduces an extra generated atom as a field. The
corresponding value is simply constrained to be that same atom. That's what an
explicit subclass is, but here's why: Sometimes you want to talk about
subclasses that are merely natural specializations of existing classes. Like
in the discussion of point types above, "integer point" would be an ordinary
subclass of "point". All it does is strengthen the type constraints for the x
and y fields. Similarly, adding a "z" field might be sufficient to define a
"three dimensional point". However, if we wanted to introduce "normal point"
whose self-dot-product is always 1, we would probably make it an explicit
subclass of point. The new "explicit-normal point" field would be in the
object type's field type map, so we couldn't accidentally create a normal point
unless we really intended to, since just providing x and y values at object
construction time would no longer be enough. Conversely, an integer point
would automatically be created whenever the coordinates happened to both be
integers. Note that the hidden fields get populated automatically by the
generated constructor method.
> i also wonder how class fields (smalltalk object slots?) are laid-out in
> memory.
At the moment, an object has a slot that holds the field map, and an object
type has a slot that holds the field type map. It's pretty simple, and not
terribly efficient, but we have some interesting possibilities planned for
optimizing object layouts. And once we start inlining non-primitive methods
and doing general escape analysis, we'll get what we hope is a huge performance
boost for object operations – by not doing them most of the time.
> does an object keep a reference to it’s class (and to it’s meta class,
> recursively)?
A value doesn't have to hold onto its type, since the value can generate it at
will; types never have identity, so asking twice and getting two different
AvailObjects is no big deal. Strictly speaking, every value's exact type is an
"instance type" (the special case of an enumeration type for a single value)
that just refers back to the value itself. We postpone creating such instance
types as long as possible when doing things like type checking and multi-method
dispatching. Each value (via its descriptor) is also intrinsically capable of
producing the most specific non-enumeration type (which the internal
documentation calls a "kind") that it is an instance of. There's quite a
complicated dance in the descriptors to appropriately delegate and
double-dispatch the various type-computing operations efficiently. Values with
different descriptors will use different techniques. The JUnit tests check a
few hundred thousand cases to make sure essential properties like reflexivity
and transitivity of the is-subtype relation hold. These tests have saved our
butts quite effectively in the past, due to the sheer volume and complexity of
the relevant code.
As it turns out, a variable does have a slot for directly holding onto its
kind, which is always a variable type. A variable type holds onto its content
type, which is the restriction on what values can be placed in the variable.
We allocated the extra slot in variables because (1) profiling showed it was
able to significantly improve performance without having any effect on Avail's
semantics, and (2) both the internal VM and generic operations like "↓_`?=_"
(P_011_SetValue, the runtime-casting assignment operation) need to be able to
query a variable's content type to ensure runtime type safety.
The Bagwell hashed set bins and map bins also have lazy slots for keeping track
of the nearest kind for the contained elements. This significantly improves
efficiency of type testing large sets and maps, especially if it's a minor edit
away from a previous version for which the nearest kinds were also computed.
We also cache hashes of those bins in additional slots, which can make
comparison of unequal sets and maps very, very fast.
> how does a class in the avail world get mapped to a class in the jvm world
> (is that even possible, or only for explicit classes)?
Java is utterly incapable of modeling Avail's type system. Within the Avail
VM, the Java code uses AvailObject, A_BasicObject, A_Tuple, A_Number, etc.
Everything else about the Avail type system is merely homogenous runtime data
as far as Java is concerned. The decision to distance ourselves from the
substrate's type system is what allows Avail to have such a novel type system.
Other languages tend to stay within the literal rules of the substrate's type
system, which I think is a great explanation for the apparent paucity of
interesting developments in type theory (and practice). Electric cars have
taken *forever* to get anywhere partly because of the installed infrastructure
for gasoline delivery, and the tools and techniques for gas vehicles that have
had so much investment over past decades.
> questions… questions …
Keep them coming! The website covers a lot of ground, but for much of the
remainder it's much more efficient for us to answer questions than to just pick
a story and tell it.
signature.asc
Description: Message signed with OpenPGP using GPGMail
