[jvm-l] Re: Attributes and items

Charles Oliver Nutter Tue, 25 Mar 2008 12:27:54 -0700

Comments inline below...

Attila Szegedi wrote:
> So, here I am trying to further my metaobject protocol library. In  
> order to get further with it, I tried the eat-my-own-dog-food approach  
> and decided to - after having written a MOP for POJOs - to try and  
> write actual MOP implementations for some dynamic language  
> implementations, most notably, Jython and JRuby.


Excellent. JRuby 1.1 will be out soon, and then this sort of work (and 
the improved integration with Java and other languages it brings) will 
become a high priority for us as well.

> And pretty soon, I hit an obvious design problem (which is okay,  
> really - this is still an exercise in exploring the right approach).  
> Namely, lots of languages actually have two namespaces when it comes  
> to accessing data belonging to an object. I'll refer to one of them as  
> "attributes" (as that's what they're called in both Ruby and Python)  
> and the other are "items", which are elements of some container  
> object. All objects have attributes, but only some objects  
> (containers) have items.

Some clarification for Ruby. Ruby doesn't distinguish between the public 
representation of attributes and methods. Attributes *are* just accessor 
methods that return values. There's no way to iterate only attributes or 
only methods, because they are the same structure and stored in the same 
way. But Ruby does have a concept of instance variables, represented as 
always-protected entries in a per-object dictionary. The set of instance 
variables is not determined ahead of time, and can grow as the program 
runs. The only way to access an object's instance variables is from 
within that object...or by wiring up attribute accessors (which just 
creates methods.

I think the MOP would be entirely satisfactory if it represented only 
methods for Ruby, but I understand this may not be the case for other 
languages.

> Some languages don't make a distinction, most notably, JavaScript. In  
> JavaScript, all objects are containers and they only have items (and  
> an item can be a function, in which case it functions as a method on  
> the object). Can't get much more generic than that, right? Other  
> languages (Ruby, Python) will distinguish between the two; the  
> containers are arrays/lists and hashes/dictionaries/maps. As a matter  
> of fact, it helps thinking of Java as having the distinction -  
> JavaBeans properties are the attributes, and arrays, Maps, and Lists  
> will have items.

It's also important to note here that in general Ruby is more about 
"maps" than "lists". There is a core class that's a list, but methods 
and instance variables and global variables and constants are all held 
in hash-like structures. But true, they're not all the same exact 
structure, and most of them you can't really access as primitive data 
structures.

> Now, to make matters a bit more complicated, in lots of languages the  
> container API is actually just a syntactic sugar. Give an object a []  
> and a []= method, and it's a container in Ruby! Give it __getitem__,  
> __setitem__, and few others, and it's a container in Python! Honestly,  
> this is okay - as a byproduct of duck typing, one shouldn't expect  
> there be any sort of an explicit declaration of "containerness", right?

In Ruby, the container API is defined by more than [], really. [] and 
[]= are just method calls...syntactic sugar for foo.[](key) and 
foo.[]=(key,value). They may make an object look like a collection, but 
they don't mean it *is* a collection. And they're frequently used for 
other syntactic magic. However...your thoughts about defining some of 
these collection operations as an additional protocol across languages 
is a great one. CLR already has low-level operations to abstract 
collection gets/sets, which get translated into collection operations 
across languages. So if a language is defined to handle them, its 
collections are transportable to other languages without too much fuss.

> Bottom line is, I feel this is a big deal to solve in interoperable  
> manner, as the raison d'être of the MOP would be to allow  
> interoperability between programs written in different languages  
> within a single JVM; I imagine in most cases the programs will pass  
> complex data structures built out of lists and dictionaries to one  
> another, so it feels... essential to get this right. It also feels  
> like something that can rightfully belong in a generic MOP as most  
> languages do have the concept of ordered sequences and associative  
> arrays. Of course, I might also be wrong here; it is also an essential  
> goal to not end up with a baroque specification that contains  
> everything plus the kitchen sink. I.e. you might notice that my  
> current MOP effort for now doesn't have a concept of the class, as not  
> all languages have a class concept; funnily enough the concept of  
> sequences and maps actually looks more important for interop (since  
> it's more general) to me right now than the concept of a class.

In Ruby, I think the problem is not as bad as you think. Almost everyone 
working with collections in Ruby uses Array and Hash or descendants of 
them, since they have everything you'd need from such data structures 
plus nice literal syntaxes. Supporting the concept of lists and 
dictionaries in the MOP as being Array and Hash in Ruby would obviously 
be a minimum requirement...but it might also be a suitable maximum as 
well. Far more interesting and powerful, I think, is how the MOP can be 
appropriately wired in to the formal coercion protocols of a given 
language, such as the to_* methods in Ruby. These coercion methods take 
two forms...those typically used by programmers explicitly to coerce 
values (to_i, to_s, to_a, ...) and those mostly used internally for 
implicit coercion (to_int, to_str, to_ary). The protocols aren't 100% 
defined (because nothing is in Ruby), but they are fairly well 
understood and key to type interop in Ruby...and therefore key to 
language interop in the MOP.

> So, here am I wondering whether this is something that can be made  
> sufficiently unified across the languages to the point that if a Ruby  
> program is given a Python dictionary, and it calls []= on it, it  
> actually ends up being translated into a __setitem__ call. The goal  
> seems worthwhile, and is certainly possible but I'm not entirely sure  
> how much of an effort will it take. There's only one way to find out  
> though (doing it), but I'd really appreciate some debate and feedback  
> here before I embark on this.

I think perhaps we want to do a quick survey of how most of the key 
languages represent collections and try to find the commonality. Where 
possible, we should always represent collections in a way that all 
languages can see them as such and use them as such, but we also need to 
define a bit more clearly where the demarcation is between the low-level 
"collection" concept and higher-level duck-typed operations, so we don't 
pull in too much for too little gain.

- Charlie

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

[jvm-l] Re: Attributes and items

Reply via email to