Comments inline below... Attila Szegedi wrote: > So, here I am trying to further my metaobject protocol library. In > order to get further with it, I tried the eat-my-own-dog-food approach > and decided to - after having written a MOP for POJOs - to try and > write actual MOP implementations for some dynamic language > implementations, most notably, Jython and JRuby.
Excellent. JRuby 1.1 will be out soon, and then this sort of work (and the improved integration with Java and other languages it brings) will become a high priority for us as well. > And pretty soon, I hit an obvious design problem (which is okay, > really - this is still an exercise in exploring the right approach). > Namely, lots of languages actually have two namespaces when it comes > to accessing data belonging to an object. I'll refer to one of them as > "attributes" (as that's what they're called in both Ruby and Python) > and the other are "items", which are elements of some container > object. All objects have attributes, but only some objects > (containers) have items. Some clarification for Ruby. Ruby doesn't distinguish between the public representation of attributes and methods. Attributes *are* just accessor methods that return values. There's no way to iterate only attributes or only methods, because they are the same structure and stored in the same way. But Ruby does have a concept of instance variables, represented as always-protected entries in a per-object dictionary. The set of instance variables is not determined ahead of time, and can grow as the program runs. The only way to access an object's instance variables is from within that object...or by wiring up attribute accessors (which just creates methods. I think the MOP would be entirely satisfactory if it represented only methods for Ruby, but I understand this may not be the case for other languages. > Some languages don't make a distinction, most notably, JavaScript. In > JavaScript, all objects are containers and they only have items (and > an item can be a function, in which case it functions as a method on > the object). Can't get much more generic than that, right? Other > languages (Ruby, Python) will distinguish between the two; the > containers are arrays/lists and hashes/dictionaries/maps. As a matter > of fact, it helps thinking of Java as having the distinction - > JavaBeans properties are the attributes, and arrays, Maps, and Lists > will have items. It's also important to note here that in general Ruby is more about "maps" than "lists". There is a core class that's a list, but methods and instance variables and global variables and constants are all held in hash-like structures. But true, they're not all the same exact structure, and most of them you can't really access as primitive data structures. > Now, to make matters a bit more complicated, in lots of languages the > container API is actually just a syntactic sugar. Give an object a [] > and a []= method, and it's a container in Ruby! Give it __getitem__, > __setitem__, and few others, and it's a container in Python! Honestly, > this is okay - as a byproduct of duck typing, one shouldn't expect > there be any sort of an explicit declaration of "containerness", right? In Ruby, the container API is defined by more than [], really. [] and []= are just method calls...syntactic sugar for foo.[](key) and foo.[]=(key,value). They may make an object look like a collection, but they don't mean it *is* a collection. And they're frequently used for other syntactic magic. However...your thoughts about defining some of these collection operations as an additional protocol across languages is a great one. CLR already has low-level operations to abstract collection gets/sets, which get translated into collection operations across languages. So if a language is defined to handle them, its collections are transportable to other languages without too much fuss. > Bottom line is, I feel this is a big deal to solve in interoperable > manner, as the raison d'ĂȘtre of the MOP would be to allow > interoperability between programs written in different languages > within a single JVM; I imagine in most cases the programs will pass > complex data structures built out of lists and dictionaries to one > another, so it feels... essential to get this right. It also feels > like something that can rightfully belong in a generic MOP as most > languages do have the concept of ordered sequences and associative > arrays. Of course, I might also be wrong here; it is also an essential > goal to not end up with a baroque specification that contains > everything plus the kitchen sink. I.e. you might notice that my > current MOP effort for now doesn't have a concept of the class, as not > all languages have a class concept; funnily enough the concept of > sequences and maps actually looks more important for interop (since > it's more general) to me right now than the concept of a class. In Ruby, I think the problem is not as bad as you think. Almost everyone working with collections in Ruby uses Array and Hash or descendants of them, since they have everything you'd need from such data structures plus nice literal syntaxes. Supporting the concept of lists and dictionaries in the MOP as being Array and Hash in Ruby would obviously be a minimum requirement...but it might also be a suitable maximum as well. Far more interesting and powerful, I think, is how the MOP can be appropriately wired in to the formal coercion protocols of a given language, such as the to_* methods in Ruby. These coercion methods take two forms...those typically used by programmers explicitly to coerce values (to_i, to_s, to_a, ...) and those mostly used internally for implicit coercion (to_int, to_str, to_ary). The protocols aren't 100% defined (because nothing is in Ruby), but they are fairly well understood and key to type interop in Ruby...and therefore key to language interop in the MOP. > So, here am I wondering whether this is something that can be made > sufficiently unified across the languages to the point that if a Ruby > program is given a Python dictionary, and it calls []= on it, it > actually ends up being translated into a __setitem__ call. The goal > seems worthwhile, and is certainly possible but I'm not entirely sure > how much of an effort will it take. There's only one way to find out > though (doing it), but I'd really appreciate some debate and feedback > here before I embark on this. I think perhaps we want to do a quick survey of how most of the key languages represent collections and try to find the commonality. Where possible, we should always represent collections in a way that all languages can see them as such and use them as such, but we also need to define a bit more clearly where the demarcation is between the low-level "collection" concept and higher-level duck-typed operations, so we don't pull in too much for too little gain. - Charlie --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "JVM Languages" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/jvm-languages?hl=en -~----------~----~----~----~------~----~------~--~---
