Thanks for taking the time to think this through. Let me make sure I understand. What you are proposing would decouple the runtime from the objects but still make it accessible via the ClassLoader. That seems to be a major part of the problem, but doesn't it still leave the metaclass hierarchy coupled to the object? If we persist (or serialize) the object, we would want to replace the metaclass reference with enough information to be able to restore the metaclass reference when the object is paged back in from disk (or deserialized). I was thinking that we had to decouple the runtime so that we wouldn't be persisting things like threads and decouple the metaclass so that we wouldn't be storing metaclass hierarchy and all of the method dictionaries. Of course, storing the classes and methods would open the door to having a distributed, transactional IDE, so maybe we would want to flush that to the persistent store, too. But it seems like that would not be what we want for serialization.

Comments?

On Jul 3, 2007, at 12:33 AM, Charles Oliver Nutter wrote:

Alan McKean wrote:
Since the lack of Java serialization of JRuby objects stops us dead in our tracks when trying to hook up our persistence engine, I am interested in either getting someone on this end to work on it or jumping in myself. In either case, I need some background on the JRuby runtime architecture and some guidance on particular issues. The issues are about how to detach an object from its runtime elements and how to restore them when the object gets reloaded into memory: 1) When we first tried saving a JRuby object to our database, we saw it drag along a gaggle of runtime objects. Given that it might be loaded into a different VM when it is brought in from the database, is reconnecting the object to a particular runtime important? If so, is there a way of determining which of the available runtimes would be best to connect it to? 2) Detaching an object from its 'runtime' variable and making the 'metaclass' variable transient lets us store the object in our database without dragging much else along. But we need to reconnect things when the object is reloaded into memory. Is there a canonical name for the metaclass that we could store in the database along with the instance? If not, what information is available for reconnecting. iWe persist type information in our Java product by storing the fully-qualified name of the class with the object, then lazily loading and initializing the connection (using the name) when we reload the object to memory. Will this work in JRuby? If someone has thought through a strategy for deserializing a JRuby object and restoring its connections to its runtime, I would love to hear about it.

I've been looking into this a bit tonight. This email represents me rambling.

Marking metaclass and finalizer as transient are no-brainers. I'm going to go ahead and commit that.

I'm going to see what would be needed to remove getRuntime everywhere it's needed. It would be a big job...be back in a few minutes...

...ok I'm back. I think it's doable. Here's more rambling thoughts.

The runtime connection is used for a few things:

1. to construct other objects

This is mostly a self-fulfilling prophecy. Objects require a runtime when they're created, so all objects need runtime available to create objects. If we break that chain, a number of places that depend on runtime disappear.

2. to locate classes in order to construct objects

This is a little harder to eliminate. In order to construct a Ruby "String" object, you need to have access to the "String" metaclass. That means having access to the place where the "String" metaclass is stored, currently in the runtime. Again, this is largely self- fulfilling; you need access to a metaclass to construct an object, so you need to locate the metaclass, and since the metaclasses are currently rooted in the runtime, you need the runtime. But the runtime dependency is largely peripheral to the use case.

3. to access runtime-global and thread-local data at execution time

This is probably the hardest to eliminate. Every thread Ruby code creates or encounters is associated with a ThreadContext, which contains extra thread-local state needed for executing Ruby code. Every external thread that touches a given runtime is "adopted" and given a ThreadContext and a Ruby "Thread" avatar to represent it. So a given Java thread may have many ruby "Thread" and "ThreadContext" associated with it, one per runtime it has touched. This allows us to share threads across runtimes, rather than having a given thread bound to a given runtime execlusively, as in many other JVM languages. But it also requires that we locate the runtime, and therefore the ThreadContext, in a different way. Therefore, we have the runtime dependency.

This essentially sums up all the major reasons why we have so many dependencies in code on access to a runtime object. And ultimately, requiring access to a runtime object obliterates the possibility of third-party manipulation and transport of Ruby objects.

So to summarize, the three actual reasons we depend on runtime being present are as follows:

1. to access and maintain types associated with a specific ruby worldspace 2. to access and maintain state associated with a specific ruby worldspace 3. to provide execution state and primitives for code running in a specific ruby worldspace

Now let's rewrite the list by substituting in a different concept for our top-level ruby worldspace:

1. to access and maintain types associated with a specific ClassLoader
2. to access and maintain static associated with a specific ClassLoader 3. to provide execution state and primitives for code running in a specific ClassLoader

So let's examine how we'd solve these issues.

First off, IRubyObject.getRuntime(). Let's assume that the classloader that loads the Ruby class is our chosen, ultimate worldspace:

public Ruby getRuntime() {
  JRubyClassLoader cl = (JRubyClassLoader)Ruby.class.getClassLoader();
  cl.getRuntime();
}

Everything else largely falls out of this. Starting up a new instance of JRuby largely becomes the act of constructing the top- level classloader in which it will live and telling it to "go".

JRubyClassLoader cl = new JRubyClassLoader(..., properties);
cl.evalScript("puts 'hello'", "(eval)");

Everything lives underneath the classloader, and since all classes have access to that classloader, all code can retrieve the runtime associated with it.

Would something like this work? The view from inside the classloader seems pretty reasonable...we already have this root context and partitioning as part of Java's classloader support, and it seems fairly natural to use it. But I'm not well-enough versed in Java serialization to know if this will solve our deserialization issues. It may require you to have more control over the object stream...but of course if you have control over the object stream, you could also just have it ask a specific runtime to unmarshal objects, avoiding the issue completely.

Thoughts? More ideas?

- Charlie


---------------------------------------------------------------------
To unsubscribe from this list please visit:

   http://xircles.codehaus.org/manage_email


---------------------------------------------------------------------
To unsubscribe from this list please visit:

   http://xircles.codehaus.org/manage_email

Reply via email to