Thanks for taking the time to think this through. Let me make sure I
understand. What you are proposing would decouple the runtime from
the objects but still make it accessible via the ClassLoader. That
seems to be a major part of the problem, but doesn't it still leave
the metaclass hierarchy coupled to the object? If we persist (or
serialize) the object, we would want to replace the metaclass
reference with enough information to be able to restore the metaclass
reference when the object is paged back in from disk (or
deserialized). I was thinking that we had to decouple the runtime so
that we wouldn't be persisting things like threads and decouple the
metaclass so that we wouldn't be storing metaclass hierarchy and all
of the method dictionaries. Of course, storing the classes and
methods would open the door to having a distributed, transactional
IDE, so maybe we would want to flush that to the persistent store,
too. But it seems like that would not be what we want for serialization.
Comments?
On Jul 3, 2007, at 12:33 AM, Charles Oliver Nutter wrote:
Alan McKean wrote:
Since the lack of Java serialization of JRuby objects stops us
dead in our tracks when trying to hook up our persistence engine,
I am interested in either getting someone on this end to work on
it or jumping in myself. In either case, I need some background on
the JRuby runtime architecture and some guidance on particular
issues. The issues are about how to detach an object from its
runtime elements and how to restore them when the object gets
reloaded into memory:
1) When we first tried saving a JRuby object to our database, we
saw it drag along a gaggle of runtime objects. Given that it might
be loaded into a different VM when it is brought in from the
database,
is reconnecting the object to a particular runtime important? If
so, is there a way of determining which of the available runtimes
would be best to connect it to?
2) Detaching an object from its 'runtime' variable and making the
'metaclass' variable transient lets us store the object in our
database without dragging much else along. But we need to
reconnect things when the object is reloaded into memory. Is there
a canonical name for the metaclass that we could store in the
database along with the instance? If not, what information is
available for reconnecting. iWe persist type information in our
Java product by storing the fully-qualified name of the class with
the object, then lazily loading and initializing the connection
(using the name) when we reload the object to memory. Will this
work in JRuby?
If someone has thought through a strategy for deserializing a
JRuby object and restoring its connections to its runtime, I would
love to hear about it.
I've been looking into this a bit tonight. This email represents me
rambling.
Marking metaclass and finalizer as transient are no-brainers. I'm
going to go ahead and commit that.
I'm going to see what would be needed to remove getRuntime
everywhere it's needed. It would be a big job...be back in a few
minutes...
...ok I'm back. I think it's doable. Here's more rambling thoughts.
The runtime connection is used for a few things:
1. to construct other objects
This is mostly a self-fulfilling prophecy. Objects require a
runtime when they're created, so all objects need runtime available
to create objects. If we break that chain, a number of places that
depend on runtime disappear.
2. to locate classes in order to construct objects
This is a little harder to eliminate. In order to construct a Ruby
"String" object, you need to have access to the "String" metaclass.
That means having access to the place where the "String" metaclass
is stored, currently in the runtime. Again, this is largely self-
fulfilling; you need access to a metaclass to construct an object,
so you need to locate the metaclass, and since the metaclasses are
currently rooted in the runtime, you need the runtime. But the
runtime dependency is largely peripheral to the use case.
3. to access runtime-global and thread-local data at execution time
This is probably the hardest to eliminate. Every thread Ruby code
creates or encounters is associated with a ThreadContext, which
contains extra thread-local state needed for executing Ruby code.
Every external thread that touches a given runtime is "adopted" and
given a ThreadContext and a Ruby "Thread" avatar to represent it.
So a given Java thread may have many ruby "Thread" and
"ThreadContext" associated with it, one per runtime it has touched.
This allows us to share threads across runtimes, rather than having
a given thread bound to a given runtime execlusively, as in many
other JVM languages. But it also requires that we locate the
runtime, and therefore the ThreadContext, in a different way.
Therefore, we have the runtime dependency.
This essentially sums up all the major reasons why we have so many
dependencies in code on access to a runtime object. And ultimately,
requiring access to a runtime object obliterates the possibility of
third-party manipulation and transport of Ruby objects.
So to summarize, the three actual reasons we depend on runtime
being present are as follows:
1. to access and maintain types associated with a specific ruby
worldspace
2. to access and maintain state associated with a specific ruby
worldspace
3. to provide execution state and primitives for code running in a
specific ruby worldspace
Now let's rewrite the list by substituting in a different concept
for our top-level ruby worldspace:
1. to access and maintain types associated with a specific ClassLoader
2. to access and maintain static associated with a specific
ClassLoader
3. to provide execution state and primitives for code running in a
specific ClassLoader
So let's examine how we'd solve these issues.
First off, IRubyObject.getRuntime(). Let's assume that the
classloader that loads the Ruby class is our chosen, ultimate
worldspace:
public Ruby getRuntime() {
JRubyClassLoader cl = (JRubyClassLoader)Ruby.class.getClassLoader();
cl.getRuntime();
}
Everything else largely falls out of this. Starting up a new
instance of JRuby largely becomes the act of constructing the top-
level classloader in which it will live and telling it to "go".
JRubyClassLoader cl = new JRubyClassLoader(..., properties);
cl.evalScript("puts 'hello'", "(eval)");
Everything lives underneath the classloader, and since all classes
have access to that classloader, all code can retrieve the runtime
associated with it.
Would something like this work? The view from inside the
classloader seems pretty reasonable...we already have this root
context and partitioning as part of Java's classloader support, and
it seems fairly natural to use it. But I'm not well-enough versed
in Java serialization to know if this will solve our
deserialization issues. It may require you to have more control
over the object stream...but of course if you have control over the
object stream, you could also just have it ask a specific runtime
to unmarshal objects, avoiding the issue completely.
Thoughts? More ideas?
- Charlie
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email
---------------------------------------------------------------------
To unsubscribe from this list please visit:
http://xircles.codehaus.org/manage_email