Hi Dennis,

Reasoning and hopefully the why's? below.

Dennis Reedy wrote:
Hi Peter,

I was hoping to take a step back for a second, perhaps its just me that seems 
to have my head spinning of late on this list. I may have missed some things, 
but we've discussed many issues over the past week:

- How to advertise the DL jar(s) a service vends, allowing a client to download 
requisite jars that allow the jars to be loaded from a local (trusted) location
Yes, we can use an Entry, or as Chris pointed out, if we annotate MarshalledInstance's using a new Maven URL schema we can extract that info and make it available via MarshalledServiceItem (An abstract class that extends ServiceItem).

- Given the capability above, the need for a codebase service may not be 
required
Agreed
- Conventions on how to develop River services, as it relates to jar naming, 
packaging and what dependencies are between the various artifacts
- How to possibly move forward with utilizing Maven repositories and the 
implied capabilities of published artifacts
- The development of a maven archetype to allow a developer to easily create a 
working project in seconds
Yes to all above.
Your attention to detail and the documentation of how class loader interactions 
with regards to security is great. I'd like to understand the requirements of 
what you have documented below, the urge to refactor MarshalledInstance, and 
why the new class loader hierarchy needs to be added to River.

The urge to refactor MarshalledInstance is to allow the URL annotation to be requested directly and passed via StreamServiceRegistrar and combined with delayed unmarshalling of proxy's via MarshalledServiceItem, to allow the client to provision and provide an alternate CodeSource if need be.

StreamServiceRegistrar returns a ResultStream<ServiceItem> , so you have check with instanceof MarshalledServiceItem.

The new packaging Scheme can be applied to distributed objects also, provided we create an implementation of CodebaseAccessClassLoader (contributed by Gregg to replace RMIClassLoaderSPI) that performs or requests local Maven archive provisioning.

The new ClassLoader hierarchy is needed, to solve class identity (fully qualified runtime classname = class + ClassLoader), class visibility, isolation and versioning problems, that PreferredClassProvider partially solves.
Perhaps I'm just missing some fundamental issues, but maybe we need to take 
some time and determine the whys before the hows? Is this direction fundamental 
to the OSGi direction that you're taking? If so, how does this impact non-OSGi 
based systems?
The changes are OSGi agnostic, OSGi will live in the application space, so while they benefit OSGi, they are independent of it, so the same benefits will apply to other software and OSGi isn't required.

I realised that fundamentally OSGi uses ClassLoaders for isolating software into components, so implementation classes aren't exposed outside of their module, something which OSGi does very well, it also manages security concerns very well. Something else I realised, OSGi's use of ClassLoaders is not optimum for distributed systems, there are difficulties determining the correct ClassLoader during deserialization. OSGi wasn't designed with Serialization in mind. Distributed computing introduces another dimension, like going from 2D to 3D, in OSGi, you only have one bundle version combination loaded (you can have many bundles of different versions but I believe typically only one of each unique bundle instance, you can have the same package version exported by differently versioned bundles). So how do you determine the correct ClassLoader during unmarshalling. In River we may have many proxy's using the same jar version, however we don't want the proxy's implementation to get all tied up in the local application bundles, we'd be allowing the smart proxy to pollute the local application space, some parts of the local application could see the proxy implementation.

In our new ClassLoader tree, a smart proxy can have it's own personal ClassLoader, because the ContextClassLoader will be that of the proxy's during returning object deserialization, since it initiated the communication with the remote Service host. The reason a clients parameter implementation cannot have it's own ClassLoader and must share with other clients that use the same codebase and version is that they have no link to the ClassLoader at the remote Service host, with ony the Codebase and Version to go by, since they didn't initiate the communication, there could otherwise be many ClassLoaders containing that codebase version, there not enough information to find it, the last thing I want to do is require the client have an identity or location to deal with that deserialization of parameters at the Service node.

Rather than take, "how you use OSGi" and apply it to River, I decided to understand why they solved their problems the way they did and learn from it. It is a very good solution to the problem they've solved. However with our solution we can solve the deserialization issue for distributed applications utilising OSGi.

Currently River uses Permission grants based on ClassLoader, (so does OSGi), what I realised was I needed a finer grained Permission grant and having many ProtectionDomain's inside one ClassLoader is about as fine as you can get. Only one ClassLoader is used for the API space for class identity reasons, to allow maximum sharing of API classes because you just can't control and coordinate someone else's JVM's ClassLoader visibility, without overcoming some serious trust issues (Simpler is better I don't even want to attempt to solve them!). There is however one compromise with my approach.

By loading all API classes into the same ClassLoader, we cannot have duplicate classes, so we must always load the latest API version, that must not break backward compatibility. If the backward compatibility constraints are hampering your design, it's simply better to deprecate a package and append a number to change the package name. (Or create a completely new API jar)

org.some.thing
org.some.thing2

The reason we version packages is so we don't have to rename them when they break backward compatibility, this makes sense for implementations, but not API. If your going to have long lived persistent objects they belong in the API space, if you don't need to persist your objects, why not have an interface and throwaway class implementations, this solves Serialization exposing class internal state and evolution. Extend the interface if you wan't new methods.

If a JVM has been running a long time, a new API version may have been released, clients using the old API functionality only, won't be able to see or utilise the new functionality until we restart the jvm. That is the compromise. But I figure it's not too bad a compromise once API's have stabilised and go into longer development cycles. I can handle having to restart my JVM once every 6 months.

I think Michael Warres got to the crux of the problem with his publication on ClassLoader issues, my interpretation of what he said, is perhaps java should tear apart the multiple ClassLoader concerns, of Security, Isolation and Identity and start again. I've chosen what appears to me to be the best compromise based on Java ClassLoader's today.

So this new ClassLoader hierarchy should play nice with Maven, OSGi and other stuff too, because now the API is visible to everything below in the ClassLoader hierarchy, while the implementations below, don't expose themselves, instead, everything cooperates through the API.

OSGi can be used to synchronize ClassLoader visibility between two separate JVM's, however that still requires the implementer deal with deserialization issues, with our solution, we won't have to worry much about ClassLoader issues. With Maven, we won't have to worry about lost codebases either.

Yep, it has been a bit of a head spin, needed your help to work out the details before I forgot them.

There is one more detail, I'd like to include in the jar archive: a list of permissions the jar needs. I'd like to use the same format OSGi uses, because it's been done before, why be different. This is to solve the: "what grants does it need?" Problem. So we can minimise permission grants.

One more step towards the net...
Thanks

Dennis

On May 24, 2010, at 1034PM, Peter Firmstone wrote:

Thanks Chris,

Sound like it's time for some MarshalledInstance Refactoring?

Perhaps a Maven (generic if possible) URL schema (with message digest support), we 
need an annotation (or name convention) that indicates whether proxy's can share 
ClassLoader & ProtectionDomain space, dictated by static variables and common 
Principals.

A new constructor for MarshalledInstance that accepts an alternate URL too.

... and two new methods in MarshalledInstance:
Object get(ClassLoader cl, CodeSource[] cs, boolean verifyCodeBaseIntegrity);
URL[] getCodeSourceAnnotation();

Then MarshalledServiceItem could include new methods:

public URL[] getCodeSourceAnnotation();
public Object getService( CodeSource[] cs );
//If cs == null || cs missing a CodeSource use default URL.

Note here that while unmarshalling has been delayed, I haven't relinquished control of ClassLoaders or ProtectionDomains, eg the client can use OSGi, without dictating the Service must also, none of the serialized instances from method returns will need to be deserialized by OSGi, avoiding altogether the OSGi deserialization issue. The client application doesn't have to deal with these concerns directly, we could write multiple ResultStreamFilters that can be chained, the filter that matches the URL schema will unmarshall the service, the filter sequence will dictate the preferred unmarshalling. The filter responsible for successful unmarshalling would construct a new ServiceItem, that isn't unmarshalled, the next unmarshalling filter would ignore it, allowing it to pass through. After it is unmarshalled another filter will check method constraints.

Method Parameters that originate from client ClassLoaders will be unmarshalled 
in the Application ClassLoader space on the Service implementation node, this 
is where things get hairy if the Service API method parameters are non final, 
abstract or interfaces.  Any class that belongs to a Service API jar will be 
safely loaded into the Jini Platform ClassLoader space in it's own 
ProtectionDomain.  Client returned parameter classes however will need their 
own ClassLoader's

If the Service API is loaded into a Parent ClassLoader (Jini Platform 
ClassLoader) at the Service implementation node and API parameters are 
extended, the client classes will need their own ClassLoader space at the 
Service Implementation end, Since a service may serve many clients, these 
ClassLoaders must be shared, based on identical CodeSource and Principals.  The 
client classes will only be accessible via the Service API interfaces or 
classes (they are abstracted).

ANY CLIENT THAT IMPLEMENTS AN API Interface or extends an API parameter, will 
need to make it's implementation package jar publicly available.  Like the 
proxy implementation, it is free to change, however it should be versioned 
appropriately, like the proxy and have it's own jar.  ( This is where the Java 
Package Version Spec comes in handy,  we can annotate classes with Package 
version and local CodeSource).  The CodeSource might contain a file URL, 
however it will contain the jar archive name (which is why Dennis want's to 
name packages with their versions, which can't hurt!) and given the Package 
Version Spec, it will work for OSGi bundles as well as Maven.  A client using 
an OSGi bundle must remember that all of the implementing classes should be in 
the same bundle and the Service node and may not be utilising OSGi, so 
shouldn't attempt to use any OSGi services in Service API parameter 
implementations.

The version spec will identify compatiblity of classes, the closed compatible 
local CodeSource may be used, otherwise a new ClassLoader will be used.  Each 
client will either share all compatible CodeSource and Principals or have their 
own ClassLoader space.

Greg, do you think we could use your service-client.jar for client parameter 
implementations or would this cause confusion?

Perhaps we should use:

service-param.jar

So to really round if off:

Service Implementers must produce versioned manifest jar archives of:

  Smart Proxy:

  Implementation jar: service.jar (depends on service-api.jar)
  API jar:            service-api.jar
  Smart proxy jar:    service-proxy.jar (depends on service-api.jar)
  Selfish Smart proxy jar:  service-iproxy.jar (depends on
  service-api.jar)

  Dumb Proxy:

  Implementation jar: service.jar (depends on service-api.jar)
  API jar:            service-api.jar


Client Implementers must produce version manifest jar archives of:

  Client Parameter extensions:   service-param.jar

If you didn't guess correctly the Selfish Smart proxy jar is the one that 
proxy's cannot share in the same ClassLoader and ProtectionDomain.


ClassLoader Structure (In addition to all your helpful comments on river-dev, 
thanks also to Jim, Tim & Mike, planting the seed):

             System ClassLoader
                     |
            Extension ClassLoader (incl jsk-policy.jar)
                     |
            Jini Platform ClassLoader (incl jsk-platform.jar, *-api.jar)
                     |
      _______________|__________________________________
     |                            |                     |
Application ClassLoader    Proxy ClassLoader's    Parameter Impl ClassLoader's
(Apps & Service Impl)      (Smart Proxy's)        (Remote client parameter 
classes)


Advise History:

Jim:     Use common Interfaces and classes in Parent ClassLoaders
Tim:    Thanks for research on Dependency Tree and ClassLoader Tree's and 
guidance.
Mike:  Research paper on ClassLoader issues.

Thanks & Praise worth mentioning:

Bob Scheifler and others for Jini's strong Security foundation.
Bill Venners for the ServiceUI, it is truly innovative

(hint: come back)


Christopher Dolan wrote:
Isn't List<URL> already present in the MarshalledInstance?  Why repeat
this as an Entry?  Wouldn't it be easier to just add a public accessor
to deserialize the list of URLs from MarshalledInstance.locBytes?

I apologize if this was already explained, but there's been a LOT of
email to read on this list lately.

Chris

-----Original Message-----
From: Dennis Reedy [mailto:[email protected]] Sent: Saturday, May 22, 2010 
9:29 AM
To: [email protected]
Subject: Re: Maven repository Entry was Re: Codebase service?

[CJD] ... <snip> ...

I would just go with a List<String> dlJars;

With this you could provide support for retrieving the DL jar(s) for
non-maven systems as well. If the dlJars property contains 1 element and
is of the form groupId:artifactId:version:classifier, then maven
resolution gets used. Otherwise the DL jars can be obtained using the
codebase of the advertising service.

For maven resolution, I think you'll also want to either provide support
for parsing your maven settings.xml or include the repositories to go
find the artifact if it's not present. If the artifact is retrieved from
the repository it will have a message digest along side of it (with
either a .sha1 or .md5 extension). That can be used to compare a locally
computed digest HttpmdUtil.computeDigest() for updates. But that
comparison really only needs to take place for snapshots, since by
definition releases are considered immutable.

IMO supporting transitive deps is a must have, without that we really
dont get that far. A DL artifact may depend on another DL artifact, and
that DL artifact may have deps as well.







Reply via email to