Re: Maven repository Entry was Re: Codebase service?

Peter Firmstone Mon, 24 May 2010 23:06:16 -0700

Hi Dennis,

Reasoning and hopefully the why's? below.


Dennis Reedy wrote:

Hi Peter,

I was hoping to take a step back for a second, perhaps its just me that seems 
to have my head spinning of late on this list. I may have missed some things, 
but we've discussed many issues over the past week:

- How to advertise the DL jar(s) a service vends, allowing a client to download 
requisite jars that allow the jars to be loaded from a local (trusted) location

Yes, we can use an Entry, or as Chris pointed out, if we annotateMarshalledInstance's using a new Maven URL schema we can extract thatinfo and make it available via MarshalledServiceItem (An abstract classthat extends ServiceItem).

- Given the capability above, the need for a codebase service may not be 
required

Agreed

- Conventions on how to develop River services, as it relates to jar naming, 
packaging and what dependencies are between the various artifacts
- How to possibly move forward with utilizing Maven repositories and the 
implied capabilities of published artifacts
- The development of a maven archetype to allow a developer to easily create a 
working project in seconds

Yes to all above.

Your attention to detail and the documentation of how class loader interactions 
with regards to security is great. I'd like to understand the requirements of 
what you have documented below, the urge to refactor MarshalledInstance, and 
why the new class loader hierarchy needs to be added to River.

The urge to refactor MarshalledInstance is to allow the URL annotationto be requested directly and passed via StreamServiceRegistrar andcombined with delayed unmarshalling of proxy's viaMarshalledServiceItem, to allow the client to provision and provide analternate CodeSource if need be.

StreamServiceRegistrar returns a ResultStream<ServiceItem> , so you havecheck with instanceof MarshalledServiceItem.

The new packaging Scheme can be applied to distributed objects also,provided we create an implementation of CodebaseAccessClassLoader(contributed by Gregg to replace RMIClassLoaderSPI) that performs orrequests local Maven archive provisioning.

The new ClassLoader hierarchy is needed, to solve class identity (fullyqualified runtime classname = class + ClassLoader), class visibility,isolation and versioning problems, that PreferredClassProvider partiallysolves.

Perhaps I'm just missing some fundamental issues, but maybe we need to take 
some time and determine the whys before the hows? Is this direction fundamental 
to the OSGi direction that you're taking? If so, how does this impact non-OSGi 
based systems?

The changes are OSGi agnostic, OSGi will live in the application space,so while they benefit OSGi, they are independent of it, so the samebenefits will apply to other software and OSGi isn't required.

I realised that fundamentally OSGi uses ClassLoaders for isolatingsoftware into components, so implementation classes aren't exposedoutside of their module, something which OSGi does very well, it alsomanages security concerns very well. Something else I realised, OSGi'suse of ClassLoaders is not optimum for distributed systems, there aredifficulties determining the correct ClassLoader during deserialization.OSGi wasn't designed with Serialization in mind. Distributed computingintroduces another dimension, like going from 2D to 3D, in OSGi, youonly have one bundle version combination loaded (you can have manybundles of different versions but I believe typically only one of eachunique bundle instance, you can have the same package version exportedby differently versioned bundles). So how do you determine the correctClassLoader during unmarshalling. In River we may have many proxy'susing the same jar version, however we don't want the proxy'simplementation to get all tied up in the local application bundles, we'dbe allowing the smart proxy to pollute the local application space, someparts of the local application could see the proxy implementation.

In our new ClassLoader tree, a smart proxy can have it's own personalClassLoader, because the ContextClassLoader will be that of the proxy'sduring returning object deserialization, since it initiated thecommunication with the remote Service host. The reason a clientsparameter implementation cannot have it's own ClassLoader and must sharewith other clients that use the same codebase and version is that theyhave no link to the ClassLoader at the remote Service host, with ony theCodebase and Version to go by, since they didn't initiate thecommunication, there could otherwise be many ClassLoaders containingthat codebase version, there not enough information to find it, the lastthing I want to do is require the client have an identity or location todeal with that deserialization of parameters at the Service node.

Rather than take, "how you use OSGi" and apply it to River, I decided tounderstand why they solved their problems the way they did and learnfrom it. It is a very good solution to the problem they've solved.However with our solution we can solve the deserialization issue fordistributed applications utilising OSGi.

Currently River uses Permission grants based on ClassLoader, (so doesOSGi), what I realised was I needed a finer grained Permission grant andhaving many ProtectionDomain's inside one ClassLoader is about as fineas you can get. Only one ClassLoader is used for the API space forclass identity reasons, to allow maximum sharing of API classes becauseyou just can't control and coordinate someone else's JVM's ClassLoadervisibility, without overcoming some serious trust issues (Simpler isbetter I don't even want to attempt to solve them!). There is howeverone compromise with my approach.

By loading all API classes into the same ClassLoader, we cannot haveduplicate classes, so we must always load the latest API version, thatmust not break backward compatibility. If the backward compatibilityconstraints are hampering your design, it's simply better to deprecate apackage and append a number to change the package name. (Or create acompletely new API jar)


org.some.thing
org.some.thing2

The reason we version packages is so we don't have to rename them whenthey break backward compatibility, this makes sense for implementations,but not API. If your going to have long lived persistent objects theybelong in the API space, if you don't need to persist your objects, whynot have an interface and throwaway class implementations, this solvesSerialization exposing class internal state and evolution. Extend theinterface if you wan't new methods.

If a JVM has been running a long time, a new API version may have beenreleased, clients using the old API functionality only, won't be able tosee or utilise the new functionality until we restart the jvm. That isthe compromise. But I figure it's not too bad a compromise once API'shave stabilised and go into longer development cycles. I can handlehaving to restart my JVM once every 6 months.

I think Michael Warres got to the crux of the problem with hispublication on ClassLoader issues, my interpretation of what he said, isperhaps java should tear apart the multiple ClassLoader concerns, ofSecurity, Isolation and Identity and start again. I've chosen whatappears to me to be the best compromise based on Java ClassLoader's today.

So this new ClassLoader hierarchy should play nice with Maven, OSGi andother stuff too, because now the API is visible to everything below inthe ClassLoader hierarchy, while the implementations below, don't exposethemselves, instead, everything cooperates through the API.

OSGi can be used to synchronize ClassLoader visibility between twoseparate JVM's, however that still requires the implementer deal withdeserialization issues, with our solution, we won't have to worry muchabout ClassLoader issues. With Maven, we won't have to worry about lostcodebases either.

Yep, it has been a bit of a head spin, needed your help to work out thedetails before I forgot them.

There is one more detail, I'd like to include in the jar archive: a listof permissions the jar needs. I'd like to use the same format OSGiuses, because it's been done before, why be different. This is to solvethe: "what grants does it need?" Problem. So we can minimise permissiongrants.


One more step towards the net...

Thanks

Dennis

On May 24, 2010, at 1034PM, Peter Firmstone wrote:

Thanks Chris,

Sound like it's time for some MarshalledInstance Refactoring?

Perhaps a Maven (generic if possible) URL schema (with message digest support), we 
need an annotation (or name convention) that indicates whether proxy's can share 
ClassLoader & ProtectionDomain space, dictated by static variables and common 
Principals.

A new constructor for MarshalledInstance that accepts an alternate URL too.

... and two new methods in MarshalledInstance:
Object get(ClassLoader cl, CodeSource[] cs, boolean verifyCodeBaseIntegrity);
URL[] getCodeSourceAnnotation();

Then MarshalledServiceItem could include new methods:

public URL[] getCodeSourceAnnotation();
public Object getService( CodeSource[] cs );
//If cs == null || cs missing a CodeSource use default URL.

Note here that while unmarshalling has been delayed, I haven't relinquished control of ClassLoaders or ProtectionDomains, eg the client can use OSGi, without dictating the Service must also, none of the serialized instances from method returns will need to be deserialized by OSGi, avoiding altogether the OSGi deserialization issue.The client application doesn't have to deal with these concerns directly, we could write multiple ResultStreamFilters that can be chained, the filter that matches the URL schema will unmarshall the service, the filter sequence will dictate the preferred unmarshalling. The filter responsible for successful unmarshalling would construct a new ServiceItem, that isn't unmarshalled, the next unmarshalling filter would ignore it, allowing it to pass through. After it is unmarshalled another filter will check method constraints.


Method Parameters that originate from client ClassLoaders will be unmarshalled 
in the Application ClassLoader space on the Service implementation node, this 
is where things get hairy if the Service API method parameters are non final, 
abstract or interfaces.  Any class that belongs to a Service API jar will be 
safely loaded into the Jini Platform ClassLoader space in it's own 
ProtectionDomain.  Client returned parameter classes however will need their 
own ClassLoader's

If the Service API is loaded into a Parent ClassLoader (Jini Platform 
ClassLoader) at the Service implementation node and API parameters are 
extended, the client classes will need their own ClassLoader space at the 
Service Implementation end, Since a service may serve many clients, these 
ClassLoaders must be shared, based on identical CodeSource and Principals.  The 
client classes will only be accessible via the Service API interfaces or 
classes (they are abstracted).

ANY CLIENT THAT IMPLEMENTS AN API Interface or extends an API parameter, will 
need to make it's implementation package jar publicly available.  Like the 
proxy implementation, it is free to change, however it should be versioned 
appropriately, like the proxy and have it's own jar.  ( This is where the Java 
Package Version Spec comes in handy,  we can annotate classes with Package 
version and local CodeSource).  The CodeSource might contain a file URL, 
however it will contain the jar archive name (which is why Dennis want's to 
name packages with their versions, which can't hurt!) and given the Package 
Version Spec, it will work for OSGi bundles as well as Maven.  A client using 
an OSGi bundle must remember that all of the implementing classes should be in 
the same bundle and the Service node and may not be utilising OSGi, so 
shouldn't attempt to use any OSGi services in Service API parameter 
implementations.

The version spec will identify compatiblity of classes, the closed compatible 
local CodeSource may be used, otherwise a new ClassLoader will be used.  Each 
client will either share all compatible CodeSource and Principals or have their 
own ClassLoader space.

Greg, do you think we could use your service-client.jar for client parameter 
implementations or would this cause confusion?

Perhaps we should use:

service-param.jar

So to really round if off:

Service Implementers must produce versioned manifest jar archives of:

  Smart Proxy:

  Implementation jar: service.jar (depends on service-api.jar)
  API jar:            service-api.jar
  Smart proxy jar:    service-proxy.jar (depends on service-api.jar)
  Selfish Smart proxy jar:  service-iproxy.jar (depends on
  service-api.jar)

  Dumb Proxy:

  Implementation jar: service.jar (depends on service-api.jar)
  API jar:            service-api.jar


Client Implementers must produce version manifest jar archives of:

  Client Parameter extensions:   service-param.jar

If you didn't guess correctly the Selfish Smart proxy jar is the one that 
proxy's cannot share in the same ClassLoader and ProtectionDomain.


ClassLoader Structure (In addition to all your helpful comments on river-dev, 
thanks also to Jim, Tim & Mike, planting the seed):

             System ClassLoader
                     |
            Extension ClassLoader (incl jsk-policy.jar)
                     |
            Jini Platform ClassLoader (incl jsk-platform.jar, *-api.jar)
                     |
      _______________|__________________________________
     |                            |                     |
Application ClassLoader    Proxy ClassLoader's    Parameter Impl ClassLoader's
(Apps & Service Impl)      (Smart Proxy's)        (Remote client parameter 
classes)


Advise History:

Jim:     Use common Interfaces and classes in Parent ClassLoaders
Tim:    Thanks for research on Dependency Tree and ClassLoader Tree's and 
guidance.
Mike:  Research paper on ClassLoader issues.

Thanks & Praise worth mentioning:

Bob Scheifler and others for Jini's strong Security foundation.
Bill Venners for the ServiceUI, it is truly innovative

(hint: come back)


Christopher Dolan wrote:

Isn't List<URL> already present in the MarshalledInstance?  Why repeat
this as an Entry?  Wouldn't it be easier to just add a public accessor
to deserialize the list of URLs from MarshalledInstance.locBytes?

I apologize if this was already explained, but there's been a LOT of
email to read on this list lately.

Chris

-----Original Message-----
From: Dennis Reedy [mailto:[email protected]] Sent: Saturday, May 22, 2010 
9:29 AM
To: [email protected]
Subject: Re: Maven repository Entry was Re: Codebase service?

[CJD] ... <snip> ...

I would just go with aList<String> dlJars;


With this you could provide support for retrieving the DL jar(s) for
non-maven systems as well. If the dlJars property contains 1 element and
is of the form groupId:artifactId:version:classifier, then maven
resolution gets used. Otherwise the DL jars can be obtained using the
codebase of the advertising service.

For maven resolution, I think you'll also want to either provide support
for parsing your maven settings.xml or include the repositories to go
find the artifact if it's not present. If the artifact is retrieved from
the repository it will have a message digest along side of it (with
either a .sha1 or .md5 extension). That can be used to compare a locally
computed digest HttpmdUtil.computeDigest() for updates. But that
comparison really only needs to take place for snapshots, since by
definition releases are considered immutable.

IMO supporting transitive deps is a must have, without that we really
dont get that far. A DL artifact may depend on another DL artifact, and

that DL artifact may have deps as well.

Re: Maven repository Entry was Re: Codebase service?

Reply via email to