Re: Serialzation PREVIOUSLY: RFR: 8229773: Resolve permissions for code source URLs lazily

Peter Firmstone Thu, 22 Aug 2019 20:00:05 -0700

"...since at the time the industry believed that distributed objectswere going to save us from complexity.) Many of the sins ofserialization were committed in the desire to get that last .1%, but thecost and benefit of that last .1% are woefully out of balance."

The following are probably a non goals, but something to consider orkeep in mind, relating to distributed objects:


The are three types of distributed objects:

  1. Immutable value / data Object types.
  2. Shared Mutable Objects.
  3. Unshared Mutable Objects.
  4. Remote Objects / Services (best for managing shared mutable state).

The second type of distributed object causes much pain and should bediscouraged. The first three types of distributed objects can haveclass resolution issues, but these are solveable.

A lot of folks also have problems with deserialization Objects whenclass visibility is different at both ends, I'm guessing this would bethe same for value types.

For example OSGi folk recommend using primitive parameter types forremote OSGi services.

RMI annotates streams with codebase annotations. Jini Extensible RemoteInvocation used to do that too.

The problem with RMI codebase and Jini codebase annotations is if youresolve your classes locally, you lose the codebase annotations, whenre-serializing data and because class visibility can be different atdifferent endpoints, end up with all sorts of class resolution issues."Class Loading Issues in Java™ RMIand Jini™ Network Technology" byMichael Warres

https://pdfs.semanticscholar.org/143f/468fcbdafd20f2b8c27fe5e0a869913b641a.pdf

The solution of course is simple, ensure that you deserialize into thesame module that you serialized from, especially when deserializing inanother jvm, so class resolution is identical.

We serialize a lot of complex object graphs, none are circular. Themodule used for serialization should have visiblity of the entire graphof object classes.

So if we're using OSGi modules, and provide a network / remote service(not to be confused with an OSGi remote service) we ensure the proxy'sfor these services have the same module installed at the client andserver endpoints. The service is represented by a Java interface andthe client makes calls on the interfaces methods. This interface maybe implemented by what is called a smart proxy, which is encapsulated bya module which is dynamically downloaded at runtime, or a reflectionProxy using an InvocationHandler that is generated dynamically.

We still provide an option for codebase annotations for client parameterobjects, where a client subclasses parameter types and pass them to theservice, but this is discouraged, it is provided for backwardcompatibility only. Where the parameters are also interfaces, theclient can implement a remote object and pass it as a parameter instead,in our system, this will cause a module to be loaded in the serveridentical to that at the client to resolve the remote object classes,without using stream codebase annotations.

Incidentally, if you're curious how this happens, a proxy is sent {Iguess you can call it a serialization proxy :) } and authenticated bythe remote end, security constraints applied, then the remote end asksthe proxy for a codebase URL,which is loaded into a ClassLoader withcontrolled visibility, this is extensible using a ServiceProvider orOSGi service, then the proxy is deserialized into this by calling amethod on the serialization proxy.

By limiting scope, we can still have 99% of the benefits of distributedobjects, without the pain.

Incidentally apart from the complexity of class resolution, what reallylimited distributed computing was IPv4. IPv6 removes the networkaddressing limitations placed on distributed computing.


So I'd make the following qualifications:

  1. Use only primitive types when serializing between different languages.
  2. Serialize Java language Object types and primitives only between
     jvm's when class visibility is uncontrolled.
  3. When serializing other object types, ensure they are immutable if
     shared and that class visibility is identical and managed at both
     endpoints.
  4. Do not serialize objects whose classes may not be resolveable
     (when you need to depend on annotated streams and uncontrolled
     class resolution for example), find another way to solve the problem.

We've had a 20 years to iron out the wrinkles. :)

Regards,

Peter.

On 23/08/2019 7:36 AM, Peter Firmstone wrote:

Hi Sean,
Regarding the section entitled "Why not write a new serializationlibrary?", unlike the serialization libraries listed, our purpose wasto be able to securely deserialize untrusted data, while maintainingbackward serial form compatibility with Java Serialization, providedit didn't compromise security.
We don't use blacklists or whitelists, we use permissions to grantDeserializationPermission, it doesn't have the granularity of whitelists, but then, classes that implement @AtomicSerial are supposed tobe hardened implementations in any case.
If it can be of use, feel free to experiment with it, hopefully itmight help with some of your design decisions:
https://github.com/pfirmstone/JGDMS/tree/trunk/JGDMS/jgdms-platform/src/main/java/org/apache/river/api/io
Much of the code on this site provides implementation examples as well.

Regards,

Peter.

On 20/08/2019 7:55 AM, Sean Mullan wrote:
Brian Goetz (copied) has done a lot of thinking in the serializationarea, so I have copied him. Not sure if you have seen it but herecently posted a document about some of his ideas and possiblefuture directions for serialization:http://cr.openjdk.java.net/~briangoetz/amber/serialization.html
--Sean

On 8/17/19 10:22 PM, Peter Firmstone wrote:
Thanks Sean,
You've gone to some trouble to answer my question, whichdemonstrates you have considered it.
I donate some time to help maintain Apache River, derived from Sun'sJini. Once Jini depended on RMI, today, not so much, it still hassome dependencies on some RMI interfaces, but doesn't utilise JRMPalthough it provides some backward compatibilty enable it.
But my point is, we heavily utilise java Serialization, and have anindependant implementation of a subset of Java Serialization(originating from Apache Harmony). We do this for security as weuse an annotated serialization constructor. Serial form isunchanged, we have Serializers for commonly used java libraryobjects, for example, we have a "PermissionSerializer", but we don'thave a "PermissionCollectionSerializer" or "PermissionsSerializer"(for java.security.Permissions). Incidentally, we have found we donot need the ability to serialize circular object graphs.Throwable is an object that has a circular object graph, but thatcircular object graph can be linked up after deserialization.
Permission implementing Serializable is probably not too much of athreat, as these objects are effectively immutable after lazyinitialization.
ProtectionDomain calls java.security.Permissions::setReadOnly duringit's construction.
ProtectionDomain::getPermissions returns internaljava.security.Permissions. If this is serialized, then thereadOnly internal state can be written to as the internal objectreferences are accessible from within the stream.
Admitedly, the attacker would already need to have some privilege,to have access to a ProtectionDomain, so it's a path of privilegeescallation. I'm not talking about gadget attacks anddeserialization of untrusted data, I'm talking about breakingencapsulation.
Even though we are heavily dependant on Java Serialization, we arevery careful when we implement it, and avoid implementing it whenpossible. Hindsight is 20:20, but given we are now seeing some JavaSE backward compatibility breakages, perhaps it might be worthconsidering breaking serialization. I don't mean we need tonecessarily break object serial form, but making the Javaserialization API explicit with subset of existing api features,that makes long term maintenace and security less of a burden andremoving support for Serialization of some objects, where it isseldom used, perhaps using a JEP that requests developers toconsider which library objects actually need to be serializable.
Something we do in our Java Serialization API is require thatmutable deserialized objects are defensively copied during objectconstruction (serial fields are deserialized before an object isconstructed, the deserialized fields are accessible via a parameterpassed in during construction. We have tools that assistdevelopers to check deserialized Java Collections contain theexpected object types for example, so during object construction thedeveloper has to replace the Collection with a new instance and copythe contents to the new Collection after checking the type of eachobject contained therein. Also we don't actually serialize JavaCollections, we have standard serial forms for List, Set and Map, sothese serial forms are equal, similar to the List, Set and Mapcontracts. By doing this, Collections don't actually need toimplement Serializable at all, as a Serializer becomes responsiblefor their serialization. This also means that all Collections mustbe accessed by interfaces, rather than implementation classes, sothe deserialization constructor, must defensively copy them intotheir preferred Collection instance. It's a bit like dependencyinjection.
I know it would take time, and there would be some pain, but longterm it would save a lot of maintenance developer time.
Regards,

Peter.

On 17/08/2019 12:50 AM, Sean Mullan wrote:
On 8/15/19 8:18 PM, Peter Firmstone wrote:
Hi Roger,

+1 for writeReplace
Personally I'd like to see some security classes break backwardcompatibility and remove support for serialization as it allowssomeone to get references to internal objects, especially sincethese classes are cached by the JVM. Which makesPermissionCollection.setReadOnly() very easy to bypass, by addingpermissions to internal collections once you have a reference tothem.
Does anyone have any use cases for serializing these objects?
These objects are easy to re-create by sending or recieving andparsing strings, because they are built from text based policyfiles, and when you do that, you are validating input, so I neverdid fully understand why they were made serializable.
This is briefly explained on page 61 in the "Inside Java 2 PlatformSecurity" book [1]:
"The Permission class implements two interfaces:java.security.Guard and java.io.Serializable. For the latter, theintention is that Permission objects may be transported to remotemachines, such as via Remote Method Invocation (RMI), and thus aSerializable representation is useful."
The Permission class was introduced in Java SE 1.2 so there weredifferent motivations back then :)
--Sean

[1] https://www.oracle.com/technetwork/java/javaee/index-141918.html

Re: Serialzation PREVIOUSLY: RFR: 8229773: Resolve permissions for code source URLs lazily

Reply via email to