Re: Serialzation PREVIOUSLY: RFR: 8229773: Resolve permissions for code source URLs lazily

Peter Firmstone Thu, 22 Aug 2019 22:22:16 -0700

I probably should have vetted this before hitting send... let me know ifyou need any clarifications.


Cheers,


Peter.

On 23/08/2019 12:59 PM, Peter Firmstone wrote:

"...since at the time the industry believed that distributed objectswere going to save us from complexity.) Many of the sins ofserialization were committed in the desire to get that last .1%, butthe cost and benefit of that last .1% are woefully out of balance."
The following are probably a non goals, but something to consider orkeep in mind, relating to distributed objects:
The are three types of distributed objects:

  1. Immutable value / data Object types.
  2. Shared Mutable Objects.
  3. Unshared Mutable Objects.
  4. Remote Objects / Services (best for managing shared mutable state).
The second type of distributed object causes much pain and should bediscouraged. The first three types of distributed objects can haveclass resolution issues, but these are solveable.
A lot of folks also have problems with deserialization Objects whenclass visibility is different at both ends, I'm guessing this would bethe same for value types.
For example OSGi folk recommend using primitive parameter types forremote OSGi services.
RMI annotates streams with codebase annotations. Jini ExtensibleRemote Invocation used to do that too.
The problem with RMI codebase and Jini codebase annotations is if youresolve your classes locally, you lose the codebase annotations, whenre-serializing data and because class visibility can be different atdifferent endpoints, end up with all sorts of class resolutionissues. "Class Loading Issues in Java™ RMIand Jini™ NetworkTechnology" by Michael Warreshttps://pdfs.semanticscholar.org/143f/468fcbdafd20f2b8c27fe5e0a869913b641a.pdf
The solution of course is simple, ensure that you deserialize into thesame module that you serialized from, especially when deserializing inanother jvm, so class resolution is identical.
We serialize a lot of complex object graphs, none are circular. Themodule used for serialization should have visiblity of the entiregraph of object classes.
So if we're using OSGi modules, and provide a network / remote service(not to be confused with an OSGi remote service) we ensure the proxy'sfor these services have the same module installed at the client andserver endpoints. The service is represented by a Java interface andthe client makes calls on the interfaces methods. This interface maybe implemented by what is called a smart proxy, which is encapsulatedby a module which is dynamically downloaded at runtime, or areflection Proxy using an InvocationHandler that is generateddynamically.
We still provide an option for codebase annotations for clientparameter objects, where a client subclasses parameter types and passthem to the service, but this is discouraged, it is provided forbackward compatibility only. Where the parameters are alsointerfaces, the client can implement a remote object and pass it as aparameter instead, in our system, this will cause a module to beloaded in the server identical to that at the client to resolve theremote object classes, without using stream codebase annotations.
Incidentally, if you're curious how this happens, a proxy is sent {Iguess you can call it a serialization proxy :) } and authenticated bythe remote end, security constraints applied, then the remote end asksthe proxy for a codebase URL,which is loaded into a ClassLoader withcontrolled visibility, this is extensible using a ServiceProvider orOSGi service, then the proxy is deserialized into this by calling amethod on the serialization proxy.
By limiting scope, we can still have 99% of the benefits ofdistributed objects, without the pain.
Incidentally apart from the complexity of class resolution, whatreally limited distributed computing was IPv4. IPv6 removes thenetwork addressing limitations placed on distributed computing.
So I'd make the following qualifications:
1. Use only primitive types when serializing between differentlanguages.
  2. Serialize Java language Object types and primitives only between
     jvm's when class visibility is uncontrolled.
  3. When serializing other object types, ensure they are immutable if
     shared and that class visibility is identical and managed at both
     endpoints.
  4. Do not serialize objects whose classes may not be resolveable
     (when you need to depend on annotated streams and uncontrolled
class resolution for example), find another way to solve theproblem.
We've had a 20 years to iron out the wrinkles. :)

Regards,

Peter.

On 23/08/2019 7:36 AM, Peter Firmstone wrote:
Hi Sean,
Regarding the section entitled "Why not write a new serializationlibrary?", unlike the serialization libraries listed, our purpose wasto be able to securely deserialize untrusted data, while maintainingbackward serial form compatibility with Java Serialization, providedit didn't compromise security.
We don't use blacklists or whitelists, we use permissions to grantDeserializationPermission, it doesn't have the granularity of whitelists, but then, classes that implement @AtomicSerial are supposed tobe hardened implementations in any case.
If it can be of use, feel free to experiment with it, hopefully itmight help with some of your design decisions:
https://github.com/pfirmstone/JGDMS/tree/trunk/JGDMS/jgdms-platform/src/main/java/org/apache/river/api/io
Much of the code on this site provides implementation examples as well.

Regards,

Peter.

On 20/08/2019 7:55 AM, Sean Mullan wrote:
Brian Goetz (copied) has done a lot of thinking in the serializationarea, so I have copied him. Not sure if you have seen it but herecently posted a document about some of his ideas and possiblefuture directions for serialization:http://cr.openjdk.java.net/~briangoetz/amber/serialization.html
--Sean

On 8/17/19 10:22 PM, Peter Firmstone wrote:
Thanks Sean,
You've gone to some trouble to answer my question, whichdemonstrates you have considered it.
I donate some time to help maintain Apache River, derived fromSun's Jini. Once Jini depended on RMI, today, not so much, itstill has some dependencies on some RMI interfaces, but doesn'tutilise JRMP although it provides some backward compatibilty enableit.
But my point is, we heavily utilise java Serialization, and have anindependant implementation of a subset of Java Serialization(originating from Apache Harmony). We do this for security as weuse an annotated serialization constructor. Serial form isunchanged, we have Serializers for commonly used java libraryobjects, for example, we have a "PermissionSerializer", but wedon't have a "PermissionCollectionSerializer" or"PermissionsSerializer" (for java.security.Permissions).Incidentally, we have found we do not need the ability to serializecircular object graphs. Throwable is an object that has acircular object graph, but that circular object graph can be linkedup after deserialization.
Permission implementing Serializable is probably not too much of athreat, as these objects are effectively immutable after lazyinitialization.
ProtectionDomain calls java.security.Permissions::setReadOnlyduring it's construction.
ProtectionDomain::getPermissions returns internaljava.security.Permissions. If this is serialized, then thereadOnly internal state can be written to as the internal objectreferences are accessible from within the stream.
Admitedly, the attacker would already need to have some privilege,to have access to a ProtectionDomain, so it's a path of privilegeescallation. I'm not talking about gadget attacks anddeserialization of untrusted data, I'm talking about breakingencapsulation.
Even though we are heavily dependant on Java Serialization, we arevery careful when we implement it, and avoid implementing it whenpossible. Hindsight is 20:20, but given we are now seeing some JavaSE backward compatibility breakages, perhaps it might be worthconsidering breaking serialization. I don't mean we need tonecessarily break object serial form, but making the Javaserialization API explicit with subset of existing api features,that makes long term maintenace and security less of a burden andremoving support for Serialization of some objects, where it isseldom used, perhaps using a JEP that requests developers toconsider which library objects actually need to be serializable.
Something we do in our Java Serialization API is require thatmutable deserialized objects are defensively copied during objectconstruction (serial fields are deserialized before an object isconstructed, the deserialized fields are accessible via a parameterpassed in during construction. We have tools that assistdevelopers to check deserialized Java Collections contain theexpected object types for example, so during object constructionthe developer has to replace the Collection with a new instance andcopy the contents to the new Collection after checking the type ofeach object contained therein. Also we don't actually serializeJava Collections, we have standard serial forms for List, Set andMap, so these serial forms are equal, similar to the List, Set andMap contracts. By doing this, Collections don't actually need toimplement Serializable at all, as a Serializer becomes responsiblefor their serialization. This also means that all Collectionsmust be accessed by interfaces, rather than implementation classes,so the deserialization constructor, must defensively copy them intotheir preferred Collection instance. It's a bit like dependencyinjection.
I know it would take time, and there would be some pain, but longterm it would save a lot of maintenance developer time.
Regards,

Peter.

On 17/08/2019 12:50 AM, Sean Mullan wrote:
On 8/15/19 8:18 PM, Peter Firmstone wrote:
Hi Roger,

+1 for writeReplace
Personally I'd like to see some security classes break backwardcompatibility and remove support for serialization as it allowssomeone to get references to internal objects, especially sincethese classes are cached by the JVM. Which makesPermissionCollection.setReadOnly() very easy to bypass, by addingpermissions to internal collections once you have a reference tothem.
Does anyone have any use cases for serializing these objects?
These objects are easy to re-create by sending or recieving andparsing strings, because they are built from text based policyfiles, and when you do that, you are validating input, so I neverdid fully understand why they were made serializable.
This is briefly explained on page 61 in the "Inside Java 2Platform Security" book [1]:
"The Permission class implements two interfaces:java.security.Guard and java.io.Serializable. For the latter, theintention is that Permission objects may be transported to remotemachines, such as via Remote Method Invocation (RMI), and thus aSerializable representation is useful."
The Permission class was introduced in Java SE 1.2 so there weredifferent motivations back then :)
--Sean

[1] https://www.oracle.com/technetwork/java/javaee/index-141918.html

Re: Serialzation PREVIOUSLY: RFR: 8229773: Resolve permissions for code source URLs lazily

Reply via email to