Re: Thinking about Extensible Serialization support.

Peter Firmstone Sat, 30 Jan 2021 18:03:27 -0800

You're welcome, thanks for asking :)

I'm proposing an API that allows for support and implementation of anyother serialization protocol (or combined serialization transportlayer), so all existing River Serializable classes would implement it,to allow a standard method of access to internal Object state forimplementations of serialization and re-creation of Object's duringdeserialization. Personally I've found the @AtomicSerial API suitablefor defensively recreating objects during deserialization, but no APIcurrently exists for access to internal state that would make itpossible to decorate existing serialization implementations, so they arepluggable into River as a configuration concern.

I've also been thinking about how to allow these serialization wrappers& implementations to be part of proxy code, that is, for protocol codedoesn't exist on the client.


People are probably wondering, how that might that be possible?

😉

Cheers,

Peter.


On 31/01/2021 5:40 am, Gregg Wonderly wrote:

Thanks for putting the words here (again) for reference.  Java Serialization 
and the Web with MIME are so interlinked in time that it’s hard, sometimes to 
think about the larger implications of interchange protocols that are transport 
and language independent.  We still pay a pretty large cost for marshal and 
unmarshall activities and edge devices don’t always have the available 
resources for full stacks.

Both packaging like Json but also encoding like Sparkplug have an impact on 
system design!

Gregg

Sent from my iPhone

On Jan 30, 2021, at 12:05 AM, Peter Firmstone <peter.firmst...@zeus.net.au> 
wrote:

Hi Gregg,

Yes, of course, if the service was using Java Serialization, the bytes would be 
the same, but if a different Serialzation protocol was used, the bytes would be 
different, appropriate for the serialization protocol in use, these bytes would 
be transferred over existing transport layers, such as TCP, TLS, HTTPS etc (and 
new transport layers when created, eg bluetooth...) .   It would be a service 
implementation choice, via configuration, although a client might reject it 
using constraints.    The implementation would be a subclass that overrides 
functionality in BasicILFactory.

To serialize object state, one must have access to internal object state.   
Java Serialization is afforded special privileges by the JVM, not afforded to 
other serialization protocols, that allow it to access private state.

Lets say for example a service developer wanted to use JSON, or protobuf 
instead of Java Serialization, their reason for doing so, might be that their 
server side service is written in another language, such as .NET, C++, C, etc.

In order to support other languages, other JERI protocol layers would need to 
be written in those languages also.

Extending BasicILFactory is relatively straightforward, however methods in 
BasicInvocationHandler and BasicInvocationDispatcher with parameters and return 
types using ObjectInputStream and ObjectOutputStream would need to be replaced 
with ObjectInput and ObjectOutput.  This is possible without breaking existing 
functionality.

For simple message passing style serialization like protobuf, each parameter 
would simply use the OutputStream and InputStream from the underlying transport 
layer to send parameters and receive return values,  The bytecodes of parameter 
and return value classes for protobuf are generated from .proto schema 
definitions.   So a simple serialization layer like protobuf, doesn't need a 
Serialization API, to access internal object state.

For more complex object graphs, like those JSON can support, access to object 
internal state is required, as fields are sent as name value pairs.  Like Java 
Serialization, JSON can also serialize objects containing object fields.

Java Serialization can of course transmit object graphs containing circular 
references, while re-implementing Java deserialization (to address security), I 
chose not to support circular object graphs, the only class this impacted was 
Throwable, however I didn't find it difficult to work around. This 
reimplementation of deserialization is called AtomicSerial, after it's failure 
atomicity.   Developers who implement @AtomicSerial are at least required to 
implement a constructor, that accepts a single parameter argument called 
GetArg.   GetArg extends java.io.ObjectInputStream.GetField.

https://github.com/pfirmstone/JGDMS/wiki

https://pfirmstone.github.io/JGDMS/jgdms-platform/apidocs/org/apache/river/api/io/package-summary.html

AtomicSerial's public API, as implemented by developers, is suitable for any 
deserialization framework, in JGDMS all Serializable objects also implement 
@AtomicSerial.   All classes implementing @AtomicSerial are also Serializable 
and their serial form is unchanged.

The constructor argument is caller sensitive, the namespace for each class in 
an inheritance hierarchy is private, so only the calling class can see it's 
serial fields, to access object state of other classes in it's own inheritance 
heirarchy, it's possible to do this by creating an instance of that class by 
calling it's constructor and passing the GetArg instance as a parameter, this 
makes it possible to validate intra-class invariants prior to creating an 
object instance.

I've been thinking that all that would be required to support access to internal 
object state, would be for each class to implement a static method, that accepts 
an instance of it's own type as well as an subclass instance of 
ObjectOutputSteam.PutField.  (A subclass of PutField is required to provide some 
security around creation of this parameter, as well as discovering the calling 
class, and to provide access to the stream for writing, optionally supported).   
PutField is simply a name -> value list of internal state, however the PutField 
parameter would need to be caller sensitive, so that each class in an object's 
inheritance hierarchy has it's own private state namespace.

So basically a different Serialization protocol layer would have 
implementations of ObjectInput and ObjectOutput and access the objects passed 
via the Invocation layer using the public Serialization Layer API.

Currently I have not implemented any such serialization API.

--
Regards,
Peter

On 30/01/2021 10:25 am, Gregg Wonderly wrote:
Can you speak to why it would be different than the stream of bytes that 
existing serialization creates through Object methods to help clarify?

Gregg

Sent from my iPhone

On Jan 29, 2021, at 3:46 PM, Peter Firmstone<peter.firmst...@zeus.net.au>  
wrote:

A question came up recently about supporting other serialization protocols.

JERI currently has three layers to it's protocol stack:

Invocation Layer,
Object identification layer
Transport layer.

Java Serialization doesn't have a public API, I think this would be one reason 
there is no serialization layer in JERI.

One might wonder, why does JERI need a serialization layer, people can 
implement an Exporter, similar IIOP and RMI.  Well the answer is quite simple, 
it allows separation of the serialization layer from the transport layer, eg 
TLS, TCP, Kerberos or other transport layer people may wish to implement.   
Currently someone implementing an Exporter would also require a transport layer 
and that may or may not already exist.

In recent years I re-implemented de-serialization for security reasons, while 
doing so, I created a public and explicit de-serialization API, I have not 
implemented an explicit serialization API, it, or something similar could 
easily be used as a serialization provider interface, which would allow 
wrappers for various serialization protocols to be implemented.

--

Re: Thinking about Extensible Serialization support.

Reply via email to