This is an automated email from the ASF dual-hosted git repository.
chaokunyang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/fury-site.git
The following commit(s) were added to refs/heads/main by this push:
new aaee6cf3 🔄 synced local 'docs/guide/' with remote 'docs/guide/'
aaee6cf3 is described below
commit aaee6cf3cc972b1e4561707348c90cc056a43cda
Author: chaokunyang <[email protected]>
AuthorDate: Fri Feb 7 06:12:00 2025 +0000
🔄 synced local 'docs/guide/' with remote 'docs/guide/'
---
docs/guide/java_serialization_guide.md | 187 ++++++++++++++++++++++-----------
1 file changed, 124 insertions(+), 63 deletions(-)
diff --git a/docs/guide/java_serialization_guide.md
b/docs/guide/java_serialization_guide.md
index 9a792d7b..c4ee8f1d 100644
--- a/docs/guide/java_serialization_guide.md
+++ b/docs/guide/java_serialization_guide.md
@@ -102,7 +102,7 @@ public class Example {
| `compressLong` | Enables or disables long compression
for smaller size.
[...]
| `compressString` | Enables or disables string compression
for smaller size.
[...]
| `classLoader` | The classloader should not be updated;
Fury caches class metadata. Use `LoaderBinding` or `ThreadSafeFury` for
classloader updates.
[...]
-| `compatibleMode` | Type forward/backward compatibility
config. Also Related to `checkClassVersion` config. `SCHEMA_CONSISTENT`: Class
schema must be consistent between serialization peer and deserialization peer.
`COMPATIBLE`: Class schema can be different between serialization peer and
deserialization peer. They can add/delete fields independently. [See
more](#class-inconsistency-and-class-version-check).
[...]
+| `compatibleMode` | Type forward/backward compatibility
config. Also Related to `checkClassVersion` config. `SCHEMA_CONSISTENT`: Class
schema must be consistent between serialization peer and deserialization peer.
`COMPATIBLE`: Class schema can be different between serialization peer and
deserialization peer. They can add/delete fields independently. [See
more](#class-inconsistency-and-class-version-check).
[...]
| `checkClassVersion` | Determines whether to check the
consistency of the class schema. If enabled, Fury checks, writes, and checks
consistency using the `classVersionHash`. It will be automatically disabled
when `CompatibleMode#COMPATIBLE` is enabled. Disabling is not recommended
unless you can ensure the class won't evolve.
[...]
| `checkJdkClassSerializable` | Enables or disables checking of
`Serializable` interface for classes under `java.*`. If a class under `java.*`
is not `Serializable`, Fury will throw an `UnsupportedOperationException`.
[...]
| `registerGuavaTypes` | Whether to pre-register Guava types
such as `RegularImmutableMap`/`RegularImmutableList`. These types are not
public API, but seem pretty stable.
[...]
@@ -125,7 +125,7 @@ public class Example {
Single thread fury:
```java
-Fury fury=Fury.builder()
+Fury fury = Fury.builder()
.withLanguage(Language.JAVA)
// enable reference tracking for shared/circular reference.
// Disable it will have better performance if no duplicate reference.
@@ -137,14 +137,14 @@ Fury fury=Fury.builder()
// enable async multi-threaded compilation.
.withAsyncCompilation(true)
.build();
- byte[]bytes=fury.serialize(object);
- System.out.println(fury.deserialize(bytes));
+byte[] bytes = fury.serialize(object);
+System.out.println(fury.deserialize(bytes));
```
Thread-safe fury:
```java
-ThreadSafeFury fury=Fury.builder()
+ThreadSafeFury fury = Fury.builder()
.withLanguage(Language.JAVA)
// enable reference tracking for shared/circular reference.
// Disable it will have better performance if no duplicate reference.
@@ -160,10 +160,45 @@ ThreadSafeFury fury=Fury.builder()
// enable async multi-threaded compilation.
.withAsyncCompilation(true)
.buildThreadSafeFury();
- byte[]bytes=fury.serialize(object);
- System.out.println(fury.deserialize(bytes));
+byte[] bytes = fury.serialize(object);
+System.out.println(fury.deserialize(bytes));
```
+### Handling Class Schema Evolution in Serialization
+
+In many systems, the schema of a class used for serialization may change over
time. For instance, fields within a class
+may be added or removed. When serialization and deserialization processes use
different versions of jars, the schema of
+the class being deserialized may differ from the one used during serialization.
+
+By default, Fury serializes objects using the
`CompatibleMode.SCHEMA_CONSISTENT` mode. This mode assumes that the
+deserialization process uses the same class schema as the serialization
process, minimizing payload overhead.
+However, if there is a schema inconsistency, deserialization will fail.
+
+If the schema is expected to change, to make deserialization succeed, i.e.
schema forward/backward compatibility.
+Users must configure Fury to use `CompatibleMode.COMPATIBLE`. This can be done
using the
+`FuryBuilder#withCompatibleMode(CompatibleMode.COMPATIBLE)` method.
+In this compatible mode, deserialization can handle schema changes such as
missing or extra fields, allowing it to
+succeed even when the serialization and deserialization processes have
different class schemas.
+
+Here is an example of creating Fury to support schema evolution:
+
+```java
+Fury fury = Fury.builder()
+ .withCompatibleMode(CompatibleMode.COMPATIBLE)
+ .build();
+
+byte[] bytes = fury.serialize(object);
+System.out.println(fury.deserialize(bytes));
+```
+
+This compatible mode involves serializing class metadata into the serialized
output. Despite Fury's use of
+sophisticated compression techniques to minimize overhead, there is still some
additional space cost associated with
+class metadata.
+
+To further reduce metadata costs, Fury introduces a class metadata sharing
mechanism, which allows the metadata to be
+sent to the deserialization process only once. For more details, please refer
to the [Meta Sharing](#MetaSharing)
+section.
+
### Smaller size
`FuryBuilder#withIntCompressed`/`FuryBuilder#withLongCompressed` can be used
to compress int/long for smaller size.
@@ -184,9 +219,9 @@ For long compression, fury support two encoding:
- Otherwise write as 9 bytes: `| 0b1 | little-endian 8bytes long |`
- Fury PVL(Progressive Variable-length Long) Encoding:
- First bit in every byte indicate whether has next byte. if first bit is
set, then next byte will be read util
- first bit of next byte is unset.
+ first bit of next byte is unset.
- Negative number will be converted to positive number by `(v << 1) ^ (v >>
63)` to reduce cost of small negative
- numbers.
+ numbers.
If a number are `long` type, it can't be represented by smaller bytes mostly,
the compression won't get good enough
result,
@@ -199,22 +234,18 @@ space savings.
Deep copy example:
```java
-Fury fury=Fury.builder()
- ...
- .withRefCopy(true).build();
- SomeClass a=xxx;
- SomeClass copied=fury.copy(a)
+Fury fury = Fury.builder().withRefCopy(true).build();
+SomeClass a = xxx;
+SomeClass copied = fury.copy(a);
```
Make fury deep copy ignore circular and shared reference, this deep copy mode
will ignore circular and shared reference.
Same reference of an object graph will be copied into different objects in one
`Fury#copy`.
```java
-Fury fury=Fury.builder()
- ...
- .withRefCopy(false).build();
- SomeClass a=xxx;
- SomeClass copied=fury.copy(a)
+Fury fury = Fury.builder().withRefCopy(false).build();
+SomeClass a = xxx;
+SomeClass copied = fury.copy(a);
```
### Implement a customized serializer
@@ -257,8 +288,8 @@ class FooSerializer extends Serializer<Foo> {
Register serializer:
```java
-Fury fury=getFury();
- fury.registerSerializer(Foo.class,new FooSerializer(fury));
+Fury fury = getFury();
+fury.registerSerializer(Foo.class, new FooSerializer(fury));
```
### Security & Class Registration
@@ -279,9 +310,9 @@ Note that class registration order is important,
serialization and deserializati
should have same registration order.
```java
-Fury fury=xxx;
- fury.register(SomeClass.class);
- fury.register(SomeClass1.class,200);
+Fury fury = xxx;
+fury.register(SomeClass.class);
+fury.register(SomeClass1.class,200);
```
If you invoke `FuryBuilder#requireClassRegistration(false)` to disable class
registration check,
@@ -290,19 +321,20 @@ allowed
for serialization. For example, you can allow classes started with
`org.example.*` by:
```java
-Fury fury=xxx;
-
fury.getClassResolver().setClassChecker((classResolver,className)->className.startsWith("org.example."));
+Fury fury = xxx;
+fury.getClassResolver().setClassChecker(
+ (classResolver, className) -> className.startsWith("org.example."));
```
```java
-AllowListChecker checker=new
AllowListChecker(AllowListChecker.CheckLevel.STRICT);
- ThreadSafeFury fury=new ThreadLocalFury(classLoader->{
- Fury
f=Fury.builder().requireClassRegistration(true).withClassLoader(classLoader).build();
+AllowListChecker checker = new
AllowListChecker(AllowListChecker.CheckLevel.STRICT);
+ThreadSafeFury fury = new ThreadLocalFury(classLoader -> {
+ Fury f =
Fury.builder().requireClassRegistration(true).withClassLoader(classLoader).build();
f.getClassResolver().setClassChecker(checker);
checker.addListener(f.getClassResolver());
return f;
- });
- checker.allowClass("org.example.*");
+});
+checker.allowClass("org.example.*");
```
Fury also provided a `org.apache.fury.resolver.AllowListChecker` which is
allowed/disallowed list based checker to
@@ -360,30 +392,30 @@ forward/backward compatibility automatically.
// // share meta across serialization.
// .withMetaContextShare(true)
// Not thread-safe fury.
-MetaContext context=xxx;
- fury.getSerializationContext().setMetaContext(context);
- byte[]bytes=fury.serialize(o);
+MetaContext context = xxx;
+fury.getSerializationContext().setMetaContext(context);
+byte[] bytes = fury.serialize(o);
// Not thread-safe fury.
- MetaContext context=xxx;
- fury.getSerializationContext().setMetaContext(context);
- fury.deserialize(bytes)
+MetaContext context = xxx;
+fury.getSerializationContext().setMetaContext(context);
+fury.deserialize(bytes);
// Thread-safe fury
- fury.setClassLoader(beanA.getClass().getClassLoader());
- byte[]serialized=fury.execute(
- f->{
- f.getSerializationContext().setMetaContext(context);
- return f.serialize(beanA);
+fury.setClassLoader(beanA.getClass().getClassLoader());
+byte[] serialized = fury.execute(
+ f -> {
+ f.getSerializationContext().setMetaContext(context);
+ return f.serialize(beanA);
}
- );
+);
// thread-safe fury
- fury.setClassLoader(beanA.getClass().getClassLoader());
- Object newObj=fury.execute(
- f->{
- f.getSerializationContext().setMetaContext(context);
- return f.deserialize(serialized);
+fury.setClassLoader(beanA.getClass().getClassLoader());
+Object newObj = fury.execute(
+ f -> {
+ f.getSerializationContext().setMetaContext(context);
+ return f.deserialize(serialized);
}
- );
+);
```
### Deserialize non-existent classes
@@ -404,10 +436,10 @@ Fury support mapping object from one type to another type.
> Notes:
>
> 1. This mapping will execute a deep copy, all mapped fields are serialized
> into binary and
-deserialized from that binary to map into another type.
+ deserialized from that binary to map into another type.
> 2. All struct types must be registered with same ID, otherwise Fury can not
> mapping to correct struct type.
-> Be careful when you use `Fury#register(Class)`, because fury will allocate
an auto-grown ID which might be
-> inconsistent if you register classes with different order between Fury
instance.
+ > Be careful when you use `Fury#register(Class)`, because fury will
allocate an auto-grown ID which might be
+ > inconsistent if you register classes with different order between Fury
instance.
```java
public class StructMappingExample {
@@ -460,12 +492,12 @@ the binary are generated by jdk serialization, you use
following pattern to make
then upgrade serialization to fury in an async rolling-up way:
```java
-if(JavaSerializer.serializedByJDK(bytes)){
+if (JavaSerializer.serializedByJDK(bytes)) {
ObjectInputStream objectInputStream=xxx;
return objectInputStream.readObject();
- }else{
+} else {
return fury.deserialize(bytes);
- }
+}
```
### Upgrade fury
@@ -482,18 +514,18 @@ serialized data
using code like following to keep binary compatibility:
```java
-MemoryBuffer buffer=xxx;
- buffer.writeVarInt32(2);
- fury.serialize(buffer,obj);
+MemoryBuffer buffer = xxx;
+buffer.writeVarInt32(2);
+fury.serialize(buffer, obj);
```
Then for deserialization, you need:
```java
-MemoryBuffer buffer=xxx;
- int furyVersion=buffer.readVarInt32()
- Fury fury=getFury(furyVersion);
- fury.deserialize(buffer);
+MemoryBuffer buffer = xxx;
+int furyVersion = buffer.readVarInt32();
+Fury fury = getFury(furyVersion);
+fury.deserialize(buffer);
```
`getFury` is a method to load corresponding fury, you can shade and relocate
different version of fury to different
@@ -520,9 +552,38 @@ consistent between serialization and deserialization.
### Deserialize POJO into another type
-Fury allows you to serialize one POJO and deserialize it into a different
POJO. To achieve this, configure Fury with
+Fury allows you to serialize one POJO and deserialize it into a different
POJO. The different POJO means the schema inconsistency. Users must to
configure Fury with
`CompatibleMode` set to `org.apache.fury.config.CompatibleMode.COMPATIBLE`.
+```java
+public class DeserializeIntoType {
+ static class Struct1 {
+ int f1;
+ String f2;
+
+ public Struct1(int f1, String f2) {
+ this.f1 = f1;
+ this.f2 = f2;
+ }
+ }
+
+ static class Struct2 {
+ int f1;
+ String f2;
+ double f3;
+ }
+
+ static ThreadSafeFury fury = Fury.builder()
+ .withCompatibleMode(CompatibleMode.COMPATIBLE).buildThreadSafeFury();
+
+ public static void main(String[] args) {
+ Struct1 struct1 = new Struct1(10, "abc");
+ byte[] data = fury.serializeJavaObject(struct1);
+ Struct2 struct2 = (Struct2) fury.deserializeJavaObject(bytes,
Struct2.class);
+ }
+}
+```
+
### Use wrong API for deserialization
If you serialize an object by invoking `Fury#serialize`, you should invoke
`Fury#deserialize` for deserialization
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]