This is an automated email from the ASF dual-hosted git repository.
chaokunyang pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/fury-site.git
The following commit(s) were added to refs/heads/main by this push:
new 82e95f7 docs: translate communtiy doc (#154)
82e95f7 is described below
commit 82e95f7a0aeb26740efc1ead2d3bbc90ad271b62
Author: YuLuo <[email protected]>
AuthorDate: Sun Aug 18 21:31:56 2024 +0800
docs: translate communtiy doc (#154)
Signed-off-by: yuluo-yx <[email protected]>
Co-authored-by: Shawn Yang <[email protected]>
---
docs/guide/DEVELOPMENT.md | 3 +-
docs/guide/graalvm_guide.md | 29 ++-
docs/guide/java_serialization_guide.md | 16 +-
docs/guide/row_format_guide.md | 12 ++
docs/guide/scala_guide.md | 27 ++-
docs/guide/xlang_serialization_guide.md | 239 +++++++++++----------
docs/guide/xlang_type_mapping.md | 6 +-
docs/specification/java_serialization_spec.md | 70 +++---
docs/specification/row_format_spec.md | 3 +-
docs/specification/xlang_serialization_spec.md | 110 +++++-----
.../current/community/community.md | 106 +++++----
.../current/community/how_to_join_community.md | 2 +-
static/img/benchmarks/README.md | 34 ++-
13 files changed, 371 insertions(+), 286 deletions(-)
diff --git a/docs/guide/DEVELOPMENT.md b/docs/guide/DEVELOPMENT.md
index 3949dcb..5cbdd86 100644
--- a/docs/guide/DEVELOPMENT.md
+++ b/docs/guide/DEVELOPMENT.md
@@ -4,7 +4,7 @@ sidebar_position: 7
id: development
---
-# How to build Fury
+## How to build Fury
Please checkout the source tree from https://github.com/apache/fury.
@@ -99,4 +99,3 @@ npm run test
- node 14+
- npm 8+
-
diff --git a/docs/guide/graalvm_guide.md b/docs/guide/graalvm_guide.md
index 3ed919f..afaaa15 100644
--- a/docs/guide/graalvm_guide.md
+++ b/docs/guide/graalvm_guide.md
@@ -4,7 +4,8 @@ sidebar_position: 6
id: graalvm_guide
---
-# GraalVM Native Image
+## GraalVM Native Image
+
GraalVM `native image` can compile java code into native code ahead to build
faster, smaller, leaner applications.
The native image doesn't have a JIT compiler to compile bytecode into machine
code, and doesn't support
reflection unless configure reflection file.
@@ -16,6 +17,7 @@ In order to use Fury on graalvm native image, you must create
Fury as an **stati
the enclosing class initialize time. Then configure `native-image.properties`
under
`resources/META-INF/native-image/$xxx/native-image.propertie` to tell graalvm
to init the class at native image
build time. For example, here we configure `org.apache.fury.graalvm.Example`
class be init at build time:
+
```properties
Args = --initialize-at-build-time=org.apache.fury.graalvm.Example
```
@@ -29,7 +31,9 @@ Note that Fury `asyncCompilationEnabled` option will be
disabled automatically f
native image doesn't support JIT at the image run time.
## Not thread-safe Fury
+
Example:
+
```java
import org.apache.fury.Fury;
import org.apache.fury.util.Preconditions;
@@ -63,12 +67,15 @@ public class Example {
}
}
```
+
Then add `org.apache.fury.graalvm.Example` build time init to
`native-image.properties` configuration:
+
```properties
Args = --initialize-at-build-time=org.apache.fury.graalvm.Example
```
## Thread-safe Fury
+
```java
import org.apache.fury.Fury;
import org.apache.fury.ThreadLocalFury;
@@ -109,32 +116,40 @@ public class ThreadSafeExample {
}
}
```
+
Then add `org.apache.fury.graalvm.ThreadSafeExample` build time init to
`native-image.properties` configuration:
+
```properties
Args = --initialize-at-build-time=org.apache.fury.graalvm.ThreadSafeExample
```
## Framework Integration
+
For framework developers, if you want to integrate fury for serialization, you
can provided a configuration file to let
the users to list all the classes they want to serialize, then you can load
those classes and invoke
`org.apache.fury.Fury.register(Class<?>, boolean)` to register those classes
in your Fury integration class, and configure that
class be initialized at graalvm native image build time.
## Benchmark
+
Here we give two class benchmarks between Fury and Graalvm Serialization.
When Fury compression is disabled:
+
- Struct: Fury is `46x speed, 43% size` compared to JDK.
- Pojo: Fury is `12x speed, 56% size` compared to JDK.
When Fury compression is enabled:
+
- Struct: Fury is `24x speed, 31% size` compared to JDK.
- Pojo: Fury is `12x speed, 48% size` compared to JDK.
See
[[Benchmark.java](https://github.com/apache/fury/blob/main/integration_tests/graalvm_tests/src/main/java/org/apache/fury/graalvm/Benchmark.java)]
for benchmark code.
### Struct Benchmark
+
#### Class Fields
+
```java
public class Struct implements Serializable {
public int f1;
@@ -151,8 +166,11 @@ public class Struct implements Serializable {
public double f12;
}
```
+
#### Benchmark Results
+
No compression:
+
```
Benchmark repeat number: 400000
Object type: class org.apache.fury.graalvm.Struct
@@ -164,7 +182,9 @@ JDK serialization took mills: 2254
Compare speed: Fury is 45.70x speed of JDK
Compare size: Fury is 0.43x size of JDK
```
+
Compress number:
+
```
Benchmark repeat number: 400000
Object type: class org.apache.fury.graalvm.Struct
@@ -178,7 +198,9 @@ Compare size: Fury is 0.31x size of JDK
```
### Pojo Benchmark
+
#### Class Fields
+
```java
public class Foo implements Serializable {
int f1;
@@ -187,8 +209,11 @@ public class Foo implements Serializable {
Map<String, Long> f4;
}
```
+
#### Benchmark Results
+
No compression:
+
```
Benchmark repeat number: 400000
Object type: class org.apache.fury.graalvm.Foo
@@ -200,7 +225,9 @@ JDK serialization took mills: 16266
Compare speed: Fury is 12.19x speed of JDK
Compare size: Fury is 0.56x size of JDK
```
+
Compress number:
+
```
Benchmark repeat number: 400000
Object type: class org.apache.fury.graalvm.Foo
diff --git a/docs/guide/java_serialization_guide.md
b/docs/guide/java_serialization_guide.md
index de179f4..17f5573 100644
--- a/docs/guide/java_serialization_guide.md
+++ b/docs/guide/java_serialization_guide.md
@@ -4,8 +4,6 @@ sidebar_position: 0
id: java_object_graph_guide
---
-# Java object graph serialization
-
When only java object serialization needed, this mode will have better
performance compared to cross-language object
graph serialization.
@@ -179,12 +177,12 @@ bit is set, then next byte will be read util first bit of
next byte is unset.
For long compression, fury support two encoding:
- Fury SLI(Small long as int) Encoding (**used by default**):
- - If long is in [-1073741824, 1073741823], encode as 4 bytes int: `|
little-endian: ((int) value) << 1 |`
- - Otherwise write as 9 bytes: `| 0b1 | little-endian 8bytes long |`
+ - If long is in [-1073741824, 1073741823], encode as 4 bytes int: `|
little-endian: ((int) value) << 1 |`
+ - Otherwise write as 9 bytes: `| 0b1 | little-endian 8bytes long |`
- Fury PVL(Progressive Variable-length Long) Encoding:
- - First bit in every byte indicate whether has next byte. if first bit is
set, then next byte will be read util
+ - First bit in every byte indicate whether has next byte. if first bit is
set, then next byte will be read util
first bit of next byte is unset.
- - Negative number will be converted to positive number by ` (v << 1) ^ (v
>> 63)` to reduce cost of small negative
+ - Negative number will be converted to positive number by `(v << 1) ^ (v >>
63)` to reduce cost of small negative
numbers.
If a number are `long` type, it can't be represented by smaller bytes mostly,
the compression won't get good enough
@@ -218,11 +216,7 @@ Fury fury=Fury.builder()
### Implement a customized serializer
-In some cases, you may want to implement a serializer for your type,
especially some class customize serialization by
-JDK
-writeObject/writeReplace/readObject/readResolve, which is very inefficient.
For example, you don't want
-following `Foo#writeObject`
-got invoked, you can take following `FooSerializer` as an example:
+In some cases, you may want to implement a serializer for your type,
especially some class customize serialization by JDK
writeObject/writeReplace/readObject/readResolve, which is very inefficient. For
example, you don't want following `Foo#writeObject` got invoked, you can take
following `FooSerializer` as an example:
```java
class Foo {
diff --git a/docs/guide/row_format_guide.md b/docs/guide/row_format_guide.md
index 076a2c0..1083297 100644
--- a/docs/guide/row_format_guide.md
+++ b/docs/guide/row_format_guide.md
@@ -5,7 +5,9 @@ id: row_format_guide
---
## Row format protocol
+
### Java
+
```java
public class Bar {
String f1;
@@ -50,7 +52,9 @@ RowEncoder<Bar> barEncoder = Encoders.bean(Bar.class);
Bar newBar = barEncoder.fromRow(barStruct);
Bar newBar2 = barEncoder.fromRow(binaryArray4.getStruct(20));
```
+
### Python
+
```python
@dataclass
class Bar:
@@ -79,10 +83,13 @@ new_foo = pickle.loads(binary)
print(new_foo.f2[100000], new_foo.f4[100000].f1, new_foo.f4[200000].f2[5])
print(f"pickle end: {datetime.datetime.now()}")
```
+
### Apache Arrow Support
+
Fury Format also supports automatic conversion from/to Arrow Table/RecordBatch.
Java:
+
```java
Schema schema = TypeInference.inferSchema(BeanA.class);
ArrowWriter arrowWriter = ArrowUtils.createArrowWriter(schema);
@@ -93,14 +100,18 @@ for (int i = 0; i < 10; i++) {
}
return arrowWriter.finishAsRecordBatch();
```
+
Python:
+
```python
import pyfury
encoder = pyfury.encoder(Foo)
encoder.to_arrow_record_batch([foo] * 10000)
encoder.to_arrow_table([foo] * 10000)
```
+
C++
+
```c++
std::shared_ptr<ArrowWriter> arrow_writer;
EXPECT_TRUE(
@@ -115,6 +126,7 @@ EXPECT_TRUE(record_batch->Validate().ok());
EXPECT_EQ(record_batch->num_columns(), schema->num_fields());
EXPECT_EQ(record_batch->num_rows(), row_nums);
```
+
```java
Schema schema = TypeInference.inferSchema(BeanA.class);
ArrowWriter arrowWriter = ArrowUtils.createArrowWriter(schema);
diff --git a/docs/guide/scala_guide.md b/docs/guide/scala_guide.md
index aa8f99b..8164c47 100644
--- a/docs/guide/scala_guide.md
+++ b/docs/guide/scala_guide.md
@@ -4,8 +4,8 @@ sidebar_position: 4
id: scala_guide
---
-# Scala serialization
Fury supports all scala object serialization:
+
- `case` class serialization supported
- `pojo/bean` class serialization supported
- `object` singleton serialization supported
@@ -15,12 +15,15 @@ Fury supports all scala object serialization:
Scala 2 and 3 are both supported.
## Install
+
```sbt
libraryDependencies += "org.apache.fury" % "fury-core" % "0.7.0"
```
## Fury creation
+
When using fury for scala serialization, you should create fury at least with
following options:
+
```scala
val fury = Fury.builder()
.withScalaOptimizationEnabled(true)
@@ -28,21 +31,25 @@ val fury = Fury.builder()
.withRefTracking(true)
.build()
```
+
Depending on the object types you serialize, you may need to register some
scala internal types:
+
```scala
fury.register(Class.forName("scala.collection.generic.DefaultSerializationProxy"))
fury.register(Class.forName("scala.Enumeration.Val"))
```
-If you want to avoid such registration, you can disable class registration by
`FuryBuilder#requireClassRegistration(false)`.
+
+If you want to avoid such registration, you can disable class registration by
`FuryBuilder#requireClassRegistration(false)` .
Note that this option allow to deserialize objects unknown types, more
flexible but may be insecure if the classes contains malicious code.
-And circular references are common in scala, `Reference tracking` should be
enabled by `FuryBuilder#withRefTracking(true)`. If you don't enable reference
tracking, [StackOverflowError](https://github.com/apache/fury/issues/1032) may
happen for some scala versions when serializing scala Enumeration.
+And circular references are common in scala, `Reference tracking` should be
enabled by `FuryBuilder#withRefTracking(true)` . If you don't enable reference
tracking, [StackOverflowError](https://github.com/apache/fury/issues/1032) may
happen for some scala versions when serializing scala Enumeration.
Note that fury instance should be shared between multiple serialization, the
creation of fury instance is not cheap.
If you use shared fury instance across multiple threads, you should create
`ThreadSafeFury` instead by `FuryBuilder#buildThreadSafeFury()` instead.
## Serialize case object
+
```scala
case class Person(github: String, age: Int, id: Long)
val p = Person("https://github.com/chaokunyang", 18, 1)
@@ -51,6 +58,7 @@
println(fury.deserializeJavaObject(fury.serializeJavaObject(p)))
```
## Serialize pojo
+
```scala
class Foo(f1: Int, f2: String) {
override def toString: String = s"Foo($f1, $f2)"
@@ -59,6 +67,7 @@ println(fury.deserialize(fury.serialize(Foo(1,
"chaokunyang"))))
```
## Serialize object singleton
+
```scala
object singleton {
}
@@ -68,6 +77,7 @@ println(o1 == o2)
```
## Serialize collection
+
```scala
val seq = Seq(1,2)
val list = List("a", "b")
@@ -78,6 +88,7 @@ println(fury.deserialize(fury.serialize(map)))
```
## Serialize Tuple
+
```scala
val tuple = Tuple2(100, 10000L)
println(fury.deserialize(fury.serialize(tuple)))
@@ -86,12 +97,16 @@ println(fury.deserialize(fury.serialize(tuple)))
```
## Serialize Enum
+
### Scala3 Enum
+
```scala
enum Color { case Red, Green, Blue }
println(fury.deserialize(fury.serialize(Color.Green)))
```
+
### Scala2 Enum
+
```scala
object ColorEnum extends Enumeration {
type ColorEnum = Value
@@ -101,6 +116,7 @@ println(fury.deserialize(fury.serialize(ColorEnum.Green)))
```
## Serialize Option
+
```scala
val opt: Option[Long] = Some(100)
println(fury.deserialize(fury.serialize(opt)))
@@ -108,12 +124,13 @@ val opt1: Option[Long] = None
println(fury.deserialize(fury.serialize(opt1)))
```
-# Performance
+## Performance
+
Scala `pojo/bean/case/object` are supported by fury jit well, the performance
is as good as fury java.
Scala collections and generics doesn't follow java collection framework, and
is not fully integrated with Fury JIT in current release version. The
performance won't be as good as fury collections serialization for java.
-The execution for scala collections will invoke Java serialization API
`writeObject/readObject/writeReplace/readResolve/readObjectNoData/Externalizable`
with fury `ObjectStream` implementation. Although
`org.apache.fury.serializer.ObjectStreamSerializer` is much faster than JDK
`ObjectOutputStream/ObjectInputStream`, but it still doesn't know how use scala
collection generics.
+The execution for scala collections will invoke Java serialization API
`writeObject/readObject/writeReplace/readResolve/readObjectNoData/Externalizable`
with fury `ObjectStream` implementation. Although
`org.apache.fury.serializer.ObjectStreamSerializer` is much faster than JDK
`ObjectOutputStream/ObjectInputStream` , but it still doesn't know how use
scala collection generics.
In future we plan to provide more optimization for scala types, see
https://github.com/apache/fury/issues/682, stay tuned!
diff --git a/docs/guide/xlang_serialization_guide.md
b/docs/guide/xlang_serialization_guide.md
index a68348e..57ef22d 100644
--- a/docs/guide/xlang_serialization_guide.md
+++ b/docs/guide/xlang_serialization_guide.md
@@ -7,8 +7,8 @@ id: xlang_object_graph_guide
## Cross-language object graph serialization
### Serialize built-in types
-Common types can be serialized automatically: primitive numeric types, string,
binary, array, list, map and so on.
+Common types can be serialized automatically: primitive numeric types, string,
binary, array, list, map and so on.
**Java**
@@ -64,32 +64,32 @@ import furygo "github.com/apache/fury/fury/go/fury"
import "fmt"
func main() {
- list := []interface{}{true, false, "str", -1.1, 1, make([]int32, 10),
make([]float64, 20)}
- fury := furygo.NewFury()
- bytes, err := fury.Marshal(list)
- if err != nil {
- panic(err)
- }
- var newValue interface{}
- // bytes can be data serialized by other languages.
- if err := fury.Unmarshal(bytes, &newValue); err != nil {
- panic(err)
- }
- fmt.Println(newValue)
- dict := map[string]interface{}{
- "k1": "v1",
- "k2": list,
- "k3": -1,
- }
- bytes, err = fury.Marshal(dict)
- if err != nil {
- panic(err)
- }
- // bytes can be data serialized by other languages.
- if err := fury.Unmarshal(bytes, &newValue); err != nil {
- panic(err)
- }
- fmt.Println(newValue)
+ list := []interface{}{true, false, "str", -1.1, 1, make([]int32, 10),
make([]float64, 20)}
+ fury := furygo.NewFury()
+ bytes, err := fury.Marshal(list)
+ if err != nil {
+ panic(err)
+ }
+ var newValue interface{}
+ // bytes can be data serialized by other languages.
+ if err := fury.Unmarshal(bytes, &newValue); err != nil {
+ panic(err)
+ }
+ fmt.Println(newValue)
+ dict := map[string]interface{}{
+ "k1": "v1",
+ "k2": list,
+ "k3": -1,
+ }
+ bytes, err = fury.Marshal(dict)
+ if err != nil {
+ panic(err)
+ }
+ // bytes can be data serialized by other languages.
+ if err := fury.Unmarshal(bytes, &newValue); err != nil {
+ panic(err)
+ }
+ fmt.Println(newValue)
}
```
@@ -126,6 +126,7 @@ fn run() {
```
### Serialize custom types
+
Serializing user-defined types needs registering the custom type using the
register API to establish the mapping relationship between the type in
different languages.
**Java**
@@ -255,59 +256,59 @@ import furygo "github.com/apache/fury/fury/go/fury"
import "fmt"
func main() {
- type SomeClass1 struct {
- F1 interface{}
- F2 string
- F3 []interface{}
- F4 map[int8]int32
- F5 int8
- F6 int16
- F7 int32
- F8 int64
- F9 float32
- F10 float64
- F11 []int16
- F12 fury.Int16Slice
- }
-
- type SomeClas2 struct {
- F1 interface{}
- F2 map[int8]int32
- }
- fury := furygo.NewFury()
- if err := fury.RegisterTagType("example.SomeClass1", SomeClass1{}); err
!= nil {
- panic(err)
- }
- if err := fury.RegisterTagType("example.SomeClass2", SomeClass2{}); err
!= nil {
- panic(err)
- }
- obj1 := &SomeClass1{}
- obj1.F1 = true
- obj1.F2 = map[int8]int32{-1: 2}
- obj := &SomeClass1{}
- obj.F1 = obj1
- obj.F2 = "abc"
- obj.F3 = []interface{}{"abc", "abc"}
- f4 := map[int8]int32{1: 2}
- obj.F4 = f4
- obj.F5 = fury.MaxInt8
- obj.F6 = fury.MaxInt16
- obj.F7 = fury.MaxInt32
- obj.F8 = fury.MaxInt64
- obj.F9 = 1.0 / 2
- obj.F10 = 1 / 3.0
- obj.F11 = []int16{1, 2}
- obj.F12 = []int16{-1, 4}
- bytes, err := fury.Marshal(obj);
- if err != nil {
- panic(err)
- }
- var newValue interface{}
- // bytes can be data serialized by other languages.
- if err := fury.Unmarshal(bytes, &newValue); err != nil {
- panic(err)
- }
- fmt.Println(newValue)
+ type SomeClass1 struct {
+ F1 interface{}
+ F2 string
+ F3 []interface{}
+ F4 map[int8]int32
+ F5 int8
+ F6 int16
+ F7 int32
+ F8 int64
+ F9 float32
+ F10 float64
+ F11 []int16
+ F12 fury.Int16Slice
+ }
+
+ type SomeClas2 struct {
+ F1 interface{}
+ F2 map[int8]int32
+ }
+ fury := furygo.NewFury()
+ if err := fury.RegisterTagType("example.SomeClass1", SomeClass1{}); err !=
nil {
+ panic(err)
+ }
+ if err := fury.RegisterTagType("example.SomeClass2", SomeClass2{}); err !=
nil {
+ panic(err)
+ }
+ obj1 := &SomeClass1{}
+ obj1.F1 = true
+ obj1.F2 = map[int8]int32{-1: 2}
+ obj := &SomeClass1{}
+ obj.F1 = obj1
+ obj.F2 = "abc"
+ obj.F3 = []interface{}{"abc", "abc"}
+ f4 := map[int8]int32{1: 2}
+ obj.F4 = f4
+ obj.F5 = fury.MaxInt8
+ obj.F6 = fury.MaxInt16
+ obj.F7 = fury.MaxInt32
+ obj.F8 = fury.MaxInt64
+ obj.F9 = 1.0 / 2
+ obj.F10 = 1 / 3.0
+ obj.F11 = []int16{1, 2}
+ obj.F12 = []int16{-1, 4}
+ bytes, err := fury.Marshal(obj);
+ if err != nil {
+ panic(err)
+ }
+ var newValue interface{}
+ // bytes can be data serialized by other languages.
+ if err := fury.Unmarshal(bytes, &newValue); err != nil {
+ panic(err)
+ }
+ fmt.Println(newValue)
}
```
@@ -394,6 +395,7 @@ fn complex_struct() {
```
### Serialize Shared Reference and Circular Reference
+
Shared reference and circular reference can be serialized automatically, no
duplicate data or recursion error.
**Java**
@@ -460,27 +462,27 @@ import furygo "github.com/apache/fury/fury/go/fury"
import "fmt"
func main() {
- type SomeClass struct {
- F1 *SomeClass
- F2 map[string]string
- F3 map[string]string
- }
- fury := furygo.NewFury(true)
- if err := fury.RegisterTagType("example.SomeClass", SomeClass{}); err
!= nil {
- panic(err)
- }
- value := &SomeClass{F2: map[string]string{"k1": "v1", "k2": "v2"}}
- value.F3 = value.F2
- value.F1 = value
- bytes, err := fury.Marshal(value)
- if err != nil {
- }
- var newValue interface{}
- // bytes can be data serialized by other languages.
- if err := fury.Unmarshal(bytes, &newValue); err != nil {
- panic(err)
- }
- fmt.Println(newValue)
+ type SomeClass struct {
+ F1 *SomeClass
+ F2 map[string]string
+ F3 map[string]string
+ }
+ fury := furygo.NewFury(true)
+ if err := fury.RegisterTagType("example.SomeClass", SomeClass{}); err != nil {
+ panic(err)
+ }
+ value := &SomeClass{F2: map[string]string{"k1": "v1", "k2": "v2"}}
+ value.F3 = value.F2
+ value.F1 = value
+ bytes, err := fury.Marshal(value)
+ if err != nil {
+ }
+ var newValue interface{}
+ // bytes can be data serialized by other languages.
+ if err := fury.Unmarshal(bytes, &newValue); err != nil {
+ panic(err)
+ }
+ fmt.Println(newValue)
}
```
@@ -514,7 +516,6 @@ console.log(result.bar.foo === result.foo);
**JavaScript**
Reference cannot be implemented because of rust ownership restrictions
-
### Zero-Copy Serialization
**Java**
@@ -569,23 +570,23 @@ import furygo "github.com/apache/fury/fury/go/fury"
import "fmt"
func main() {
- fury := furygo.NewFury()
- list := []interface{}{"str", make([]byte, 1000)}
- buf := fury.NewByteBuffer(nil)
- var bufferObjects []fury.BufferObject
- fury.Serialize(buf, list, func(o fury.BufferObject) bool {
- bufferObjects = append(bufferObjects, o)
- return false
- })
- var newList []interface{}
- var buffers []*fury.ByteBuffer
- for _, o := range bufferObjects {
- buffers = append(buffers, o.ToBuffer())
- }
- if err := fury.Deserialize(buf, &newList, buffers); err != nil {
- panic(err)
- }
- fmt.Println(newList)
+ fury := furygo.NewFury()
+ list := []interface{}{"str", make([]byte, 1000)}
+ buf := fury.NewByteBuffer(nil)
+ var bufferObjects []fury.BufferObject
+ fury.Serialize(buf, list, func(o fury.BufferObject) bool {
+ bufferObjects = append(bufferObjects, o)
+ return false
+ })
+ var newList []interface{}
+ var buffers []*fury.ByteBuffer
+ for _, o := range bufferObjects {
+ buffers = append(buffers, o.ToBuffer())
+ }
+ if err := fury.Deserialize(buf, &newList, buffers); err != nil {
+ panic(err)
+ }
+ fmt.Println(newList)
}
```
diff --git a/docs/guide/xlang_type_mapping.md b/docs/guide/xlang_type_mapping.md
index 4baa455..43c7e3c 100644
--- a/docs/guide/xlang_type_mapping.md
+++ b/docs/guide/xlang_type_mapping.md
@@ -10,7 +10,7 @@ Note:
- `int16_t[n]/vector<T>` indicates `int16_t[n]/vector<int16_t>`
- The cross-language serialization is not stable, do not use it in your
production environment.
-# Type Mapping
+## Type Mapping
| Fury Type | Fury Type ID | Java | Python |
Javascript | C++ | Golang | Rust
|
|--------------------|--------------|-----------------|----------------------|-----------------|--------------------------------|------------------|------------------|
@@ -48,7 +48,7 @@ Note:
| arrow record batch | 32 | / | / |
/ | / | / | /
|
| arrow table | 33 | / | / |
/ | / | / | /
|
-# Type info(not implemented currently)
+## Type info(not implemented currently)
Due to differences between type systems of languages, those types can't be
mapped one-to-one between languages.
@@ -70,6 +70,7 @@ Such information can be provided in other languages too:
Here is en example:
- Java:
+
```java
class Foo {
@Int32Type(varint = true)
@@ -77,6 +78,7 @@ Here is en example:
List<@Int32Type(varint = true) Integer> f2;
}
```
+
- Python:
```python
diff --git a/docs/specification/java_serialization_spec.md
b/docs/specification/java_serialization_spec.md
index 592413a..a235725 100644
--- a/docs/specification/java_serialization_spec.md
+++ b/docs/specification/java_serialization_spec.md
@@ -4,8 +4,6 @@ sidebar_position: 1
id: fury_java_serialization_spec
---
-# Fury Java Serialization Specification
-
## Spec overview
Fury Java Serialization is an automatic object serialization framework that
supports reference and polymorphism. Fury
@@ -77,14 +75,14 @@ If schema consistent mode is enabled globally or enabled
for current class, clas
- If class is registered, it will be written as a fury unsigned varint:
`class_id << 1`.
- If class is not registered:
- - If class is not an array, fury will write one byte `0bxxxxxxx1` first,
then write class name.
- - The first little bit is `1`, which is different from first bit `0` of
+ - If class is not an array, fury will write one byte `0bxxxxxxx1` first,
then write class name.
+ - The first little bit is `1`, which is different from first bit `0` of
encoded class id. Fury can use this information to determine whether
to read class by class id for
deserialization.
- - If class is not registered and class is an array, fury will write one
byte `dimensions << 1 | 1` first, then write
+ - If class is not registered and class is an array, fury will write one byte
`dimensions << 1 | 1` first, then write
component
class subsequently. This can reduce array class name cost if component
class is or will be serialized.
- - Class will be written as two enumerated fury unsigned by default:
`package name` and `class name`. If meta share
+ - Class will be written as two enumerated fury unsigned by default: `package
name` and `class name`. If meta share
mode is
enabled,
class will be written as an unsigned varint which points to index in
`MetaContext`.
@@ -145,10 +143,10 @@ Meta header is a 64 bits number value encoded in little
endian order.
```
- num fields: encode `num fields << 1 | register flag(1 when class
registered)` as unsigned varint.
- - If class is registered, then an unsigned varint class id will be written
next, package and class name will be
+ - If class is registered, then an unsigned varint class id will be written
next, package and class name will be
omitted.
- - If current class is schema consistent, then num field will be `0` to
flag it.
- - If current class isn't schema consistent, then num field will be the
number of compatible fields. For example,
+ - If current class is schema consistent, then num field will be `0` to flag
it.
+ - If current class isn't schema consistent, then num field will be the
number of compatible fields. For example,
users
can use tag id to mark some field as compatible field in schema
consistent context. In such cases, schema
consistent
@@ -156,34 +154,34 @@ Meta header is a 64 bits number value encoded in little
endian order.
fields info of those fields which aren't annotated by tag id for
deserializing schema consistent fields, then use
fields info in meta for deserializing compatible fields.
- Package name encoding(omitted when class is registered):
- - encoding algorithm: `UTF8/ALL_TO_LOWER_SPECIAL/LOWER_UPPER_DIGIT_SPECIAL`
- - Header: `6 bits size | 2 bits encoding flags`. The `6 bits size: 0~63`
will be used to indicate size `0~62`,
+ - encoding algorithm: `UTF8/ALL_TO_LOWER_SPECIAL/LOWER_UPPER_DIGIT_SPECIAL`
+ - Header: `6 bits size | 2 bits encoding flags`. The `6 bits size: 0~63`
will be used to indicate size `0~62`,
the value `63` the size need more byte to read, the encoding will encode
`size - 62` as a varint next.
- Class name encoding(omitted when class is registered):
- - encoding algorithm:
`UTF8/LOWER_UPPER_DIGIT_SPECIAL/FIRST_TO_LOWER_SPECIAL/ALL_TO_LOWER_SPECIAL`
- - header: `6 bits size | 2 bits encoding flags`. The `6 bits size: 0~63`
will be used to indicate size `1~64`,
+ - encoding algorithm:
`UTF8/LOWER_UPPER_DIGIT_SPECIAL/FIRST_TO_LOWER_SPECIAL/ALL_TO_LOWER_SPECIAL`
+ - header: `6 bits size | 2 bits encoding flags`. The `6 bits size: 0~63`
will be used to indicate size `1~64`,
the value `63` the size need more byte to read, the encoding will encode
`size - 63` as a varint next.
- Field info:
- - header(8
+ - header(8
bits): `3 bits size + 2 bits field name encoding + polymorphism flag +
nullability flag + ref tracking flag`.
Users can use annotation to provide those info.
- - 2 bits field name encoding:
- - encoding:
`UTF8/ALL_TO_LOWER_SPECIAL/LOWER_UPPER_DIGIT_SPECIAL/TAG_ID`
- - If tag id is used, i.e. field name is written by an unsigned
varint tag id. 2 bits encoding will be `11`.
- - size of field name:
- - The `3 bits size: 0~7` will be used to indicate length `1~7`,
the value `6` the size read more bytes,
+ - 2 bits field name encoding:
+ - encoding: `UTF8/ALL_TO_LOWER_SPECIAL/LOWER_UPPER_DIGIT_SPECIAL/TAG_ID`
+ - If tag id is used, i.e. field name is written by an unsigned varint
tag id. 2 bits encoding will be `11`.
+ - size of field name:
+ - The `3 bits size: 0~7` will be used to indicate length `1~7`, the
value `6` the size read more bytes,
the encoding will encode `size - 7` as a varint next.
- - If encoding is `TAG_ID`, then num_bytes of field name will be
used to store tag id.
- - ref tracking: when set to 1, ref tracking will be enabled for this
field.
- - nullability: when set to 1, this field can be null.
- - polymorphism: when set to 1, the actual type of field will be the
declared field type even the type if
+ - If encoding is `TAG_ID`, then num_bytes of field name will be used to
store tag id.
+ - ref tracking: when set to 1, ref tracking will be enabled for this field.
+ - nullability: when set to 1, this field can be null.
+ - polymorphism: when set to 1, the actual type of field will be the
declared field type even the type if
not `final`.
- - type id:
- - For registered type-consistent classes, it will be the registered
class id.
- - Otherwise it will be encoded as `OBJECT_ID` if it isn't `final` and
`FINAL_OBJECT_ID` if it's `final`. The
+ - type id:
+ - For registered type-consistent classes, it will be the registered class
id.
+ - Otherwise it will be encoded as `OBJECT_ID` if it isn't `final` and
`FINAL_OBJECT_ID` if it's `final`. The
meta for such types is written separately instead of inlining here
is to reduce meta space cost if object of
this type is serialized in current object graph multiple times, and
the field value may be null too.
- - Field name: If type id is set, type id will be used instead. Otherwise
meta string encoding length and data will
+ - Field name: If type id is set, type id will be used instead. Otherwise
meta string encoding length and data will
be written instead.
Field order are left as implementation details, which is not exposed to
specification, the deserialization need to
@@ -195,12 +193,12 @@ using a more compact encoding.
Same encoding algorithm as the previous layer except:
- header + package name:
- - Header:
- - If package name has been written before: `varint index + sharing
flag(set)` will be written
- - If package name hasn't been written before:
- - If meta string encoding is `LOWER_SPECIAL` and the length of
encoded string `<=` 64, then header will be
+ - Header:
+ - If package name has been written before: `varint index + sharing
flag(set)` will be written
+ - If package name hasn't been written before:
+ - If meta string encoding is `LOWER_SPECIAL` and the length of encoded
string `<=` 64, then header will be
`6 bits size + encoding flag(set) + sharing flag(unset)`.
- - Otherwise, header will
+ - Otherwise, header will
be `3 bits unset + 3 bits encoding flags + encoding flag(unset)
+ sharing flag(unset)`
## Meta String
@@ -307,17 +305,17 @@ If string has been written before, the data will be
written as follows:
- size: 1~9 byte
- Fury PVL(Progressive Variable-length Long) Encoding:
- - positive long format: first bit in every byte indicates whether to have
the next byte. If first bit is set
+ - positive long format: first bit in every byte indicates whether to have
the next byte. If first bit is set
i.e. `b & 0x80 == 0x80`, then the next byte should be read until the
first bit is unset.
#### Signed long
- size: 1~9 byte
- Fury SLI(Small long as int) Encoding:
- - If long is in [-1073741824, 1073741823], encode as 4 bytes int: `|
little-endian: ((int) value) << 1 |`
- - Otherwise write as 9 bytes: `| 0b1 | little-endian 8 bytes long |`
+ - If long is in [-1073741824, 1073741823], encode as 4 bytes int: `|
little-endian: ((int) value) << 1 |`
+ - Otherwise write as 9 bytes: `| 0b1 | little-endian 8 bytes long |`
- Fury PVL(Progressive Variable-length Long) Encoding:
- - First convert the number into positive unsigned long by ` (v << 1) ^ (v
>> 63)` ZigZag algorithm to reduce cost of
+ - First convert the number into positive unsigned long by `(v << 1) ^ (v >>
63)` ZigZag algorithm to reduce cost of
small negative numbers, then encoding it as an unsigned long.
#### Float
diff --git a/docs/specification/row_format_spec.md
b/docs/specification/row_format_spec.md
index d32c11a..f368eb7 100644
--- a/docs/specification/row_format_spec.md
+++ b/docs/specification/row_format_spec.md
@@ -4,5 +4,6 @@ sidebar_position: 2
id: fury_row_format_spec
---
-# Row Format
+## Row Format
+
Coming soon
diff --git a/docs/specification/xlang_serialization_spec.md
b/docs/specification/xlang_serialization_spec.md
index 5882d00..5f79f66 100644
--- a/docs/specification/xlang_serialization_spec.md
+++ b/docs/specification/xlang_serialization_spec.md
@@ -4,9 +4,10 @@ sidebar_position: 0
id: fury_xlang_serialization_spec
---
-# Cross-language Serialization Specification
+## Cross-language Serialization Specification
> Format Version History:
+>
> - Version 0.1 - serialization spec formalized
Fury xlang serialization is an automatic object serialization framework that
supports reference and polymorphism.
@@ -42,22 +43,22 @@ also introduce more complexities compared to static
serialization frameworks. So
- set: an unordered set of unique elements.
- map: a map of key-value pairs. Mutable types such as
`list/map/set/array/tensor/arrow` are not allowed as key of map.
- time types:
- - duration: an absolute length of time, independent of any
calendar/timezone, as a count of nanoseconds.
- - timestamp: a point in time, independent of any calendar/timezone, as a
count of nanoseconds. The count is relative
+ - duration: an absolute length of time, independent of any
calendar/timezone, as a count of nanoseconds.
+ - timestamp: a point in time, independent of any calendar/timezone, as a
count of nanoseconds. The count is relative
to an epoch at UTC midnight on January 1, 1970.
- decimal: exact decimal value represented as an integer value in two's
complement.
- binary: an variable-length array of bytes.
- array type: only allow numeric components. Other arrays will be taken as
List. The implementation should support the
interoperability between array and list.
- - array: multidimensional array which every sub-array can have different
sizes but all have same type.
- - bool_array: one dimensional int16 array.
- - int8_array: one dimensional int8 array.
- - int16_array: one dimensional int16 array.
- - int32_array: one dimensional int32 array.
- - int64_array: one dimensional int64 array.
- - float16_array: one dimensional half_float_16 array.
- - float32_array: one dimensional float32 array.
- - float64_array: one dimensional float64 array.
+ - array: multidimensional array which every sub-array can have different
sizes but all have same type.
+ - bool_array: one dimensional int16 array.
+ - int8_array: one dimensional int8 array.
+ - int16_array: one dimensional int16 array.
+ - int32_array: one dimensional int32 array.
+ - int64_array: one dimensional int64 array.
+ - float16_array: one dimensional half_float_16 array.
+ - float32_array: one dimensional float32 array.
+ - float64_array: one dimensional float64 array.
- tensor: a multidimensional dense array of fixed-size values such as a NumPy
ndarray.
- sparse tensor: a multidimensional array whose elements are almost all zeros.
- arrow record batch: an arrow [record
batch](https://arrow.apache.org/docs/cpp/tables.html#record-batches) object.
@@ -197,9 +198,9 @@ differently.
of `type_id`. Schema evolution related meta will be ignored.
- If schema evolution mode is enabled globally when creating fury, and current
class is configured to use schema
consistent mode like `struct` vs `table` in flatbuffers:
- - Type meta will be add to `captured_type_defs`: `captured_type_defs[type
def stub] = map size` ahead when
+ - Type meta will be add to `captured_type_defs`: `captured_type_defs[type
def stub] = map size` ahead when
registering type.
- - Get index of the meta in `captured_type_defs`, write that index as `|
unsigned varint: index |`.
+ - Get index of the meta in `captured_type_defs`, write that index as `|
unsigned varint: index |`.
### Schema evolution
@@ -207,21 +208,24 @@ If schema evolution mode is enabled globally when
creating fury, and enabled for
using one of the following mode. Which mode to use is configured when creating
fury.
- Normal mode(meta share not enabled):
- - If type meta hasn't been written before, add `type def`
+ - If type meta hasn't been written before, add `type def`
to `captured_type_defs`: `captured_type_defs[type def] = map size`.
- - Get index of the meta in `captured_type_defs`, write that index as `|
unsigned varint: index |`.
- - After finished the serialization of the object graph, fury will start to
write `captured_type_defs`:
- - Firstly, set current to `meta start offset` of fury header
- - Then write `captured_type_defs` one by one:
- ```python
- buffer.write_var_uint32(len(writting_type_defs) -
len(schema_consistent_type_def_stubs))
- for type_meta in writting_type_defs:
- if not type_meta.is_stub():
- type_meta.write_type_def(buffer)
- writing_type_defs = copy(schema_consistent_type_def_stubs)
- ```
+ - Get index of the meta in `captured_type_defs`, write that index as `|
unsigned varint: index |`.
+ - After finished the serialization of the object graph, fury will start to
write `captured_type_defs`:
+ - Firstly, set current to `meta start offset` of fury header
+ - Then write `captured_type_defs` one by one:
+
+ ```python
+ buffer.write_var_uint32(len(writting_type_defs) -
len(schema_consistent_type_def_stubs))
+ for type_meta in writting_type_defs:
+ if not type_meta.is_stub():
+ type_meta.write_type_def(buffer)
+ writing_type_defs = copy(schema_consistent_type_def_stubs)
+ ```
+
- Meta share mode: the writing steps are same as the normal mode, but
`captured_type_defs` will be shared across
multiple serializations of different objects. For example, suppose we have a
batch to serialize:
+
```python
captured_type_defs = {}
stream = ...
@@ -234,16 +238,20 @@ using one of the following mode. Which mode to use is
configured when creating f
```
- Streaming mode(streaming mode doesn't support meta share):
- - If type meta hasn't been written before, the data will be written as:
+ - If type meta hasn't been written before, the data will be written as:
+
```
| unsigned varint: 0b11111111 | type def |
```
- - If type meta has been written before, the data will be written as:
+
+ - If type meta has been written before, the data will be written as:
+
```
| unsigned varint: written index << 1 |
```
+
`written index` is the id in `captured_type_defs`.
- - With this mode, `meta start offset` can be omitted.
+ - With this mode, `meta start offset` can be omitted.
> The normal mode and meta share mode will forbid streaming writing since it
> needs to look back for update the start
> offset after the whole object graph writing and meta collecting is finished.
> Only in this way we can ensure
@@ -281,33 +289,33 @@ Meta header is a 64 bits number value encoded in little
endian order.
```
- num fields: encode `num fields` as unsigned varint.
- - If the current type is schema consistent, then num_fields will be `0` to
flag it.
- - If the current type isn't schema consistent, then num_fields will be the
number of compatible fields. For example,
+ - If the current type is schema consistent, then num_fields will be `0` to
flag it.
+ - If the current type isn't schema consistent, then num_fields will be the
number of compatible fields. For example,
users can use tag id to mark some fields as compatible fields in schema
consistent context. In such cases, schema
consistent fields will be serialized first, then compatible fields will
be serialized next. At deserialization,
Fury will use fields info of those fields which aren't annotated by tag
id for deserializing schema consistent
fields, then use fields info in meta for deserializing compatible fields.
- type id: the registered id for the current type, which will be written as an
unsigned varint.
- field info:
- - header(8
+ - header(8
bits): `3 bits size + 2 bits field name encoding + polymorphism flag +
nullability flag + ref tracking flag`.
Users can use annotation to provide those info.
- - 2 bits field name encoding:
- - encoding:
`UTF8/ALL_TO_LOWER_SPECIAL/LOWER_UPPER_DIGIT_SPECIAL/TAG_ID`
- - If tag id is used, i.e. field name is written by an unsigned
varint tag id. 2 bits encoding will be `11`.
- - size of field name:
- - The `3 bits size: 0~7` will be used to indicate length `1~7`,
the value `7` indicates to read more bytes,
+ - 2 bits field name encoding:
+ - encoding: `UTF8/ALL_TO_LOWER_SPECIAL/LOWER_UPPER_DIGIT_SPECIAL/TAG_ID`
+ - If tag id is used, i.e. field name is written by an unsigned varint
tag id. 2 bits encoding will be `11`.
+ - size of field name:
+ - The `3 bits size: 0~7` will be used to indicate length `1~7`, the
value `7` indicates to read more bytes,
the encoding will encode `size - 7` as a varint next.
- - If encoding is `TAG_ID`, then num_bytes of field name will be
used to store tag id.
- - ref tracking: when set to 1, ref tracking will be enabled for this
field.
- - nullability: when set to 1, this field can be null.
- - polymorphism: when set to 1, the actual type of field will be the
declared field type even the type if
+ - If encoding is `TAG_ID`, then num_bytes of field name will be used to
store tag id.
+ - ref tracking: when set to 1, ref tracking will be enabled for this field.
+ - nullability: when set to 1, this field can be null.
+ - polymorphism: when set to 1, the actual type of field will be the
declared field type even the type if
not `final`.
- - field name: If tag id is set, tag id will be used instead. Otherwise
meta string encoding `[length]` and data will
+ - field name: If tag id is set, tag id will be used instead. Otherwise meta
string encoding `[length]` and data will
be written instead.
- - type id:
- - For registered type-consistent classes, it will be the registered
type id.
- - Otherwise it will be encoded as `OBJECT_ID` if it isn't `final` and
`FINAL_OBJECT_ID` if it's `final`. The
+ - type id:
+ - For registered type-consistent classes, it will be the registered type
id.
+ - Otherwise it will be encoded as `OBJECT_ID` if it isn't `final` and
`FINAL_OBJECT_ID` if it's `final`. The
meta for such types is written separately instead of inlining here
is to reduce meta space cost if object of
this type is serialized in current object graph multiple times, and
the field value may be null too.
@@ -401,10 +409,10 @@ Notes:
- size: 1~9 byte
- Fury SLI(Small long as int) Encoding:
- - If long is in `[0, 2147483647]`, encode as 4 bytes int: `|
little-endian: ((int) value) << 1 |`
- - Otherwise write as 9 bytes: `| 0b1 | little-endian 8 bytes long |`
+ - If long is in `[0, 2147483647]`, encode as 4 bytes int: `| little-endian:
((int) value) << 1 |`
+ - Otherwise write as 9 bytes: `| 0b1 | little-endian 8 bytes long |`
- Fury PVL(Progressive Variable-length Long) Encoding:
- - positive long format: first bit in every byte indicates whether to have
the next byte. If first bit is set
+ - positive long format: first bit in every byte indicates whether to have
the next byte. If first bit is set
i.e. `b & 0x80 == 0x80`, then the next byte should be read until the
first bit is unset.
#### signed int64
@@ -416,10 +424,10 @@ Notes:
- size: 1~9 byte
- Fury SLI(Small long as int) Encoding:
- - If long is in `[-1073741824, 1073741823]`, encode as 4 bytes int: `|
little-endian: ((int) value) << 1 |`
- - Otherwise write as 9 bytes: `| 0b1 | little-endian 8 bytes long |`
+ - If long is in `[-1073741824, 1073741823]`, encode as 4 bytes int: `|
little-endian: ((int) value) << 1 |`
+ - Otherwise write as 9 bytes: `| 0b1 | little-endian 8 bytes long |`
- Fury PVL(Progressive Variable-length Long) Encoding:
- - First convert the number into positive unsigned long by `(v << 1) ^ (v
>> 63)` ZigZag algorithm to reduce cost of
+ - First convert the number into positive unsigned long by `(v << 1) ^ (v >>
63)` ZigZag algorithm to reduce cost of
small negative numbers, then encoding it as an unsigned long.
#### float32
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/community/community.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/community/community.md
index ef3f56e..eb3d4ed 100644
--- a/i18n/zh-CN/docusaurus-plugin-content-docs/current/community/community.md
+++ b/i18n/zh-CN/docusaurus-plugin-content-docs/current/community/community.md
@@ -1,91 +1,85 @@
---
-title: Community
+title: 社区
sidebar_position: 0
id: community
---
+Apache Fury 是一个由社区驱动的开源项目,项目的蓬勃发展得益于社区贡献。
+我们邀请您根据自己的意愿尽可能地参与项目。以下是几种贡献方式:
-Apache Fury is a volunteer project and it thrives on the contributions of its
community.
-We invite you to participate as much or as little as you wish. Here are
several ways to contribute:
+- 使用 Apache Fury 并分享使用体验和反馈问题;
+- 为项目提供最佳实践示例;
+- 报告错误并修复;
+- 贡献代码和参与文档建设。
-- Use our project and share feedback.
-- Provide use-cases for the project.
-- Report bugs and contribute fixes.
-- Contribute code and documentation improvements.
+## 邮件列表
-## Mailing list
-
-| Name | Desc |
Subscribe | Unsubscribe
| Post |
Archive |
+| 邮件列表 | 描述 | 订阅
| 取消订阅
| 发送邮件 | 活动
|
|-------------------------|---------------------------------------------|-------------------------------------------------------|-----------------------------------------------------------|------------------------------------|-----------------------------------------------------------------------|
-| [email protected] | Development related discussions |
[Subscribe](mailto:[email protected]) |
[Unsubscribe](mailto:[email protected]) |
[Post](mailto:[email protected]) |
[Archive](https://lists.apache.org/[email protected]) |
-| [email protected] | All commits to our repositories |
[Subscribe](mailto:[email protected]) |
[Unsubscribe](mailto:[email protected]) | Read only list
|
[Archive](https://lists.apache.org/[email protected]) |
+| [email protected] | 开发相关讨论 |
[订阅](mailto:[email protected]) |
[取消订阅](mailto:[email protected]) |
[发送邮件](mailto:[email protected]) |
[邮件列表活动](https://lists.apache.org/[email protected]) |
+| [email protected] | 仓库的所有 commits |
[订阅](mailto:[email protected]) |
[取消订阅](mailto:[email protected]) | 只读的邮件列表
| [邮件列表活动](https://lists.apache.org/[email protected]) |
-Please make sure subscribe to any list before attempting to post.
+在尝试发送邮件之前,请确保订阅上述的邮件列表。
-If you are not subscribed to the mailing list, your message will either be
rejected or you won't receive the response.
+**如果您没有订阅邮件列表,您的邮件将被拒绝或不会收到回复。**
-### How to subscribe to a mailing list
+### 如何订阅邮件列表
-To post messages, subscribe first by:
+要发送邮件至邮件列表,请先通过以下方式订阅:
-1. Sending an email to [email protected] with `listname`
replaced accordingly.
-2. Replying to the confirmation email you'll receive, keeping the subject line
intact.
-3. You'll then get a welcome email, and the subscription succeeds.
+1. 发送电子邮件至 [email protected],并相应替换 `listname`;
+2. 回复您将收到的确认电子邮件,保持邮件主题行完整;
+3. 然后您将收到一封欢迎的电子邮件,订阅成功。
-When discussing code snippets in emails, ensure:
+在讨论电子邮件中的代码片段时,请确保:
-- You do not link to files in external services, as such files can change, get
deleted or the link might break and thus
- make an archived email thread useless.
-- You paste text instead of screenshots of text.
-- You keep formatting when pasting code in order to keep the code readable.
-- There are enough import statements to avoid ambiguities.
+- 您不要链接到外部服务中的文件,因为此类文件可能会更改、被删除或链接可能会中断,从而使存档的电子邮件线程变得无用;
+- 您粘贴文本而不是文本屏幕截图;
+- 粘贴代码时保持格式,以保持代码可读;
+- 有足够的导入语句以避免产生代码歧义。
## Slack
-You can join
-the [Apache Fury™ community on
Slack](https://join.slack.com/t/fury-project/shared_invite/zt-1u8soj4qc-ieYEu7ciHOqA2mo47llS8A).
+您可以加入[Slack 上的 Apache Fury™
社区](https://join.slack.com/t/fury-project/shared_invite/zt-1u8soj4qc-ieYEu7ciHOqA2mo47llS8A)。
-There are a couple of community rules:
+这里有一些社区规则:
-- Be respectful and nice.
-- All important decisions and conclusions must be reflected back to the
mailing lists. "If it didn't happen on a mailing
- list, it didn't happen." - The [Apache
Mottos](https://theapacheway.com/on-list/).
-- Use Slack threads to keep parallel conversations from overwhelming a channel.
-- Please do not direct message people for troubleshooting, issue assigning and
PR review. These should be picked-up
- voluntarily.
+- 保持尊重和友善;
+- 所有重要的决定和结论都必须反映到邮件列表中。 “如果这没有在邮件列表中有相关的讨论记录,则代表它不生效” ;
+- [The Apache Way](https://theapacheway.com/on-list/);
+- 使用 Slack 线程来防止并行对话淹没当前的对话频道;
+- 请不要直接向邮件列表发送 Bug fix、Issue 分配和 Code Review 消息。这些内容应该被社区贡献者自愿处理并分配。
-## Issue tracker
+## Issue 跟踪
-We use GitHub Issues to track all issues:
+我们使用 GitHub Issues 来跟踪所有 Issues:
-- code related issues: https://github.com/apache/fury/issues
-- website related issues: https://github.com/apache/fury-site/issues
+- 代码相关问题:https://github.com/apache/fury/issues
+- 网站相关问题:https://github.com/apache/fury-site/issues
-You need to have a [GitHub account](https://github.com/signup) in order to
create issues.
-If you don't have a [GitHub account](https://github.com/signup), you can post
an email to [email protected].
+您需要有一个 [GitHub 帐户](https://github.com/signup) 才能创建问题。
+如果您没有 [GitHub 帐户](https://github.com/signup),您可以发送电子邮件至 [email protected]。
-### Bug reports
+### 报告 Bug
-To report a bug:
+您在报告 Bug 之前,应该:
-- Verify that the bug does in fact exist.
-- Search the [issue tracker](https://github.com/apache/fury/issues) to verify
there is no existing issue reporting the bug you've found.
-- Create a [bug
report](https://github.com/apache/fury/issues/new?assignees=&labels=bug&projects=&template=bug_report.yml)
on issue tracker.
-- If possible, dive into the source code of fury, and submit a patch for the
bug you reported, this helps ensure the bug
- will be fixed quickly.
+- 验证该 Bug 确实存在;
+- 搜索 [Issue List](https://github.com/apache/fury/issues) 以确保不存在相关 Bug。
+- 在 Issue List 中创建 [bug
报告](https://github.com/apache/fury/issues/new?assignees=&labels=bug&projects=&template=bug_report.yml)。
+- 如果可能的话,深入研究 Apache Fury 的源代码,并针对您报告的 Bug 提交补丁,这有助于快速修复 Bug。
-### Reporting a Vulnerability
+### 报告安全漏洞
-Apache Fury is a project of the [Apache Software
Foundation](https://apache.org/) and follows the [ASF vulnerability handling
process](https://apache.org/security/#vulnerability-handling).
+Apache Fury 是 [Apache 软件基金会](https://apache.org/) 的一个项目,遵循 [ASF
漏洞处理流程](https://apache.org/security/#vulnerability-handling)。
-To report a new vulnerability you have discovered please follow the [ASF
vulnerability reporting
process](https://apache.org/security/#reporting-a-vulnerability), which
explains how to send us details privately.
+要报告您发现的新的安全漏洞,请遵循 [ASF
漏洞报告流程](https://apache.org/security/#reporting-a-vulnerability),该流程解释了如何私下向社区维护者发送详细的漏洞信息。
-### Enhancement
+### New Feature
-Enhancements or new feature proposals are also welcome. The more concrete and
rationale the proposal is, the greater the
-chance it will be incorporated into future releases.
+欢迎您增强功能或新功能建议。提案越具体、越合理,您在 Fury 社区的影响力就越大。它有可能在之后版本发布。
-## Source code
+### 项目源代码
-- fury core repository: https://github.com/apache/fury
-- fury website repository: https://github.com/apache/fury-site
+- Fury Core 存储库:https://github.com/apache/fury
+- Fury 网站存储库:https://github.com/apache/fury-site
diff --git
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/community/how_to_join_community.md
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/community/how_to_join_community.md
index 51cf010..8203570 100644
---
a/i18n/zh-CN/docusaurus-plugin-content-docs/current/community/how_to_join_community.md
+++
b/i18n/zh-CN/docusaurus-plugin-content-docs/current/community/how_to_join_community.md
@@ -1,5 +1,5 @@
---
-title: 如何加入Fury社区
+title: 如何加入 Fury 社区
sidebar_position: 0
id: how_to_join_community
---
diff --git a/static/img/benchmarks/README.md b/static/img/benchmarks/README.md
index 1bc562a..d380e51 100644
--- a/static/img/benchmarks/README.md
+++ b/static/img/benchmarks/README.md
@@ -1,5 +1,7 @@
# Java Benchmarks
+
## System Environment
+
- Operation System:4.9.151-015.x86_64
- CPU:Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz
- Byte Order:Little Endian
@@ -9,14 +11,19 @@
- L3 cache: 33792K
## JMH params
+
Don't skip **warm up**, otherwise the results aren't accurate.
+
```bash
-f 1 -wi 3 -i 3 -t 1 -w 2s -r 2s -rf cs
```
## Benchmark Data:
+
### Struct
+
Struct is a class with 100 primitive fields:
+
```java
public class Struct {
public int f1;
@@ -27,8 +34,11 @@ public class Struct {
public double f99;
}
```
+
### Struct2
+
Struct2 is a class with 100 boxed fields:
+
```java
public class Struct {
public Integer f1;
@@ -39,16 +49,23 @@ public class Struct {
public Double f99;
}
```
+
### MediaContent
+
MEDIA_CONTENT is a class from
[jvm-serializers](https://github.com/eishay/jvm-serializers/blob/master/tpc/src/data/media/MediaContent.java).
+
### Sample
+
SAMPLE is a class from [kryo
benchmark](https://github.com/EsotericSoftware/kryo/blob/master/benchmarks/src/main/java/com/esotericsoftware/kryo/benchmarks/data/Sample.java)
## Benchmark Plots
+
### Serialize to heap buffer
+
Serialize data java byte array.
#### Java schema consistent serialization
+
The deserialization peer must have same class definition with the
serialization peer.
No class forward/backward compatibility are supported in this mode.
@@ -60,6 +77,7 @@ No class forward/backward compatibility are supported in this
mode.
</p>
#### Java schema compatible serialization
+
The deserialization peer can have different class definition with the
serialization peer.
Class forward/backward compatibility are supported in this mode.
@@ -71,6 +89,7 @@ Class forward/backward compatibility are supported in this
mode.
</p>
#### Java schema consistent deserialization
+
The deserialization peer must have same class definition with the
serialization peer.
No class forward/backward compatibility are supported in this mode.
@@ -82,6 +101,7 @@ No class forward/backward compatibility are supported in
this mode.
</p>
#### Java schema compatible deserialization
+
The deserialization peer can have different class definition with the
serialization peer.
Class forward/backward compatibility are supported in this mode.
<p align="center">
@@ -92,9 +112,11 @@ Class forward/backward compatibility are supported in this
mode.
</p>
### Off-heap serialization
+
Serialize data off-heap memory.
#### Java schema consistent serialization
+
The deserialization peer must have same class definition with the
serialization peer.
No class forward/backward compatibility are supported in this mode.
<p align="center">
@@ -105,6 +127,7 @@ No class forward/backward compatibility are supported in
this mode.
</p>
#### Java schema compatible serialization
+
The deserialization peer can have different class definition with the
serialization peer.
Class forward/backward compatibility are supported in this mode.
<p align="center">
@@ -115,6 +138,7 @@ Class forward/backward compatibility are supported in this
mode.
</p>
#### Java schema consistent deserialization
+
The deserialization peer must have same class definition with the
serialization peer.
No class forward/backward compatibility are supported in this mode.
<p align="center">
@@ -125,6 +149,7 @@ No class forward/backward compatibility are supported in
this mode.
</p>
#### Java schema compatible deserialization
+
The deserialization peer can have different class definition with the
serialization peer.
Class forward/backward compatibility are supported in this mode.
<p align="center">
@@ -135,10 +160,13 @@ Class forward/backward compatibility are supported in
this mode.
</p>
### Zero-copy serialization
-Note that zero-copy serialization just avoid the copy in serialization, if you
send data to other machine, there may be copies.
+
+Note that zero-copy serialization just avoid the copy in serialization, if you
send data to other machine, there may be copies.
But if you serialize data between processes on same node and use
shared-memory, if the data are in off-heap before serialization, then other
processes can read this buffer without any copies.
+
#### Java zero-copy serialize to heap buffer
+
<p align="center">
<img width="24%" alt=""
src="zerocopy/zero_copy_bench_serialize_BUFFER_to_array_tps.png">
<img width="24%" alt=""
src="zerocopy/zero_copy_bench_serialize_BUFFER_to_directBuffer_tps.png">
@@ -147,6 +175,7 @@ But if you serialize data between processes on same node
and use shared-memory,
</p>
#### Java zero-copy serialize to direct buffer
+
<p align="center">
<img width="24%" alt=""
src="zerocopy/zero_copy_bench_deserialize_BUFFER_from_array_tps.png">
<img width="24%" alt=""
src="zerocopy/zero_copy_bench_deserialize_BUFFER_from_directBuffer_tps.png">
@@ -155,7 +184,9 @@ But if you serialize data between processes on same node
and use shared-memory,
</p>
## Benchmark Data
+
### Java Serialization
+
| Lib | Benchmark | bufferType | objectType | references | Tps |
| ------- | ------- | ------- | ------- | ------- | ------- |
| Fst | serialize | array | SAMPLE | False | 915907.574306 |
@@ -464,6 +495,7 @@ But if you serialize data between processes on same node
and use shared-memory,
| Protostuff | deserialize | directBuffer | STRUCT2 | False | 425523.315814 |
### Java Zero-copy
+
| Lib | Benchmark | array_size | bufferType | dataType | Tps |
| ------- | ------- | ------- | ------- | ------- | ------- |
| Fst | deserialize | 200 | array | PRIMITIVE_ARRAY | 219333.990504 |
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]