This is an automated email from the ASF dual-hosted git repository.

pandalee pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/fory.git


The following commit(s) were added to refs/heads/main by this push:
     new 49746f346 docs(python): add row format doc (#2499)
49746f346 is described below

commit 49746f34646dd81e0c2e3ea751be57b9c6788c34
Author: Shawn Yang <[email protected]>
AuthorDate: Sat Aug 23 10:42:41 2025 +0800

    docs(python): add row format doc (#2499)
    
    ## What does this PR do?
    
    <!-- Describe the purpose of this PR. -->
    
    ## Related issues
    
    #2498
    
    ## Does this PR introduce any user-facing change?
    
    <!--
    If any user-facing interface changes, please [open an
    issue](https://github.com/apache/fory/issues/new/choose) describing the
    need to do so and update the document if necessary.
    -->
    
    - [ ] Does this PR introduce any public API change?
    - [ ] Does this PR introduce any binary protocol compatibility change?
    
    ## Benchmark
    
    <!--
    When the PR has an impact on performance (if you don't know whether the
    PR will have an impact on performance, you can submit the PR first, and
    if it will have impact on performance, the code reviewer will explain
    it), be sure to attach a benchmark data here.
    -->
---
 python/README.md | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 75 insertions(+), 1 deletion(-)

diff --git a/python/README.md b/python/README.md
index fb29c7dc5..963c8409a 100644
--- a/python/README.md
+++ b/python/README.md
@@ -46,7 +46,7 @@ print(fory.deserialize(data))
 
 ### Cross-language Serialization
 
-Fory excels at cross-language serialization. You can serialize data in Python 
and deserialize it in another language like Java or Go, and vice-versa.
+Apache Fory excels at cross-language serialization. You can serialize data in 
Python and deserialize it in another language like Java or Go, and vice-versa.
 
 Here's an example of how to serialize an object in Python and deserialize it 
in Java:
 
@@ -95,6 +95,80 @@ public class ReferenceExample {
 }
 ```
 
+### Row Format Zero-Copy Partial Serialzation
+
+Apache Fory provide a random-access row format, which supports map a typed 
nested struct into a binary and read its nested element without deserializing 
the whole binary. This can be used to minimize teh deserialization overhead for 
huge objects in the case where you only needs to access part of the data. You 
can even encode huge objects into binary and write to file, then mmap that file 
into memory to reduce memory overhead too.
+
+**Python**
+
+```python
+@dataclass
+class Bar:
+    f1: str
+    f2: List[pa.int64]
+@dataclass
+class Foo:
+    f1: pa.int32
+    f2: List[pa.int32]
+    f3: Dict[str, pa.int32]
+    f4: List[Bar]
+
+encoder = pyfory.encoder(Foo)
+foo = Foo(f1=10, f2=list(range(1000_000)),
+         f3={f"k{i}": i for i in range(1000_000)},
+         f4=[Bar(f1=f"s{i}", f2=list(range(10))) for i in range(1000_000)])
+binary: bytes = encoder.to_row(foo).to_bytes()
+foo_row = pyfory.RowData(encoder.schema, binary)
+print(foo_row.f2[100000], foo_row.f4[100000].f1, foo_row.f4[200000].f2[5])
+```
+
+**Java**
+
+```java
+public class Bar {
+  String f1;
+  List<Long> f2;
+}
+
+public class Foo {
+  int f1;
+  List<Integer> f2;
+  Map<String, Integer> f3;
+  List<Bar> f4;
+}
+
+RowEncoder<Foo> encoder = Encoders.bean(Foo.class);
+Foo foo = new Foo();
+foo.f1 = 10;
+foo.f2 = IntStream.range(0, 1000000).boxed().collect(Collectors.toList());
+foo.f3 = IntStream.range(0, 1000000).boxed().collect(Collectors.toMap(i -> 
"k"+i, i->i));
+List<Bar> bars = new ArrayList<>(1000000);
+for (int i = 0; i < 1000000; i++) {
+  Bar bar = new Bar();
+  bar.f1 = "s"+i;
+  bar.f2 = LongStream.range(0, 10).boxed().collect(Collectors.toList());
+  bars.add(bar);
+}
+foo.f4 = bars;
+// Can be zero-copy read by python
+BinaryRow binaryRow = encoder.toRow(foo);
+// can be data from python
+Foo newFoo = encoder.fromRow(binaryRow);
+// zero-copy read List<Integer> f2
+BinaryArray binaryArray2 = binaryRow.getArray(1);
+// zero-copy read List<Bar> f4
+BinaryArray binaryArray4 = binaryRow.getArray(3);
+// zero-copy read 11th element of `readList<Bar> f4`
+BinaryRow barStruct = binaryArray4.getStruct(10);
+
+// zero-copy read 6th of f2 of 11th element of `readList<Bar> f4`
+barStruct.getArray(1).getInt64(5);
+RowEncoder<Bar> barEncoder = Encoders.bean(Bar.class);
+// deserialize part of data.
+Bar newBar = barEncoder.fromRow(barStruct);
+Bar newBar2 = barEncoder.fromRow(binaryArray4.getStruct(20));
+```
+
 ## Useful Links
 
 - **[Project Website](https://fory.apache.org)**


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to