This is an automated email from the ASF dual-hosted git repository.
yuxia pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/fluss.git
The following commit(s) were added to refs/heads/main by this push:
new 613d35436 [lake/lance] add Flink memory usage note (#1909)
613d35436 is described below
commit 613d35436bd35b17d03fd330795cfd1a43d53931
Author: xx789 <[email protected]>
AuthorDate: Fri Oct 31 13:49:00 2025 +0800
[lake/lance] add Flink memory usage note (#1909)
---
website/docs/streaming-lakehouse/integrate-data-lakes/lance.md | 2 ++
1 file changed, 2 insertions(+)
diff --git a/website/docs/streaming-lakehouse/integrate-data-lakes/lance.md
b/website/docs/streaming-lakehouse/integrate-data-lakes/lance.md
index 082ef04a7..af435973e 100644
--- a/website/docs/streaming-lakehouse/integrate-data-lakes/lance.md
+++ b/website/docs/streaming-lakehouse/integrate-data-lakes/lance.md
@@ -72,6 +72,8 @@ Additionally, when following the [Start Datalake Tiering
Service](maintenance/ti
> **NOTE**: Fluss v0.8 only supports tiering log tables to Lance.
+> **NOTE**: The Lance connector leverages Arrow Java library, which operates
on off-heap memory. To prevent `java.lang.OutOfMemoryError: Direct buffer
memory` error in Flink Task Manager, please increase the value of
`taskmanager.memory.task.off-heap.size` in `<FLINK_HOME>/conf/config.yaml` to
at least `'512m'` (e.g., `taskmanager.memory.task.off-heap.size: 512m`). You
may need to adjust this value higher (such as `'1g'`) depending on your
workload and data size.
+
Then, the datalake tiering service continuously tiers data from Fluss to
Lance. The parameter `table.datalake.freshness` controls the frequency that
Fluss writes data to Lance tables. By default, the data freshness is 3 minutes.
You can also specify Lance table properties when creating a datalake-enabled
Fluss table by using the `lance.` prefix within the Fluss table properties
clause.