Hi, Cheng.
Thanks for driving this work. A few comments are below:
1: >> "When creating primary key tables, we enforce table.datalake.enabled = 
false."
So, will it throw un-supported exception or just set `table.datalake.enabled = 
false` whatever users set it?
Also, it seems conflicts with "For both log tables and primary key tables, 
xxx." since IIUC, only log table is supported.

2: >> "The size of a fragment is controlled by the configuration option 
datalake.lance.batch_size"
Is it a per-table optiions? If so, I'd like to suggest to rename it to 
`lance.batch_size` to follow the convention we have for paimon.

3: >> "When lake committer needs to find out the bucket end offset of committed 
lake snapshot, it has to reconstruct this information by reading the entire 
artificial bucket and offset columns from lake "
Can we commit the bucket end offset to lance with storageOptions. Looking into 
lance-spark code, it seems it's possible to pass storageOptions while commiting 
to lance


Best regards,
Yuxia

----- 原始邮件 -----
发件人: "Wang Cheng" <[email protected]>
收件人: "dev" <[email protected]>
发送时间: 星期五, 2025年 7 月 04日 下午 12:26:43
主题: [SPAM][DISCUSS] FIP-5: Support tiering Fluss data to Lance

Hi all,


Lance is a popular table format designed for performant AI workloads. To enable 
the integration between Fluss and the multimodal AI data lake ecosystem, I'd 
like to propose FIP-5: Support tiering Fluss data to Lance [1].


Any feedback and suggestions on this proposal are welcome!


[1]: 
https://cwiki.apache.org/confluence/display/FLUSS/FIP-5%3A+Support+tiering+Fluss+data+to+Lance



Regards,
Cheng



&nbsp;

Reply via email to