Currently, Flink's capability to unon read data in datalake and Fluss is 
tightly coupled with Paimon's implementation, which limits it's flexibility and 
extensibility. We hard code paimon related classes in fluss-flink module. It 
makes it difficult for Flink to support union read other datalakes and for 
other compute engines like spark, trino to integate with the union read 
ability. What's more, the tight coupling also obscures the core logic of union 
read , making the code harder to maintain and evolve. 

To address this , I’d like to propose FIP-6: Decouple Flink union read with 
paimon[1], which seeks to decouple union read from Paimon by introuding 
well-defiend interfaces and extension points which paimon should implement. By 
doing so, Flink can support a wider range of datalakes. Furthermore, the 
standardized interfaces will allow other compute engines to integrate with 
Fluss's union read capability. 

Welcome your feedback and suggestions on this proposal. Looking forward to a 
productive discussion! 

[1]: 
https://cwiki.apache.org/confluence/display/FLUSS/FIP-6%3A+Decouple+Flink+union+read+with+paimon
 

Best regards, 
Yuxia 

Reply via email to