Hi, Paimon Devs, I’d like to start a discussion about PIP-16[1].

Position delete is a solution to implement the Merge-On-Read (MOR) structure, 
which has been adopted by other formats such as Iceberg and Delta. 
By combining with Paimon's LSM tree, we can create a new position deletion mode 
unique to Paimon.
Under this mode, extra overhead (lookup and write delete file) will be 
introduced during writing, but during reading, data can be directly retrieved 
using "data + filter with position delete", avoiding additional merge costs 
between different files. 
Furthermore, this mode can be easily integrated into native engine solutions 
like Spark + Gluton in the future, thereby significantly enhancing read 
performance.

Look forward to your question and suggestions.

Best, zouxxyy

[1] https://cwiki.apache.org/confluence/x/Tws4EQ

Reply via email to