Thanks Yang for driving this work.
Can we clarify that this filter evaluation works on a best-effort basis at the
beginning of the FIP document? Specifically?6?7?6?7, it only performs
coarse-grained block skipping by leveraging RecordBatch statistics. To be
honest, the table.newScan().filter(recordBatchFilter) API gave me the
impression that the server side performs row-by-row filtering.
Regards,
Cheng
------------------ Original ------------------
From:
"dev"
<[email protected]>;
Date: Thu, Aug 7, 2025 11:11 AM
To: "dev"<[email protected]>;
Subject: [DISCUSS] FIP-10: Support Log RecordBatch Filter Pushdown
Hello Fluss Community,
I propose initiating discussion on FIP-10: Support Log RecordBatch Filter
Pushdown (
https://cwiki.apache.org/confluence/display/FLUSS/FIP-10%3A+Support+Log+RecordBatch+Filter+Pushdown).
This optimization aims to improve the performance of Log table queries and
is now ready for community feedback.
This FIP introduces RecordBatch-level filter pushdown to enable early
filtering at the storage layer, thereby optimizing CPU, memory, and network
resources by skipping non-matching log record batches.
A proof-of-concept (PoC) has been implemented in the logfilter branch in
https://github.com/platinumhamburg/fluss and is ready for testing and
preview.