Ilya Soin created FLINK-39637:
---------------------------------

             Summary: Filter push-down for the savepoint table connector
                 Key: FLINK-39637
                 URL: https://issues.apache.org/jira/browse/FLINK-39637
             Project: Flink
          Issue Type: Improvement
          Components: API / State Processor
            Reporter: Ilya Soin


h3. Problem

The savepoint table connector reads keyed state from Flink savepoints via SQL. 
Currently, every query - even one with a _WHERE_ clause on the primary key - 
must restore and iterate all keys across all key groups. For large savepoints 
with millions of keys, this makes point lookups and small-range queries 
unnecessarily expensive.
h3. Proposed solution

Implement _SupportsFilterPushDown_ on the savepoint table source, enabling the 
Flink SQL planner to push key predicates ({_}=, IN, BETWEEN, <, <=, >, >={_}, 
and combinations via {_}AND/OR{_}) directly into the savepoint scan.

Filtering would be applied at two levels:
 # *Split pruning* — For equality filters ({_}WHERE k = 42 or WHERE k IN (1, 2, 
3){_}), the connector would determine which key groups contain the target keys 
and create input splits only for those key groups. This avoids restoring the 
state backend for irrelevant portions of the savepoint entirely.
 # *Key iteration pruning* — Within each input split, the key iterator would be 
wrapped with a filter that skips non-matching keys before they reach the state 
reader function. This applies to all supported filter types, including range 
predicates where split-level pruning is not possible.

h3. Scope
 * Filter push-down only works for state keys, not values;
 * At first, push-down only supports primitive key types (e.g. {_}INT, 
STRING{_}). Composite key support can be added in a follow-up ticket.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to