[
https://issues.apache.org/jira/browse/HBASE-30150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Viraj Jasani resolved HBASE-30150.
----------------------------------
Fix Version/s: 4.0.0-alpha-1
2.7.0
3.0.0-beta-2
2.6.6
Hadoop Flags: Reviewed
Resolution: Fixed
> Propagate filter hints through composite filters
> ------------------------------------------------
>
> Key: HBASE-30150
> URL: https://issues.apache.org/jira/browse/HBASE-30150
> Project: HBase
> Issue Type: Improvement
> Components: Filters, Scanners
> Reporter: Shubham Roy
> Assignee: Shubham Roy
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2, 2.6.6
>
>
> h3. Context
> HBASE-29974 introduced two new Filter API methods —
> {{getHintForRejectedRow(Cell)}} and {{getSkipHint(Cell)}} — that allow
> filters to provide seek hints
> when rows are rejected by {{filterRowKey}} or when cells are structurally
> skipped before {{filterCell}} is reached (time-range gates, column-set
> exclusion, version-limit exhaustion). These methods are correctly delegated
> through {{FilterWrapper}}, but the composite filter wrappers do not propagate
> them.
> h3. Problem
> {{FilterListWithAND}}, {{FilterListWithOR}}, {{SkipFilter}}, and
> {{WhileMatchFilter}} do not override or delegate {{getHintForRejectedRow}} or
> {{getSkipHint}}. They inherit the no-op default from {{FilterBase}} which
> returns {{null}}. This means:
> * A filter graph like {{FilterList(AND, MultiRowRangeFilter,
> ColumnPrefixFilter)}} will silently ignore any hint provided by sub-filters.
> * Almost all real-world HBase filter configurations use {{FilterList}} to
> compose filters. Until this JIRA is resolved, the hint optimization from
> HBASE-29974 only benefits standalone (non-composed) filter usage.
> * For CDC\/replication use cases that combine filters with AND (e.g., a
> skip-scan filter combined with a time-range-aware filter), the seek hint path
> is
> effectively dead code.
> This was explicitly documented as a limitation in the Javadoc of both new
> methods in HBASE-29974 and deferred to this follow-up JIRA.
> h3. Scope
> The following classes need to override {{getHintForRejectedRow}} and
> {{getSkipHint}} with appropriate composition semantics:
> * *{{FilterListWithAND}}* — all sub-filters must agree on row rejection
> before a hint is meaningful. When multiple sub-filters provide hints, the
> composed
> hint should be the most conservative (furthest forward for forward scans,
> furthest backward for reversed scans) to avoid skipping rows that another
> sub-filter would have accepted.
> * *{{FilterListWithOR}}* — any sub-filter rejecting a row may provide a
> hint, but the composed hint must be the least aggressive (closest to current
> position) since other sub-filters may still accept intermediate rows.
> * *{{SkipFilter}}* — should delegate to the wrapped filter if the wrapped
> filter provides a hint.
> * *{{WhileMatchFilter}}* — should delegate to the wrapped filter if the
> wrapped filter provides a hint.
> h3. Key Design Considerations
> * *Hint composition for AND semantics:* when sub-filter A hints to row-X
> and sub-filter B hints to row-Y, the AND-list should use {{max(row-X, row-Y)}}
> for forward scans and {{min(row-X, row-Y)}} for reversed scans — the
> furthest hint is safe because ALL filters must accept.
> * *Hint composition for OR semantics:* the OR-list should use {{min(row-X,
> row-Y)}} for forward scans and {{max(row-X, row-Y)}} for reversed scans — the
> closest hint is required because ANY filter accepting means the row should
> not be skipped.
> * *Null handling:* if any sub-filter returns {{null}} (no hint), the
> composed result depends on the operator. For AND, null from one filter means
> "no
> opinion" — the other hint can still be used. For OR, null from one filter
> means "no shortcut available" — the entire composition must fall back to
> {{null}}.
> * *{{getSkipHint}} statelessness contract:* the composition must respect
> the contract that {{getSkipHint}} implementations must not modify filter
> state.
> The composite override should call sub-filters' {{getSkipHint}} and compose
> results without side effects.
> * *Reversed scan direction:* hint composition must be direction-aware,
> consistent with the contracts documented in HBASE-29974.
> h3. Test Plan
> * Unit tests for {{FilterListWithAND}} and {{FilterListWithOR}} hint
> composition — single hint provider, multiple hint providers, mixed
> null\/non-null,
> forward and reversed scans
> * Unit tests for {{SkipFilter}} and {{WhileMatchFilter}} delegation
> * Integration tests with composed filter graphs exercising the hint path
> end-to-end (e.g., {{FilterList(AND, hintFilter, noHintFilter)}})
> * Regression tests ensuring existing {{FilterList}} behavior is unchanged
> when no sub-filter overrides the new methods
> h3. References
> * Parent JIRA:
> [HBASE-29974|https://issues.apache.org/jira/browse/HBASE-29974]
> * Master PR: [apache/hbase#7882|https://github.com/apache/hbase/pull/7882]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)