airborne12 opened a new pull request, #61584:
URL: https://github.com/apache/doris/pull/61584
## Summary
- Fix `visitMatch()` crash ("SlotReference in Match failed to get Column")
when MATCH references alias slots that lost column metadata (e.g.,
`CAST(variant['key'] AS VARCHAR) AS fn`)
- Add graceful fallback in `ExpressionTranslator.visitMatch()` when slot
metadata is missing
- New rewrite rule `PushDownMatchPredicateAsVirtualColumn` that extracts
MATCH from join/filter predicates and pushes it as a virtual column on OlapScan
for inverted index evaluation
## Problem
When all three conditions are met, MATCH crashes:
1. MATCH left side is an alias over a non-trivial expression (Cast,
ElementAt, etc.) — `Alias.toSlot()` loses `originalColumn`/`originalTable`
metadata
2. OR predicate references join-dependent columns (`l.objectId IS NOT NULL`,
EXISTS mark `$c$1`) — prevents MATCH from being pushed below the join
3. MATCH is stuck at the join layer referencing a metadata-less alias slot →
`visitMatch()` throws
**Reproducer:**
```sql
WITH contacts AS (
SELECT objectId, CAST(overflowProperties['string_8'] AS VARCHAR) AS
firstName
FROM objects_small WHERE portalId = 865815822
),
lists AS (
SELECT objectId FROM lists_v2 WHERE portalId = 865815822
)
SELECT o.objectId
FROM contacts o LEFT JOIN lists l ON o.objectId = l.objectId
WHERE firstName MATCH_ANY 'john' OR l.objectId IS NOT NULL;
-- ERROR: SlotReference in Match failed to get Column
```
## Solution
1. **`ExpressionTranslator.visitMatch()`**:
`getOriginalColumn().orElse(null)` instead of `orElseThrow()`. When
column/table metadata is missing, `invertedIndex = null` and BE evaluates via
slow-path expression evaluation.
2. **`PushDownMatchPredicateAsVirtualColumn`** (new rewrite rule): Traces
the MATCH's alias slot back through the Project to the original column
expression, creates a virtual column `(original_expr MATCH_ANY 'term')` on
OlapScan, and replaces the MATCH in the predicate with the boolean slot
reference. BE evaluates via `fast_execute()` using inverted index.
**Plan transformation:**
```
Before:
Filter(fn MATCH_ANY 'john' OR l.objectId IS NOT NULL) ← crashes or slow
path
└── Join → Project[CAST(col) as fn] → OlapScan
After:
Filter(__match_vc OR l.objectId IS NOT NULL) ← boolean reference, no crash
└── Join → Project[fn, __match_vc] → OlapScan[virtualColumns=[(CAST(col)
MATCH_ANY 'john')]]
↑ inverted index fast
path
```
## Test plan
- [x] Manual test: verified with variant subcolumn + EXISTS + OR, LEFT JOIN
+ OR
- [ ] Regression test
- [ ] Unit Test
🤖 Generated with [Claude Code](https://claude.com/claude-code)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]