924060929 commented on PR #62304:
URL: https://github.com/apache/doris/pull/62304#issuecomment-4853573094
This PR implements the IS NULL optimization by removing the commented-out
META-type `visitIsNull` and re-implementing it as an `ACCESS_NULL` path suffix
(`[col, NULL]`). The removed version was:
// context.setType(ColumnAccessPathType.META);
Same "read-mode in the path string" direction as the OFFSET tier (#62205) —
and this is the NULL half of the namespace collision I filed there: a struct
field literally named `null` gets read as the NULL meta by BE's
case-insensitive `StringCaseEqual`, so the struct enters `NULL_MAP_ONLY` and
skips all sub-columns. Verified on a current build:
```sql
CREATE TABLE t(id INT, s STRUCT<`null`:INT, x:INT>)
DUPLICATE KEY(id) DISTRIBUTED BY HASH(id) BUCKETS 1
PROPERTIES("replication_num"="1");
INSERT INTO t VALUES(1, named_struct('null',100,'x',200));
SELECT s.`null` FROM t; -- pruning ON -> NULL (wrong); OFF -> 100
(correct)
```
Two things specific to this PR:
1. The `stripNullSuffixPaths` comment here already documents the BE trap
verbatim ("Struct/Array/Map iterators treat a leading NULL sub-path as
NULL_MAP_ONLY and skip all children") — the behavior was known, but only the
legitimate-meta path was guarded; the field-name entry was missed.
2. NULL is more dangerous than OFFSET: the struct reader actually acts on
`NULL_MAP_ONLY`, while it has no `OFFSET_ONLY` branch — so an `offset` field
survives by luck but a `null` field does not.
The functional IS NULL pruning itself tests correct (scalar / struct /
variant / mixed / IS NOT NULL / WHERE pushdown all match the non-pruned
results). The concern is the design direction: encoding read-mode as a path
component reintroduces the namespace it then collides on. Keeping read-mode as
the orthogonal `DATA`/`META` type dimension — the one this PR removed — avoids
it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]