924060929 commented on PR #64436:
URL: https://github.com/apache/doris/pull/64436#issuecomment-4853411497

   Correctness is fine here — full-access is a safe upper bound. But the 
implementation **over-reads**: it uses a fresh `CollectorContext` + 
`ACCESS_ALL` and discards the incoming context, forcing a full element read 
even when only metadata is needed.
   
   Example — `cardinality(array_map(x->1, arr))` (body doesn't reference the 
item): the access path comes out as `[arr, *]` (full element read), but only 
the array length is needed → it should be `[arr, OFFSET]`. 
`array_map`/`array_count`/`array_exists` with an element-independent body 
depend only on the array's length, not its elements.
   
   A more principled approach: when the body doesn't reference the item, branch 
on the function's *result* semantics instead of always full-reading:
   - derived value (array_map / array_count / array_exists) → `[arr, OFFSET]` 
(metadata only)
   - original elements (array_filter / array_first) → full read
   
   Related: `array_sort` is in the same `return collectArrayPathInLambda(...)` 
(pruning) branch but returns the original reordered elements — a comparator 
that reads only part of an element can prune away data it still has to return. 
That's the flip side of the same "blanket rule vs. decide-by-result-semantics" 
issue and worth a separate look.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to