foxtail463 opened a new pull request, #64036:
URL: https://github.com/apache/doris/pull/64036

   ### What problem does this PR solve?
   
   Problem Summary:
   
   Nested MV rewrite needs to distinguish two different identities during fuzzy
   StructInfo collection:
   ```sql
   -- Query side: base table + view.
   SELECT ...
   FROM fact_src t
   LEFT JOIN dim_full d0
     ON ...
   LEFT JOIN v_dim_full_non_double d1
     ON ...;
   
   -- v_dim_full_non_double is a view over dim_full.
   CREATE VIEW v_dim_full_non_double AS
   SELECT ...
   FROM dim_full
   WHERE double_flag = '0';
   
   -- Child MVs.
   CREATE MATERIALIZED VIEW mv_fact AS
   SELECT ...
   FROM fact_src;
   
   CREATE MATERIALIZED VIEW mv_dim_full AS
   SELECT ...
   FROM dim_full;
   
   CREATE MATERIALIZED VIEW mv_dim_full_view_non_double AS
   SELECT ...
   FROM v_dim_full_non_double;
   
   -- Target MV side: child MVs.
   CREATE MATERIALIZED VIEW mv_target AS
   SELECT ...
   FROM mv_fact t
   LEFT JOIN mv_dim_full d0
     ON ...
   LEFT JOIN mv_dim_full_view_non_double d1
     ON ...;
   ```
   In this shape, child rewrite can first introduce MV scan relations into memo.
   Then the parent group should be able to build a candidate plan from those MV
   scan relations and match mv_target.
   
   The old StructInfo candidate path used the table/common-table-id based cache 
key
   in StructInfoMap's candidate map to organize memo candidates. That key only
   describes the table family covered by one MV definition; it is a search-space
   key, not the identity of a concrete candidate. The exact candidate identity 
is
   relationIdSet, which describes the relations contained by one memo candidate 
plan
   tree.
   
   In the example above, the rewritten scan candidate for mv_dim_full and the
   rewritten scan candidate for mv_dim_full_view_non_double can fall into the 
same
   table/common-table-id cache key while representing different relationIdSet
   values. If one candidate overwrites or is reused as the other, the parent
   mv_target candidate is assembled with the wrong child relation, so the final
   target MV rewrite becomes path-sensitive and may fail.
   
   This refactor makes the identity boundary explicit:
   
   - use table ids only to expand the relation search space for an MV
   - use exact relationIdSet as the StructInfo candidate identity
   - cache candidates by target relation search space, with exact relationIdSet 
as
     the inner key
   - register tableId -> relationId when catalog relations enter memo, including
     nested MV scan relations
   - clear StructInfoMap candidate caches when relation identity changes
   - keep candidate plan materialization lazy until StructInfo is actually 
needed
   
   This keeps base-table, view-derived, and rewritten MV-scan candidates 
coexisting
   under the same coarse table family without overwriting each other.
   
   ### Check List (For Author)
   
   - Test <!-- At least one of them must be included. -->
       - [x] Regression test
       - [x] Unit Test
       - [ ] Manual test (add detailed scripts or steps below)
       - [ ] No need to test or manual test. Explain why:
           - [ ] This is a refactor/code format and no logic has been changed.
           - [ ] Previous test can cover this change.
           - [ ] No code files have been changed.
           - [ ] Other reason <!-- Add your reason?  -->
   
   - Behavior changed:
       - [ ] No.
       - [x] Yes. <!-- Explain the behavior change -->
   
   - Does this need documentation?
       - [x] No.
       - [ ] Yes. <!-- Add document PR link here. eg: 
https://github.com/apache/doris-website/pull/1214 -->
   
   ### Check List (For Reviewer who merge this PR)
   
   - [ ] Confirm the release note
   - [ ] Confirm test cases
   - [ ] Confirm document
   - [ ] Add branch pick label <!-- Add branch pick label that this PR should 
merge into -->
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to