yaooqinn commented on PR #55708: URL: https://github.com/apache/spark/pull/55708#issuecomment-4387841858
Closing this PR. After re-reading the change with fresh eyes, the upstream user-visible motivation is too thin: in apache/spark master no code path passes a sub-tree with an orphan `CTERelationRef` (no enclosing `WithCTE`) to `CacheManager.lookupCachedDataInternal`. `Dataset.cache()` / `CACHE TABLE` cache the full analyzed plan with `WithCTE` at top, and `useCachedData`'s `transformDown` walk over a wider query never matches a cached entry on a sub-tree below `WithCTE` (no cached plan is shaped like an orphan `CTERelationRef` body in normal usage). The real consumer of the fix is downstream CTE-materialization / CTE-inmemory-cache prototyping that explicitly probes `CacheManager` with sub-trees rooted at `CTERelationRef`. That work is not being upstreamed at this point, so this contract fix has no consumer in master. Happy to revive this (or fold it into a larger PR) when there's a real upstream consumer — e.g. if a CTE-materialization rule that probes the cache lands. The branch `SPARK-56738` is preserved on my fork for that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
