adriangb commented on PR #21621:
URL: https://github.com/apache/datafusion/pull/21621#issuecomment-4466063490

   Hey @kumarUjjawal I'm afraid I don't have clear answers off the top of my 
head.
   
   > Sort(exprs, fetch=N) → Join → Sort(exprs, fetch=N), the outer Sort becomes 
redundant whenever the join is provably 1-to-≤1 on the preserved key (e.g. 
unique constraint on the other side, or upstream DISTINCT/GROUP BY on the join 
column). In that case the pushed Sort's ordering survives the join and the 
outer Sort + LIMIT could collapse to a plain Limit(N).
   
   This makes a lot of sense, I agree with this.
   
   > Does the FD info on Join's output schema already carry enough to detect 
this, or would it need more wiring?
   
   Sorry what is an FD?
   
   > Better as a new rule (EliminateRedundantOuterSort) or an extension to an 
existing one (EliminateLimit / a Sort-dedup pass)?
   
   I am not familiar with e.g. what EliminateLimit does. I think if it fits 
well in an existing rule that's best, but we shouldn't put two unrelated things 
in the same rule just because.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to