bmarcott commented on issue #27096: [SPARK-28148][SQL] Repartition after join is not optimized away URL: https://github.com/apache/spark/pull/27096#issuecomment-572408308 Thanks for taking a look! Yes, the reason it is here is because the shuffle/sorting is introduced by EnsureRequirements itself, which causes the user added sorts/shuffles unnecessary. Yea it felt a little hacky for optimization code to be in a rule called EnsureRequirements. I'd like someone more familiar with overall planner design to suggest whether I go through with 1st or 2nd option. For 2nd option, won't I need to create a new physical node for both the repartition and sort, each of which is kinda a dummy physical node which relies on EnsureRequirements to add the necessary sorts/partitioning based on `requiredChildDistribution` and `requiredChildOrdering`
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org