Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/20560 @gatorsmile thanks for your comment. I moved it to a separate rule and added more tests. As per the added value of this rule, I see 3 main points: 1. Let's imagine that a user exposes a cached sorted relation which can be queried by other users via JDBC. Other users cannot know that the table is already sorted and they may write query which cause an unnecessary sort. 2. Many tools which produce automatic SQL code are not very smart in creating it, so they can generate queries which cause unneeded sorts. 3. I think this is also enabling for more interesting use cases. What I am thinking about is that we may have some datasources which store sorted data and if we can express this in the logical plan, then we may avoid unneeded sorts. What do you think? Thanks.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org