Github user henryr commented on the issue:

    https://github.com/apache/spark/pull/21049
  
    I might be a bit of a hardliner on this, but I think it's correct to 
eliminate the {{ORDER BY}} from common table expressions (e.g. MSSQL agrees 
with me, see [this 
link](https://docs.microsoft.com/en-us/sql/t-sql/queries/with-common-table-expression-transact-sql?view=sql-server-2017#guidelines-for-creating-and-using-common-table-expressions)).
    
    However, given the principle of least surprise, I agree it might be a good 
idea to at least start with scalar and nested subqueries, and leave inline 
views for another day. That might be a bit harder to do (I think the rule will 
need a whitelist of operators it's ok to eliminate sorts below), and in general 
I think there'll be some missed opportunities, but it's a start :)
    
    Alternatively we could extend the analyzed logical plan to explicitly mark 
the different subquery types (i.e. have a `InlineView` node, a `NestedSubquery` 
node and so on). That would make these optimizations easier to express, but I 
have some reservations about the semantics of introducing those nodes. What do 
you think @dilipbiswal / @gatorsmile ?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to