jonahgao commented on code in PR #7581:
URL: https://github.com/apache/arrow-datafusion/pull/7581#discussion_r1448958126


##########
datafusion/expr/src/logical_plan/plan.rs:
##########
@@ -112,6 +112,8 @@ pub enum LogicalPlan {
     /// produces 0 or 1 row. This is used to implement SQL `SELECT`
     /// that has no values in the `FROM` clause.
     EmptyRelation(EmptyRelation),
+    /// A named temporary relation with a schema.
+    NamedRelation(NamedRelation),

Review Comment:
   @matthewgapp Another rationale might be to support pushing down filters to 
the working table, which may be useful if we support spilling the working table 
to disk in the future. I think the performance should not be affected, the 
execution of physical plans is almost the same as it is now.
   
   I implemented a demo on [this 
branch](https://github.com/jonahgao/cte-datafusion/tree/cte-poc) and in [this 
commit](https://github.com/jonahgao/cte-datafusion/commit/8729bc8b715f4cf5bbc40e2b2224824589a0c751).
 GitHub does not allow forking a repository twice, so I directly pushed it to 
another repository for convenience.
   
   In this demo, I attempted to replace the `NamedRelation` with a 
`TableProvider`, namely `CteWorkTable`. The benefit of this is that it can 
avoid maintaining a new logical plan.
   
   
   Another change is that I used a structure called `WorkTable` to connect the 
`RecursiveQueryExec` and the `WorkTableExec` (it was previously 
`ContinuanceExec`). The advantage of this is that  it avoids maintaining some 
external context information, such as `relation_handlers` in `TaskContext`, and 
the `ctx` in `create_initial_plan`.
   
   The `WorkTable` is a shared table, it will be scanned by the `WorkTableExec` 
during the execution of  the recursive term, and after the execution is 
completed, the results will be written back to it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to