Your images did not come through (ASF mail server strips them from all emails) 
but I’ll try my best to answer.

As you note, we added the Spool operator to model the case where the RelNode 
tree is a DAG (one node has more than one consumer). But other than that, 
relational algebra doesn’t have a CTE concept. (In a sense every relational 
operator is a CTE. And in relational algebra the intermediate relations don’t 
have names.)

Do you intend to create CTEs only where a node has multiple consumers, or do 
you have some other criteria for ‘cutting’ the tree?

RelToSqlConverter (and its base class SqlImplementor) is very concerned with 
‘cutting’. Given an algebra tree, it tries to create the minimal number of 
SELECT blocks. The algebra scan-filter-aggregate-filter can be done in one 
block (FROM-WHERE-GROUP BY-HAVING) but scan-aggregate-filter-aggregate requires 
two blocks (because you can’t have multiple GROUP BY in one block).

Perhaps you need a version of RelToSqlConverter with a custom ‘cutting’ policy.

Julian

PS You and your colleague should subscribe to dev@ to avoid moderation delays 
and to receive replies.

> On Jul 26, 2023, at 12:58 AM, Ken Yang <yangyx891...@gmail.com> wrote:
> 
> Dear Calcite Community,
> 
> We are trying to use Calcite to parse SQLs, by breaking complicated SQLs into 
> WITH-AS blocks. Particularly, our goal is to first parse an SQL clause into 
> Calcite Logical Plans, and then manually traverse the entire tree and use 
> WITH-AS to wrap any sub-trees with a LogicalProject root. By doing so we will 
> have an easy-to-understand SQL equivalent of the original SQL. After some 
> research, we came to believe that this feature is not readily available in 
> Calcite, and decide to reach out for help as our current approach has 
> blockers.
> 
> To illustrate our current approach and blocker, we consider the following 
> SQL, and its corresponding Logical Plan provided by Calcite:
> 
> 
> Here, with a RelNode logPlan as root, we would simply take out 
> logPlan.getInput(0) and logPlan.getInput(1), transform them back into 
> SqlNodes, and then into Strings. During the transformation process between 
> SqlNodes and Strings, we would manually wrap a "WITH SQ0 AS" and "WITH SQ1 
> AS" around the Statements, making them WITH-AS blocks as desired.
> 
> The blocker we face is that we do not know how to push SQ0 and SQ1 as tables 
> into the Logical Plan. The purpose of pushing them as LogicalTableScans into 
> the tree is we want their parent node, LogicalIntersect to result in 
> something like (SELECT * FROM SQ0) INTERSECT (SELECT * FROM SQ1) when parsed 
> back into SQL. If left untreated, the resulting SQL statement would simply 
> contain two WITH-AS blocks and the entire original SQL statement, as the 
> parser would take on the entire subquery of SQ0 and SQ1 without knowing that 
> they have acquired aliases. On the other hand, as SQ0 and SQ1 are dynamically 
> generated during the parsing of the tree, we are not sure if it is 
> appropriate to add them as new tables into the schemas while we are parsing.
> 
> With the problem clearly stated, we would sincerely appreciate if you could 
> provide us with some concrete guidance on how we can complete the task on 
> this simple case. We are aware that there are some potential solutions, e.g., 
> by pushing a Spool into the Logical Plan and override the RelToSqlConverter, 
> as suggested in 
> https://lists.apache.org/thread/k0or4xyfv4bbmgtrllg40ftysbg24y0h, but we are 
> unfortunately incapable of realizing these solutions without further concrete 
> guidance.
> 
> Please kindly respond to this thread if you know how to solve this problem. 
> We will be very grateful for your help.
> 
> Best,
> Yingxiang & Boyi
> 
> PhD, Electrical and Computer Engineering
> University of Illinois at Urbana-Champaign
> Coordinated Science Laboratory, Room 108
> Urbana, Illinois, 61801.
> 

Reply via email to